THE PRACTICE OF ENGLISH LANGUAGE TEACHINGTESTING & EVALUATION : THE PRACTICE OF ENGLISH LANGUAGE TEACHINGTESTING & EVALUATION TESTING AND ASSESSMENT : TESTING AND ASSESSMENT Measurement and evaluation have been with us for a long time.
Since the effect of testing on teaching and learning is unavoidable, testing is an important part of every language teaching and language learning experience. WHY TEST? : WHY TEST? Has the instruction been successful? Were the materials for instruction at the right level? Have all language skills been emphasized
equally? What points need reviewing? Should the same materials be used next year
or do they need some modifications? ………. Slide 4: TEST EVALUATION MEASURMENT TEST : Test is the narrowest of the three terms. It often connotes to the presentation of
a set of questions to be answered. In general what distinguishes a test from other
types of measurement is that it is designed to obtain
a specific sample of behavior. TEST It is one type of measurement. It may be used for pedagogic or descriptive purposes. MEASUREMENT : MEASUREMENT It implies a broader sense. We can measure characteristics by means
other than giving tests e.g. using observation,
rating scales, or other devices that
allow us to obtain information in a quantitative
form. Slide 7: EVALUATION It has been defined in a variety of ways.
In general, it refers to the systematic gathering of information for purposes of decision making.
In other words, evaluation is a professional judgment or a process that allows one to make a judgment about the desirability or value of a measure. Slide 8: So, testing is not the only way in which
information bout people’s language ability
can be gathered.
It is just one form of assessment. Slide 9: ASSESSMENT SUMMATIVE FORMATIVE FORMATIVE ASSESSMENT : FORMATIVE ASSESSMENT To check on the progress of their students. To see how far they have mastered what
they should have learned. And then to use this information to modify
their future teaching plans. It cab also be the basis for
feedback to the students e.g. informal test
or quizzes Teachers use it : SUMMATIVE ASSESSMENT : It is used at the end of
the term, semester
, or year in order
to measure what
has been achieved
both by groups and
by individuals. SUMMATIVE ASSESSMENT e.g. final examination In most cases grades
on the basis of
performance on tests
in addition to
classroom performance. Slide 12: Proficiency
test Different reasons for testing learners Different kinds of tests Portfolio
assessment PLACEMENT TEST : PLACEMENT TEST It is used to sort new students into relatively homogenous language ability groupings so that they can start a course at approximately the same level as the other students in the class.
It is one of the most frequently used test at different levels of language instruction. DIAGNOSTIC TEST : DIAGNOSTIC TEST It is designed to show what skills or knowledge a learner knows or does not know. In other words, it is used to identify students’ strengths and weaknesses.
It is in the reverse side of achievement test in the sense that while the interest of the achievement test is in success, the interest in the diagnostic test is in failure, what has gone wrong, in order to develop remedies. ACHIEVEMENT TEST : ACHIEVEMENT TEST It is designed to measure the degree of students’ learning from a particular set of instructional materials.
It is directly related to language courses. It means that such tests normally come after a program of instruction or items of the test are drawn from the content of instruction directly.
e.g. final, midterm, and class examinations PROFICIENCY TEST : PROFICIENCY TEST It is used to measure the overall language ability of the learners regardless of any training they may have had in that language.
It seeks to answer the question:”having learned this much, what can the student do with it?”
e.g. Test of English as a Foreign Language (TOEFL) PORTFOLIO ASSESSMENT : PORTFOLIO ASSESSMENT Many educational institutions allow students to assemble a portfolio of their works over a period of time(a term or semester), so the students can be assessed by looking at three or four of the best pieces of work over this period. Slide 18: ADVANTAGES Provide evidence of students’ effort
Help students to become more autonomous
Help them to self monitor their own learning DISADVANTAGES It is time-consuming.
Teachers will need clear training in how to select items from the portfolio and how to give them grades.
In preparing their portfolios, students may have been helped by others. Slide 19: RELIABILITY VALIDITY Characteristics of a good test 1 2 VALIDITY : VALIDITY It measures what it is supposed to measure (construct validity)
or can be used for the purposes for which it is intended.
Valid + for
It means any given test may be valid for some purposes, but not for others.
Validity tells us what can be inferred from test scores. Slide 21: Different kinds
of validity face validity: a test should look, on the face of it,
as if it is valid. A test which consisted of only three
multiple-choice items would not convince students
of its face validity. criterion-related validity: it is based on the extent to which
performance on a newly-developed test is related to some other
criterion measure which is an indicator of the ability tested. -
content validity: if the content of a test constitutes a
representative sample of the language skills, structures, etc.
with which it is meant to be concerned. Slide 22: How to make tests more valid? Slide 23: First, write explicit specifications for the test and make sure that you include a representative sample of the content of these in the test. Second, whenever feasible, use direct testing. Third, make sure that the scoring of responses relates directly to what is being tested. Finally, do everything possible to make the test reliable. If a test is not reliable, it cannot be valid. Slide 24: What is reliability? RELIABILITY : RELIABILITY Reliability is a quality of test scores.
It refers to the consistency of measures across different times, test forms, raters, and other characteristics of the measurement context.
Synonyms for reliability are:
Dependability, stability, consistency, predictability, accuracy How to make test more reliable? : How to make test more reliable? Take enough samples of behavior
Exclude items which do not discriminate well between weaker and stronger students.
Do not allow candidates too much freedom.
Provide clear and explicit instructions.
Write unambiguous items.
Provide uniform and non-distracting conditions of administration.
…………….. Slide 27: Two kinds of testing Discrete-point testing:
only tests one thing at a time and the answer is
either correct or incorrect.
e.g. asking students to choose the correct form of
tense, or multiple-choice tests Integrative testing:
expects students to use a variety of language at
any given time
e.g. writing a composition, doing conversational oral test Types of test items : Types of test items Direct test item It requires the candidate to perform precisely the skill we wish to measure.
It tries to be as much like real-life language use as possible.
e.g. writing samples, oral interview Indirect test item It tries to measure the abilities which underlie the skills in which we are interested.
e.g. multiple-choice questions, cloze procedures, sentence reordering Time Line : Time Line Transformation and
paraphrase Sentence reordering Multiple-choice items Cloze procedures CLOZE PROCEDURES : CLOZE PROCEDURES It offers the ideal indirect, but integrative testing items.
They can be prepared quickly, and are an extremely cost-effective way of finding out about a testee’s overall knowledge.
Cause the deletion of words is random, it avoids test designers failing.
Cause of the randomness of deleted words, anything may be tested.(grammar, collocations, fixed phrases,…)
Supplying the correct word for the blank implies an understanding of context and a knowledge of that word and how it operates.
In some cases, there are several possible answers.
The actual score a student gets depends on the particular words that are deleted, rather than on any general English knowledge.(problem of reliability) advantages disadvantages Slide 34: DIRECT TEST ITEMS
To have valid and reliable direct test items, test designers need to do the following: Slide 37: WRITING TESTS
1- assess the test situation
2- decide what to test
3- balance the elements
4- weight the scores
5- make the test work Slide 38: OBJECTIVE SCORING: a method of
scoring in which the scores are given
according to some predetermined
criteria. in this method, each correct
answer is usually counted one point.
SUBJECTIVE SCORING: a method of scoring in which the scoring procedures do not follow any objective criteria. So, the fluctuations of scores from one scorer to another creates a serious problem. To compensate for the inadequacies of subjective scoring, the following solutions are recommended: Slide 39: 1- Training
It means that the scorer should not come to the task fresh. They should see some scripts at different levels.
e.g. they may be allowed to watch videoed oral test in order to be trained to rate the samples of spoken language accurately and consistently in terms of predetermined description of performance. Slide 40: 2- More than one scorer
“More scorer, more reliability”
The more people who look at a script, the greater the chance that its true worth will be located somewhere between the various scores it is given.
sometimes we can use a moderator
whose job is to check samples of
scorer’s work to see that it conforms with
the general standards laid down for
the exam. Slide 41: 3- Global assessment scale
A way of specifying score is to use a pre-defined descriptions of performance. Such descriptions say what the students need to be capable of in order to gain the required marks.
However, they are not without problems:
Maybe these descriptions do not exactly match the students who is
Another one is that different teachers will not agree on the meaning of
scale descriptors. Slide 42: We can mark tests for different elements, instead of general assessment.
A combination of global and analytic scoring gives us the best chance of the reliable marking. 4- Analytic profiles Slide 43: 5- Scoring and interacting during oral tests
if we separate the role of scorer (or examiner) from the role of interlocutor (the examiner who guides and provokes conversation), it will allow the scorer to observe and assess, free from the responsibility of keeping the interaction with the candidate or candidates going.
e.g. In test of speaking, we can put students in pairs or groups for certain tasks. It will help to relax students in a way that interlocutor-candidate interaction might not. TEACHING FOR TESTS : TEACHING FOR TESTS backwash (wash back) effect: the effect of the nature of a test on teaching and learning. In other words, it is the potential impact of test on test takers and their characteristics, on teaching and learning activities, and on educational system and society.
Harmful (negative) backwash: when test and testing techniques are at variance with the objectives of the course.
Beneficial (positive) backwash: if a test is regarded as important, if the stakes are high, preparation for it can come to dominate all teaching and learning activities. What does teaching for test mean? : What does teaching for test mean? Exam-teachers Those who quit reasonably want their students to pass the tests and exams they are going to take, so their teaching become dominated by the test. Suffering from the backwash effect, they might stick rigidly to exam-format activities in class.
In such a situation, the format of the test is determining the format of the class. Non-exam teachers They might use a range of different activities. Slide 46: Many teachers believe that teaching exam classes are extremely satisfying because:
Since students perceive a clear sense of purpose and are highly motivated to do as well as possible, they are in some sense “easier” to teach than students whose focus is less clear.
Also, in training students to develop good exam skills (e.g.
working on their own, reviewing what they have done, learning to use reference tools, keeping an independent learning record, etc.), we push them towards autonomous learning. Slide 47: However, to be a good exam-preparation teacher is not easy.
They need to be familiar with the test their students are taking, and be able to answer their students’ concerns and worries, and to walk a fine line between good exam preparation and the wash back effect. So there are number of things they can do in an exam class:
Train for test
Discuss general exam skills
Do practice test
Ignore the test Slide 48: 1- Train for test types
Generally, we can make students familiar with the test items they will have to face so that they can give their best, and the test discovers their level of English.
e.g. we can show them the various types of tests.
Help them to understand what the test designer is aiming for.
Help them to focus on what they are being asked to do and why.
and so on……
2- Discuss general exam skills
We can remind students about general test and exam skills and teach them how to organize their work so that they can revise effectively.
e.g. help them to pace themselves so that they do not spend a disproportionate amount of time on only one part of exam.
Remind them how easily they can find the answer by reading question carefully, and ……… Slide 49: 3- Do practice test
It means giving students the chance to practice taking the test so that they get a feel for experience.
During a course, students can sit practice papers or whole practice tests.
4- Have fun
Although students need to practice certain test types, it has not to be done in a boring or tense way. Teachers can use number of ways of having fun with tests and exams.
e.g. teachers can ask students to write their own test items, based on language they have been working on and the examples they have seen so far. Slide 50: 5- ignore the test
Exam teacher should be careful that only discussing on exam techniques and taking practice tests in class may become lesson and class monotonous. In other words, in such classes there is a possibility that general English improvement will be compromised at the expense of exam preparation.
To avoid this problem, we need to ignore the exam from time to time so that we have opportunities to work on general language issues to encourage students to take part in the kind of motivating activities that are appropriate for all English lesson. THANK TOU : THANK TOU