Reliability It provides consisten t results no matter how many times the student take it When a measurement tool consistently gives the same answer T he person get a similar test score or a different score? The more similar the scores are, the more reliable the test is. Would you trust on something that tells how your personality is, but the next day it tells you other thing
The test performance can be influenced by A person’s psychological or physical state at the time of testing: Motivation, fatigue, anxiety Environmental factors: temperature, noise, light, test administrator.
Look the table 1a and 1b A group of students were supposed to take a test on Thursday, but they took it on Wednesday Look the scores Which test seems to be more reliable?
The reliability coefficient The reliability of a test is indicated by the reliability coefficient. “r” it is expressed as a number ranging between 0-1 r=0 r=1 Not select or reject a test based on the reliability coefficient: consider the type of the test, the type of reliability estimate reported, context in which the text will be used Good vocabulary, structure and reading tests are usually in the .90 to .99 range while auditory comprehension tests are more often in the .80 to 89 range. Oral production tests may be in the .70 to .79 range
Test – retest method The same test is given to the same people after a period of time. T he reliability of the test can be estimated by the consistency of the scores between the two tests Alternate form reliability Indicates how consistent test scores are if a person takes two or more form of a test Two tests with the same people. Both are designed to measure the same thing
Coefficient of internal consistency It indicates that the items on a test are very similar to each other in content Split half method It is necessary to split into two halves the test, through the careful matching of items It measures multiple characteristics
Standard error of measurement It gives the margin of error that you should expect in an individual test score In SEM of “2” indicates that the test taker’s “true” score probably lies with in 2 points in either direction of the score he or she receives on the test. If a person receives a 91 on the test, there is a chance that the person’s true score lies somewhere between 89 and 93 The smaller the SEM, the more accurate the measurements
Scorer reliability Scorer reliability refers to the consistency when different people who score the same test agree It indicates how consistent test scores are if the test is scored by two or more people
How to make test more reliable Take enough samples of behavior The more items on a test, the more reliable the test will be Items should be independent of each other It should not be so long Exclude items which do not discriminate well between weaker and stronger students They perform with similar degrees of success contributing little to reliability of the test
How to make a test more reliable Do not allow candidates too much freedom Write a composition on tourism Write a composition on tourism in this country Write a composition on how we might develop the tourist industry in this country Discuss the following measures intended to increase the number of foreign tourist coming to this country Better advertising or information (where? What form should it take?) Improve facilities (hotels, transportation, communication) Training the personnel (guides, hotel managers, etc )
How to make a test more reliable Write unambiguous items The fact than an individual candidate might interpret the question in different ways on different occasions means that an item is not contributing fully to the reliability of the test Provide clear and explicit instructions Write and oral instructions Spoken instruction should be always be read from a prepared text
How to make a test more reliable Make candidates familiar with format and testing techniques Sample tests Provide uniform and non-distracting conditions of administration Timing should be specified The acoustic conditions should be similar for all administrations of a listening test
How to make a test reliable Use items that permit scoring which is as objective as possible Multiple choice items Open-ended item Make comparisons between candidates as direct as possible Scoring all the compositions one topic will be more reliable than if the candidates are allowed to choose from six topics
How to make a test reliable Provide a detailed scoring key The key should be as detailed as possible in its assignment of points Train scorers They should have learned to score accurately Agree acceptable responses and appropriateness scores at outset of scoring
How to make a test reliable Identify candidates by number, not name Scorers have expectations of candidates that they know Employ multiple, independent scoring Where testing is subjective, all scripts should be scored by at least two independent scorers
Reliability and validity A valid test must be reliable, but a reliable test may not be valid at all. A test can be internally consistent (reliable), but not to be an accurate measure of what you claim to be measuring (validity)