3232423232323232323232323232323232323 .pptx

AttallahAlanazi 14 views 9 slides Apr 28, 2024
Slide 1
Slide 1 of 9
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9

About This Presentation

3232423232323232323232323232323232323 .pptx


Slide Content

Language Assessment Principles and Classroom Practices Chapter 2 : Principles of Language Assessment

Outline 1 Reliability 2 Validity 3 Practicality 4 Authenticity 5 Wash back Testing

Reliability Reliability Reliability is the extent to which a test produces consistent scores at different administrations to the similar group of examinees. Reliability is synonymous with Dependability , Stability , Consistency , Predictability and Accuracy . = + Accordingly, reliability is defined as the extent to which a test is error free . In fact, your scores (Obtained Scores) are only a partial representation of your true score ( Real Ability) and the reason lies in the presence of the factors other than the ability being tested . Therefore, a reliable test is a test in which true score variance is higher and error score variance is lower . If the test is error free, the true score will be equal to observed score . Classical True Score Measurement Obtained Score Real Ability Other Factors X = Xt + Xe X= Observed Score Xt= True Score Xe= Error Score

Reliability It refers to psychological and physical factors including “bad day” anxiety , illness , test taker’s “test wiseness” and fatigue which can make an “observed score” deviate from one’s true score . A). Inter- Rater Reliability: Scorers yield inconsistent scores of the same test. Rater reliability falls into 2 categories B). Intra- Rater Reliability: Unclear scoring criteria, bias and carelessness. It basically springs from the conditions in which the test is administered: noisy class , amount of light , chairs .. The test should fit into the time constraints , the test should not be too long or short and test items should be clear . 4). Test Reliability Reliability falls within 4 kinds 1). Student-related Reliability 3). Test Administration Reliability 2). Rater Reliability

Which test is more reliable? Table 1 Table 2

Validity Validity Validity is the degree of correspondence between the test content and the content of the material to be tested. Ex: A valid test of Reading Ability actually measures the reading ability itself : not previous knowledge. 5 Ways to Establish Validity 1). Content Validity 2). Criterion Validity 3). Construct Validity 4). Consequential Validity 5). Face Validity

Validity Validity 1). Content Validity: If a test actually samples the subject matter about which conclusions are to be drawn, so it can claim content- related evidence of validity. Ex Direct Testing: It requires the test takers to perform the target task directly . Indirect Testing: Learners are required to perform by the use of indirectly related tasks . 2). Criterion validity: Is the extent to which performance on a test is related to a criterion which is the indicator of the ability being tested. The criterion may be individuals’ performance on another test or even a known standard . Concurrent Validity: A test has CV if its results are supported by other concurrent performance beyond the assessment itself. Predictive Validity: It tends to predict a student’s likelihood of future success 3). Construct Validity: Is the extent to which a test measures just the construct it is supposed to measure. Consequential Validity: It refers to the positive or negative consequences of a particular test. Consequences include its impact on the preparation of test takers, on learners , social consequences and washback as well. Face Validity: It is the extent to which the measurement method “on its face” appears to measure the particular ability. It is generally based on the subjective judgment of the examinees. Speaking Multiple-choice Oral production TOEFL Depression

Practicality and Authenticity P & A Practicality: It is defined as the relationship between the resources that will be required in the design , development , and use of the test and the resources that will be available for these activities. It is represented as following figure: Brown (2004:19) defines practicality is in terms of: Cost Time Administration Scoring / Evaluation Authenticity: Is the extent to which the tasks required on a given test are similar to normal “real life” language use, in other words, it is the degree of correspondence between tests , tasks , and activities of target language use . Therefore, the higher the correspondence, the more authentic the test. Authenticity may be present in the following ways: 1. 2. 3. 4. The language in the test is natural as possible. Items are contextualized rather than isolated. Topics are meaningful (relevant, interesting) to the learners. Some thematic organization to items is provided, such as through a story or episode. 5. Tasks represent, or closely approximate, real-world tasks.

Washback/Backwash Negative Washback : When test and testing techniques are at variance with the objectives of the course. Tests which have negative washback is considered to have negative influence on teaching and learning. Ex: Taking an English course to be trained in 4 language skills, however the language test does not test those skills. Positive Washback: Positive washback would result when a testing procedure encourages “ good ” teaching practices . EX: The consequence of many reading comprehension tests is a possible development of the reading skills. washback Washback Effect: Generally, it is the influence of the nature of a test on teaching and learning. 2 kinds of washback 1). Negative Washback 2). Positive Washback
Tags