Types of Evaluation • Formative - conducted to provide learners with evaluative information useful in improving • Summative - provides information on the ability of the learner to do or know what he/she is supposed to do or know
Purposes of Testing - Individual Evaluating learner progress toward achieving educational goals Identifying areas that require remedial activity Assigning grades Ident ifying scholarship recipients Tool for program evaluation Identifying placement into a certain course Identifying strong or weak areas of curriculum
Learning Objectives Understand the pros and cons to various testing questions for written examinations Learn how to determine Item difficulty and Item discrimination Understand the psychometrics of a high stakes test Validity Reliability Standard Setting
Question Types Essay Items Short Answer and Completion Items Matching Items True-False and Multiple-Choice Tests Interviews Portfolios ….all can be scored and can be subject to test development
Pros and Cons of MCQ’s Pros Useful for measuring learning outcomes at almost any level Easy to understand Easy to score Easily analyzed for effectiveness Allow broad coverage efficiently Cons Good questions Take a long time to write Are difficult to write Constrain creative responses from learners May have more than one correct answer
PREPARATION OF MCQ’s : Paper developer should read concept and learning outcome and then write stem of Question . Instructions for writing stem; ( AIMS UNICEF ) Brief avoiding extra language in stem Comprehensive including the repetitive portion of options in stem M eaningful as a complete statement S imple using easy language words
Instructions for writing stem U nderstandable avoiding negative statements. N eutral avoiding Gender, Medium, Area I ndependent so that student cannot get help from other questions. C lear using reference of any statement E valuator of learning outcome to be tested. F ree from content so that exact statement doesn’t matches with content in book because in SOLO Taxonomy rote learned responses have no level.
Instructions for writing options All options should be written using logic. Options should be related to main concept . Order of correct option should be different in different questions. Order of options can also be used for adjustment of difficulty level .
Instructions for writing options All options should be equally attractive for hit & trail maker. In options avoid lifting phrases directly from text . After writing stem write correct option and then search plausible distracters. Finally shuffle correct option and distracters. Answer options should be about the same length and parallel in grammatical structure .
Instructions for writing options Distracters must be incorrect, but plausible. Try to include common errors in distracters. To make distracters more plausible, use words that should be familiar to students If a recognizable key word appears in the correct answer, it should appear in some or all of the distracters as well
Use Rarely: Extreme words like "all," "always" and "never" (generally a wrong answer). Vague words or phrases like "usually," "typically" and "may be" (generally a correct answer). "All of the above" - eliminating one distracter immediately eliminates this, too. "None of the above" - use only when the correct answer can be absolutely correct, such as in math, grammar, historical dates, geography, etc.. Do not use with negatively-stated stems, as the resulting double-negative is confusing. Studies do show that using "None of the above" does make a question more difficult, and is a better choice when the alternative is a weak distracter.
STUDENTS STRATEGIES ABOUT MULTIPLE CHOICE ITEMS: Pick the longest answer. Pick the ' b ' alternative. Never pick an answer which uses the word 'always' or 'never' in it. If there are two answers which express opposites, pick one or the other and ignore other alternatives. If in doubt, guess. Pick the scientific-sounding answer. Don't pick an answer which is too simple or obvious. Pick a word which you remember was related to the topic.
Tips for writing discriminant MCQs Responses All answers should be plausible and homogenous Items need to be independent of one another Answer choices should be similar in length and grammatical form List answer choices in alphabetical or numerical order Avoid ‘all of the above’ as a response Avoid technical flaws (tense or plurality for example)
What Makes a Content-Appropriate Item Be sure that each item reflects a clearly defined learning outcome Stem The stem of the item should be self-contained and written in clear and precise language. Avoid ‘trigger’ words (e.g. pin-rolling tremor) Negatives, excepts, absolutes and qualifiers in question stems are no-no’s. Reflects specific content Able to be classified on the test blueprint Does not contain trivial content Independent from other items Not a trick question Vocabulary suitable for candidate population Not opinion-based
Guidelines for the Answer Options All answers should be plausible and homogenous Items need to be independent of one another Answer choices should be similar in length and grammatical form List answer choices in alphabetical or numerical order Avoid ‘all of the above’ as a response Avoid technical flaws (tense or plurality for example) Maintain similarity in the answer options. Keep the answer options relatively equal in length. Keep all answer options grammatically correct with the stem and parallel. Avoid cueing the right answer. Avoid words such as “always” and “never”. Avoid repetitive wording in the answer options. Vary location of key. Place answer options in logical order.
Item Analysis Qualitative: looks at whether the content matches the information, attitude, characteristic or behavior being assessed Quantitative: Item difficulty Item discrimination
Determining item difficulty The percentage of participants who get that item correct Item difficulty scores can range from 0 to 100% Low value = high difficulty High value = low difficulty High (Difficult) Medium (Moderate) Low (Easy) <= 30% >30% AND < 80% >=80% 0 10 20 30 40 50 60 70 80 90 100
Discrimination Index The Discrimination Index distinguishes for each item between the performance of students who did well on the exam and students who did poorly.
Discrimination Index Index of discrimination: The difference in the % of people in one extreme group minus the % of people in the other extreme group Item discrimination scores can range from -1.00 to +1.00 Example 100 test takers: 20 in top 25 were correct but only 5 in the lowest 25 students were correct. DI = (20-5)/25 = 0.8
Test Validity Validity : The extent to which inferences made from a test are appropriate, meaningful, or useful. Does my test measure what it is intended to measure? Content validity Expert review Criterion validity – Predictive/Concurrent Scores can be related to another known metric Construct validity Successfully differentiates between levels of learners
Test Reliability Reliability : Measure the underlying construct consistently = trustworthiness/stability Test-Retest Reliability Alternate forms reliability Internal consistency reliability (cronbach’s alpha) Inter-rater reliability
Your Turn! Review the distributed questions and identify strengths and weaknesses in each.
Summary Utilize action verbs to write objectives Write your exam items based on the objectives Choose appropriate options with one best answer Avoid technical errors Utilize an item checklist to ensure that you have done all you can to write the best items possible. Pretest your items
Item Discrimination: Examples Item No. Number of Correct Answers in Group Item Discrimination Index Upper 1/4 Lower 1/4 1 90 20 2 80 70 3 100 4 100 100 5 50 50 6 20 60 0.7 0.1 1 -0.4 Number of students per group = 100
Distracter Analysis: Examples Item 1 A* B C D E Omit % of students in upper ¼ 20 5 % of students in the middle 15 10 10 10 5 % of students in lower ¼ 5 5 5 10 (*) marks the correct answer. Item 2 A B C D* E Omit % of students in upper ¼ 5 5 15 % of students in the middle 10 15 5 20 % of students in lower ¼ 5 10 10