Item analysis

3,204 views 36 slides Mar 21, 2022
Slide 1
Slide 1 of 36
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36

About This Presentation

Item analysis of psychological test construction.


Slide Content

Item Analysis

Item Analysis A set of methods used to evaluate test items. The most common techniques involve assessment of item difficulty and item discriminability.

Item Difficulty A form of item analysis used to assess how difficult items are. The most common index of difficulty is the percentage of test takers who respond with the correct choice

The purpose of measuring item difficulty To examine the difficulty level of test items; whether the difficulty level set up at the time of test construction was right or not. To see if there are any items which are correctly attempted by everyone. If there are any such items that are ‘too easy’ then they need to be removed, replaced, or revised.

Similarly, if there are any items which are not correctly attempted by anyone then such ‘too difficult ’ items also need to be removed, replaced, or revised

Item-Difficulty Index item difficulty index is either in form of percentages or proportions of the total number of test takers who attempted an item correctly. Item difficulty is calculated separately for every item.

It is denoted by a lowercase italicized ‘ p’. A number attached to this ‘ p ’ indicates the item number whose difficulty level is described . For example p 1 indicates item difficulty of item number 1, p 2 is the difficulty level of item number 2 and so on.

the value of item difficulty index may range from zero to 1. Item difficulty index of .60 m ea ns th a t the i t e m wa s c o r r ec t l y a tt e mpt e d by 60 % t e s t t ake rs ( 60 / 100=.60 )

A zero item difficulty index would mean that nobody was able to attempt the item correctly. Therefore the item in question is a bad or poor item.

On the other hand an item with difficulty index of 1.00 is also a bad item because it was correctly attempted by 100 % test takers

The common item difficulty index range of test items is between .3 and .8

Average Index of Difficulty Of A Test Once the item difficulty index has been calculated for all items in a test, you can calculate the average difficulty index of the whole test as well. Simply add up the indices for individual items and divide the summation with the total number of items.

Taking Care of Guessing In case of some tests the test taker can give the right answer simply by guessing. This happens mostly in case of items where multiple response options are provided.

The optimal average difficulty for MCQ type tests is calculated by taking the mid-point between chance success proportion and 1.00 Chance success proportion is the likelihood of giving a correct response simply by guessing; in MCQs with 4 options it is .25; with 5 options it is .20, and with three options it will be .33.

For example in a test with 4 response options in each item, the chance success proportion is .20. The optimal item difficulty will be: .20 + 1.00 / = 1.20 / 2 = .6 .

Item Discrimination A test is supposed to discriminate between those who know and those who do not know who score high and those who score low those who have acquired a skill and those who have not.

A test will not be a good test if the people who are supposed to know the correct answer fail and those who are not supposed to know succeed. A good test differentiates between the high and low scorers.

Item Discrimination Index Every test has its discrimination power To see if the test discriminates between high and low achievers a certain percentage of the high and low achievers are taken. The discrepancy between their attempts to correct responses is calculated in terms of percentages.

Item discrimination index is denoted by a lowercase italicized ‘ d ’. the higher the value of d, the greater the number of high scorers answering the item correctly.

The item discrimination index is calculated by considering the number of people in high scorers (U) and the number of people in low scorers (L) who correctly answered an item.

The value of ‘d’ may range from -1 to +1. +1 would be an ideal situation where all test takers in the upper scoring group gave correct answers and none from the lower scorers did so. no test developer would like to get d= -1 which means that all high scorers failed and all low scorers passed this item.

A value of d= 0 indicates that the item does not differentiate between the two groups The larger the value of ‘d’ of an item the more discrimination it is making between the two groups

Item number U L U-L n U-L/n = d 1 10 10 10 2 10 10 10 1 3 10 -10 10 -1 4 9 2 7 10 .7

Item response theory Item- characteristic curves

Item Response Theory Item response theory is an approach that takes into consideration the probability of answering, right or wrong, each individual item in a test The information regarding each item is plotted graphically.

The graph containing information about the items is called the item- characteristic curve.

Item- Characteristic Curves Item difficulty and item discrimination can be presented graphically also . Item- characteristic curves are the graphs that represent these characteristics of a test. The horizontal axis represents the ability being tested whereas the vertical axis contains the probability correct responses or the proportion of examiners responding correctly to the item.

a graph prepared as part of the process of item analysis One graph is prepared for each item and shows the total test score on the X axis and the proportion of test takers passing the item on Y axis”

The shape or slope of the graph or curve indicates whether the item is a good one or not, how far does it discriminate high scorers from low scorers. A steep slope indicates that the test discriminates between the two groups. Scores of a highly discriminating test will yield a very steep slope.

Good Item

Weak/Poor/Bad Item

Cross Validation The validity of a test, we know, may be determined from a sample that was used for item selection . However, in order to have a better estimate of the validity of the test, the entire test needs to be validated on different samples as well

If validity of a test is computed from the original sample used for sample selection, then there are chances that the validity index will be higher than the one expected to be obtained from a new or different sample of test takers.

It is expected so because of the possible chance variations.

“ The amount of decrease in the strength of the relationship from the original sample to the sample with which the equation is used is known as shrinkage ”

The factors that may affect the amount of Shrinkage The size of the original item pool Proportion of test items retained Sample size
Tags