VALIDITY AND RELIABILITY OF THE TOPIC NURSING RESEARCH.pptx

abhinavbhatt906 593 views 61 slides Aug 09, 2024
Slide 1
Slide 1 of 61
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61

About This Presentation

RELIABILITY�
Introduction:
The reliability of a research instrument concerns the extent to which the instrument yields the same results on repeated trials. Although unreliability is always present to a certain extent, there will generally be a good deal of consistency in the results of a qualit...


Slide Content

RELIABILITY AND VALIDITY PREPARED BY: Mr. Abhinav Bhatt

RELIABILITY Introduction: The reliability of a research instrument concerns the extent to which the instrument yields the same results on repeated trials. Although unreliability is always present to a certain extent, there will generally be a good deal of consistency in the results of a quality instrument gathered at different times. The tendency toward consistency found in repeated measurements is referred to as reliability.

The measurement of a quantitative instrument is a major criterion for assessing its quality and adequacy. Instrument reliability is consistency with which it measures the attribute. E.G.: if a scale weighed a person at 120 pounds one minute and 150 pounds the next, we would consider it unreliable

Reliability weight: 48 kg Weight: 50kg

Unreliability Weight: 50kg weight: 35 kg

DEFINITION “The reliability of an instrument is the degree of consistency with which the instrument measures the target attribute.” According to Burroughs “Reliability constitutes the ability of a measuring instrument to produce the same answer on successive occasions when no change has occurred in the thing being measured.”

March 3 rd March 12 th Group A takes Group A retakes Test 1 Test 1 March 3 rd March 12 th Mean Score Mean Score =89 = 93 If mean scores are similar, then Test, is reliable March 3rd March 12th Mean Score Mean Score =89 = 41 If mean scores are Different , then Test, is Not reliable

Types of Reliability

Stability Reliability- Test Retest method Repetition of test is the simplest method of determining agreement between two scores of set. We estimate test-retest reliability when we administer the same test to the same sample on two different occasions. Although the test-retest is sometime only available method is open to several serious objections. This approach assumes that there is no substantial change in the construct being measured between the two occasions. The amount of time allowed between measures is critical.

We know that if we measure the same thing twice that the correlation between the two observations will depend in part by how much time elapses between the two measurement occasions. The shorter the time gap, the higher the correlation; the longer the time gap, the lower the correlation.

This is, you can obtain because the two observations are related over time -- the closer in time many subject will recall their first answers and spend their time on new material, besides immediate memory effect, practice and confidence induced by familiarity with the material will almost certainly affect scores when test is taken for a second time.

If the interval between tests is rather longer and subjects are young children, growth changes will affect the retest score. In general growth increases initial score by various amounts and tends to lower the reliability coefficient.

Procedure of calculating test-retest reliability of research instrument involves following steps: - Administration of a research instrument to a sample of subjects on two different occasions. - Scores of tool administered at two different occasions is compared and calculated by using following formula of correlation coefficient.

Interpretation of results: the result of the correlation coefficient range between -1.00 through 0.0 and +1.00,and the result are interrelated as follows: - +1.00 score indicates perfect reliability. - 0.00 score indicates no reliability. - A score between above 0.70 indicates an acceptable level of reliability of a tool.

Disadvantage It is a time consuming. The examiner’s health, emotional condition, motivation and mental set up may change. Some controlled environmental changes may take place during the second administration of the test. Many traits change over time. Attitudes, behavior, knowledge and physical condition modified by experiences between testing.

Equivalence The focus equivalence is the comparison of two versions of the same paper and pencil instrument or of two observers measuring the same events. Comparison of two observers is referred to as rater reliability.

March 3 rd March 3 rd Group A takes Group A takes Observer 1 Observer 2 Mean Score= 99 (1) =98(2) If mean scores are similar, then test (2) is an equivalent test Mean Score= 54 (1) =99 (2) If mean scores are Different , then test (2) is not an equivalent test

Comparison of two paper and pencil instrument is referred to as alternate forms reliability, equivalent forms reliability and the comparable forms reliability.

The procedures for calculating inter ratter reliability involves a comparison of the agreements obtained between rather on coding form, with the number of possible agreements. The calculation is performed through the use of the following figure:

For example, a rating scale is developed to assess the cleanliness of the bone marrow transplantation unit; this rating may be administered to observe the cleanliness of the bone marrow transplantation unit by two different observers simultaneously but independently.

The reliability may be computed by using following equation: Number of agreements --------------------------------------------- Number of agreements+ disagreement

For instance, let's say you had 100 observations that were being rated by two ratters. For each observation, the rater could check one of three categories. Imagine that on 86 of the 100 observations the raters checked the same category. In this case, the percent of agreement would be 86%. it's a crude measure, but it does give an idea of how much agreement exists, and it works no matter how many categories are used for each observation.

Homogeneity It is also called internal consistency. Internal consistency is the extent to which tests or procedures assess the same characteristics, skill or quality. It is a measure of the precision between the observers or of the measuring the instruments used in the study. This type of reliability often helps researchers interpret data and predict the value of scores and the limits of the relationship among variables.

For example, a patient’s satisfaction measurement scale developed to measure their satisfaction with nursing care must include all the subpart related to measurement of satisfaction with nursing care only. It should not include items patient’s satisfaction with other aspects of care; including a subpart related to patient’s satisfaction with health care would be inappropriate in this scale.

Spilt-half reliability Procedure of calculating split-half reliability of research instrument involves following steps: Steps:1[ Divide the test into equivalent halves] Step : 2 [ compute a person ‘r’ between scores on the two halves of the test]

Divide the items of a research instrument in two equal parts through grouping either in odd number question or first half and seconds half item group. Administer two sub-parts of the tool simultaneously, score them independently and compute the correlation coefficient on the two separate scores by using following formula: ∑(x-x̅)(y-y̅) r = ------------------------- ⎷∑(x-x̅)2*∑(y-y̅)2

Step 3 [adjust the half test reliability using the spearman Brown formula] In split-half, to overcome the underestimation of reliability of entire scale, as the formula given above has estimated reliability of only half items, the following formula is used to estimate the reliability of entire test.

2r r = ----------- 1+r Where, r= the correlation coefficient computed on the split halves with formula of step 1 r = the estimated reliability of entire test

Factors influencing reliability of test scores 1. Extrinsic factors: Group variability Guessing by the examiners Environmental conditions Momentary fluctuation in the examiner  

2. Intrinsic factors: Length of the test Range of the total scores Homogeneity of test Difficulty value of item Discriminative value Scores reliability  

VALIDITY

VALIDITY Introduction Reliability and validity are not independent qualities of an instrument. A measuring device that is unreliable cannot possibly be valid. An instrument cannot validly measure an attribute if it is inconsistent and inaccurate. An unreliable instrument contains too much error to be a valid indicator of the target variable. An instrument can, however, be reliable without being valid.

Definition According to Treece and Treece , “Validity refers to an instrument or test actually testing what it suppose to be testing.” According to Polite and Hungler , “Validity refers to the degree to which an instrument measures what it suppose to measuring.”

According to American Psychological Foundation , “Validity is the appropriateness, meaning, fullness and usefulness of the interference made from the scoring of the instrument.” “Validity refers to the degree which the tool measures, what it is intended to measure, refers to whether the instrument or scale is quantifying what it claims to.”

Weight: √ Temperature: x Weight: x Temperature: √

Features of validity: Validity applies to all assessment, including performance or behavioral assessment. Validity is ongoing process. Validity is not just a measurement principle; it is social value that has powerful implications when ever evaluate judgment and decision is made. Validity is not a fixed property of the test because validation is not a fixed process rather an unending process.

Factors influencing validity

Types of Validity

1. FACE VALIDITY Face validity refers to whether the instrument looks as though it is measuring the appropriate construct. Although face validity should not be considered strong evidence for an instrument’s validity; it is helpful for a measure to have face validity if other types of validity have also been demonstrated. For example, it might be easier to persuade people to participate in a study if the instruments being used have face validity.

2. CONTENT VALIDITY It is also designated other terms such as intrinsic validity, relevance, circular validity and representativeness. It refers to the content representiveness of the content that is completeness with which items cover the important areas of the domain they are attempting to represent sampling of content. Content validity is based on the extent to which a measurement reflects the specific intended domain of content.

Content validity assessment anniversary also involves a more detailed procedure by which items for each category. Percentage of agreement between the judges is considered as the basis for inclusion or rejection of items. After item analysis, content validity needs assessed again to determine if the correct proportion of items for each sub category has been maintained or the proportion has not been maintained, exchange of items and sometimes addition of items are made until the content is sampled appropriately while retraining an acceptable level of reliability.

Content validity concerns the degree to which an instrument has an appropriate sample of items for the construct being measured and adequately covers the construct domain. Content validity is relevant for both affective measures (i.e., measures relating to feelings, emotions, and psychological traits) and cognitive measures.

Such a conceptualization might come from a variety of sources, including rich first-hand knowledge, an exhaustive literature review, and consultation with experts, or findings from a qualitative inquiry.

An instrument’s content validity is necessarily based on judgment. There are no completely objective methods of ensuring the adequate content coverage of an instrument. However, it is becoming increasingly common to use a panel of substantive experts to evaluate and document the content validity of new instrument.

Characteristics of content validity: - Developing items to cover all the areas of the content to be measured. - Selecting experts to evaluate the content validity of each item and of the total scale. - A structured procedure for the evaluation of the content validity of the instrument should be given to the explain. - The content is qualified by the use of the index of content validity. - Several items in the instrument may need to be evaluated more than once in order to establish sufficient content validity.

3. Criterion related validity: Criterion validity, also referred to as instrumental validity, is used to demonstrate the accuracy of a measure or procedure by comparing it with another measure or procedure which has been demonstrated to be valid.

For example, imagine a hands-on driving test has been shown to be an accurate test of driving skills. By comparing the scores on the written driving test, the written test can be validated by using a criterion related strategy in which the hands-on driving test is compared to the written test.

Criterion related validity is one which obtained on a criterion available at present or to be available in the future. According to Cure ton (1965) Criterion related validity of a test is an estimate of the correlation co-efficient between the test scores and the “true” criterion scores.

Types of criterion validity

a. Predictive validity: It is also called empirical validity or statistical validity. In predictive validity a test is corrected against the criterion to be made available in the future. Subsequently the test scores with the criterion scores are correlated and the obtained correlation becomes the index of validity co-efficient.

b. Concurrent validity: It is very similar to predictive validity except that there is no time gap in obtaining test scores and criterion scores. The test is correlated with criterion which is available at the present time. Concurrent validity can be determined by establishing the relationship or discrimination.

4. Construct validity: It is also called factorial validity or trait validity. Construct validity is the extent to which the test may be set to measure a theoretical construct or trait.

In construct validity, the investigator is concerned with the questions the concept under investigation being adequately measured? Is there a valid basis for inferring the scores? This begins when the investigator formulates hypotheses, based on theory, about characteristics of those who can be dependent variable in some hypotheses and an independent variable in others. Evidence of construct validity is not established within a single study. Variables like anxiety, creativity, job satisfaction, attitude towards numerous phenomena, intelligence, empathy, dependency, power, self-concept, etc. are all constructs.

These variables have been constructed from theory and practical data (i.e. quantitative data from research). For measures of variables that are constructs, evidence must be presented that scores represent the degree to which an individual does indeed acquire or exhibit the construct. There are a number of ways to gather evidence about the construct validity of a scale, It should also be noted, however, that if the instrument developer has taken strong steps to enhance the content validity of the instrument, construct validity will also be strengthened.  

Characteristics of construct validity: 1. Specifying the difference measures of the construct. 2. Determining the extent of correlation between all or some of the measures of construct. 3. Determining whether or not all or some measures act as if they were measuring the construct.

© A. Taylor Do not duplicate without author’s permission 59 Putting Reliability and Validity Together Imagine that I have 3 fish tank thermometers, a blue one, a red one, and a green one. The blue one always reads the same temperature no matter how hot or cold the water is. The red one shows a different temperature every time even if I just measured it 5 seconds earlier. The green one seems to read accurately, warm when the water is warm and cold when the water is cool.

© A. Taylor Do not duplicate without author’s permission 60 Complete the chart below Is it consistent? Is it measuring what it is supposed to? Is it reliable? Is it valid? Blue always reads the same temperature no matter what Red different temperature every time even if nothing has changed Green warm when the water is warm and cold when the water is cool Yes No Yes, the thermometer only changes if the temperature changes No No Yes Reliable but Not valid Not reliable Not valid Reliable and Valid