VALIDITY AND RELIABILITY OF THE TOPIC NURSING RESEARCH.pptx
abhinavbhatt906
593 views
61 slides
Aug 09, 2024
Slide 1 of 61
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
About This Presentation
RELIABILITY�
Introduction:
The reliability of a research instrument concerns the extent to which the instrument yields the same results on repeated trials. Although unreliability is always present to a certain extent, there will generally be a good deal of consistency in the results of a qualit...
RELIABILITY�
Introduction:
The reliability of a research instrument concerns the extent to which the instrument yields the same results on repeated trials. Although unreliability is always present to a certain extent, there will generally be a good deal of consistency in the results of a quality instrument gathered at different times. The tendency toward consistency found in repeated measurements is referred to as reliability.
The measurement of a quantitative instrument is a major criterion for assessing its quality and adequacy. Instrument reliability is consistency with which it measures the attribute.
E.G.: if a scale weighed a person at 120 pounds one minute and 150 pounds the next, we would consider it unreliable
Reliability�
weight: 48 kg
Unreliability�
Weight: 50kg
weight: 35 kg
DEFINITION�
“The reliability of an instrument is the degree of consistency with which the instrument measures the target attribute.”
According to Burroughs “Reliability constitutes the ability of a measuring instrument to produce the same answer on successive occasions when no change has occurred in the thing being measured.”
March 3rd March 12th
Group A takes Group A retakes
Test 1 Test 1
Types of Reliability�
Stability Reliability- Test Retest method�
Repetition of test is the simplest method of determining agreement between two scores of set. We estimate test-retest reliability when we administer the same test to the same sample on two different occasions.
Although the test-retest is sometime only available method is open to several serious objections. This approach assumes that there is no substantial change in the construct being measured between the two occasions. The amount of time allowed between measures is critical.
We know that if we measure the same thing twice that the correlation between the two observations will depend in part by how much time elapses between the two measurement occasions.
The shorter the time gap, the higher the correlation; the longer the time gap, the lower the correlation.
This is, you can obtain because the two observations are related over time -- the closer in time many subject will recall their first answers and spend their time on new material, besides immediate memory effect, practice and confidence induced by familiarity with the material will almost certainly affect scores when test is taken for a second time.
If the interval between tests is rather longer and subjects are young children, growth changes will affect the retest score. In general growth increases initial score by various amounts and tends to lower the reliability coefficient.
Procedure of calculating test-retest reliability of research instrument involves following steps:
- Administration of a research instrument to a sample of subjects on two different occasions.
- Scores of tool administered at two different occasions is compared and calculated by using following formula of cor
Size: 1.2 MB
Language: en
Added: Aug 09, 2024
Slides: 61 pages
Slide Content
RELIABILITY AND VALIDITY PREPARED BY: Mr. Abhinav Bhatt
RELIABILITY Introduction: The reliability of a research instrument concerns the extent to which the instrument yields the same results on repeated trials. Although unreliability is always present to a certain extent, there will generally be a good deal of consistency in the results of a quality instrument gathered at different times. The tendency toward consistency found in repeated measurements is referred to as reliability.
The measurement of a quantitative instrument is a major criterion for assessing its quality and adequacy. Instrument reliability is consistency with which it measures the attribute. E.G.: if a scale weighed a person at 120 pounds one minute and 150 pounds the next, we would consider it unreliable
Reliability weight: 48 kg Weight: 50kg
Unreliability Weight: 50kg weight: 35 kg
DEFINITION “The reliability of an instrument is the degree of consistency with which the instrument measures the target attribute.” According to Burroughs “Reliability constitutes the ability of a measuring instrument to produce the same answer on successive occasions when no change has occurred in the thing being measured.”
March 3 rd March 12 th Group A takes Group A retakes Test 1 Test 1 March 3 rd March 12 th Mean Score Mean Score =89 = 93 If mean scores are similar, then Test, is reliable March 3rd March 12th Mean Score Mean Score =89 = 41 If mean scores are Different , then Test, is Not reliable
Types of Reliability
Stability Reliability- Test Retest method Repetition of test is the simplest method of determining agreement between two scores of set. We estimate test-retest reliability when we administer the same test to the same sample on two different occasions. Although the test-retest is sometime only available method is open to several serious objections. This approach assumes that there is no substantial change in the construct being measured between the two occasions. The amount of time allowed between measures is critical.
We know that if we measure the same thing twice that the correlation between the two observations will depend in part by how much time elapses between the two measurement occasions. The shorter the time gap, the higher the correlation; the longer the time gap, the lower the correlation.
This is, you can obtain because the two observations are related over time -- the closer in time many subject will recall their first answers and spend their time on new material, besides immediate memory effect, practice and confidence induced by familiarity with the material will almost certainly affect scores when test is taken for a second time.
If the interval between tests is rather longer and subjects are young children, growth changes will affect the retest score. In general growth increases initial score by various amounts and tends to lower the reliability coefficient.
Procedure of calculating test-retest reliability of research instrument involves following steps: - Administration of a research instrument to a sample of subjects on two different occasions. - Scores of tool administered at two different occasions is compared and calculated by using following formula of correlation coefficient.
Interpretation of results: the result of the correlation coefficient range between -1.00 through 0.0 and +1.00,and the result are interrelated as follows: - +1.00 score indicates perfect reliability. - 0.00 score indicates no reliability. - A score between above 0.70 indicates an acceptable level of reliability of a tool.
Disadvantage It is a time consuming. The examiner’s health, emotional condition, motivation and mental set up may change. Some controlled environmental changes may take place during the second administration of the test. Many traits change over time. Attitudes, behavior, knowledge and physical condition modified by experiences between testing.
Equivalence The focus equivalence is the comparison of two versions of the same paper and pencil instrument or of two observers measuring the same events. Comparison of two observers is referred to as rater reliability.
March 3 rd March 3 rd Group A takes Group A takes Observer 1 Observer 2 Mean Score= 99 (1) =98(2) If mean scores are similar, then test (2) is an equivalent test Mean Score= 54 (1) =99 (2) If mean scores are Different , then test (2) is not an equivalent test
Comparison of two paper and pencil instrument is referred to as alternate forms reliability, equivalent forms reliability and the comparable forms reliability.
The procedures for calculating inter ratter reliability involves a comparison of the agreements obtained between rather on coding form, with the number of possible agreements. The calculation is performed through the use of the following figure:
For example, a rating scale is developed to assess the cleanliness of the bone marrow transplantation unit; this rating may be administered to observe the cleanliness of the bone marrow transplantation unit by two different observers simultaneously but independently.
The reliability may be computed by using following equation: Number of agreements --------------------------------------------- Number of agreements+ disagreement
For instance, let's say you had 100 observations that were being rated by two ratters. For each observation, the rater could check one of three categories. Imagine that on 86 of the 100 observations the raters checked the same category. In this case, the percent of agreement would be 86%. it's a crude measure, but it does give an idea of how much agreement exists, and it works no matter how many categories are used for each observation.
Homogeneity It is also called internal consistency. Internal consistency is the extent to which tests or procedures assess the same characteristics, skill or quality. It is a measure of the precision between the observers or of the measuring the instruments used in the study. This type of reliability often helps researchers interpret data and predict the value of scores and the limits of the relationship among variables.
For example, a patient’s satisfaction measurement scale developed to measure their satisfaction with nursing care must include all the subpart related to measurement of satisfaction with nursing care only. It should not include items patient’s satisfaction with other aspects of care; including a subpart related to patient’s satisfaction with health care would be inappropriate in this scale.
Spilt-half reliability Procedure of calculating split-half reliability of research instrument involves following steps: Steps:1[ Divide the test into equivalent halves] Step : 2 [ compute a person ‘r’ between scores on the two halves of the test]
Divide the items of a research instrument in two equal parts through grouping either in odd number question or first half and seconds half item group. Administer two sub-parts of the tool simultaneously, score them independently and compute the correlation coefficient on the two separate scores by using following formula: ∑(x-x̅)(y-y̅) r = ------------------------- ⎷∑(x-x̅)2*∑(y-y̅)2
Step 3 [adjust the half test reliability using the spearman Brown formula] In split-half, to overcome the underestimation of reliability of entire scale, as the formula given above has estimated reliability of only half items, the following formula is used to estimate the reliability of entire test.
2r r = ----------- 1+r Where, r= the correlation coefficient computed on the split halves with formula of step 1 r = the estimated reliability of entire test
Factors influencing reliability of test scores 1. Extrinsic factors: Group variability Guessing by the examiners Environmental conditions Momentary fluctuation in the examiner
2. Intrinsic factors: Length of the test Range of the total scores Homogeneity of test Difficulty value of item Discriminative value Scores reliability
VALIDITY
VALIDITY Introduction Reliability and validity are not independent qualities of an instrument. A measuring device that is unreliable cannot possibly be valid. An instrument cannot validly measure an attribute if it is inconsistent and inaccurate. An unreliable instrument contains too much error to be a valid indicator of the target variable. An instrument can, however, be reliable without being valid.
Definition According to Treece and Treece , “Validity refers to an instrument or test actually testing what it suppose to be testing.” According to Polite and Hungler , “Validity refers to the degree to which an instrument measures what it suppose to measuring.”
According to American Psychological Foundation , “Validity is the appropriateness, meaning, fullness and usefulness of the interference made from the scoring of the instrument.” “Validity refers to the degree which the tool measures, what it is intended to measure, refers to whether the instrument or scale is quantifying what it claims to.”
Weight: √ Temperature: x Weight: x Temperature: √
Features of validity: Validity applies to all assessment, including performance or behavioral assessment. Validity is ongoing process. Validity is not just a measurement principle; it is social value that has powerful implications when ever evaluate judgment and decision is made. Validity is not a fixed property of the test because validation is not a fixed process rather an unending process.
Factors influencing validity
Types of Validity
1. FACE VALIDITY Face validity refers to whether the instrument looks as though it is measuring the appropriate construct. Although face validity should not be considered strong evidence for an instrument’s validity; it is helpful for a measure to have face validity if other types of validity have also been demonstrated. For example, it might be easier to persuade people to participate in a study if the instruments being used have face validity.
2. CONTENT VALIDITY It is also designated other terms such as intrinsic validity, relevance, circular validity and representativeness. It refers to the content representiveness of the content that is completeness with which items cover the important areas of the domain they are attempting to represent sampling of content. Content validity is based on the extent to which a measurement reflects the specific intended domain of content.
Content validity assessment anniversary also involves a more detailed procedure by which items for each category. Percentage of agreement between the judges is considered as the basis for inclusion or rejection of items. After item analysis, content validity needs assessed again to determine if the correct proportion of items for each sub category has been maintained or the proportion has not been maintained, exchange of items and sometimes addition of items are made until the content is sampled appropriately while retraining an acceptable level of reliability.
Content validity concerns the degree to which an instrument has an appropriate sample of items for the construct being measured and adequately covers the construct domain. Content validity is relevant for both affective measures (i.e., measures relating to feelings, emotions, and psychological traits) and cognitive measures.
Such a conceptualization might come from a variety of sources, including rich first-hand knowledge, an exhaustive literature review, and consultation with experts, or findings from a qualitative inquiry.
An instrument’s content validity is necessarily based on judgment. There are no completely objective methods of ensuring the adequate content coverage of an instrument. However, it is becoming increasingly common to use a panel of substantive experts to evaluate and document the content validity of new instrument.
Characteristics of content validity: - Developing items to cover all the areas of the content to be measured. - Selecting experts to evaluate the content validity of each item and of the total scale. - A structured procedure for the evaluation of the content validity of the instrument should be given to the explain. - The content is qualified by the use of the index of content validity. - Several items in the instrument may need to be evaluated more than once in order to establish sufficient content validity.
3. Criterion related validity: Criterion validity, also referred to as instrumental validity, is used to demonstrate the accuracy of a measure or procedure by comparing it with another measure or procedure which has been demonstrated to be valid.
For example, imagine a hands-on driving test has been shown to be an accurate test of driving skills. By comparing the scores on the written driving test, the written test can be validated by using a criterion related strategy in which the hands-on driving test is compared to the written test.
Criterion related validity is one which obtained on a criterion available at present or to be available in the future. According to Cure ton (1965) Criterion related validity of a test is an estimate of the correlation co-efficient between the test scores and the “true” criterion scores.
Types of criterion validity
a. Predictive validity: It is also called empirical validity or statistical validity. In predictive validity a test is corrected against the criterion to be made available in the future. Subsequently the test scores with the criterion scores are correlated and the obtained correlation becomes the index of validity co-efficient.
b. Concurrent validity: It is very similar to predictive validity except that there is no time gap in obtaining test scores and criterion scores. The test is correlated with criterion which is available at the present time. Concurrent validity can be determined by establishing the relationship or discrimination.
4. Construct validity: It is also called factorial validity or trait validity. Construct validity is the extent to which the test may be set to measure a theoretical construct or trait.
In construct validity, the investigator is concerned with the questions the concept under investigation being adequately measured? Is there a valid basis for inferring the scores? This begins when the investigator formulates hypotheses, based on theory, about characteristics of those who can be dependent variable in some hypotheses and an independent variable in others. Evidence of construct validity is not established within a single study. Variables like anxiety, creativity, job satisfaction, attitude towards numerous phenomena, intelligence, empathy, dependency, power, self-concept, etc. are all constructs.
These variables have been constructed from theory and practical data (i.e. quantitative data from research). For measures of variables that are constructs, evidence must be presented that scores represent the degree to which an individual does indeed acquire or exhibit the construct. There are a number of ways to gather evidence about the construct validity of a scale, It should also be noted, however, that if the instrument developer has taken strong steps to enhance the content validity of the instrument, construct validity will also be strengthened.
Characteristics of construct validity: 1. Specifying the difference measures of the construct. 2. Determining the extent of correlation between all or some of the measures of construct. 3. Determining whether or not all or some measures act as if they were measuring the construct.