Scale Development in Research Methodology.pptx

PawasChaturvedi1 17 views 35 slides Aug 19, 2024
Slide 1
Slide 1 of 35
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35

About This Presentation

Scale Development in Research


Slide Content

“Scale Development and Use of Statistical Tools in Different Scales” Prof. Sampada Kumar Swain Department of Tourism Studies Pondicherry University

“Questionnaires are the most commonly used method of data collection in the field of research” (Sone,1978). “Perhaps the greatest difficulty in conducting survey research is assuring the accuracy of measurement of the constructs under examination” (Barrett,1972) “Even with advanced techniques such as meta-analysis, the strong conclusion often cannot be drawn from a body of research due to problems with measurement” (Hinkin,1995). “Attention has moved beyond simple questions of reliability and now includes more “direct” assessments of validity” (Churchill,1979, p.65). “A measure is valid when the differences in observed scores reflect true differences on the characteristic one is attempting to measure and nothing else”(Churchill,1979, p.65). “Technically, the process of measurements or operationalisation involves “rules for assigning numbers to the objects to represent quantities of attributes’” (Nunnally,1967, p.2). The American Psychological Association (1985) states that measures should demonstrate content validity, criterion-validity , construct and internal consistency. Introduction:

(Malhotra,3 rd edition, pg. 334-368)

(Malhotra,3 rd edition, pg. 334-368)

Comparative Scaling Techniques: 1. Paired comparison Scaling: 2. Rank order scaling: 3. Constant sum scaling: 4. Q-sort and other procedures: (Malhotra,3 rd edition, pg. 334-368)

Non- Comparative Scaling Techniques: 1. Continuous rating scales: 2. Likert scale: 3. Semantic scale: 4. Staple scale: (Malhotra,3 rd edition, pg. 334-368)

Scale Development Process (Hinkin,1998, p.106)

Steps in scale development ( DeVellis , 2017).

(Malhotra,3 rd edition, pg. 334-368)

Specify the Domain Of Construct: The researcher must be exacting in delineating what is included in the definition and what is excluded. Consider, for example, the construct of the Memorable Tourist Experience , selectively constructed from tourism experiences based on the individual’s assessment of the experience. Identifying and understanding the components of the tourism experience that strongly affect individuals, and lead to memorability, necessitated an inquiry into these experiential components while cross-referencing the literature on memory and memorable experiences. In the memory literature, researchers have found various factors that increase the memorability of an event. These include affective feelings, cognitive evaluations, and novel events. Brewer (1988) found that affective thoughts are an important part of memory and that events related to emotions are more likely to be remembered. (Kim et.al 2012)

Research the intended meaning and breadth of the theoretical concept: (Carpenter,2018)

Domain of construct : Tourist Experience (Kim et.al 2012)

Step 1: Item Generation Two basic approaches to item development were used in the sample. DEDUCTIVE : sometimes called “logical partitioning”, or classification from above” INDUCTIVE: “grouping” or “classification from below” The thumb rule of number of items per construct ( Devellis , 2003) 3 to 4 times larger than the final scale. The thumb rule for the final scale for each dimension is a minimum of 3 items. Content validity, sometimes called face validity, is a subjective but systematic evaluation of how well the content of a scale represents the measurement task at hand. The researcher or someone else examines whether the scale items adequately cover the entire domain of the measured construct. Criterion validity reflects whether a scale performs as expected in relation to other selected variables (criterion variables) as meaningful criteria. Construct validity addresses the question of what construct or characteristic the scale is measuring. When assessing construct validity, the researcher attempts to answer theoretical questions about why the scale works and what deductions can be made concerning the underlying theory.

Guideline for item development (Hinkin,1995): Statement should be as simple and as short as possible. The language used should be familiar to the target respondents. Keep all items consistent in terms of perspective. Items should address only a single issue; “double-barrelled” should not be used. items using negative words should be carefully worded. Harvey, Billings, and Nilan (1985) suggest that at least 4 items per scale are needed to test the homogeneity of the item within each latent construct. (no hard and fast rule.) Any scale should require the contribution of a minimum number of items that adequately tap the domain of interest. It should be anticipated that approximately one-half of the created items will be retained for use in the final scale, so at least twice as many items as needed in the final scales should be generated to be administered in a survey questionnaire.

For example : Construct definition Memorable Tourism Experiences (Kim et.al 2012)

Factors and Items of Memorable Tourism Experiences (Kim et.al 2012)

In the component of Memorable Tourist Experience: A set of 58 MTE items related to the 16 construct domains was initially generated from a review of tourism and leisure research pertaining to participants’ experiences. The researchers interviewed 62 individuals using open-ended questions (e.g., recall your most MTE and list five words to describe this experience). The main purposes of this study were to identify themes or construct dimensions that constituted one’s MTE and to ensure the content validity of construct domains , which were predetermined from the literature review. While conducting the content analysis on the responses, 62 different words were identified when describing MTE, reviewing the answers, different words that could be categorised under one theme were merged together. Through this process, the 62 words were reduced to nine themes (hedonism, social interaction, knowledge, novelty, happiness, relaxation, challenge, unexpected happenings, and adverse feelings) By combining the items generated from two sources (the open-ended survey and literature review), a total of 84 items were developed as a basis for measuring MTE. A jury of three experts reviewed the above set of 84 items to ensure content validity ( Devellis 2003). (Kim et.al 2012)

Step 2: Questionnaire Administration Use the items that have survived content validity assessment to measure the construct under examination. The item should be presented to a sample representative of the actual population of interest. The new items should be administered along with the other established measures to examine the “nomological network”- the relationship between existing measures and the newly developed scales. Likert scale should be used for item scaling. Sample size: recommendation for item-to-response ratio range from 1:4 (Rummel,1970) to at least 1:10 (Schwab,1980) for each set of scales to be factor analysed. A sample size of 150 observations should be sufficient to obtain an accurate solution in Exploratory Factor Analysis as long as item intercorrelations are reasonably strong ( Guadangnoli & Velicer , 1998). For Confirmatory Factor Analysis , a minimum sample size of 200 has been recommended ( Hoelter , 1983).

Step 3: Initial Item Reduction Once the data has been collected, it is recommended that factor analysis is used to refine the new scale further. Factor analysis allows the reduction of a set of observed variables to a smaller set of variables (Parsimonious representation ), providing evidence of construct validity. A key assumption in the domain sampling model is that all items belonging to a common domain should have similar average intercorrelation. The number of factors to be retained depends on both underlying theory and quantitative results. Retain items that load on a single appropriate factor , and the .40 criterion level appears most commonly used in judging factor loading as meaningful (Ford et al., 1986), retain the items with higher commonalities. To purify the scale, items with poor correlation (r < .4) with the total score must be eliminated. Principal Component Analysis is used when the objective is data reduction. Minimum no. of factors are required to account for the maximum portion of total variance represented by the original set of variables Total explained variance should be greater than 50%.

(Cronbach’s Alpha: should be greater than.70)

Internal Consistency Assessment: Reliability is the accuracy or precision of measuring instruments and is a necessary condition for validity (Kerlinger, 1986). Cornbach’s alph a is the most widely accepted measure of internal consistency reliability (Price & Mueller,1986). A large coefficient alpha ( .70 for exploratory measures ; Nunnally, 1978) provides a strong item covariance and suggests that the sampling domain has been captured adequately (Churchill, 1979). Split-half reliability: A form of internal consistency reliability in which the items constituting the scale are divided into two halves, and the resulting half scores are correlated. Nomological validity: It is the extent to which the scale correlates in theoretically predicted ways with measures of different but related constructs. A theoretical model is formulated that leads to further deductions, tests and inferences.

Step 4: Confirmatory Factor Analysis CFA allows the researcher to quantitatively assess the quality of the factor's structure , providing further evidence of the construct validity of the new measure. CFA results should include the minimum of the chi-square statistic, degree of freedom and the recommended goodness-of-fit indices used for each competing model. The purpose of CFA is Two fold: First, to assess the goodness of fit of the measurement model , comparing a single common factor model with a multitrait model with the number of factors equal to the number of constructs in the new measure ( Joreskog & sorbom , 1989). The multitrait model restricts each item to load only on its appropriate factor. To examine the fit of individual items within the specified model using the modification indices and t value. Comparative Fit Index (CFI) & Relative No centrality Index (RNI): involves correction factor for the degree of freedom, CFI range from 0 to 1 is recommended when assessing the degree of fit of a single model. RNI range from -1 to 1 is recommended when comparing the fit of competing model Further, to assess the quality of the model, t value and modification indices can be used. By selecting a desired level of significance (usually p < .05, Bagozzi , Yi, & Phillips, 1991), the items that are not significant may need to be deleted.

Step 5: Convergent/Discriminant Validity Gathering further evidence of construct validity can be accomplished by examining the extent to which the scales correlate with other measures designed to asses similar constructs (Convergent validity) and to which they do not correlate with dissimilar measures (discriminant validity). Multitrait-Multimethod Matrix (MTMM) developed by Campbell and Fiske: used to determine convergent and discriminant validity. Convergent validity is achieved when the correlation between measures of similar constructs using different methods, such as self-reported performance and performance evaluation data ( monotrait-heteromethod ). Discriminant validity is achieved when three conditions are satisfied: 1. When the correlation between the measures of the same constructs with different methods is greater than the correlation between different constructs measured with different methods. 2. When the correlation between the same constructs using different methods is larger than the correlation between different constructs measured with the common method. 3. When a similar correlation pattern exists in each of the matrices formed by correlation of measures of different constructs obtained by the same method and correlation of different constructs obtained by different methods.

Step 6: Replication It is recommended that when items are added or deleted from a measure, the “new” scale should then be administered to another independent sample (Anderson & Gerbing, 1991; Schwab, 1980). The use of a new sample would also allow the application of the measure in a substantive test. It would now be necessary to collect another data set from an appropriate sample and repeat the scale-testing process with the new scales. The replication should include confirmatory factor analysis, assessment of internal consistency, and convergent, discriminant and criterion-related validity assessment.

Thank You
Tags