Methods in Social Resaerch- Quantitative Research.pptx

Methods in Social Research – Quantitative research Ph.D. Programme in Global studies Università degli studi di Urbino Carlo Bo Tim Goedemé, PhD [email protected] Lecture 1 – 24/11/2020

Overview of the course Main aims Basic understanding of strengths and pitfalls of various types of quantitative research Being able to identify and reflect upon quality of social indicators and social research To be familiar with survey research and the total survey error paradigm Being able to critically reflect upon the identification of causality Having an understanding of some of often used quantitative research techniques in social policy research Being aware of key points of attention when setting up your own quantitative research project

Overview of the course Introduction to quantitative research and social indicators Survey data and total survey error, including sampling variance Causality Quantitative research techniques to identify drivers Setting up your own research project Perspective of (social) policy, poverty and inequality This is an introduction to give you some handles to understand quantitative research, better grasp the main issues and points of attention, and give some direction for your own research, not a statistics course

Introduction to quantitative research Quantitative research: concepts and definitions Summarising quantitative data Social indicators: an introduction

Main questions What is quantitative research? What are key metrics, tables and graphs to summarize quantitative data? What are the most important characteristics of good indicators? How can one define comparability?

Part 1: Quantitative research

Quantitative research What is qualitative research? Quantitative research: “ Quantitative research is the process of collecting and analyzing numerical data. It can be used to find patterns and averages, make predictions, test causal relationships, and generalize results to wider populations. Quantitative research is the opposite of qualitative research, which involves collecting and analyzing non-numerical data (e.g. text, video, or audio).” (Source: https://www.scribbr.com/methodology/quantitative-research/ ) <<Discussion>>

Quantitative research “Quantitative research is empirical research where the data are in the form of numbers” “Qualitative research is empirical research where the data are not in the form of numbers” (Punch, 2014, p. 3) However, lots of data in quantitative research are (initially) not in the form of numbers: gender, occupational status, economic sector of activity. And sometimes, in qualitative studies, numerical data are very important (e.g. dynamics of household debt, reference budgets research on how much people need to participate in society) Mixed-methods designs, number of observations, data generation process and degree of ‘generalisability’

Quantitative research Types: Descriptive Correlational Causal Methodological

Quantitative research Basis: start from data and try to turn that into a number that can be meaningfully interpreted Key issue: replicability if others would follow the same procedure, they should (be able to) arrive at the same result So the question is: are the data of sufficient quality? is the procedure appropriate and adequate? do the authors interpret the result correctly?

Quantitatve research Basic ingredients: variables and indicators Quantitative database: variables and observations

Quantitative research Variables (columns) Respondents / units of observation (rows) A record An observation / data point (cell) Structure of a database

Quantitative research Types of units of observation Social entities: Individuals, households, municipalities, companies, countries Social phenomena: Transactions, court cases, crimes, purchases Other types: durables, animals, … Types of variables Categorical: Dichotomous (only two values) Ordinal (logical order, but no clear (numerical) distance between categories) Nominal (no logical order) Numerical: ordinal, continuous (interval vs. ratio); discrete variables (only integers) – fully continuous variables Refers to measurement and how it is recorded in the database, not how it is in reality (e.g. weight in reality, versus recorded in varying classes of 5-10kg) Cases in between (read Heeringa et al., 2010, section 5.2.3, p. 119-120)

Quantitative research Structure of a database

Quantitative research Variables vs. indicators Indicator a summary statistic which tries to measure a (social) phenomenon or a (past, current or future) state of affairs Can be based on one or a combination of variables Can be based on a combination of (many) indicators Does not necessarily be based on survey data (e.g. the minimum level of social benefits can be derived from a database with programmed legislation) Examples: Average or median income, GDP/capita, CPI, GINI coefficient, Human Development Index (HDI) => a database can be a collection of (e.g. country-level) indicators, which themselves are generated on the basis of other databases

Quantitative research Types of analysis: Univariate (distribution of one variable) Bivariate (joint distribution of two variables, correlation) Multivariate (joint distribution of more than two variables, correlation)

Quantitative research Types of variables (bis) Dependent variable (response variable) Independent variable (treatment variable) Control variables

Quantitative research Types of databases Cross-sectional One moment in time Longitudinal Repeated cross-sections (multiple moments, different units of observation) Panel data (multiple moments, same units of observation) Survey data, Administrative data (register data)

Quantitative research Different types of data: Population data => information on all elements of the target population => Some elements of ‘total survey error paradigm’ still relevant (missing data, coverage errors, etc.) Information on a sample (i.e. a selection of the population) Non-random samples => cannot generalise with confidence to target population, no good indicators of statistical reliability Random samples => can generalise with more confidence, with indicators of statistical reliability (including confidence intervals, but also non-response rates

Quantitative research Non-random samples: Convenience samples, self-selected samples, purposefully selected samples (e.g. quota samples) No direct theoretical support to generalise findings to the population Random samples: Human influence (both known and unknown) is ‘removed’ from the selection process All elements in the population have a non-zero probability of selection For all elements in the sample, this probability of selection is known Simple random sampling: no stratification, no clustering, equal probability of selection for all population members Samples can be randomly wrong => require estimate of reliability! See also: Groves et al., 2009, p. 97ff

Quantitative research Strengths: (testable) potential for high degree of generalisability Replicability Support or reject quantitative and causal claims Helps to simplify to an understandable degree complex phenomena and changes Weaknesses: Risk of over-simplification (‘superficial’) Sometimes hard to identify and test causal mechanisms Limited possibilities to take variables and considerations on board that were not thought of in the design phase Over-confidence & misrepresentation Some of these also apply to qualitative research

Quantitative research Essential for: Knowledge of incidence and distribution of phenomena / characteristics, correlations in a target population ‘Proving’ causal relationships in specific and broader populations But quantitative research does not automatically lead to these things Requires careful sample selection, data treatment, analysis and interpretation

Part 2: Summarising data Databases are too big and not very telling to publish / present at such We need meaningful summary measures that tell us what the data look like They are estimated from the data. Very often an additional step is required to estimate what the estimate value tells about the target population (see afternoon) Two ways of summarising data: tables (i.e. numbers) and graphs

Summarising data The type of appropriate metric depends on the type of variable E.g. more limited possibilities with nominal variables vs. continuous variables Hierarchy: nominal – ordinal – interval – ratio As continuous variables can be made ordinal, what is applicable to ordinal is also applicable to continuous (with loss of information) Usually not the other way around Same holds for nominal vs. ordinal Exception: dichotomous (‘dummy’) variables => in analysis often treated as if they are continuous

Summarising data ‘Typical values’ (univariate analysis) Totals (e.g. how many unemployed are there in Italy?) Mode (most common value of a variable) Average Median (the observation in the middle of the distribution, when units of observation are ranked from lowest to highest; in case of even numbers: arithmetic average of two observations in the middle of the distribution)

Summarising data Illustration geometric average

Summarising data A simple line graph X-axis (independent) Y-axis (dependent) Origin, includes zero in case of dependent (in this case) Title, which also indicates unit of measurement Gridlines help to read the graph Axis titles are often very helpful to read the graph Source: my imagination Source: essential

Summarising data

Summarising data 2. Summarising the distribution Relative and cumulative frequencies Proportion = relative frequency = number / total number Percentage = relative frequency x 100 Percentage point difference (p.p.) = Percentage(A) – Percentage(B) Percentage change = 100 x Percentage(A) / Percentage(B) Quantiles = rank from low to high, then divide in groups of equal numbers of observations, quantiles are % cut-of points. Median = 50% cut-off Percentiles = Cut-offs when subdividing in 100 groups Deciles = Cut-offs when subdividing in 10 groups Quintiles = 5 groups Quartiles = 4 groups

Summarising data The frequency distribution of equivalised disposable income in Italy in 2017 Source: EU-SILC 2018 UDB, own computations

Summarising data Number of children per household in Italy, EU-SILC 2018 Note: children defined as those aged below 18 years, private households only. Source: EU-SILC 2018 UDB, own computations. What do these numbers mean?

Summarising data Source: https://towardsdatascience.com/understanding-boxplots-5e2df7bcbd51

Summarising data

Summarising data 3. Summarising the distribution: dispersion Absolute measures: Maximum – minimum Interquartile range = p75-p25 Standard deviation (=s) Variance (=s²) Relative measures: Decile ratio (D9/D1) Coefficient of variation ( = s / mean) GINI coefficient and many other indicators of inequality Standard deviation of the mean (in a sample)

Summarising data 4. Measures of association between variables Risk ratio (relative risk) = relative frequency(A) / relative frequency(B) Pearson’s correlation coefficient (-1 – 0 – +1), positive vs. negative Spearman’s rank-order coefficient = Pearson’s correlation of the ranks Multiple variables: regression coefficients Regression is a technique to analyse how one or more variables (simultaneously) correlate with a dependent variable Differences in mean values / medians in Y by groups of X

Summarising data Cross-tabulation of health status by immigration status for persons between 18 and 65 years old in Italy, EU-SILC 2018 Note: only persons living in private households Source: EU-SILC 2018 UDB, own computations Risk ratio = 10.9 / 12.8 = 0.86 Immigrants are 14% less likely to have health problems

Summarising data https://en.wikipedia.org/wiki/Correlation_and_dependence#/media/File:Pearson_Correlation_Coefficient_and_associated_scatterplots.png

Summarising data https://en.wikipedia.org/wiki/Correlation_and_dependence

Summarising data Source: https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient (last accessed 23/11/2020) Scatterplots

Summarising data Source: https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient (last accessed 23/11/2020)

Summarising data Linear regression Source: https://sphweb.bumc.bu.edu/otlt/MPH-Modules/PH717-QuantCore/PH717-Module9-Correlation-Regression/PH717-Module9-Correlation-Regression7.html (last accessed 23/11/2020)

Summarising data

Summarising data What would be an appropriate way to summarize each of the variables? (typical values, distribution, dispersion) What is the proportion of males? What is the mode of professional status? What is the average wage? What is the median of wage? Would the median or the average be the best way to summarise wage? How would you find out whether there is an association between age and wage; age and agreement; and gender and agreement?

Summarising data There are many more relevant metrics, this was really scratching the surface By now also sizeable literature on best way to graphical present data (e.g. do not use pseudo 3D graphs in Excell ) Many resources online. My favourite handbook (somewhat more advanced, but takes better account of data generation process): Heeringa , S. G., West, B. T. and Berglund, P. A. (2010), Applied Survey Data Analysis, Boca Raton: Chapman & Hall/CRC, 467p.

Part 3: (Social) indicators An indicator = a summary statistic which tries to measure a phenomenon or a (past, current or future) state of affairs

Indicators Concept – definition – metric – indicator Concept: description of the phenomenon in relation to related phenomena Definition: exact description which allows one to (theoretically) identify the phenomenon of interest at the exclusion of others Metric: Typically a mathematical formula which expresses how a single or multiple variables will be used and combined to measure the phenomenon Indicator: An implementation of the metric with observed variables / real data (i.e. operationalisation) *Multiple concepts are sometimes measured with the same indicator, often multiple indicators are necessary / possible to fully capture a single concept

Indicators Quality criteria of individual indicators (Atkinson et al., 2002): Validity: an indicator should identify the essence of the problem and have a clear and accepted normative interpretation (Face validity, transparency & acceptability) Internal vs. external validity => should always be evaluated in function of the objective of the exercise! Reliability: an indicator should be robust and statistically validated Responsiveness: an indicator should be responsive to effective policy interventions but not subject to manipulation

Quality criteria Comparability (Goedemé et al., 2015) Place & time: an indicator should be measurable in a sufficiently comparable way across member states Procedural comparability: the same procedures are implemented for measuring a phenomenon or characteristic at different occasions – different times or different places Substantive comparability (i.e. functional equivalence): the same phenomenon is captured similarly in different (social) contexts Operational feasibility, timeliness and potential for revision

Quality criteria For the portfolio of indicators (Atkinson et al., 2002): Balance across different dimensions (and be comprehensive but selective rather than exhaustive) Mutual consistency of indicators & proportionate weight of indicators Transparency and accessibility

Some points to remember Quantitative data are often helpful, sometimes essential They have to be treated and interpreted carefully A definition is not an indicator, multiple indicators are often necessary to measure a concept Validity, reliability and comparability are key quality characterstics of indicators

References Atkinson, A. B., Cantillon, B., Marlier, E., and Nolan, B. (2002), Social Indicators: the EU and Social Inclusion, Oxford: Oxford University Press, 240p. Chapter 2 Heeringa , S. G., West, B. T. and Berglund, P. A. (2010), Applied Survey Data Analysis , Boca Raton: Chapman & Hall/CRC, 467p. Groves, R. M., F. J. J. Fowler, M. P. Couper, et al. (2009), Survey Methodology (Second edition), John Wiley & Sons, New Jersey. Punch, K. F. (2014) Introduction to social research , London: Sage.

Methods in Social Resaerch- Quantitative Research.pptx

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Methods in Social Resaerch- Quantitative Research.pptx

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Slide 45

Slide 46

Slide 47

Slide 48

Slide 49

Slide 50

Slide 51

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Earthquakes_Type of Faults_Science G8.pptx

Quiz #1 Science 10 in the first quarter for jhs

Astronomy history from long ago till doday

Great history of astronomy from long ago till today

EARTHQUAKE-DRILL.powerpoint.............

History of astronomy from old times to the present times