Statistics in economics and business Week 2.pdf

kalleylee05 18 views 45 slides Aug 27, 2025
Slide 1
Slide 1 of 45
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45

About This Presentation

abc


Slide Content

STATISTICS
IN ECONOMICS AND BUSINESS
Nguyen Huyen Trang
Faculty of Statistics -National Economics University
[email protected]

LECTURE 2: DATA COLLECTION
•Data measurement
•Source of data
•Sampling process
•Types of sampling

•Nominal
•Ordinal
•Interval
•Ratio
DATA MEASUREMENT

NOMINAL SCALE
•The labels are numerically coded
•Have no logical orderamong labels and numbers
Sex Code
Male 1
Female 2

ORDINAL SCALE
The data classified can be ranked or ordered.
Strongly
Disagree
DisagreeNeutralAgree
Strongly
Agree
1 2 3 4 5
Strongly
Disagree
DisagreeNeutralAgree
Strongly
Agree
5 4 3 2 1
LEARNING STATISTICS IS INTERESTING!

INTERVAL SCALE
Similar to the ordinal level, but differences between data values
are equal and meaningful
Team Round 1Round 2Round 3
A 1 2 3
B 2 3 1
C 3 1 2
Which team
is the
winner?
TeamRound
1
Round
2
Round
3
Total
A 18 18 16 52
B 15 16 18 49
C 11 19 17 47
Team A is
the winner!

There is no natural zero point
→cannot calculate the ratio
INTERVAL SCALE

The interval level with a natural zero starting point
Can use EVERY function
RATIO SCALE

LEVEL OF MEASUREMENT
Qualitative (Categorical)Quantitative (Scale)
Nominal OrdinalDiscreteContinuous
Listing,
Grouping
Listing,
Grouping,
Sorting,
Maybe ±
Listing, Grouping, Sorting
Math operation: ±, , ÷, …
Interval Ratio
Coded by numbers Used to rank

EXERCISE 1 –GROUP WORK
Whattypeofdataandmeasurementscalewouldeachofthefollowing
represent?
(1) What is your favorite sport?
(2) Do you like opera?
(3) How many hours per week do you watch television?
(4) What kind of music do you like?
(5) To what degree do you enjoy reading novels?
(6) On a scale of from 1 (Dislike) to 7 (Like), how much do you like
Italian food?
(7) In what state were you born?

Place these variables in the following classification tables
Nominal
Ordinal
Discrete Continuous
Interval
Ratio
EXERCISE 1 –GROUP WORK

EXERCISE 2
Whatisthe levelof measurementfor eachof the followingvariables?
A. student’smajor
B. distance studentstravelto class
C. studentscores on the first statisticstest
D. a classification of studentsby state of birth
E. a rankingof studentsas freshmen, sophomore, junior, and senior
F. numberof hoursstudentsstudyper week

SOURCES OF DATA
Both must be:
•Relevant
•Accurate
•Current
•Impartial
Primary
Source
Collected for
the particular
purpose
Secondary
Source
Already exists,
collected for
some other purpose

SOURCES OF DATA

GROUP WORK
What are the advantages and disadvantages of
Primary and secondary data?
PRIMARY VS SECONDARY DATA

PRIMARY VS SECONDARY DATA?
•Focus group
•Statistical Yearbook of Vietnam
•Survey
•Interview
•Trade Association Report

POPULATION VS SAMPLE
Population:
A set of all interested elements
N represents the population size, maybe infinite
Sample:
A part of the population that is selected to represent
the entire group
n represents the sample size, finite

CENSUS VS SAMPLING
A censusis a study of every unit,
everyone or everything, in a
population
Sampling is a method of
studying from a few selected
items, instead of the entire big
number of units

REASON TO TAKE SAMPLE
•Collectinginformationfromtheentirepopulationis
sometimesimpossible
•Enableresearch/surveystobedonemorequickly/timely
•Lessexpensiveandoftenmoreaccuratethanlargecensus
•Allowsforminimaldamageorlost
•Beusedtovalidatecensusdata

AN IMPORTANT REQUIREMENT
A sample must be representative of the population.

SAMPLING PROCESS
Define Population
SpecifySampling Frame
Determine Sampling Method
Probability Sampling Non-Probability Sampling
Determine Appropriate
Sample Size
Execute Sampling Design

MOVING FROM
POPULATION TO SAMPLE
Population
Sample
Sampling frame
(a list of all items of
the population)

TYPES OF SAMPLING

PROBABILITY VS
NON-PROBABILITY SAMPLING
FEATURE
PROBABILITY
SAMPLING
NON-PROBABILITY
SAMPLING
Meaning
Subjects of the population get an
equal opportunity to be selected
as a representative sample
The researcher selects sample
based on the subjective judgment
of the researcher rather than
random selection
Alternately known as Random sampling Non-random sampling
Basis of selection Randomly Arbitrarily
Opportunity of
selection
Fixed and known Not specified and unknown
Research Conclusive Exploratory
Result Unbiased Biased
Method Objective Subjective
Inferences Statistical Analytical
Hypothesis Tested Generated

PROBABILITY SAMPLING

SIMPLE RANDOM SAMPLING
•Informal method: randomly picking. Easiest way and can be
applied to a small population (picking a name out of a hat,
choosing the short straw, lottery draw,…)
•Formal method: use the table of random numbers, software
programs

SIMPLE RANDOM SAMPLING
•Five steps in applying this method
i. Obtain a complete sampling frame
ii. Give each case a unique number starting at one
iii. Decide on the required sample size
iv. Select numbers for the sample size from a table of random numbers
v. Select the cases that correspond to the randomly chosen numbers
•Example: Randomly call a few students to take attendance

TABLE OF RANDOM NUMBERS
54033935397490257237839400383070718700154548745727980851451238614
92744532239060836942713827136865638241139237439008765537928614332
17716956902158444015676229532821217209443022673254405063880850946
99153066304828763905436109753715845172953932721392847397407180258
32607841095616987115942179304181437842233892577017804827078893096
25123113078887615580354701526692263495085960382354937822477557586
62173290616858276463262616861677488615331677798307562492997096282
60706305347561481804102397653551098788062405943888305213011918724

WHEN DO WE APPLY THIS?
▪Have a good sampling frame
▪Population is geographically concentrated
▪Data collection technique does not involve travelling

SYSTEMATIC RANDOM SAMPLING
Choose every “k
th
” individual to be a part of the sample

SYSTEMATIC RANDOM SAMPLING
Steps to obtain a systematic sample:
•Obtain a sampling frame
•Determine the population size: N
•Determine the sample size required: n
•Divide population of N individuals into groups of k individuals:
•Randomly select one individual from the 1st group
•Select every k
th
individual thereafter
k=
N
n

STRATIFIED RANDOM SAMPLING
•Populationisdividedintotwoormoregroups
calledstrata
•Subsamplesarerandomlyselectedfromeachstrata

STRATIFIED RANDOM SAMPLING
•Thesamplingprocedureismorecomplicated
•Stepstotakeastratifiedsample
•Selectthestratifyingvariable
•Dividethesamplingframeintostrataorcategories
•Drawasystematicorrandomsampleofeachstratum

CLUSTER RANDOM SAMPLING
▪Dividethepopulationintoseparategroups,calledclusters.
▪Twotypesofclustersampling:
•Onestagecluster
•Twostagecluster

CLUSTER RANDOM SAMPLING
▪Onestagecluster
•Randomlyselectsubsets
•Sampleentireparticipationsin
theselectedsubset
▪Twostagecluster
•Randomlyselectsubsets
•Conduct simple random
sampling for participations in
the selected subset

MULTI-STAGE RANDOM SAMPLING
•To be a complex form of cluster and stratified sampling
•Carried out in stages
•Using smaller and smaller sampling units at each stage

PROBABILITY SAMPLING
Technique Advantages Disadvantages
Random
-Easy to conduct
-Not require any additional information
except the contact info
-Meets assumption of many statistical
procedures
-Identification of all members of the
population can be difficult
-Can be expensive and unfeasible for large
population
Systematic
-Easy to construct, execute, compare, and
understand
-Spread over population
-High sampling bias if periodicity exists
Stratified
-More accurate sample
-Effective representation of all subgroups
-Problem if strata not clearly defined
-Complex to apply in practical levels
Cluster
-Time efficient
-Cost efficient: reduce field cost
-Applicable where no complete list of units
is available
-May not be representative of whole
population

NON-PROBABILITY SAMPLING
The process of selecting
sample without using
statistical probability
theory

QUOTA SAMPLING
▪Similar tostratified sampling: population is divided
into subsets
▪Select the participations from each subset based on
specified proportion

PURPOSIVE SAMPLING
Selective
Sampling
Subjective
Sampling
-Also known as: Judgmental Sampling, Selective Sampling,
Subjective Sampling
-Rely on the judgement of the researcher

VOLUNTEER SAMPLING
▪Participants self-select to become part of a study because
they volunteer when asked, or respond to an advert
▪Two types of volunteer sampling:
-Snowball
-Self selection

SNOWBALL SAMPLING
▪Known as network or chain-referral sampling
▪Existing participations recruit future participations among their
acquaintances

SELF SELECTION SAMPLING
▪Individuals identify their wish to take part in the study
▪Individuals volunteer to be part of the sample

CONVENIENCE SAMPLING
▪Known as Haphazard or Accidental sampling
▪Sample units are only selected if they can be accessed easily and
conveniently

NON-PROBABILITY SAMPLING
Technique Advantages Disadvantages
Quota
-Low cost, time and administrations
-No need for list of population elements
-Dependent on subjective decisions
-Not possible to generalise
Purposive
-Select only individuals who are relevant
to research purpose
-Less costly, more convenient
-No guarantee that chosen sample are
true representative of the population
-Limited generalizability
Volunteer
-May have an interest in the subject so
they are less likely to give biased
information
-Doesn’t require a lot of screening
-Over-representation of a particular
network
-Take a long time to get enough people
to do experiment
Convinience
-High levels of simplicity and ease
-Less time and cost required
-Usefulness in pilot studies
-Highest level of sampling error
-Sample is not representative of
population
Tags