PERTEMUAN-01-02 mengenai probabilitas statistika ekonomi dan umum.ppt
BayuYakti1
29 views
123 slides
Sep 10, 2024
Slide 1 of 123
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
About This Presentation
Define Statistics
Describe the Uses of Statistics
Distinguish Descriptive & Inferential Statistics
Define Population, Sample, Parameter, and�Statistic
Define Quantitative and Qualitative Data
Define Random Sample
Size: 801.92 KB
Language: en
Added: Sep 10, 2024
Slides: 123 pages
Slide Content
Statistics for Business and
Economics
Chapter 1
Statistics, Data, &
Statistical Thinking
Learning Objectives
1.Define Statistics
2.Describe the Uses of Statistics
3.Distinguish Descriptive & Inferential Statistics
4.Define Population, Sample, Parameter, and
Statistic
5.Define Quantitative and Qualitative Data
6.Define Random Sample
Descriptive Statistics
1.Involves
•Collecting Data
•Presenting Data
•Characterizing Data
2.Purpose
•Describe Data
X = 30.5 S
2
= 113
0
25
50
Q1Q2Q3Q4
$
1.Involves
•Estimation
•Hypothesis
Testing
2.Purpose
•Make decisions about
population characteristics
Inferential Statistics
Population?
Key Terms
1.Population (Universe)
• All items of interest
2.Sample
• Portion of population
3.Parameter
• Summary measure about population
4.Statistic
• Summary measure about sample
•PP in PPopulation
& PParameter
•SS in SSample
& SStatistic
Types of Data
Types of
Data
Quantitative
Data
Qualitative
Data
Quantitative Data
Measured on a numeric
scale.
• Number of defective
items in a lot.
• Salaries of CEO's of
oil companies.
• Ages of employees at
a company.
3
52
71
4
8
943
120
12
21
Qualitative Data
Classified into categories.
• College major of each
student in a class.
• Gender of each employee
at a company.
• Method of payment
(cash, check, credit card).
$$ Credit
Random Sample
Every sample of size n has an equal chance of
selection.
Conclusion
1.Defined Statistics
2.Described the Uses of Statistics
3.Distinguished Descriptive & Inferential
Statistics
4.Defined Population, Sample, Parameter,
and Statistic
5.Defined Quantitative and Qualitative Data
6.Defined Random Sample
Probability & Statistics
Methods for Describing
Sets of Data
Learning Objectives
1.Describe Qualitative Data Graphically
2.Describe Quantitative Data Graphically
3.Explain Numerical Data Properties
4.Describe Summary Measures
5.Analyze Numerical Data Using
Summary Measures
Thinking Challenge
Our market share far
exceeds all
competitors! - VP
30%30%
32%32%
34%34%
36%36%
UsYYXX
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Summary
Table
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
Bar
Graph
Pie
Chart
Pareto
Diagram
Presenting
Qualitative Data
Data Presentation
Pie
Chart
Pareto
Diagram
Data
Presentation
Qualitative
Data
Quantitative
Data
Summary
Table
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
Bar
Graph
Summary Table
1.Lists categories & number of elements in category
2.Obtained by tallying responses in category
3.May show frequencies (counts), % or both
Row Is
Category
Tally:
|||| ||||
|||| ||||
Major Count
Accounting 130
Economics 20
Management 50
Total 200
Data Presentation
Pie
Chart
Summary
Table
Data
Presentation
Qualitative
Data
Quantitative
Data
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
Bar
Graph
Pareto
Diagram
0
50
100
150
Acct. Econ. Mgmt.
Major
Bar Graph
Vertical Bars
for Qualitative
Variables
Bar Height
Shows
Frequency or %
Zero Point
Percent
Used
Also
Equal Bar
Widths
F
r
e
q
u
e
n
c
y
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Summary
Table
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
Bar
Graph
Pie
Chart
Pareto
Diagram
Econ.
10%
Mgmt.
25%
Acct.
65%
Pie Chart
1.Shows breakdown of
total quantity into
categories
2.Useful for showing
relative differences
3.Angle size
•(360°)(percent)
Majors
(360°) (10%) = 36°
36°
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Summary
Table
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
Bar
Graph
Pie
Chart
Pareto
Diagram
Pareto Diagram
Like a bar graph, but with the categories arranged by
height in descending order from left to right.
0
50
100
150
Acct. Mgmt. Econ.
Major
Vertical Bars
for Qualitative
Variables
Bar Height
Shows
Frequency or %
Zero Point
Percent
Used
Also
Equal Bar
Widths
F
r
e
q
u
e
n
c
y
Thinking Challenge
You’re an analyst for IRI. You want to show the
market shares held by Web browsers in 2006.
Construct a bar graph, pie chart, & Pareto diagram
to describe the data.
Browser Mkt. Share (%)
Firefox 14
Internet Explorer 81
Safari 4
Others 1
0%
20%
40%
60%
80%
100%
FirefoxInternet
Explorer
Safari Others
Bar Graph Solution*
M
a
r
k
e
t
S
h
a
r
e
(
%
)
Browser
Pie Chart Solution*
Market Share
Safari; 4%
Firefox;
14%
Internet
Explorer;
81%
Others;
1%
Pareto Diagram Solution*
0%
20%
40%
60%
80%
100%
Internet
Explorer
Firefox Safari Others
M
a
r
k
e
t
S
h
a
r
e
(
%
)
Browser
Presenting
Quantitative Data
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Summary
Table
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
Bar
Graph
Pie
Chart
Pareto
Diagram
Stem-and-Leaf Display
1. Divide each observation
into stem value and leaf
value
• Stem value defines
class
• Leaf value defines
frequency (count)
2. Data: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41
26
2144677
3028
41
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Summary
Table
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
Bar
Graph
Pie
Chart
Pareto
Diagram
Frequency Distribution
Table Steps
1.Determine range
2.Select number of classes
• Usually between 5 & 15 inclusive
3.Compute class intervals (width)
4.Determine class boundaries (limits)
5.Compute class midpoints
6.Count observations & assign to classes
1. Determine the range
Range (R) = highest value – lowest value
2. Number of classes
C=1 + 10/3 x log N ( N = number of observation)
3. Class Interval
CI = R/C (rounded)
3a. Upper class limit – lower class limit
b. Lower Class Bounddary, upper class boundary
4. Class Boundaries
Lowest Boundaries value <= lowest value
Highest Boundaries value >= Highest Value
New Lowest Boundaries – x ( any value)
New Highest Boundaries + x ( any Value)
5. New CI will be rounded/integer value
6. Class Mid Point
CM = (Lower + Upper Boundaries) / 2
Frequency Distribution Table
Example
Raw Data: 24, 26, 24, 21, 27 27 30, 41, 32, 38
Boundaries
(Lower + Upper Boundaries) / 2
Width
Class MidpointFrequency
15.5 – 25.5 20.5 3
25.5 – 35.5 30.5 5
35.5 – 45.5 40.5 2
Relative Frequency &
% Distribution Tables
Percentage
Distribution
Relative Frequency
Distribution
Class Prop.
15.5 – 25.5.3
25.5 – 35.5.5
35.5 – 45.5.2
Class %
15.5 – 25.530.0
25.5 – 35.550.0
35.5 – 45.520.0
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Summary
Table
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
Bar
Graph
Pie
Chart
Pareto
Diagram
Numerical Data Properties
Thinking Challenge
... employees cite low pay --
most workers earn only
$20,000.
... President claims average
pay is $70,000!
$400,000$400,000
$70,000$70,000
$50,000$50,000
$30,000$30,000
$20,000$20,000
Standard Notation
Measure Sample Population
Mean X
Standard
Deviation
S
Variance S
2
2
Size n N
Numerical Data Properties
Central Tendency
(Location)
Variation
(Dispersion)
Shape
Numerical Data
Properties & Measures
Numerical Data
Properties
Mean
Median
Mode
Central
Tendency
Range
Variance
Standard Deviation
Variation
Percentiles
Relative
Standing
Interquartile Range
Z–scores
Central Tendency
Numerical Data
Properties & Measures
MeanMean
Median
Mode
Range
Variance
Standard Deviation
Interquartile Range
Numerical Data
Properties
Central
Tendency
Variation
Percentiles
Relative
Standing
Z–scores
Mean
1.Measure of central tendency
2.Most common measure
3.Acts as ‘balance point’
4.Affected by extreme values (‘outliers’)
5.Formula (sample mean)
X
X
n
X X X
n
i
i
n
n
1 1 2
…
Mean Example
Raw Data:10.34.98.911.76.37.7
X
X
n
X X X X X X
i
i
n
1 1 2 3 4 5 6
6
10349891176377
6
830
. . . . . .
.
Numerical Data
Properties & Measures
Mean
MedianMedian
Mode
Range
Variance
Standard Deviation
Interquartile Range
Numerical Data
Properties
Central
Tendency
Variation
Percentiles
Relative
Standing
Z–scores
Median
1.Measure of central tendency
2.Middle value in ordered sequence
• If n is odd, middle value of sequence
• If n is even, average of 2 middle values
3.Position of median in sequence
4.Not affected by extreme values
Positioning Point
n1
2
Median Example
Odd-Sized Sample
•Raw Data:24.122.621.523.722.6
•Ordered:21.522.622.623.724.1
•Position:1 2 3 45
Positioning Point
Median
n1
2
51
2
30
226
.
.
Median Example
Even-Sized Sample
•Raw Data:10.34.98.911.76.37.7
•Ordered:4.96.37.78.910.311.7
•Position:123456
Positioning Point
Median
n1
2
61
2
35
7789
2
830
.
. .
.
Numerical Data
Properties & Measures
Mean
Median
ModeMode
Range
Variance
Standard Deviation
Interquartile Range
Numerical Data
Properties
Central
Tendency
Variation
Percentiles
Relative
Standing
Z–scores
Mode
1.Measure of central tendency
2.Value that occurs most often
3.Not affected by extreme values
4.May be no mode or several modes
5.May be used for quantitative or qualitative
data
Mode Example
•No Mode
Raw Data:10.34.98.911.76.37.7
•One Mode
Raw Data:6.34.98.9 6.3 4.94.9
•More Than 1 Mode
Raw Data:212828414343
Thinking Challenge
You’re a financial analyst
for Prudential-Bache
Securities. You have
collected the following
closing stock prices of new
stock issues: 17, 16, 21, 18,
13, 16, 12, 11.
Describe the stock prices
in terms of central
tendency.
Central Tendency Solution*
Mean
X
X
n
X X X
i
i
n
1 1 2 8
8
1716211813161211
8
155
…
.
Central Tendency Solution*
Median
•Raw Data:1716211813161211
•Ordered:1112131616171821
•Position:12345678
Positioning Point
Median
n1
2
81
2
45
1616
22
16
.
Central Tendency Solution*
Mode
Raw Data: 1716211813161211
Mode = 16
Summary of
Central Tendency Measures
Measure Formula Description
Mean Xi / n Balance Point
Median (n+1)
Position
2
Middle Value
When Ordered
Mode none Most Frequent
Shape
Shape
1.Describes how data are distributed
2.Measures of Shape
• Skew = Symmetry
Right-SkewedLeft-Skewed Symmetric
MeanMean = = MedianMedian MeanMean MedianMedian MedianMedian MeanMean
Variation
Numerical Data
Properties & Measures
Mean
Median
Mode
RangeRange
Variance
Standard Deviation
Interquartile Range
Numerical Data
Properties
Central
Tendency
Variation
Percentiles
Relative
Standing
Z–scores
Range
1.Measure of dispersion
2.Difference between largest & smallest
observations
Range = X
largest – X
smallest
3.Ignores how data are distributed
7788991010 7788991010
Range = 10 – 7 = 3Range = 10 – 7 = 3
Numerical Data
Properties & Measures
Mean
Median
Mode
Range
Interquartile Range
VarianceVariance
Standard DeviationStandard Deviation
Numerical Data
Properties
Central
Tendency
Variation
Percentiles
Relative
Standing
Z–scores
Variance &
Standard Deviation
1.Measures of dispersion
2.Most common measures
3.Consider how data are distributed
46 1012
X = 8.3
4. Show variation about mean (X or μ)
8
Sample Variance Formula
n - 1 in denominator!
(Use N if Population
Variance)
S
X X
n
i
i
n
2
2
1
1
()
X X X X X X
n
n1
2
2
2 2
1
()()()
…
=
Sample Standard Deviation
Formula
SS
XX
n
XX XX XX
n
i
i
n
n
2
2
1
1
2
2
2 2
1
1
()
()()()…
Variance Example
Raw Data:10.34.98.911.76.37.7
S
X X
n
X
X
n
S
i
i
n
i
i
n
2
2
1 1
2
2 2 2
1
83
10383 4983 7783
61
6368
()
()()()
where .
. . . . . .
.
…
Thinking Challenge
•You’re a financial analyst
for Prudential-Bache
Securities. You have
collected the following
closing stock prices of
new stock issues: 17, 16,
21, 18, 13, 16, 12, 11.
•What are the variance
and standard deviation
of the stock prices?
Variation Solution*
Sample Variance
Raw Data: 1716211813161211
S
XX
n
X
X
n
S
i
i
n
i
i
n
2
2
1 1
2
2 2 2
1
155
17155 16155 11155
81
1114
()
()()()
where .
. . .
.
…
Variation Solution*
Sample Standard Deviation
SS
XX
n
i
i
n
2
2
1
1
1114334
()
. .
Summary of
Variation Measures
Measure Formula Description
Range Xlargest – XsmallestTotal Spread
Standard Deviation
(Sample)
XX
n
i
2
1
Dispersion about
Sample Mean
Standard Deviation
(Population)
X
N
i X
2Dispersion about
Population Mean
Variance
(Sample)
(Xi X)
2
n – 1
Squared Dispersion
about Sample Mean
Interpreting Standard
Deviation
Interpreting Standard Deviation:
Chebyshev’s Theorem
•Applies to any shape data set
No useful information about the fraction of data in the
interval x – s to x + s
At least 3/4 of the data lies in the interval
x – 2s to x + 2s
At least 8/9 of the data lies in the interval
x – 3s to x + 3s
In general, for k > 1, at least 1 – 1/k
2
of the data lies in
the interval x – ks to x + ks
Interpreting Standard Deviation:
Chebyshev’s Theorem
sx3 sx3sx2 sx2sxxsx
No useful information
At least 3/4 of the data
At least 8/9 of the data
Chebyshev’s Theorem Example
•Previously we found the mean
closing stock price of new stock
issues is 15.5 and the standard
deviation is 3.34.
•Use this information to form an
interval that will contain at least
75% of the closing stock prices of
new stock issues.
Chebyshev’s Theorem Example
At least 75% of the closing stock prices of new stock issues will lie within 2 standard deviations of the mean.
x = 15.5 s = 3.34
(x – 2s, x + 2s) = (15.5 – 2∙3.34, 15.5 + 2∙3.34)
= (8.82, 22.18)
Interpreting Standard Deviation:
Empirical Rule
•Applies to data sets that are mound shaped and symmetric
•Approximately 68% of the measurements lie in the
interval μ – σ to μ + σ
•Approximately 95% of the measurements lie in the
interval μ – 2σ to μ + 2σ
•Approximately 99.7% of the measurements lie in the
interval μ – 3σ to μ + 3σ
Interpreting Standard Deviation:
Empirical Rule
μ – 3σ μ – 2σ μ – σ μ μ + σ μ +2σ μ + 3σ
Approximately 68% of the measurements
Approximately 95% of the measurements
Approximately 99.7% of the measurements
Empirical Rule Example
Previously we found the mean
closing stock price of new
stock issues is 15.5 and the
standard deviation is 3.34. If
we can assume the data is
symmetric and mound shaped,
calculate the percentage of the
data that lie within the intervals
x + s, x + 2s, x + 3s.
Empirical Rule Example
Approximately 95% of the data will lie in the interval (x
– 2s, x + 2s),
(15.5 – 2∙3.34, 15.5 + 2∙3.34) = (8.82, 22.18)
Approximately 99.7% of the data will lie in the interval
(x – 3s, x + 3s),
(15.5 – 3∙3.34, 15.5 + 3∙3.34) = (5.48, 25.52)
According to the Empirical Rule, approximately 68% of
the data will lie in the interval (x – s, x + s),
(15.5 – 3.34, 15.5 + 3.34) = (12.16, 18.84)
Numerical Measures of
Relative Standing
Numerical Data
Properties & Measures
Mean
Median
Mode
Range
Variance
Standard Deviation
Interquartile Range
Numerical Data
Properties
Central
Tendency
Variation
PercentilesPercentiles
Relative
Standing
Z–scores
Numerical Measures of
Relative Standing: Percentiles
•Describes the relative location of a
measurement compared to the rest of the data
•The p
th
percentile is a number such that p% of
the data falls below it and (100 – p)% falls
above it
•Median = 50
th
percentile
Percentile Example
•You scored 560 on the GMAT exam. This
score puts you in the 58
th
percentile.
•What percentage of test takers scored lower
than you did?
•What percentage of test takers scored higher
than you did?
Percentile Example
•What percentage of test takers scored lower
than you did?
58% of test takers scored lower than 560.
•What percentage of test takers scored higher
than you did?
(100 – 58)% = 42% of test takers scored
higher than 560.
Numerical Data
Properties & Measures
Mean
Median
Mode
Range
Variance
Standard Deviation
Interquartile Range
Numerical Data
Properties
Central
Tendency
Variation
Percentiles
Relative
Standing
Z–scoresZ–scores
Numerical Measures of
Relative Standing: Z–Scores
•Describes the relative location of a
measurement compared to the rest of the data
Sample z–score
x – x
s
z =
Population z–score
x – μ
σ
z =
Measures the number of standard deviations
away from the mean a data value is located
Z–Score Example
•The mean time to assemble a
product is 22.5 minutes with a
standard deviation of 2.5 minutes.
•Find the z–score for an item that
took 20 minutes to assemble.
•Find the z–score for an item that
took 27.5 minutes to assemble.
Z–Score Example
x = 20, μ = 22.5 σ = 2.5
x – μ 20 – 22.5
σ
z =
=
2.5
= –1.0
x = 27.5, μ = 22.5 σ = 2.5
x – μ 27.5 – 22.5
σ
z =
=
2.5
= 2.0
Quartiles & Box Plots
Quartiles
1.Measure of noncentral tendency
25%25%25%25% 25%25% 25%25%
QQ
11 QQ
22 QQ
33
2. Split ordered data into 4 quarters
Positioning Point ofQ
in
i
1
4
()
3. Position of i-th quartile
Numerical Data
Properties & Measures
Mean
Median
Mode
Range
Interquartile RangeInterquartile Range
Variance
Standard Deviation
Skew
Numerical Data
Properties
Central
Tendency
Variation Shape
Interquartile Range
1.Measure of dispersion
2.Also called midspread
3.Difference between third & first quartiles
•Interquartile Range = Q
3 – Q
1
4.Spread in middle 50%
5.Not affected by extreme values
Thinking Challenge
•You’re a financial analyst for
Prudential-Bache Securities.
You have collected the
following closing stock prices
of new stock issues: 17, 16,
21, 18, 13, 16, 12, 11.
•What are the quartiles, Q
1
and Q
3, and the interquartile
Graphing Bivariate
Relationships
•Describes a relationship between two
quantitative variables
•Plot the data in a Scattergram
Positive
relationship
Negative
relationship
No
relationship
x xx
yy y
Scattergram Example
•You’re a marketing analyst for Hasbro Toys.
You gather the following data:
Ad $ (x)Sales (Units) (y)
1 1
2 1
3 2
4 2
5 4
•Draw a scattergram of the data
Time Series Plot
•Used to graphically display data produced
over time
•Shows trends and changes in the data over
time
•Time recorded on the horizontal axis
•Measurements recorded on the vertical axis
•Points connected by straight lines
Time Series Plot Example
•The following data shows
the average retail price of
regular gasoline in New
York City for 8 weeks in
2006.
•Draw a time series plot
for this data.
Date
Average
Price
Oct 16, 2006$2.219
Oct 23, 2006$2.173
Oct 30, 2006$2.177
Nov 6, 2006$2.158
Nov 13, 2006$2.185
Nov 20, 2006$2.208
Nov 27, 2006$2.236
Dec 4, 2006$2.298
Time Series Plot Example
2,05
2,1
2,15
2,2
2,25
2,3
2,35
10/16 10/23 10/30 11/6 11/13 11/20 11/27 12/4
Date
Price
Distorting the Truth
with Descriptive Techniques
Errors in Presenting Data
1.Using ‘chart junk’
2.No relative basis in
comparing data
batches
3.Compressing the
vertical axis
4.No zero point on the
vertical axis
No Relative Basis
Good PresentationGood Presentation
A’s by Class A’s by Class
Bad PresentationBad Presentation
0
100
200
300
FRSOJRSR
Freq.
0%
10%
20%
30%
FRSOJRSR
%
No Zero Point
on Vertical Axis
Good PresentationGood Presentation
Monthly Sales Monthly Sales
Bad PresentationBad Presentation
0
20
40
60
JMMJSN
$
36
39
42
45
JMMJSN
$
Conclusion
1.Described Qualitative Data Graphically
2.Described Numerical Data Graphically
3.Explained Numerical Data Properties
4.Described Summary Measures
5.Analyzed Numerical Data Using Summary
Measures