OBJECTIVES: After the successful discussion in class, you will be able to: Define statistics and discuss the history of statistics; Understand the terms related to statistics; and Differentiate the two branches of Statistics.
STATISTICS It is imperative that every Filipino student have a clear understanding of the nature and definition of Statistics as a field of discipline. However, understanding the nature of statistics requires the students to go beyond its definition and appreciate its application.
Definitions of Statistics Plural Sense: Statistics pertains to numerical data or figures that can be presented and counted out of observations done by individual. Singular Sense: Statistics is a branch of science that deals with the collection, tabulation and presentation. Analysis and interpretation of the data that can be used when making decisions in the face of uncertainty.
HISTORY OF STATISTICS Simple forms of statistics have been used to record numbers of people, animals, and inanimate objects on skins, slabs, or sticks of wood and the walls of caves. Before 3000 B.C., Babylonians small clay tablets The Egyptians analyzed the population and material wealth of their country before building a pyramid.
Aside from secular records, Statistics could be seen in Biblical accounts such as the one recorded in numbers 31:25-41 when, during the times of Moses, statistics were gathered for purposes such as taxation, military service and priestly duties. Statistics has evolved from the word “ Statistik ” which was popularized by a German political scientist Gottfried Achenwall . The word was derived from Latin word “Status” or the Italian word “ Statista ” which means “State”. Its English use was introduced early in the 19 th Century by Sir John Sinclair to mean “the collection and classification of data”.
SIGNIFICANCE OF STATISTICS Good decisions are driven by data. Proper evaluation and interpretation of these data could be useful in making the best decision for practicality every facet of man’s daily activities.
Significance of Statistics in the ff. areas: Society Business and Industry Transportation Safety Peace and Order Sports Consumers Health Agriculture
AREAS OF STATISTICS Descriptive Statistics Is the discipline of quantitatively describing main features of a collection of data without drawing conclusions about a large group. Descriptive statistics can be thought of as a straightforward presentation of facts. Inferential Statistics Concerned with the analysis of a subset of data or sample leading to predictions or inferences about the entire set of data or population. Some examples shows the analysis of the samples are generalized for the large population that the sample represents.
Figure 1.1 Age-Sex Population Pyramid Philippines: 2010
Lesson 2 TERMS AND SYMBOLS USED IN STATISTICS
OBJECTIVES: After the successful discussion in class, you will be able to: Understand the terms related to statistics; Identify the different categories and types of variables; Differentiate population from a sample; Distinguish parametric test from non-parametric test; and Familiarize the different statistical symbols used in this book.
VARIABLES Figures or any characteristics, numbers, or quantity that can be measured or counted and varies over time and on different individual or object under consideration. Arrows are used to illustrate relationships among variables.
Types of variables 1. Independent Variable – experimental or predictor 2. Dependent Variable – outcome that is presumed effect 3. Extraneous Variables – interferes or interacts with IV
Classification of Variables Can be categorize according to the manner of measurement and presentation. 1. Qualitative Variable – categorical variable. It describes the qualities or characteristics of the samples 2. Quantitative Variable – give numerical responses representing an amount or number of something. 3. Discrete Quantitative Variable – countable variable. It assumes fixed or countable values of something being measured. 4. Continuous Quantitative Variable –non countable variable. It cannot take on finite values but the values are associated with points on an interval of the real line.
Kinds of Variables
STATISTICAL TERMS Measurement – is the process done by individual to determine the value or label of the variable based on what has been observed. Observation – is defined as the realized value of the variable Population – is the collection of things or observational units under consideration. It is an aggregate set of individuals with varied characteristics and could be classify according to age and sex.
Three (3) Factors Influencing Population 1. Fertility – this pertains to the birth rate in a community 2. Mortality – this is the death rate in a community 3. Migration – this is the number of people moving in and out of a community. The Statistical Population is the set of all possible values of the variable.
POPULATION – changes in population and its structure significantly affects policy making and the development of a certain community. SAMPLE – commonly defined as the subset of population that shares the same characteristics of the population SAMPLING – process of selecting a part or subset of a population. You can use your sample to make inferences for your population.
Common Reasons for Sampling Limited budget, time constraints, lack of manpower, accessibility, peace and order of the area, size of the population, and availability and cost of the experimental materials.
Types of Sampling Techniques Probability Sampling It is the one in which each sample has the same probability of being chosen. Non-Probability Sampling In this type of population sampling, members of the population do not have equal chance of being selected. No rule sampling We take sample without any rule
Types of Probability Sampling Random Sampling Systematic Sampling Stratified Sampling A. Proportionate Stratified Random Sampling B. Disproportionate Stratified Random Sampling Cluster Sampling
Information is limited to useful facts that an analyst or a decision maker can use in solving problems. Parameter is a set of numerical figures describing the characteristic of the population. Data mining is simply the process of examining or going to the details of the data. Important details are carefully studied and scrutinized . Sex is the biological description of a person either a male or a female. Operational Definition is how you define the word/s according to its use and purpose.
KINDS OF STATISTICAL TEST 1. PARAMETRIC TEST Requires normality of the distribution and the levels of measurement should be either interval or ratio. The observations must be independent and the populations must have the same variances. It implies that parametric tests are more efficient than its non-parametric tests counterpart Parametric tests are more appropriate when sample sizes are small.
KINDS OF PARAMETRIC TEST 1. T-test 2. Z-test 3. F-test 4. Analysis of Variance 5. Pearson Product Moment Coefficient of Correlation 6. Simple Linear Regression 7. Multiple Regression Analysis
2. NON-PARAMETRIC TEST Does not require normality of the distribution and the levels of measurement must be either interval ordinal.
STATISTICAL SYMBOLS ∑ N Xm n i F X <F Md >F X Cs Y SD R S² R² CD
Lesson 3 LEVELS OF MEASUREMENT
OBJECTIVES: After the successful discussion in class, you will be able to: Define and distinguish nominal, ordinal, interval, and ratio scales; Enumerate the limitations of each scale; Apply measurement scales in choosing statistical tools; and Give examples of errors that can be made by having inadequate understanding on the proper use of measurement scales.
Variables are considered the substance of statistics. Before we can conduct statistical investigation and analysis, we need first to fully understand the nature and measurements of the variables to be studied. The in-depth understanding of your variables may result to correct and usable inferences or conclusions. When we choose our statistical tools to be used in studying our variables, we need to consider the four measurement scales. For parametric test , the measurement scales should be either interval or ratio, and for non-parametric test , the measurement scales should be either nominal or ordinal.
FOUR LEVELS OF MEASUREMENT SCALES NOMINAL LEVEL ORDINAL LEVEL INTERVAL LEVEL RATIO LEVEL
Lesson 4 Data Collection and Presentation
Data Collection Presentation Data can be defined as groups of information that represent the qualitative or quantitative attributes of a variable or set of variables, which is the same as saying that data can be any set of information that describes a given entry. Data in statistics can be classified into grouped data and ungrouped data.
Ungrouped Data Any data that you first gathered and is the data in the raw. An example of ungrouped data is any list of numbers that you can think of. Example: The marks obtained by 20 students in class in a certain examination are given below: 21, 23, 19, 17, 12, 15, 15, 17, 17, 19, 23, 23, 21, 23, 25, 25, 21, 19, 19, 19
Array An arrangement of ungrouped data in ascending or descending order of magnitude is called an array. Example: The marks obtained by 20 students in class in a certain examination are given below: 21, 23, 19, 17, 12, 15, 15, 17, 17, 19, 23, 23, 21, 23, 25, 25, 21, 19, 19, 19 (Arrange the scores in ascending order)
Frequency Distribution Table or Frequency Chart A frequency is the number of times a data value occurs. For example, if ten students score 80 in Statistics, then the score of 80 has a frequency of 10. Frequency is often represented by letter f. A Frequency Chart is made by arranging data values in ascending order of magnitude along with their frequencies.
Example: We take each observation from the data, one at a time, and indicate the frequency (the number of times the observation has occurred in the data) by small line, called tally marks. For convenience we write tally marks in bunches of five, the fifth one crossing the fourth diagonally. In the table so formed, the sum of all the frequency is equal to the total number of observations in the given data. Marks Tally Marks Frequency
Grouped Data Is data that has been organized into groups known as classes. Grouped data has been ‘classified’ and thus some level of data analysis has taken place, which means that the data is no longer raw.
When the set of data values are spread out, it is difficult to set up a frequency table for every data value as there will be too many rows table. So we group the data into class intervals (or groups) to help us organize, interpret and analyze the data. Each class is bounded by two figures, which are called class limits. The figure on the left side of a class is called its lower limit and that on its right is called its upper limit. Ideally, we should have between five and ten rows in a frequency table. Bear this in mind when deciding the size of the class interval (or group).
TYPES OF GROUPED FREQUENCY DISTRIBUTION 1. Exclusive form (or Continuous Interval Form): A frequency distribution in which the upper limit of each class is excluded and lower limit is included is called an exclusive form.
2. Inclusive Form (or Discontinuous Interval Form): A frequency distribution in which upper limit as well as lower limit is included is called an inclusive form.
Calculating Class Interval Given a set of raw or ungrouped data, how would you group that data into suitable classes that are easy to work with and at the same time meaningful? The first step is to determine how many classes you want to have. Next, you subtract the lowest value in the data set from the highest value in the data set and then you divide by the number of classes that you want to have.
Formula:
The general rules for constructing a frequency distribution are: There should not be too few or too many classes Insofar as possible, equal class intervals are preferred. But the first and last classes can be open-ended to cater for extreme values. Each class should have a class mark to represent the classes. It is also named as the class midpoint of the i th class. It can be found by taking simple average of the class boundaries or the class limits of the same class.
Example: Group the following raw data into ten classes. An array of the marks of 25 students in ascending order. 8, 10, 11, 12, 14, 16, 16, 16, 20, 24, 25, 25, 25, 29, 30, 33, 35, 36, 37, 40, 40, 42, 45, 45, 48.
GRAPHICAL METHODS Frequency distributions and are usually illustrated graphically by plotting various types of graphs: 1. Bar Graph - A bar graph is a way of summarizing a set of categorical data. It displays the data using a number of rectangles, of the same width, each of which represents a particular category. Bar graphs can be displayed horizontally or vertically and they are usually drawn with gap between the bars (rectangles).
GRAPHICAL METHODS 2. Histogram - A histogram is a way of summarizing data that are measured on an interval scale (either discrete or continuous). It is often used in exploratory data analysis to illustrate the features of the distribution of the data in a convenient form.
GRAPHICAL METHODS 3. Pie Chart - A pie chart is used to display a set of categorical data. It is a circle, which is divided into segments. Each segment represents a particular category. The area of each segment is proportional to the number of cases in that category.
GRAPHICAL METHODS 4. Line Graph - A line graph is particularly useful when we want to show the trend of a variable over time. Time is displayed on the horizontal axis (x-axis) and the variable is displayed on the vertical axis (y-axis).
Did you like this Powerpoint presentation? We offer some educational services like: - Powerpoint presentation maker Proof-reader Content creator Layout designer If you are interested, kindly contact us at [email protected] For content videos, subscribe to our youtube channel: FlippED Channel