Data Analysis technique, data collection, data analysis

EktaJolly 190 views 95 slides Dec 09, 2023
Slide 1
Slide 1 of 95
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85
Slide 86
86
Slide 87
87
Slide 88
88
Slide 89
89
Slide 90
90
Slide 91
91
Slide 92
92
Slide 93
93
Slide 94
94
Slide 95
95

About This Presentation

Introduction to data analysis


Slide Content

Data Analysis The data, after collection, has to be processed and analysed in accordance with the outline laid down for the purpose at the time of developing the research plan This is essential for a scientific study and for ensuring that we have all relevant data for making contemplated comparisons and analysis Processing implies editing, coding, classification and tabulation of collected data so that they are amenable to analysis The term analysis refers to the computation of certain measures along with searching for patterns of relationship that exist among data-groups

Processing Operations Editing Coding Classification Tabulation

Process of examining the collected raw data to detect errors and omissions and to correct these when possible It involves a careful scrutiny of the completed questionnaires and/or schedules It ensures that the data are accurate, consistent with other facts gathered, uniformly entered, as completed as possible and have been well arranged to facilitate coding and tabulation With regard to points or stages at which editing should be done, one can talk of field editing and central editing Field editing consists in the review of the reporting forms by the investigator for completing (translating or rewriting) what the latter has written in abbreviated and/or in illegible form at the time of recording the respondents’ responses. This type of editing is necessary in view of the fact that individual writing styles often can be difficult for others to decipher Editing

Central editing should take place when all forms or schedules have been completed and returned to the office Thorough editing by a single editor or a team of editors in case of a large inquiry Editor(s ) may correct the obvious errors In case of inappropriate on missing replies, the editor can sometimes determine the proper answer by reviewing the other information in the schedule and at the same time respondent can be contacted for clarification

Editors must keep in view several points while performing their work: They should be familiar with instructions given to the interviewers and coders as well as with the editing instructions supplied While crossing out an original entry for one reason or another, they should just draw a single line on it so that the same may remain legible They must make entries (if any) on the form in some distinctive colur and that too in a standardised form They should initial all answers which they change or supply Editor’s initials and the date of editing should be placed on each completed form or schedule

Coding It refers to the process of assigning numerals or other symbols to answers so that responses can be put into a limited number of categories or classes appropriate to the research problem They must also possess the characteristic of exhaustiveness and also that of mutual exclusively which means Another rule to be observed is that of unidimensionality by which is meant that every class is defined in terms of only one concept Through it the several replies may be reduced to a small number of classes which contain the critical information required for analysis

Classification Most research studies result in a large volume of raw data which must be reduced into homogeneous groups if we are to get meaningful relationships This fact necessitates classification of data which happens to be the process of arranging data in groups or classes on the basis of common characteristics Data having a common characteristic are placed in one class and in this way the entire data get divided into a number of groups or classes Classification can be one of the following two types, depending upon the nature of the phenomenon involved : According to attributes According to class intervals

Tabulation When a mass of data has been assembled, it becomes necessary for the researcher to arrange the same in some kind of concise and logical order This procedure is referred to as tabulation and thus , tabulation is the process of summarizing raw data and displaying the same in compact form (i.e., in the form of statistical tables) for further analysis In a broader sense, tabulation is an orderly arrangement of data in columns and rows Tabulation is essential because of the following reasons: It conserves space and reduces explanatory and descriptive statement to a minimum It facilitates the process of comparison It facilitates the summation of items and the detection of errors and omissions It provides a basis for various statistical computations

Generally Accepted Principles of Tabulation Every table should have a clear, concise and adequate title so as to make the table intelligible without reference to the text and this title should always be placed just above the body of the table Every table should be given a distinct number to facilitate easy reference The column headings (captions) and the row headings (stubs) of the table should be clear and brief The units of measurement under each heading or sub-heading must always be indicated Explanatory footnotes, if any, concerning the table should be placed directly beneath the table , along with the reference symbols used in the table Source or sources from where the data in the table have been obtained must be indicated just below the table Usually the columns are separated from one another by lines which make the table more readable and attractive

Lines are always drawn at the top and bottom of the table and below the captions There should be thick lines to separate the data under one class from the data under another class and the lines separating the sub-divisions of the classes should be comparatively thin lines The columns may be numbered to facilitate reference Those columns whose data are to be compared should be kept side by side Similarly, percentages and/or averages must also be kept close to the data It is generally considered better to approximate figures before tabulation as the same would reduce unnecessary details in the table itself In order to emphasise the relative significance of certain categories, different kinds of type, spacing and indentations may be used

It is important that all column figures be properly aligned Decimal points and (+) or (–) signs should be in perfect alignment Abbreviations should be avoided to the extent possible and ditto marks should not be used in the table Miscellaneous and exceptional items, if any, should be usually placed in the last row of the table Table should be made as logical, clear, accurate and simple as possible. If the data happen to be very large, they should not be crowded in a single table for that would make the table unwieldy and inconvenient Total of rows should normally be placed in the extreme right column and that of columns should be placed at the bottom The arrangement of the categories in a table may be chronological, geographical, alphabetical or according to magnitude to facilitate comparison

Elements/ Types of Analysis By analysis we mean the computation of certain indices or measures along with searching for patterns of relationship that exist among the data groups It involves estimating the values of unknown parameters and testing of hypotheses for drawing inferences Analysis may, therefore, be categorized as descriptive analysis and inferential analysis (Inferential analysis is often known as statistical analysis ) Descriptive analysis is largely the study of distributions of one variable & this sort of analysis may be in respect of one variable (described as unidimensional analysis), or in respect of two variables (described as bivariate analysis ) or in respect of more than two variables (described as multivariate analysis ) We may as well talk of correlation analysis and causal analysis

Correlation analysis studies the joint variation of two or more variables for determining the amount of correlation between two or more variables Causal analysis is concerned with the study of how one or more variables affect changes in another variable It is thus a study of functional relationships existing between two or more variables This analysis can be termed as regression analysis Causal analysis is considered relatively more important in experimental researches In modern times, with the availability of computer facilities, there has been a rapid development of multivariate analysis which may be defined as “all statistical methods which simultaneously analyse more than two variables Elements/ Types of Analysis

Multivariate analysis Multiple regression analysis: This analysis is adopted when the researcher has one dependent variable which is presumed to be a function of two or more independent variables The objective of this analysis is to make a prediction about the dependent variable based on its covariance with all the concerned independent variables Multiple discriminant analysis: This analysis is appropriate when the researcher has a single dependent variable that cannot be measured, but can be classified into two or more groups on the basis of some attribute The object of this analysis is to o predict an entity’s possibility of belonging to a particular group based on several predictor variables Multivariate analysis of variance ( or multi-ANOVA ): Extension of two way ANOVA , wherein the ratio of among group variance to within group variance is worked out on a set of variables

Canonical analysis: This analysis can be used in case of both measurable and non-measurable variables for the purpose of simultaneously predicting a set of dependent variables from their joint covariance with a set of independent variables Inferential analysis is concerned with the various tests of significance for testing hypotheses in order to determine with what validity data can be said to indicate some conclusion or conclusions

Statistics in Research The role of statistics in research is to function as a tool in designing research, analysing its data and drawing conclusions therefrom Clearly the science of statistics cannot be ignored by any research worker, even though he may not have occasion to use statistical methods in all their details and ramifications The important statistical measures Measures of central tendency or statistical averages Measures of dispersion Measures of asymmetry ( skewness ) Measures of relationship Other measures

Measures of central tendency (or statistical averages) tell us the point about which items have a tendency to cluster Mean, median and mode are the most popular averages

Median Arrange your numbers in numerical order Count how many numbers you have If you have an odd number, divide by 2 and round up to get the position of the median  number If you have an even number, divide by 2. Go to the number in that position and average it with the number in the next higher position to get the  median Mode To find the mode, or modal value, it is best to put the numbers in order. Then count how many of each number. A number that appears most often is the mode.

Find the mean, median, and mode for the following list of values: 13, 18, 13, 14, 13, 16, 14, 21, 13 Mean=15 Median: 14 Mode:13 1, 2, 4, 7 Mean=3.5 Median= (2+4)/2=3 Mode=0 G.M. & H.M.

Measure of Dispersion An average can represent a series only as best as a single figure It fails to give any idea about the scatter of the values in the series around the true value of average In order to measure this scatter, statistical devices called measures of dispersion are calculated Important measures of dispersion are Range Mean deviation Standard deviation https://geographyfieldwork.com/DataPresentationScatterGraphs.htm#

Range It is the simplest possible measure of dispersion and is defined as the difference between the values of the extreme items of a series Range = Highest value of an item in a series- Lowest value of an item in a series It gives an idea of the variability very quickly, but the drawback is that range is affected very greatly by fluctuations of sampling Its value is never stable, being based on only two values of the variable As such, range is mostly used as a rough measure of variability and is not considered as an appropriate measure in serious research studies

Mean deviation It is the average of difference of the values of items from some average of the series In calculating mean deviation we ignore the minus sign of deviations while taking their total for obtaining the mean deviation Standard deviation It is most widely used measure of dispersion of a series and is commonly denoted by the symbol sigma Standard deviation is defined as the square-root of the average of squares of deviations, when such deviations for the values of individual items in a series are obtained from the arithmetic average

Measures of Asymmetry When the distribution of item in a series happens to be perfectly symmetrical, we then have the following type of curve for the distribution:

A normal curve and the relating distribution as normal distribution Such a curve is perfectly bell shaped curve in which case the value of X or M or Z is just the same and skewness is altogether absent If the curve is distorted (whether on the right side or on the left side), we have asymmetrical distribution which indicates that there is skewness If the curve is distorted on the right side, we have positive skewness but when the curve is distorted towards left, we have negative skewness

Skewness is, thus, a measure of asymmetry and shows the manner in which the items are clustered around the average

Measures of Relationship S tatistical measures that we used so far are in context of univariate population i.e ., measurement of only one variable If for every measurement of a variable , X , there is a corresponding value of a second variable, Y , the resulting pairs of values are called a bivariate population Similarly it can be a multi-variable data There are several methods of determining the relationship between variables, but no method can tell us for certain that a correlation is indicative of causal relationship

Two types of questions in bivariate or multivariate populations Does there exist association or correlation between the two (or more) variables? If yes, of what degree? Is there any cause and effect relationship between the two variables ? If yes, of what degree and in which direction? The first question is answered by the use of correlation technique and the second question by the technique of regression Measures of Relationship

There are several methods of applying the two techniques, but the important ones are as under : In case of bivariate population: Correlation can be studied through Cross tabulation Charles Spearman’s coefficient of correlation Karl Pearson’s coefficient of correlation; whereas cause and effect relationship can be studied through simple regression equations In case of multivariate population: Correlation can be studied through Coefficient of multiple correlation Coefficient of partial correlation; whereas cause and effect relationship can be studied through multiple regression Measures of Relationship

Simple Regression Analysis Regression is the determination of a statistical relationship between two or more variables In simple regression , we have only two variables, one variable (defined as independent) is the cause of the behaviour of another one (defined as dependent variable ) Regression can only interpret what exists physically i.e ., there must be a physical way in which independent variable X can affect dependent variable Y The basic relationship between X and Y is given by denotes the estimated value of Y for a given value of X

Then generally used method to find the ‘best’ fit that a straight line of this kind can give is the least-square method Least-Square Method

Least Square Curve Fitting method b a b S. S. Shashtri , “Introductory-Methods-of-Numerical-Analysis, 2012, PHI Learning, N. Delhi

Read a =a, a 1 =b*

A  sigmoid function  is a  mathematical function  having a characteristic "S"-shaped curve or  sigmoid curve . A common example of a sigmoid function is the  logistic function  shown in the first figure and defined by the formula

Definition Curve fitting: is the process of constructing a curve, or mathematical function, that has the best fit to a series of data points, possibly subject to constraints. It is a statistical technique use to drive coefficient values for equations that express the value of one(dependent) variable as a function of another (independent variable) https://www2.slideshare.net/shopnohinami/curve-fitting-53775511?from_action=save

What is curve fitting Curve fitting is the process of constructing a curve, or mathematical functions, which possess closest proximity to the series of data. By the curve fitting we can mathematically construct the functional relationship between the observed fact and parameter values, etc. It is highly effective in mathematical modelling some natural processes. https://www2.slideshare.net/shopnohinami/curve-fitting-53775511?from_action=save

Interpolation & Curve fitting In many application areas, one is faced with the test of describing data, often measured, with an analytic function. There are two approaches to this problem:- 1. In Interpolation, the data is assumed to be correct and what is desired is some way to descibe what happens between the data points. 2. The other approach is called curve fitting or regression, one looks for some smooth curve that ``best fits'' the data, but does not necessarily pass through any data points. In many application areas, one is faced with the test of describing data, often measured, with an analytic function . There are two approaches to this problem • In Interpolation, the data is assumed to be correct and what is desired is some way to describe what happens between the data points • The other approach is called curve fitting or regression , one looks for some smooth curve that `` best fits'' the data, but does not necessarily pass through any data points

Curve fitting There are two general approaches for curve fitting : • Least squares regression Data exhibit a significant degree of scatter. The strategy is to derive a single curve that represents the general trend of the data Interpolation Data is very precise. The strategy is to pass a curve or a series of curve through each of the points is very precise.

General approach for curve fitting

Engineering A applications of C urve fitting T echnique Trend Analysis:- Predicating values of dependent variable , may include extrapolation beyond data points or interpolation between data points In engineering, two types of applications are encountered: Trend analysis . Predicting values of dependent variable , may include extrapolation beyond data points or interpolation between data points Hypothesis testing . Comparing existing mathematical model with measured data

Data scatterness Positive Correlation Positive Correlation No Correlation

Mathematical Background Variance . Representation of spread by the square of the standard deviation. Coefficient of variation . Has the utility to quantify the spread of data. 2 n  1 ( y  y ) S 2   i y 2 2 2   n  1   y  / n y S  i i y c . v .  S y 100 % y Mean S.D

Least square method

Linear Regression: Criteria for a “Best” Fit n n a  a 1 x i ) m i n  e i   ( y i i  1 i  1 e 1 = -e 2

Linear Regression: Criteria for a “Best” Fit n n min  | e i |   | y i  a  a 1 x i | i  1 i  1

Linear Regression: Criteria for a “Best” Fit n min max| e i |  | y i  a  a 1 x i | i  1

Linear curve fitting (Straight line)? Given a set of data point (x i, f(xi )) find a curve that best captures the general trend • Where g(x) is approximation function set of data point (x i, f(x i )) find a curve that best captures the general trend Where g(x) is approximation function Try to fit a straight line Through the data

Linear Regression: Least Squares Fit n i n n r  i  i S   i  1 2 2 i  1 i  1 2 ( y i  a  a 1 x i ) e  ( y , m e a s u r ed  y , m od e l )     n n i r e i  1 i 0 1 i i  1 2 ( y  a  a x ) 2 m i n S  Yields a unique line for a given set of data.

Linear Regression: Least Squares Fit   n n r i i 0 1 i 2 e  ( y  a  a x ) 2 m i n S  i  1 i  1 The coefficients a and a 1 that minimize S r must satisfy the following conditions:     a 1   S    a   r    S r

Linear Regression: Determination of a o and a 1 2  1 i  i i  i  o y x  a x  a x 1   y i   a   a 1 x i   2   ( y i  a o  a 1 x i ) x i    S r  a   2  ( y i  a o  a 1 x i )   S r  a    2 1 i i i i y x  a x  a x  a  na n a    x i  a 1   y i 2 equations with 2 unknowns, can be solved simultaneously

Linear Regression: Determination of ao and a1 2 2 1  i  i x   x  n  i i   x i  y i x y n a  a  y  a 1 x

Error Quantification of Linear Regression Sum of the squares of residuals around the regression line is S r Total sum of the squares around the mean for the dependent variable, y, is S t 2 S t   ( y i  y ) 2 n n 2 r  i  i  1 i  1 e  ( y i  a o  a 1 x i ) S 

Example The table blew gives the temperatures T in C and Resistance R in Ω of a circuit if R=a + a 1 T Find the values of a and a 1 T 10 20 30 40 50 60 R 20.1 20.2 20.4 20.6 20.8 21

Solut i on T=Xi R=yi 𝑿𝒊 𝟐 = 𝑻 𝟐 Xiyi=TR g(xi)=Y 10 20.1 100 201 20.05 20 20.2 400 404 20.24 30 20.4 900 612 20.42 40 20.6 1600 824 20.61 50 20.8 2500 1040 20.80 60 21 3600 1260 20.98 𝑥𝑖 = 210 𝑦𝑖 = 123.1 𝑥𝑖 2 = 9100 𝑥𝑖𝑦𝑖 = 4341

S o lu t ion a =19.867 a 1 =0.01857 6a +210a 1 =123.1 210a +9100a 1 =4341 g(x)=19.867+0.01857*T

Least Squares Fit of a Straight Line: Example Fit a straight line to the x and y values in the following Table:  x i  28  y i  24.0 2  i x  140  i i x y  119.5 x  28  4 7 7 y  24  3.428571 x i y i x y i i i x 2 1 0.5 0.5 1 2 2.5 5 4 3 2 6 9 4 4 16 16 5 3.5 17.5 25 6 6 36 36 7 5.5 38.5 49 28 24 119.5 140

Least Squares Fit of a Straight Line: Example 2 2 1   x ) x  ( n x y n a  i i  i i   x i  y i  7  119.5  28  24  0.8392857 7  140  28 2 a  y  a 1 x  3.428571  0.8392857  4  0.07142857 Y = 0.07142857 + 0.8392857 x

Least Squares Fit of a Straight Line: Example (Error Analysis) 2  i r e  2.9911 S   0.868 S t S r r 2  S t 2  y  y   22.7143 S  t  i r 2 r   0.868  0.932

Least Squares Fit of a Straight Line: Example (Error Analysis ) The standard deviation (quantifies the spread around the mean):  n  1 7  1 s  S t 22.7143  1.9457 y The standard error of estimate (quantifies the spread around the regression line) 7  2  2.9911  0.7735 n  2 s  S r y / x

The relationship between the dependent and independent variables is linear. However, a few types of nonlinear functions can be transformed into linear regression problems. The exponential equation. The power equation. The saturation-growth-rate equation. Linearization of Nonlinear Relationships

Linearization of Nonlinear Relationships 1. The exponential equation. ln y  ln a 1  b 1 x y* = a o + a 1 x

Linearization of Nonlinear Relationships 2. The power equation log y  log a 2  b 2 log x y* = a o + a 1 x*

Linearization of Nonlinear Relationships The saturation-growth-rate equation a  x  y a 3   1  1  b 3  1  3 y* = 1/y a o = 1/a 3 a 1 = b 3 /a 3 x* = 1/x

Example Fit the following Equation: y  a 2 x b 2 To the data in the following table: x i y i X*=log x i Y*=logy i 1 0.5 0 0.602 2 1.7 0.301 0.753 3 3.4 0.301 0.699 4 5.7 .22 6 0.922 5 8.7 .447 2.079 1 5 19.7 .53 4 2.141 log y  log( a 2 x 2 ) b let Y *  log y, X *  log x, a  log a 2 , a 1  b 2 2 2 log y  log a  b log x Y *  a  a X * 0 1

Example Sum Xi Yi X* i =Log(X) Y* i =Log(Y) X*Y* X*^2 1 0.5 0.0000 -0.3010 0.0 00 0.0 00 2 1.7 0.3010 0.2304 0.0 6 94 0.0 9 06 3 3.4 0.4771 0.5315 0.2 5 36 0.2 2 76 4 5.7 0.6021 0.7559 . 455 1 . 362 5 5 8.4 0.6990 0.9243 0.6 4 60 0.4 8 86 15 19.700 2.079 2.141 1.424 1.169 i i 5  1.424  2.079  2.141   1.75 5  1.169  2.079 2 n x 2  ( x ) 2  a 1     a  y  a 1 x  0.4282  1.75  0.41584   0.334  n  x i y i  x i  y i  

Linearization of Nonlinear Functions: Example log y =-0.334+1.75log x y  0.46 x 1.75

Polynomial Regression Some engineering data is poorly represented by a straight line For these cases a curve is better suited to fit the data The least squares method can readily be extended to fit the data to higher order polynomials

Polynomial Regression (cont’d) A parabola is preferable

Polynomial Regression (cont’d) • A 2 nd 2 nd order polynomial (quadratic) is defined by: y  a  a x  a x 2  e o 1 2 The residuals between the model and the data: e  y  a  a x  a x 2 i i o 1 i 2 i The sum of squares of the residual: 2 2 2 2 i r  i  a x  e   y i  a o  a 1 x i S 

Polynomial Regression (cont’d) A system of 3x3 equations needs to be solved to determine the coefficients of the polynomial . The standard error & the coefficient of determination n  3 s  S r y / x t S S r r 2  S t               i i i  i  i  i  i  i  i  i  i a x y x x x x  a     x y  x  x 2  2  i i  1  4 3 2 3 2 n  x  x 2   a    y 

Polynomial Regression (cont’d) The coefficient of determination: General: The mth-order polynomial: y  a  a x  a x 2  .....  a x m  e o 1 2 m A system of (m+1)x(m+1) linear equations must be solved for determining the coefficients of the mth-order polynomial. The standard error: s  S r n   m  1  y / x S t S r r 2  S t

Polynomial Regression- Example Fit a second order polynomial to data: 3 x  225 4  979  i x x i y i x i 2 x i 3 x i 4 x i y i x i y i 2 2.1 1 7.7 1 1 1 7.7 7.7 2 13.6 4 8 16 27.2 54.4 3 27.2 9 27 81 81.6 244.8 4 40.9 16 64 256 163.6 654.4 5 61.1 25 125 625 305.5 1527.5 15 152.6 55 225 979 585.6 2489  x i y i  5 8 5 . 6  x i  15  y i  15 2 . 6 2  i  i x  55 y  152.6  25.433 6 x  15  2.5, 6 2  i i x y  2488.8

2 nd order polynomial Example y  a  a x  a x 2 o 1 2 xi fi 𝑥𝑖 2 𝒙𝒊 𝟑 𝒙𝒊 𝟒 fixi 𝒇𝒊𝒙𝒊 𝟐 g (x) 1 4 1 1 1 4 4 4.505 2 11 4 8 6 22 44 10.15 4 19 16 64 256 76 304 19.43 6 26 36 216 1296 156 936 26.03 8 30 64 512 4096 240 1920 29.95 𝑥 = 21 𝑓𝑖 = 90 𝑥𝑖 2 = 121 𝑥𝑖 3 = 801 𝑥𝑖 4 = 5665 𝑓𝑖𝑥𝑖 = 498 𝑓𝑖𝑥𝑖 2 = 3208

2 nd order polynomial Example 5 a +21 a 1 +121 a 2 =90 21 a +121 a 1 +801 a 2 =498 121 a +801 a 1 +5665 a 2 =3208 a =- 1.81 ,a 1 =6.6 5 , a 2 =- 0.335 So the required equation is g (x)=-1.81+6.65X-0.335 𝑥 2

Exponential function x 1 2 3 4 5 y 1.5 4.5 6 8.5 11 Solution y=a 𝑒 𝑏𝑥 lny=lna 𝑒 𝑏𝑥 =lna+bx Y=a +a 1 X Where Y=lny=fi, a =a ,a 1 =b , X=x

X= xi yi Y=lny 𝒙𝒊 𝟐 xiyi g (x) 1 1.5 0.405 1 0.405 2.06 2 4.5 1.504 4 3.008 3.27 3 6 1.791 9 5.373 5.186 4 8.5 2.14 16 8.56 8.22 5 11 2.39 25 11.95 13.03 𝑥𝑖 = 15 𝑓𝑖 = 8 . 23 𝑥𝑖 2 = 55 𝑓𝑖𝑥𝑖 = 29.296 Solution

Solution 5a +15a 1 =8.23 15a + 55a 1 = 29.296 ; a = 0.2642 ;a 1 =0.4606 a= 𝑒 0.2642 =1.30234, b=0.4606 Require equation g (x)=1.30238 𝑒 0.4606

Exampl Power function: x 2 2.5 3 3.5 4 y 7 8.5 11 12.75 15 Solution: y=a 𝑥 𝑏 lny = lna + blnx Y=a +a 1 X Where, Y=lny, a =lna; X=lnx; a 1 =b

Solution x y lnx=X lny=Y 𝑿 𝟐 XY g (x) 2 7 0.6931 1.946 0.480 1.3487 6.868 2.5 8.5 0.9163 2.140 0.8396 1.9608 8.813 3 11 1.098 2.397 1.2056 2.6319 10.806 3.5 12.75 1.252 2.545 1.5675 3.1863 12.838 4 15 1.386 2.708 1.9209 3.7532 14.904 𝑋𝑖 = 5.3454 𝑓𝑖 = 1 1 . 736 𝑋𝑖 2 = 6.0136 𝑓𝑖𝑋𝑖 = 1 2 . 8809

Solution 5a +5.3454a 1 =11.736 5.3454a +6.0136a 1 =12.8809 a =1.1521 ; a 1 =1.1178 a= 𝑒 𝑎0 • = 𝑒 1 . 1 52 1 =3 . 16 4 8 b=a 1 =1.1178 Required equation=3.1648 𝑥 1.1178

Polynomial Regression- Example (cont’d) x i y i y model e i 2 (y i -y`) 2 2.1 2.4786 0.14332 544.42889 1 7.7 6.6986 1.00286 314.45929 2 13.6 14.64 1.08158 140.01989 3 27.2 26.303 0.80491 3.12229 4 40.9 41.687 0.61951 239.22809 5 61.1 60.793 0.09439 1272.13489 15 152.6 3.74657 2513.39333 The standard error of estimate: 3.74657  1.12 6  3 y / x s  2513.39 r 2 r   0.99925 The coefficient of determination: r 2  2513.39  3.74657  0.99851,
Tags