REGRESSION ANALYSIS :
REGRESSION ANALYSIS M.Ravishankar [ And it’s application in Business ]Introduction. . .:
Introduction. . . Father of Regression Analysis Carl F. Gauss (1777-1855). contributions to physics, Mathematics & astronomy. The term “Regression” was first used in 1877 by Francis Galton.Regression Analysis. . .:
Regression Analysis. . . It is the study of the relationship between variables. It is one of the most commonly used tools for business analysis. It is easy to use and applies to many situations.Regression types. . .:
Regression types. . . Simple Regression : single explanatory variable Multiple Regression : includes any number of explanatory variables.Slide 5:
Dependant variable : the single variable being explained/ predicted by the regression model Independent variable : The explanatory variable(s) used to predict the dependant variable. Coefficients (β): values, computed by the regression tool, reflecting explanatory to dependent variable relationships. Residuals (ε): the portion of the dependent variable that isn ’ t explained by the model; the model under and over predictions.Regression Analysis. . .:
Regression Analysis. . . Linear Regression : straight-line relationship Form: y=mx+b Non-linear : implies curved relationships logarithmic relationshipsRegression Analysis. . .:
Regression Analysis. . . Cross Sectional : data gathered from the same time period Time Series : Involves data observed over equally spaced points in time.Simple Linear Regression Model. . .:
Simple Linear Regression Model. . . Only one independent variable, x Relationship between x and y is described by a linear function Changes in y are assumed to be caused by changes in xTypes of Regression Models. . .:
Types of Regression Models. . .Estimated Regression Model. . .:
The sample regression line provides an estimate of the population regression line Estimated Regression Model. . . Estimate of the regression intercept Estimate of the regression slope Estimated (or predicted) y value Independent variable The individual random error terms e i have a mean of zeroSimple Linear Regression Example. . .:
Simple Linear Regression Example. . . A real estate agent wishes to examine the relationship between the selling price of a home and its size (measured in square feet) A random sample of 10 houses is selected Dependent variable (y) = house price in $1000s Independent variable (x) = square feetSample Data :
Sample Data House Price in $1000s (y) Square Feet (x) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700Output. . .:
Output. . . Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 The regression equation is:Graphical Presentation . . .:
Graphical Presentation . . . House price model: scatter plot and regression line Slope = 0.10977 Intercept = 98.248Interpretation of the Intercept, b0:
Interpretation of the Intercept, b 0 b 0 is the estimated average value of Y when the value of X is zero (if x = 0 is in the range of observed x values) Here, no houses had 0 square feet, so b 0 = 98.24833 just indicates that, for houses within the range of sizes observed, $98,248.33 is the portion of the house price not explained by square feetInterpretation of the Slope Coefficient, b1:
Interpretation of the Slope Coefficient, b 1 b 1 measures the estimated change in the average value of Y as a result of a one-unit change in X Here, b 1 = .10977 tells us that the average value of a house increases by .10977($1000) = $109.77, on average, for each additional one square foot of sizeExample: House Prices:
House Price in $1000s (y) Square Feet (x) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700 Estimated Regression Equation: Example: House Prices Predict the price for a house with 2000 square feetExample: House Prices:
Example: House Prices Predict the price for a house with 2000 square feet: The predicted price for a house with 2000 square feet is 317.85($1,000s) = $317,850Coefficient of Determination, R2:
Coefficient of determination Coefficient of Determination, R 2 Note: In the single independent variable case, the coefficient of determination is where: R 2 = Coefficient of determination r = Simple correlation coefficientExamples of Approximate R2 Values:
R 2 = +1 Examples of Approximate R 2 Values y x y x R 2 = 1 R 2 = 1 Perfect linear relationship between x and y: 100% of the variation in y is explained by variation in xExamples of Approximate R2 Values:
Examples of Approximate R 2 Values y x y x 0 < R 2 < 1 Weaker linear relationship between x and y: Some but not all of the variation in y is explained by variation in xExamples of Approximate R2 Values:
Examples of Approximate R 2 Values R 2 = 0 No linear relationship between x and y: The value of Y does not depend on x. (None of the variation in y is explained by variation in x) y x R 2 = 0Output. . .:
Output. . . Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 58.08% of the variation in house prices is explained by variation in square feetStandard Error of Estimate. . .:
Standard Error of Estimate. . . The standard deviation of the variation of observations around the regression line is estimated by Where SSE = Sum of squares error n = Sample size k = number of independent variables in the modelThe Standard Deviation of the Regression Slope:
The Standard Deviation of the Regression Slope The standard error of the regression slope coefficient (b 1 ) is estimated by where: = Estimate of the standard error of the least squares slope = Sample standard error of the estimateOutput. . .:
Output. . . Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580Reference. . .:
Reference. . . Business statistics by S.P.Gupta & M.P.Gupta Sources retrieved from Internet… www.humboldt.edu www.cs.usask.ca www.cab.latech.edu www.quickmba.com www.wikipedia.com www.youtube.comM.RAVISHANKAR MBA(AB) 2008-2010 Batch NIFTTEA KNITWEAR FASHION INSTITUTE TIRUPUR OXYGEN024@GMAIL.COM:
M.RAVISHANKAR MBA(AB) 2008-2010 Batch NIFTTEA KNITWEAR FASHION INSTITUTE TIRUPUR OXYGEN024@GMAIL.COM