OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.

OPIM 303-Lecture #8

Jose M. Cruz

Assistant Professor

Session 8 - Overview

• Simple Regression Model• Determining the best fit• “Goodness of Fit”

– R2

– Confidence Intervals– Hypothesis tests– Residual Analysis

Purpose of Regression Analysis

• Regression analysis is used primarily to model causality and provide prediction– Predicts the value of a dependent (response)

variable based on the value of at least one independent (explanatory) variable

– Explains the effect of the independent variables on the dependent variable

Types of Regression Models

Positive Linear Relationship

Negative Linear Relationship

Relationship NOT Linear

No Relationship

Simple Linear Regression Model

• Relationship between variables is described by a linear function

• The change of one variable causes the change in the other variable

• A dependency of one variable on the other

PopulationRegressionLine (conditional mean)

Population Linear Regression

Population regression line is a straight line that describes the dependence of the average value (conditional mean)average value (conditional mean) of one variable on the other

Population Y intercept

Population SlopeCoefficient

Random Error

Dependent (Response) Variable

Independent (Explanatory) Variable

ii iY X

Population Linear Regression

(continued)

ii iY X

= Random Error

(Observed Value of Y) =

Observed Value of Y

(Conditional Mean)

Sample regression line provides an estimateestimate of the population regression line as well as a predicted value of Y

Sample Linear Regression

Sample Y Intercept

SampleSlopeCoefficient

Residual0 1i iib bY X e

0 1Y b b X Sample Regression Line (Fitted Regression Line, Predicted Value)

• and are obtained by finding the values of and that minimizes the sum of the squared residuals

• provides an estimateestimate of • provides and estimateestimate of

0b 1b 0b1b

(continued)

i i ii i

(continued)

XObserved Value

ii iY X

0 1i iY b b X

0 1i iib bY X e 1b

Interpretation of the Slope and the Intercept

• is the average value of Y when

the value of X is zero.

• measures the change in the

average value of Y as a result of a one-unit

change in X.

| 0E Y X

|E Y X

• is the estimatedestimated average

value of Y when the value of X is zero.

• is the estimatedestimated change in

the average value of Y as a result of a one-unit

change in X.

(continued)

ˆ | 0b E Y X

ˆ |E Y Xb

Interpretation of the Slope and the Intercept

Simple Linear Regression: Example

You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained. Find the equation of the straight line that fits the data best.

Annual Store Square Sales

Feet ($1000)

1 1,726 3,681

2 1,542 3,395

3 2,816 6,653

4 5,555 9,543

5 1,292 3,318

6 2,208 5,563

7 1,313 3,760

Scatter Diagram: Example

0 1000 2000 3000 4000 5000 6000

Square Feet

Excel Output

Equation for the Sample Regression Line: Example

1636.415 1.487i i

Y b b X

From Excel Printout:

CoefficientsIntercept 1636.414726X Variable 1 1.486633657

Excel Output

Regression Statistics

Multiple R 0.970557

R Square 0.941981

Adjusted R Square 0.930378

Standard Error 611.7515

Observations 7

df SS MS FSignificance

Regression 1 30380456 30380456 81.17909 0.000281

Residual 5 1871200 374239.9

Total 6 32251656

Coefficient

sStandard

Error t Stat P-value Lower 95% Upper 95%Intercept 1636.415 451.4953 3.624433 0.015149 475.8109 2797.019X Variable 1 1.486634 0.164999 9.009944 0.000281 1.06249 1.910777

Graph of the Sample Regression Line: Example

0 1000 2000 3000 4000 5000 6000

Square Feet

Y i = 1636.415 +1.487X i

Interpretation of Results: Example

The slope of 1.487 means that for each increase of one unit in X, we predict the average of Y to increase by an estimated 1.487 units.

The model estimates that for each increase of one square foot in the size of the store, the expected annual sales are predicted to increase by $1487.

ˆ 1636.415 1.487i iY X

How Good is the regression?

• R2

• Residual Plots• Analysis of Variance• Confidence Intervals• Hypothesis (t) tests

Coefficient of Correlation

• Measures the strength of the linear relationship between two quantitative variables

i ii i

X X Y Yr

X X Y Y

The Coefficient of Determination

• Denoted by R2

• Measures the proportion of variation in Y that is explained by the independent variable X in the regression model

Coefficients of Determination (r 2) and Correlation (r)

r2 = 1, r2 = 1,

r2 = .8, r2 = 0,Y

Yi = b0 + b1Xi

YYi = b0 + b1Xi

Yi = b0 + b1Xi

r = +1 r = -1

r = +0.9 r = 0

Linear Regression Assumptions

1. Linearity

2. Normality– Y values are normally distributed for each X– Probability distribution of error is normal

2. Homoscedasticity (Constant Variance)

3. Independence of Errors

Residual Analysis

• Purposes– Examine linearity – Evaluate violations of assumptions

• Graphical Analysis of Residuals– Plot residuals vs. Xi , Yi and time

Residual Analysis for Linearity

Not Linear Linear

• Y values are normally distributed around the regression line.

• For each X value, the “spread” or variance around the regression line is the same.

Variation of Errors around the Regression Line

Sample Regression Line

Residual Analysis for Homoscedasticity

Heteroscedasticity Homoscedasticity

Residual Plot

0 1000 2000 3000 4000 5000 6000

Square Feet

Residual Analysis:Excel Output for Produce Stores Example

Excel Output

Observation Predicted Y Residuals1 4202.344417 -521.34441732 3928.803824 -533.80382453 5822.775103 830.22489714 9894.664688 -351.66468825 3557.14541 -239.14541036 4918.90184 644.09816037 3588.364717 171.6352829

Residual Analysis for Independence

Not Independent Independente e

TimeTime

Residual is plotted against time to detect any autocorrelation

No Particular PatternCyclical Pattern

Graphical Approach

The ANOVA Table in Excel

Regression p SSRMSR

=SSR/pMSR/MSE

P-value of

the F Test

Residuals n-p-1 SSEMSE

=SSE/(n-p-1)

Total n-1 SST

Measures of VariationThe Sum of Squares: Example

Excel Output for Produce Stores

df SS MS F Significance F

Regression 1 30380456.12 30380456 81.17909 0.000281201

Residual 5 1871199.595 374239.92

Total 6 32251655.71

Measures of Variation: Produce Store Example

Regression StatisticsMultiple R 0.9705572R Square 0.94198129Adjusted R Square 0.93037754Standard Error 611.751517Observations 7

Excel Output for Produce Stores

r2 = .94

94% of the variation in annual sales can be explained by the variability in the size of the store as measured by square footage

Inference about the Slope: t Test

• t test for a population slope– Is there a linear dependency of Y on X ?

• Null and alternative hypotheses– H0: 1 = 0 (no linear dependency)– H1: 1 0 (linear dependency)

• Test statistic–

b St S

. . 2d f n

Example: Produce Store

Data for Seven Stores: Estimated Regression Equation:

The slope of this model is 1.487.

Is square footage of the store affecting its annual sales?

Annual Store Square Sales

Feet ($000)

1 1,726 3,681

2 1,542 3,395

3 2,816 6,653

4 5,555 9,543

5 1,292 3,318

6 2,208 5,563

7 1,313 3,760

Yi = 1636.415 +1.487Xi

Inferences about the Slope: t Test Example

H0: 1 = 0

H1: 1 0

df 7 - 2 = 5

Critical Value(s):

Test Statistic:

Decision:

Conclusion:There is evidence that square footage affects annual sales.

t0 2.5706-2.5706

Reject Reject

From Excel Printout

Reject H0

Coefficients Standard Error t Stat P-valueIntercept 1636.4147 451.4953 3.6244 0.01515Footage 1.4866 0.1650 9.0099 0.00028

1b 1bS t

Inferences about the Slope: Confidence Interval Example

Confidence Interval Estimate of the Slope:

11 2n bb t S Excel Printout for Produce Stores

At 95% level of confidence, the confidence interval for the slope is (1.062, 1.911). Does not include 0.

Conclusion: There is a significant linear dependency of annual sales on the size of the store.

Lower 95% Upper 95%Intercept 475.810926 2797.01853X Variable 11.06249037 1.91077694

Confidence Intervals for Estimators

Regression Statistics

Multiple R 0.970557

R Square 0.941981

Adjusted R Square 0.930378

Standard Error 611.7515

Observations 7

Regression 1 30380456 30380456 81.17909 0.000281

Residual 5 1871200 374239.9

Total 6 32251656

Coefficient

sStandard

Error t Stat P-value Lower 95% Upper 95%Intercept 1636.415 451.4953 3.624433 0.015149 475.8109 2797.019X Variable 1 1.486634 0.164999 9.009944 0.000281 1.06249 1.910777

OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.

Documents

Transcript of OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.

PANOS M. MARKOPOULOS€¦ · · 2011-12-17•OPIM 101, Introduction to the Computer as an Analysis Tool • OPIM 670, Simulation & Dynamic Competitive Strategy • OPIM 210, Management

mef acc dic 303 2007 - Piura Region · 2018-11-13 · 1 castillo lachira jose francisco 571.89 0.00 49.62 621.51 2 aguirre martinez victor jose orlando 598.28 0.00 51.99 650.27 3

OPIM 204: Lecture #1 Introduction to OM OPIM 204 Operations Management Instructor: Jose M. Cruz Office: Room 332 Phone: (203) 236-9945 E-mail: Jose.Cruz@business.uconn.eduJose.Cruz@business.uconn.edu.

1 1 Supply Chain Management OPIM 5110 – Lecture # 6 Instructor: Jose Cruz.

Simulation OPIM 310-Lecture #4 Instructor: Jose Cruz.

FOTOS GLOSADAS (nº 303)ianasagasti.blogs.com/files/535.16-fotos-glosadas-nº-303-21.04.pdf · Arzalluz y Pujol. 1996. 8.- Jose Emilio García Gómez, de las Agrupaciones Independientes

Innovation Process – OPIM 651

.21 7.6 303- 1 -001 2442 303-ffl-006(2) 01412 .334 303-11 ... · 303-1ff-016 01413 303-ffl-001 303- 1 -008 2046 2582 UI 405 303- 1 -007(2) 303- 1 -007(1) 1402(2) 1402(1 ) 3034 325.3

Linear Programming OPIM 310-Lecture 2 Instructor: Jose Cruz.

OPIM 5103 Descriptive Statistics Random Sampling Intro to Probability and Discrete Distributions Jan Stallaert Professor of OPIM.

OPIM 204 – Aggregate Planning 1 Aggregate Planning OPIM 3104 Instructor: Jose Cruz.

SOSC 303 METODOS ESTADISTICOS APLICADOS A LAS ... - suagm.edu 303... · SOSC 303-Statistics Methods Applied to the Social Sciences 4 Prep. 06-04-2010. Prof. Jose Irizarry, MS, Rev.

OPIM 5103-Lecture #3 Jose M. Cruz Assistant Professor.

Negotiation OPIM 691 Spring 2019 Course Syllabus Professor ...

1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #5 Jose M. Cruz Assistant Professor.

Supply Chain Management OPIM 310-Lecture #6 Instructor: Jose Cruz.

BA in Management Program Spring 2012 OPIM 406 – …€¦ · Page 1 of 6 BA in Management Program Spring 2012 OPIM 406 – Geographic Information Systems (GIS) Instructor: Burçin

OPIM 915 taught by MGS Multicommodity Flows 2 Written by Jim Orlin.

OPIM Business · OPIM Innovate Installing IoT through Splunk The nal technology workshop of the Spring 2017 semester held by the Operations and Information Management (OPIM) Department

Corporate Purchasing UConn Graduate Business Studies Program OPIM 310 May 8, 2008.