Karl Schmedders - Home - HKUST Business School Materials.pdf · EMAIL:...

185
Karl Schmedders MANAGERIAL ECONOMICS & DECISION SCIENCES Visiting Professor of Managerial Economics & Decision Sciences PhD, 1996, Operations Research, Stanford University MS, 1992, Operations Research, Stanford University Vordiplom, 1990, Business Engineering, Universitat Karlsruhe, Highest Honors, Ranked first in a class of 350 EMAIL: [email protected] OFFICE: Jacobs Center Room 528 Karl Schmedders is Associate Professor in the Department of Managerial Economics and Decision Sciences. He holds a PhD in Operations Research from Stanford University. Professor Schmedders’ research interests include computational economics, general equilibrium theory, asset pricing and portfolio selection. His work has been published in Econometrica, The Review of Economic Studies, The Journal of Finance, and many other academic journals. He teaches courses in decision science both in the MBA and the EMBA program at Kellogg. Professor Schmedders has been named to the Faculty Honor Roll in every quarter he has taught at Kellogg. He has received numerous teaching awards, including the 2002 Lawrence G. Lavengood Outstanding Professor of the Year. Professor Schmedders is the only Kellogg faculty member to receive the ‘Ehrenmedaille’ (Honorary Medal) of Kellogg’s partner school WHU. Research Interests Mathematical economics, in particular general equilibrium models involving time and uncertainty Asset pricing Mathematical programming

Transcript of Karl Schmedders - Home - HKUST Business School Materials.pdf · EMAIL:...

Karl Schmedders

MANAGERIAL ECONOMICS & DECISION SCIENCES

Visiting Professor of Managerial Economics & Decision

Sciences

PhD, 1996, Operations Research, Stanford University

MS, 1992, Operations Research, Stanford University

Vordiplom, 1990, Business Engineering,

Universitat Karlsruhe, Highest Honors, Ranked first in a

class of 350

EMAIL: [email protected]

OFFICE: Jacobs Center Room 528

Karl Schmedders is Associate Professor in the Department of Managerial Economics and

Decision Sciences. He holds a PhD in Operations Research from Stanford University.

Professor Schmedders’ research interests include computational economics, general equilibrium

theory, asset pricing and portfolio selection. His work has been published in Econometrica, The

Review of Economic Studies, The Journal of Finance, and many other academic journals. He

teaches courses in decision science both in the MBA and the EMBA program at Kellogg.

Professor Schmedders has been named to the Faculty Honor Roll in every quarter he has taught

at Kellogg. He has received numerous teaching awards, including the 2002 Lawrence G.

Lavengood Outstanding Professor of the Year. Professor Schmedders is the only Kellogg

faculty member to receive the ‘Ehrenmedaille’ (Honorary Medal) of Kellogg’s partner school

WHU.

Research Interests

Mathematical economics, in particular general equilibrium models involving time and

uncertainty

Asset pricing

Mathematical programming

KH19, Course Description

1

Managerial Statistics

Course Description In this course we will cover the following topics:

Confidence Intervals Hypothesis Tests Regression Analysis

Our objective is to quickly cover the first two topics. While they are important by themselves many people describe them as rather “dry” course material. However, they will be of great help to us when we cover the main subject of the course, regression analysis. Regressions are extremely useful and can deliver eye-opening insights in many managerial situations. You will solve some entertaining case studies which show the power of regression analysis. We will cover the material in this case packet as well as the following chapters of the textbook:

Sections 13.1 and 13.2 of chapter 13; Section 14.1 of chapter 14; Chapter 15; Chapter 16; Chapter 19; Chapter 21; Chapter 23.

Time permitting, we will also cover parts of chapter 25. There will be several team assignments. After the conclusion of the course, there will be an in-class final exam on the first day of the following module, that is, on April 1, 2016. The final grades in this course will be determined as follows.

Team assignments: 40% Class participation: 10% Final Exam: 50%

KH19, Course Description

2

In case you would like to prepare for our course, you should start reading the relevant sections of Chapters 13 and 14 in our textbook. Before you do that, please also consider the following suggestions. 1) Review the material on the normal distribution from your probability course. In

particular, you should review the use of the functions NORMDIST, NORMSDIST, NORMINV and NORMSINV in Excel.

2) We will use the software KStat that was developed at Kellogg. Ideally you should install KStat on your laptop before our first class.

I realize that all of you are very busy and you may not have the time to prepare at length for our course. Please note, however, that the better you prepare the faster we can cover the early parts of the course material and the more time we have for the fun part, the coverage of regression analysis. Of course, I am happy to help you with your preparation. Please do not hesitate to contact me with any questions or concerns. My email address is [email protected].

September 29, 1998

When Scientific Predictions Are So Good They're Bad By WILLIAM K. STEVENS

NOAH had it easy. He got his prediction straight from the horse's mouth and was left in no doubt about what to do.

But when the Red River of the North was rising to record levels in the spring of 1997, the citizens and officials of Grand Forks, N.D., were not so privileged. They had to rely on scientists' predictions about how high the water would rise. And in this case, Federal experts say, the flood forecast may have been issued and used in a way that made things worse.

The problem, the experts said, was that more precision was assigned to the forecast than was warranted. Officials and citizens tended to take as gospel an oft-repeated National Weather Service prediction that the river would crest at a record 49 feet. Actually, there was a wider range of probabilities; the river ultimately crested at 54 feet, forcing 50,000 people to abandon their homes fast. The 49-foot forecast had lulled the town into a false sense of security, said Dr. Roger A. Pielke Jr. of the National Center for Atmospheric Research in Boulder, Colo., a consultant on a subsequent inquiry by the weather service.

In fixating on the single number of 49 feet, the people involved in the Grand Forks disaster made a common error in the use of predictions and forecasts, experts who have studied the case say. It was, they say, a case of what Alfred North Whitehead, the mathematician and philosopher, once termed ''misplaced concreteness.'' And whether the problem is climate change, earthquakes, droughts or floods, they say the tendency to overlook uncertainties, margins of error and ranges of probability can lead to damaging misjudgments.

The problem was the topic of a workshop this month at Estes Park, Colo. In part, participants said, the problem arises becausedecision makers sometimes want to avoid making hard choices in uncertain situations. They would rather place responsibility on the predictors.

Scientifically based predictions, typically using computerized mathematical models, have become pervasive in modern society. But only recently has much attention been paid to the proper use -- and misuse -- of predictions. The Estes Park workshop, of which Dr. Pielke was an organizer, was an attempt to come to grips with the question. The workshop was sponsored by the Geological Society of America and the National Center for Atmospheric Research.

People have predicted and prophesied for millenniums, of course, through means ranging from the visions of shamans and the warnings of biblical prophets to the examination of animal entrails. With the arrival of modern science, people teased out fundamental laws of physical and chemical behavior and used them to make better and better predictions.

But once science moves beyond the relatively deterministic processes of physics and chemistry, prediction gets more complicated and chancier. The earth's atmosphere, for instance, often frustrates efforts to predict the weather and long-term climatic changes because scientists have not nailed down all of its physical workings and because a substantial measure of chaotic unpredictability is inherent in the climate system. The result is a considerable range of uncertainty, much more so than is popularly associated with science. So while computer modeling has often made reasonable predictions possible, they are always uncertain; results are by definition a model of reality, not reality itself.

The accuracy of predictions varies widely. Some, like earthquake forecasts, have proved so disappointing that experts have turned instead to forecasting longer-term earthquake potential in a general sense and issuing last-second warnings to distant communities once a quake has begun.

In some cases, the success of a prediction is near impossible to judge. For instance, it will take thousands of years to know whether the environmental effects of buried radioactive waste will be as predicted.

On the other hand, daily weather forecasts are checked almost instantly and are used to improve the next day's forecast. But weather forecasting is also a success, the assembled experts agreed, because people know its shortcomings and take them into consideration. Weather forecasts ''are wrong a lot of the time, but people expect that and they use them accordingly,'' said

Page 1 of 3When Scientific Predictions Are So Good They're Bad - The New York Times

7/15/2009http://www.nytimes.com/1998/09/29/science/when-scientific-predictions-are-so-good-they...

Robert Ravenscroft, a Nebraska rancher who attended the workshop as a ''user'' of predictions.

A prediction is to be distrusted, workshop participants said, when it is made by the group that will use it as a basis for policy making -- especially when the prediction is made after the policy decision has been taken. In one example offered at the workshop, modeling studies purported to show no harmful environmental effects from a gold mine that a company had decided to dig.

Another type of prediction miscue emerged last March in connection with asteroids, the workshop participants were told by Dr. Clark R. Chapman, a planetary scientist at the Southwest Research Institute in Boulder. An astronomer erroneously calculated that there was a chance of one-tenth of 1 percent that a mile-wide asteroid would strike Earth in 30 years. The prediction created an international stir but was withdrawn a day later after further evidence turned up.

This ''uncharacteristically bad'' prediction, said Dr. Chapman, would not have been issued had it been subjected to normal review by the forecaster's scientific peers. But, he said, there was no peer-review apparatus set up to make sure that ''off-the-wall predictions don't get out.'' (Such a committee has since been established by NASA.)

Most sins committed in the name of prediction, however, appear to stem from the uncertainty inherent in almost all forecasts. ''People don't understand error bars,'' said one scientist, referring to margins of error. Global climate change and the Red River flood offer two cases in point.

Computer models of the climate system are the major instruments used by scientists to project changes in climate that might result from increasing atmospheric concentrations of heat-trapping gases, like carbon dioxide, emitted by the burning of fossilfuels.

Basing its forecast on the models, a panel of scientists set up by the United Nations has projected that the average surface temperature of the globe will rise by 2 to 6 degrees Fahrenheit, with a best estimate of 3.5 degrees, in the next century, and more after that. This compares with a rise of 5 to 9 degrees since the depths of the last ice age. The temperature has increased by about 1 degree over the last century.

But the magnitude and nature of any climate changes produced by any given amount of carbon dioxide are uncertain. Moreover, it is unclear how much of the gas will be emitted over the next few years, said Dr. Jerry D. Mahlman, a workshop participant who directs the National Oceanic and Atmospheric Administration's Geophysical Fluid Dynamics Laboratory at Princeton, N.J. The laboratory is one of the world's major climate modeling centers, and the oldest.

This uncertainty opens the way for two equal and opposite sins of misinterpretation. ''The uncertainty is used as a reason for governments not to act,'' in the words of Dr. Ronald D. Brunner, a political scientist at the University of Colorado at Boulder. On the other hand, people often put too much reliance on the precise numbers.

In the debate over climate change, the tendency is to state all the uncertainties and caveats associated with the climate model projections -- and then forget about them, said Dr. Steve Rayner, a specialist in global climate change in the District of Columbia office of the Pacific Northwest National Laboratory. This creates a ''fallacy of misplaced confidence,'' he said, explaining that the specific numbers in the model forecasts ''take on a validity not allowed by the caveats.'' This tendency to focus unwisely on specific numbers was termed ''fallacious quantification'' by Dr. Naomi Oreskes, a historian at the University of California at San Diego.

Where uncertainty rules, many at the workshop said, it might be better to stay away from specific numbers altogether and issue a more generalized forecast. In climate change, this might mean using the models as a general indication of the direction in which the climate is going (whether it is warming, for instance) and of the approximate magnitude of the change, while taking the numbers with a grain of salt.

None of which means that the models are not a helpful guide to public policy, said Dr. Mahlman and other experts. For example, the models say that a warming atmosphere, like today's, will produce heavier rains and snows, and some evidence suggests that this is already happening in the United States, possibly contributing to damaging floods. Local planners might be well advised to consider this, Dr. Mahlman said.

One problem in Grand Forks was that lack of experience with such a damaging flood aggravated the uncertainty of the flood forecast. Because the river had never before been observed at the 54-foot level, the models on which the prediction was based were ''flying blind,'' said Dr. Pielke; there was no historical basis on which to produce a reliable forecast.

But this was apparently lost on local officials and the public, who focused on the specific forecast of a 49-foot crest. This number was repeated so often, according to the report of an inquiry by the National Weather Service, that it ''contributed to an

Page 2 of 3When Scientific Predictions Are So Good They're Bad - The New York Times

7/15/2009http://www.nytimes.com/1998/09/29/science/when-scientific-predictions-are-so-good-they...

impression of certainty.'' Actually, the report said, the 49-foot figure ''created a sense of complacency,'' because it was only a fraction of a foot higher than the record flood of 1979, which the city had survived.

''They came down with this number and people fixated on it,'' Tom Mulhern, the Grand Forks communications officer, said in an interview. The dikes protecting the city had been built up with sandbags to contain a 52-foot crest, and everyone figured the town was safe, he said.

It is difficult to know what might have happened had the uncertainty of the forecast been better communicated. But it is possible, said Mr. Mulhern, that the dikes might have been sufficiently enlarged and people might have taken more steps to preserve their possessions. As it was, he said, ''some people didn't leave till the water was coming down the street.''

Photo: Petty Officer Tim Harris patroled an area of Grand Forks, N.D., in April 1997, where the Red River flooded the houses up to the second story. Residents, relying on the precision of forecasts, were forced to flee quickly. (Reuters)(pg. F6)

Copyright 2009 The New York Times Company Home Privacy Policy Search Corrections XML Help Contact Us Back to Top

Page 3 of 3When Scientific Predictions Are So Good They're Bad - The New York Times

7/15/2009http://www.nytimes.com/1998/09/29/science/when-scientific-predictions-are-so-good-they...

1 – Sampling

Managerial Statistics

KH 19

Course material adapted from Chapters 13.1, 13.2, and 14.1 of our textbookStatistics for Business, 2e © 2013 Pearson Education, Inc.

2

Learning Objectives

Describe why sampling is important

Understand the implications of sampling variation

Explain the flaw of averages

Define the concept of a sampling distribution

Determine the mean and standard deviation for the sampling distribution of the sample mean

Describe the Central Limit Theorem and its importance

Determine the mean and standard deviation for the sampling distribution of the sample proportion

3

Descriptive statistics

Collecting, presenting, and describing data

Inferential statistics

Drawing conclusions and/or making decisions concerning a population based only on sample data

Tools of Business Statistics

4

A Population is the set of all items or individuals of interest Examples: All likely voters in the next election

All parts produced todayAll sales receipts for March

A Sample is a subset of the population Examples: 1000 voters selected at random for interview

A few parts selected for destructive testingRandom receipts selected for audit

Populations and Samples

5

Properties of Samples

A representative sample is a sample that re-flects the composition of the entire population.

A sample is biased, if a systematic error occurs in the selection of the sample. For example, the sample may systematically omit a portion of the population.

6

Population vs. Sample

a b c d

ef gh i jk l m n

o p q rs t u v w

x y z

Population Sample

b c

g i n

o r u

y

7

Why Sample?

Less time consuming than a census

Less costly to administer than a census

It is possible to obtain statistical results of a sufficiently high precision based on samples

8

Two Surprising Properties

Surprise 1: The best way to obtain a repre-sentative sample is to pick members of the population at random.

Surprise 2: Larger populations do not require larger samples.

9

Randomization

A randomly selected sample is representative of the whole population (avoids bias).

Randomization ensures that on average a sample mimics the population.

Randomization enables us to infer character-istics of the population from a sample.

10

Comparison of Two Random Samples

Two large samples (each with 8,000 data points) drawn at random from a population of 3.5 million customers of a bank

11

(In)Famous Biased Sample

The Literary Digest predicted a landslide defeat for Franklin D. Roosevelt in the 1936 presiden-tial election. They selected their sample from, among others, a list of telephone numbers. The size of their sample was about 2.4 million!

Telephones were a luxury during and soon after the Great Depression. Roosevelt’s supporters tended to be poor and were grossly underrepre-sented in the sample.

12

Simple Random Sample (SRS)

A Simple Random Sample (SRS) is a sample of n data points chosen by a method that has an equal chance of picking any sample of size n from the population.

An SRS is the standard to which all other sampling methods are compared.

An SRS is the foundation for virtually all of the theory of statistics.

13

Making statements about a population by examining sample results

Sample statistics Population parameters

(known) Inference (unknown, but can

be estimated from

sample evidence)

Sample Population

Inferential Statistics

14

Tools of Inferential Statistics

Estimation Example: Estimate the population

mean age using the sample mean age.

Hypothesis Testing Example: Use sample evidence to

test the claim that the population mean age is 40.5 years.

Drawing conclusions and/or making decisions concerning a population based on sample results.

15

Estimating Parameters

Parameter: a characteristic of the population (e.g., mean µ)

Statistic: an observed characteristic of a sample (e.g., sample average )

Estimate: using a statistic to approximate a parameter

xy ,

16

Notation for Statistics and Parameters

17

Sampling Variation

Sampling Variation is the variability in the value of a statistic from sample to sample.

Two samples from the same population will rarely (if ever) yield the same estimate.

Sampling variation is the price we pay for working with a sample rather than the population.

18

The Flaw of Averages

19

The Flaw of Averages

Our culture encodes a strong bias either to neglect or ignore variation. We tend to focus instead on measures of central tendency, and as a result we make some terrible mistakes, often with considerable practical import.

Stephen Jay Gould, 1941 – 2002,

evolutionary biologist, historian of science

(continued)

20

Point Estimates

A sample statistics is a point estimate. It pro-vides a single number (e.g. the sample mean) for an unknown population parameter (e.g. the population mean).

A point estimate delivers no information on the possible sampling variation.

A key step in any careful statistical analysis is to quantify the effect of sampling variation.

21

Definitions

An estimator of a population parameter is a random variable that depends on sample

information . . .

whose value provides an approximation to this unknown parameter.

A specific value of that random variable is called an estimate.

22

Sampling Distributions

The sampling distribution is the proba-bility distribution that describes how a statistic, such as the mean, varies from sample to sample.

23

Testing of GPS Chips

A manufacturer of GPS chips selects samples for highly accelerated life testing (HALT).

HALT scores range from 1 (failure on first test) to 16 (chip endured all 15 tests without failure).

Even when the production process is functioning normally, there is variation among HALT scores.

24

Testing 400 Chips

Distribution of individual HALT scores

25

Distribution of Daily Average Scores

Distribution of average HALT scores

(54 samples, each with sample size n=20)

26

Benefits of Averaging

Averaging reduces variation: The sample-to-sample variance among average HALT scores is smaller than the variance among individual HALT scores.

The distribution of average HALT scores appears more “bell shaped” than the distribution of individual HALT scores.

27

Sampling Distributions

Sampling Distributions

Sampling Distribution of Sample Mean

Sampling Distribution of Sample Proportion

28

Expected Value of Sample Mean

Let x1, x2, . . . , xn represent a random sample from a population.

The sample mean value of these observations is defined as

The random variable “sample mean” is denoted by and its specific value in the sample by .

n

iixn

x1

1

X x

29

Standard Error of the Mean

Different samples from the same population will yield different sample means.

A measure of the variability in the mean from sample to sample is given by the Standard Error of the Mean:

The standard error of the mean decreases as the sample size increases.

n

σ)XSE(

30

Standard Error of the Mean

The standard error is proportional to σ. As population data become more variable, sample averages become more variable.

The standard error is inversely proportional to the square root of the sample size n. The larger the sample size, the smaller the sampling variation of the averages.

(continued)

31

If the Population is Normal

If a population is normally distributed with mean μ and standard deviation σ, then the sampling distribution of the sample mean is also normally distributed with

and

X

μ)XE( n

σ)XSE(

32

Normal Population Distribution

Normal Sampling Distribution (has the same mean)

Sampling Distribution Properties

( is unbiased )X x

x

μ)XE(

μ

)XE(

33

Sampling Distribution Properties

As n increases,

decreasesLarger sample size

Smaller sample size

x

(continued)

)XSE(

μ

34

If the Population is not Normal

We can apply the Central Limit Theorem:

Even if the population is not normal,

… sample means from the population will beapproximately normal as long as the sample size is large enough.

Properties of the sampling distribution:

andμ)XE( n

σ)XSE(

35

n↑

Central Limit Theorem

As the sample size gets large enough …

the sampling distribution becomes almost normal regardless of shape of population.

x

36

Population Distribution

Sampling Distribution (becomes normal as n increases)

Central Tendency

Variation

x

x

Larger sample size

Smaller sample size

If the Population is not Normal(continued)

Sampling distribution properties:

μ)XE(

n

σ)XSE(

μ)XE(

μ

37

How Large is Large Enough?

For most distributions, a sample size of n > 30will give a sampling distribution that is nearly normal.

For normal population distributions, the sampling distribution of the mean is always normally distributed regardless of the sample size.

38

More Formal Condition

Sample Size Condition for an application of the central limit theorem:

A normal model provides an accurate approxi-mation to the sampling distribution of if the sample size n is larger than 10 times the squared skewness and larger than 10 times the absolute value of the kurtosis,

and .

X

2310Kn 410Kn

39

Average HALT Scores

Design of the chip-making process indicates that the HALT score of a chip has a mean µ = 7 with a standard deviation σ = 4.

Sampling distribution of average HALT scores

(n = 20)

2

22

8902047 .

n

σ,μ~NX

40

Average HALT Scores

The sampling distribution of average HALT scores is (approximately) a normal distribution with mean 7 and standard deviation 0.89.

(continued)

41

Sampling Distributions ofSample Proportions

Sampling Distributions

Sampling Distribution of Sample Mean

Sampling Distribution of Sample Proportion

42

Population Proportions p

p = the proportion of the population having some characteristic

Sample proportion ( ) provides an estimate of p:

0 ≤ ≤ 1

p has a binomial distribution, but can be approximated by a normal distribution when n is large enough

size sampleinterest of sticcharacteri with thesample in the items#ˆ p

p

p

43

Sampling Distribution

Normal approximation:

Properties:

and

(where p = population proportion)

Sampling Distribution

.3

.2

.10

0 . 2 .4 .6 8 1

p)pE( ˆn

p)p(σ p

12ˆ

p

44

Sample Size Condition

Sample size condition for proportions,

and .

If this condition holds, then the distribution of the sample proportion is approximately a normal distribution.

10ˆ pn 10)ˆ1( pn

p

45

Take Aways

Understand the notion of sampling variation.

Appreciate the dangers of the flaw of averages.

Grasp the concept of a sampling distribution.

Have an idea of the central limit theorem.

Know the sampling distributions of a sample mean and of a sample proportion.

46

Pitfalls

Do not confuse a sample statistic for the population parameter.

Do not fall for the flaw of averages.

2 – Confidence Intervals

Managerial Statistics

KH 19

Course material adapted from Chapter 15 of our textbookStatistics for Business, 2e © 2013 Pearson Education, Inc.

2

Learning Objectives

Distinguish between a point estimate and a confidence interval estimate

Construct and interpret a confidence interval of a population proportion

Construct and interpret a confidence interval of a population mean

3

Point and Interval Estimates

A point estimate is a single number.

A Confidence Interval provides additional information about variability.

Point Estimate

Lower

Confidence

Limit

Upper

Confidence

Limit

Width of confidence interval

4

We can estimate a population parameter …

Point Estimates

with a sample statistic(a point estimate)

mean

proportion p

p

5

Confidence Interval Estimate

An interval gives a range of values: Takes into consideration variation in sample

statistics from sample to sample

Based on observation from a single sample

Provides more information about a population characteristic than does a point estimate

Relies on the sampling distribution of the statistic.

Stated in terms of level of confidence Can never be 100% confident

6

Estimation Process

(mean, μ, is unknown)

Population

Random Sample

Mean = 50

Sample

I am 95% confident that μ is between 40 and 60.

x

7

General Formula

The general formula for all confidence intervals is:

The value of the reliability factor depends on the desired level of confidence.

Point Estimate (Reliability Factor)(Standard Error)

8

Confidence Intervals

Population Mean

ConfidenceIntervals

PopulationProportion

9

Confidence Interval for the Proportion

Recall that the Central Limit Theorem implies a normal model for the sampling distribution of .

E( ) = p and SE( ) =

SE( ) is called the Standard Error of the Proportion.

npp /)1(

p

p p

p

10

Interpretation

The sample statistic in 95% of samples lies within 1.96 standard errors of the population parameter.

11

Interpretation

Probability that sample proportion deviates by less than 1.96 standard errors of the proportion from the true (but unknown) population propor-tion p is 95%.

P( –1.96 SE( ) ≤ p – ≤ +1.96 SE( ) ) = 0.95.

(continued)

p

p pp

12

95% Confidence Interval for p

For 95% of samples, the interval formed by reaching 1.96 standard errors to the left and right of will contain p.

Problem: We do not know the value of the standard error of the proportion, SE( ), since it depends on the true (but unknown) parameter p.

We estimate this standard error using in place of p,

p

p

p

n

)p(p)pse(

ˆ1ˆˆ

13

Confidence Interval for p

The 100(1 – α)% confidence interval for p is

where z/2 is the standard normal value for the level of

confidence desired (“reliability factor”)

is the sample proportion

n is the sample size

n

)p(pzpp

n

)p(pzp α/α/

ˆ1ˆˆˆ1ˆˆ 22

p

14

Finding the Reliability Factor, z/2

Consider a 95% confidence interval:

z = -1.96 z = 1.96

.951

.0252

α .025

2

α

Point EstimateLower Confidence Limit

UpperConfidence Limit

z units:

p units: Point Estimate

0

15

Common Levels of Confidence

Most commonly used confidence level is 95%.

Confidence Level

Confidence Coefficient, Z/2 value

1.28

1.645

1.96

2.33

2.58

3.08

3.27

.80

.90

.95

.98

.99

.998

.999

80%

90%

95%

98%

99%

99.8%

99.9%

1

16

Affinity Credit Card

Before deciding to offer an affinity credit card to alumni of a university, the credit card company wants to know how many customers will accept the offer.

Population: Alumni of the university

Parameter of interest: Proportion p of alumni who will return the application for the credit card

17

SRS of Alumni

Question: What should we conclude about the proportion p in the population of 100,000 alumni who will accept the offer if the card is launched on a wider scale?

Method: Construct a confidence interval based on the results of a simple random sample.

18

SRS of Alumni

The credit card issuer sent preapproved applica-tions to a sample of 1000 alumni. Of these, 140 accepted the offer and received the card.

Summary Statistics:

(continued)

19

Checklist for Application of Normal

SRS condition. The sample is a simple random sample from the relevant population.

Sample size condition (for proportion). Bothand are larger than 10.

pnˆ)ˆ1( pn

20

Credit Card: Confidence Interval

The estimated standard error is

The 95% confidence interval is

0.14 ± 1.96 × 0.01097 ≈ [0.1185, 0.1615]

0109701000

1401140ˆ .).(.

)pse(

21

Credit Card: Conclusion

With 95% confidence, the population proportion that will accept the offer is between 11.85% and 16.15%.

If the bank decides to launch the credit card, might 20% of the alumni accept the offer? It’s not impossible but rather unlikely given the information in our sample; 20% is outside the 95% confidence interval for the unknown proportion p.

22

Margin of Error

The confidence interval,

can also be written as

where ME is called the Margin of Error,

n

)p(pzpp

n

)p(pzp α/α/

ˆ1ˆˆˆ1ˆˆ 22

MEp ˆ

n

)p(pzα/

ˆ1ˆME 2

23

Reducing the Margin of Error

The width of the confidence interval is equal to twice the margin of error.

The margin of error can be reduced if the sample size is increased (n↑), or

the confidence level is decreased, (1 – ) ↓ .

n

)p(pzα/

ˆ1ˆME 2

24

Margin of Error in the News

You often read in the news statements like the following:

The CNN/USA Today/Gallup poll taken March 7-10 showed that 52% of Americans say… . The poll had a margin of error of plus or minus four percentage points.

No confidence level is given!

The assumed confidence level is typically 95%. In addition, the 1.96 is rounded up to 2.

25

Margin of Error in the News

For an interpretation of this statement we use the confidence interval formula

where ME = 0.04 ≥ .

We can have (slightly more than) 95% confidence that the true proportion of Americans saying … is between 48% and 56%.

(continued)

MEp ˆ

n

)p(p ˆ1ˆ2

26

Confidence Intervals

Population Mean

ConfidenceIntervals

PopulationProportion

27

Sampling Distribution of the Mean

Recall that the Central Limit Theorem implies a normal model for the sampling distribution of .

E( ) = μ and

SE( ) is called the Standard Error of the Mean.

X

X

X

n

σ)XSE(

28

Interpretation

Probability that sample mean deviates by less than 1.96 standard errors of the mean from the true (but unknown) population mean μ is 95%.

P( –1.96 SE( ) ≤ μ – ≤ +1.96 SE( )) = 0.95.

Once again, the sample statistic lies within about two standard errors of the corresponding population parameter in 95% of samples.

X X

X

X

29

Since the population standard deviation σ is unknown, we estimate it using the sample standard deviation, s.

This step introduces extra uncertainty, since sis variable from sample to sample.

As an adjustment, we use the t-distributioninstead of the normal distribution.

Confidence Interval for μ

11

2

n-

)x(xs

n

ii

30

Student’s t-Distribution

Consider an SRS of n observations with mean and standard deviation s

from a normally distributed population with mean μ.

Then the variable

follows the Student’s t-distribution with (n - 1) degrees of freedom.

nS/

μXTn

1

x

31

Student’s t-Distribution

The t-distribution is a family of distributions.

The t-value depends on the degrees of freedom (df). Number of observations that are free to vary after

sample mean has been calculated

df = n – 1

32

Student’s t-Distribution

t0

t (df = 5)

t (df = 13)t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal

Standard Normal(t with df = ∞)

Note: t Z as n increases(continued)

33

t distribution values

With comparison to the Z value

Confidence t t t ZLevel (df = 10) (df = 20) (df = 30) ____

.80 1.372 1.325 1.310 1.282

.90 1.812 1.725 1.697 1.645

.95 2.228 2.086 2.042 1.960

.99 3.169 2.845 2.750 2.576

Note: t Z as n increases

34

Assumptions Population is normally distributed.

If population is not normal, use “large” sample.

Use Student’s t-Distribution

100(1-α)% Confidence Interval for μ:

where t α/2,n-1 is the reliability factor from the t-distribution with n-1 degrees of freedom and an area of α/2 in each tail.

Confidence Interval for μ

n

stxμ

n

stx ,n-α/,n-α/ 1212

35

Affinity Credit Card

Before deciding to offer an affinity credit card to alumni of a university, the credit card company wants to know how large a balance those alumni will carry who accept the offer.

Population: (Future) credit card balances of (future) customers among the alumni of the university

Parameter of interest: Mean μ of (future) balances carried by alumni on their affinity credit card

36

SRS of Alumni

The 140 alumni who accepted the offer and received the affinity credit card have been carrying an aver-age monthly balance of = $1990.50 with a standard deviation of s = $2,833.33.

x

37

SRS of Alumni

Question: What should we conclude about the average future credit card balance μ on the new affinity credit card for this particular university?

Method: Construct confidence interval.

(continued)

38

Checklist for Application of Normal

SRS condition. The sample is a simple random sample from the relevant population.

Sample size condition (for mean). The sample size is larger than 10 times the squared skew-ness and 10 times the absolute value of the kurtosis.

39

Credit Card: Confidence Interval

The estimated standard error is

se ( ) = 2,833.33 / = 239.46.

The t-value for a 95% confidence interval with 139 degrees of freedom is

T.INV.2T(0.05,139) = 1.97718.

The 95% confidence interval is

1,990.50 ± 1.97718 × 239.46

= [1517.04, 2463.96].

X 140

40

Credit Card: Conclusion

We are 95% confident that the true but unknown µ lies between $1,517.04 and $2,463.96.

If the bank decides to launch the credit card, might the average balance be $1,250? It’s not impossible but based on the sample results it’s rather unlikely.

41

Confidence Interval and Confidence Level

If P(a ≤ p ≤ b) = 1 - then the interval from a to bis called a 100(1 - )% confidence interval of p.

The quantity (1 - ) is called the confidence levelof the interval ( between 0 and 1).

In repeated samples of the population, the true value of the parameter p would be contained in 100(1 - )% of intervals calculated this way.

42

p)pE( ˆ

Intervals and Level of Confidence

Confidence Intervals

Intervals extend from

to

100(1-)%of intervals constructed contain p;

100()% do not.

Sampling distribution of the proportion

p

2α/ 2α/α1

)ˆ(ˆ 2 psezp α/

)ˆ(ˆ 2 psezp α/

pp

43

Confidence Level, (1-)

Suppose confidence level = 95%

Also written (1 - ) = 0.95

A relative frequency interpretation: From repeated samples, 95% of all the

confidence intervals that can be constructed will contain the unknown true parameter.

44

Common Confusions:Wrong Interpretations

95% of all customers keep a balance of $1,517 to $2,464. The CI gives a range for the population mean µ, not

the balance of individual customers.

The mean balance of 95% of samples of 140 accounts will fall between $1,517 and $2,464. The CI provides a range for µ, not the means of other

samples.

45

Common Confusions:Wrong Interpretations

The mean balance is between $1,517 and $2,464. The average balance in the population may not fall

within the CI. The confidence level of the interval is 95%. It may not contain µ.

(continued)

46

Correct Interpretation

We are 95% confident that the mean monthly credit card balance for the population of customers who accept an application lies between $1,517 and $2,464.

The phrase “95% confident” is our way of saying that we are using a procedure that produces an interval containing the unknown mean in 95% of samples.

47

Transforming Confidence Intervals

Obtaining Ranges for Related Quantities

If [L,U] is a 100(1 – α)% confidence interval for µ, then [c×L,c×U] is a 100 (1 – α)% confidence interval for c×µ and [c+L,c+U] is a 100(1 – α)% confidence interval for c+µ.

48

Application: Property Taxes

Motivation

A mayor is considering a tax on business that is proportional to the amount spent to lease property in her city. How much revenue would a 1% tax generate?

49

Property Taxes

Method

Need a confidence interval for µ (average cost of a lease) to obtain a confidence interval for the amount raised by the tax. Check conditions (SRS and sample size) before proceeding.

50

Property Taxes

Mechanics

(continued)

Univariate statisticsTotal Lease Cost

mean 478,603.48standard deviation 535,342.56standard error of the mean 35,849.19

minimum 20,409.00median 290,559.00maximum 2,820,213.00range 2,799,804.00

skewness 1.953kurtosis 4.138

number of observations 223

t-statistic for computing95%-confidence intervals 1.9707

51

Property Taxes

Mechanics

95% confidence interval for average lease cost

478603 ± 1.9707 × 35849

= [407955, 549252]

95% confidence interval for average tax revenue per business

0.01 × [407955, 549252]

= [4079.55, 5492.52]

(continued)

52

Conclusion

Message

We are 95% confident that the average cost of a lease is between $407,955 and $549,252. The 95% confidence interval for tax raised per business is therefore [$4079, $5493]. Since the number of businesses leased in the city is 4,500, we are 95% confident that the amount raised will be between $18,358,000 and $24,716,000.

53

Best Practices

Be sure that the data are an SRS from the population.

Stick to 95% confidence intervals.

Round the endpoints of intervals when presenting the results.

Use full precision for intermediate calculations.

54

Pitfalls

Do not claim that a 95% confidence interval holds µ.

Do not use a confidence interval to describe other samples.

Do not manipulate the sampling to obtain a particular confidence interval.

3 – Hypothesis Tests

Managerial Statistics

KH 19

Course material adapted from Chapter 16 of our textbookStatistics for Business, 2e © 2013 Pearson Education, Inc.

2

Learning Objectives

Formulate null and alternative hypotheses for applications involving

a single population proportion

a single population mean

Execute the four steps of a hypothesis test

Know how to use and interpret p-values

Know what Type I and Type II errors are

3

Motivating Example

An office manager is evaluating software to filter SPAM e-mails (cost $15,000). To make it profitable, the software must reduce SPAM to less than 20%. Should the manager buy the software?

The manager wants to test the software.

4

Motivating Example

To demonstrate how well the software works, the software vendor applied its filtering system to email arriving at the office. After passing through the filter, a sample of 100 messages contained only 11% spam (and no valid messages were removed).

(continued)

5

Motivating Example

Question: Okay, 11% is better than 20%. But does that mean the manager should buy this software?

Method: Use a Hypothesis Test to answer this question.

Idea: Use the sample result, , to decide whether the software will be profitable, p < 0.2.

(continued)

110ˆ .p

6

What is a Hypothesis?

A hypothesis is a claim about the value of an unknown parameter:

population proportion

population mean

Example: The proportion of spam will be below 20%, that is, p < 0.2.

Example: The average monthly rent for all rent-al properties exceeds $500, that is, μ > 500.

7

The Null Hypothesis, H0

The Null Hypothesis, H0, states the claim to be tested; specifies a default course of action; preserves the status quo. Example: The proportion of spam that slips past the

filter is at least 20% (H0: p ≥ 0.2).

H0 is always about a population parameter, not about a sample statistic.

H0 : p ≥ 0.20 H0 : ≥ 0.20p

8

The Null Hypothesis, H0

We begin with the assumption that the null hypothesis is true.

Similar idea to the notion of innocent untilproven guilty

Always contains “=” , “≤”, or “” sign

May or may not be rejected

(continued)

9

The Alternative Hypothesis, Ha

The Alternative Hypothesis, Ha (H1), is the opposite of the null hypothesis. Example: The proportion of spam that slips past

the filter is less than 20% (Ha: p < 0.2).

Ha never contains the “=” , “≤”, or “” sign.

Ha may or may not be supported.

Ha is generally the hypothesis that the decision maker is trying to support.

10

Spam Filter: Hypotheses

Step 1 of a hypothesis test:

Define the hypotheses H0 and Ha.

H0: p ≥ p0 = 0.20

Ha: p < p0 = 0.20

11

Two Possible Options

We may decide to reject H0 (accept Ha).

Alternatively, we may decide not to reject H0 (we do not accept Ha).

There is no third option.

12

Sampling Distribution of p

p = 0.2If H0 is true

If it is unlikely that we would get a sample proportion of this value ...

... then we reject the null hypothesis that p ≥ 0.2.

Reason for Rejecting H0

0.11

... if in fact this were the population proportion…

X

13

Errors in Decision-Making

Type I Error

Reject a true null hypothesis Example: Buy software that will not reduce spam to

below 20% of incoming emails.

Considered a serious type of error

Threshold probability of Type I Error is Called level of significance or simply -

level of the test

Set in advance by decision maker

14

Errors in Making Decisions

Type II Error

Fail to reject a false null hypothesis Example: Do not buy software that would have reduced

spam to below 20% of incoming emails.

The probability of Type II Error is β. 1-β is also called the power of a test.

(continued)

15

Outcomes and Probabilities

Actual Situation

Decision

Do NotReject

H0

No error(1 - )

Type II Error( β )

RejectH0

Type I Error( )

Possible Hypothesis Test Outcomes

H0 FalseH0 True

Key:Outcome(Probability) No Error

( 1 - β )

16

Type I & II Errors

Type I and Type II errors cannot happen atthe same time.

Type I error can only occur if H0 is true.

Type II error can only occur if H0 is false.

17

Evaluation of Hypotheses

Sample proportion < 0.2. Is this

relationship sufficient to reject the null

hypothesis?

No! The claim is about the population

proportion p. Maybe we just have a lucky

(unlucky?) sample. That is, the test result

may be due to sampling error.

110ˆ .p

18

Evaluation of Hypotheses

Hypothesis tests rely on the sampling distribution of the statistic that estimates the parameter specified in the null and the alternative.

Key question: What is the chance of getting a sample that differs from H0 by as much as (or even more than) this one if H0 is true?

(continued)

19

Spam Filter

A sample of size n = 100 delivered a sample proportion of .

Question: Assuming H0: p ≥ 0.20 is true, how likely is this deviation of 0.09 (or more)?

Assuming H0 is true, the sampling distribution of is approximately normal with mean p = 0.20 and SE( ) = 0.04 (note that the hypothe-sized “boundary” value p0 = 0.20 is used to calculate SE).

110ˆ .p

pp

20

Spam Filter

What is the chance of finding a sample proportion of or even smaller?

(continued)

110ˆ .p

21

Test Statistic

Step 2 of a hypothesis test:

Calculate the test statistic.

252

1002001200200110

00

0

.

)/.(.

..

)/np(p

ppz

22

Meaning of Test Statistic

The test statistic measures the difference between the sample outcome and the boundary value of the null hypothesis in multiples of the standard error.

Spam filter example: The sample proportion lies 2.25 standard errors of the proportion below the boundary value in the null hypothesis.

Since the sample distribution is assumed to be normal, the test statistic for proportions is also called z-statistic.

23

From Test Statistic to Probability

Since the sampling distribution of the sample proportion is (approximately) normal, we can calculate the probability of a sample outcome of at least 2.25 standard errors below the mean.

This probability is the famous p-value.

24

p-value

Step 3 of a hypothesis test:

Calculate the p-value.

p = NORM.S.DIST(-2.25,1) ≈ 0.012

p = NORM.DIST(0.11,0.2,0.04,1) ≈ 0.012

25

Calculating the p-value

)ˆ(ˆ 0

pSE

ppz

0-2.25

Under the null hypothesis (H0: p ≥ 0.2), our sample proportion is at least 2.25 standard errors below the population proportion. The probability of such a sample outcome is 1.2% (p-value).

p-value =

NORMSDIST(-2.25) = 0.012

26

Type I Error and p-value

Question: Suppose we decide to reject H0. What is the probability of a Type I error?

Answer: The p-value is the (maximal) chance of a Type I error if H0 is rejected based on the observed test statistic.

27

Level of Significance

Common practice is to reject H0 only if the p-value is less than a preset threshold.

This threshold that sets the maximum tolerance for a Type I error is called level of significance or α-level.

Statistically significant difference from the null hypothesis: Data contradicts H0 and leads us to reject H0 since p-value < α.

28

Decision

Step 4 of a hypothesis test:

Compare p-value to α and make a decision.

p-value = 0.012 < 0.05 = α

We reject H0 and accept the alternative hypothesis Ha. The spam software reduces the proportion of spam e-mails to less than 20%. The office manager should buy the software.

29

Summary

30

Take Aways I

The Four Steps of a Hypothesis Test:

1. Define H0 and Ha.

2. Calculate the test statistic.

3. Calculate the p-value.

4. Compare the p-value to the significance

level α. Make a decision. Accept Ha if p-

value < α.

31

Take Aways II

Hypothesis Testing: The Idea

We always try to prove the alternative hypo-thesis, Ha.

We then assume that its opposite (the null hypothesis) is true.

H0 and Ha must be totally exhaustive & mutually exclusive.

We can never possibly prove H0!

32

Take Aways III

We ask the question: how likely is to obtain our evidence, given that the null hypothesis is (supposedly) true?

This probability is called the p-value.

Not likely (small p) we have statistically “proven” the alternative hypothesis, so we reject the null.

Likely (not small p) we cannot reject the null.

33

Application: Burger King Ads

Motivation

The Burger King ad featuring Coq Roq won critical acclaim (and resulted in much controversy as well as several lawsuits). In a sample of 2,500 homes, MediaCheck found that only 6% saw the ad. An ad must be viewed by 5% or more of households to be effective. Based on these sample results, should the local sponsor run this ad?

34

Burger King Ads

Method

Perform a hypothesis test.

Set up the null and alternative hypotheses.H0: p ≤ 0.05Ha: p > 0.05

Use α = 0.05. Note that p is the population proportion who watches this ad. (Both SRS and sample size conditions are met.)

35

Burger King Ads

Mechanics

Perform the necessary calculations for an evaluation of the null hypothesis.

NORM.S.DIST(2.294,1) = 0.9891

p-value = 1 – 0.9891 = 0.0109 < 0.05 = α

Reject H0.

(continued)

294.2500,2/)05.01(05.0

05.006.0

z

36

Conclusion

Message

The hypothesis test shows a statistically significant result. We can conclude that more than 5% of households watch this ad. The Burger King Coq Roq ad is cost effective and should be run.

37

Hypothesis Test of a Mean

Hypothesis tests of the mean are similar to tests of proportions.

H0 and Ha are claims about the unknown population mean μ. For example,

H0: µ ≤ µ0 and Ha: µ > µ0 .

The test statistic uses the random variable , the sample mean.

Unlike in the test of proportions, the standard error is not specified since σ is unknown.

X

38

Hypothesis Test of a Mean

Just as in the calculation of a CI we estimate the unknown population standard deviation σwith the known sample standard deviation s.

The resulting test statistic is

(continued)

n

σ)XSE(

n

s)Xse(

ns

xt

/0

39

Hypothesis Test of a Mean

In a hypothesis test of a mean the test statistic is called a t-statistic since the appropriate sampling distribution is the t-distribution.

Specifically, the distribution of the t-statistic in a hypothesis test of a mean is the t-distribution with n-1 degrees of freedom.

We use this distribution to calculate the p-value.

(continued)

40

Denver Rental Properties

A firm is considering expanding into the Denver area. In order to cover costs, the firm needs rents in this area to average more than $500 per month. Are Denver rents high enough to justify the expansion?

41

Univariate Statistics

The firm obtained rents for a sample of size n = 45; the average rent was $647.33 with a sample std. dev. s = $298.77.

Univariate statisticsRent ($/Month)

mean 647.3333333standard deviation 298.7656424standard error of the mean 44.53735239

minimum 140median 610maximum 1600range 1460

skewness 0.617kurtosis 0.992

number of observations 45

t-statistic for computing95%-confidence intervals 2.0154

42

Hypotheses H0 and Ha

Let µ = mean monthly rent for all rental properties in the Denver area.

Step 1: Set up the hypotheses.

H0: µ ≤ µ0 = 500

Ha: µ > µ0 = 500

43

Test Statistic

Step 2: Compute the test statistic.

The average rent in the sample is 3.308 standard errors of the mean above the boundary value in the null hypothesis.

308.344.5374

50033.647/

0

ns

xt

44

p-value

Step 3: Calculate the p-value.

T.DIST.RT(3.308,44) = 0.0009394

The p-value is 0.09394% and thus below 0.1%.

45

Make a Decision

Step 4: Compare the p-value to α and make a decision.

p-value = 0.0009394 < 0.05 = α

We reject H0 and accept Ha. We conclude that the average rent in the Denver area exceeds the break-even value.

46

Summary: Tests of a Mean

47

Checklist

SRS condition: the sample is a simple random sample from the relevant population.

Sample size condition. Unless the population is normally distributed, a normal model can be used to approximate the sampling distribution of if the sample size n is larger than 10 times both the squared skewness and the absolute value of the kurtosis.

48

Application: Returns on IBM Stock

Motivation

Does stock in IBM return more, on average, than T-Bills? From 1980 through 2005, T-Bills returned 0.5% each month.

49

Returns on IBM Stock

Method

Let µ = mean of all future monthly returns for IBM stock. Set up the hypotheses as follows (Step 1):

H0: µ ≤ 0.005

Ha: µ > 0.005

The sample consists of monthly returns on IBM for 312 months (January 1980 – December 2005).

50

Returns on IBM Stock

The sample yields

= 0.01063

s = 0.08053

(continued)

x

Univariate statisticsIBM Return

mean 0.01063365standard deviation 0.08053206standard error of the mean 0.00455923

minimum -0.2619median 0.0065maximum 0.3538range 0.6157

skewness 0.303kurtosis 1.624

number of observations 312

t-statistic for computing95%-confidence intervals 1.9676

51

Returns on IBM Stock

MechanicsStep 2: Calculation of test statistic.

Step 3: Calculation of p-value.

T.DIST(1.236,311,1) ≈ 0.1088

Step 4: Compare p-value to α = 0.05.

p-value = 0.1088 > 0.05 = α. Do NOT reject H0.

(continued)

23610045590

0050010600 ..

..

ns/

μxt

52

Conclusion

Message

According to monthly IBM returns from 1980 through 2005, the IBM stock does not generate statistically significantly higher earnings than comparable investments in US Treasury Bills.

53

Failure to Reject H0

Our failure to reject H0 and to prove Ha does not mean the null is true. We did not prove the null hypothesis.

Our sample evidence is just too weak to prove Ha at a 5% or even 10% significance level. If we had rejected H0, then the chance of making a Type I error (p-value of about 11%) would have been too high for the given level of significance.

If the α-level had been 15% then we could have proven Ha.

54

Significance vs. Importance

Statistical significance does not mean that you have made a practically important or meaning-ful discovery.

The size of the sample affects the p-value of a test. With enough data, a trivial difference from H0 leads to a statistically significant outcome. Such a trivial difference may be practically un-important.

55

Confidence Interval vs. Test

Confidence intervals make positive statements about the population. A confidence interval provides a range of parameter

values that are compatible with the observed data.

Hypothesis tests provide negative statements. A test provides a precise analysis of specific

hypothesized values for a parameter.

A test attempts to reject a specific hypothesis for a parameter.

56

Two-tailed Hypothesis Test

Hypotheses in a Two-tailed Hypothesis Test are of the following form:

mean: H0: µ = 0.005 Ha: µ ≠ 0.005

proportion: H0: p = 0.2 Ha: p ≠ 0.2

The calculation of the test statistic is identical to the calculation in a One-tailed Hypothesis Test.

57

Two-Tailed Hypothesis Test

By convention, the p-value in a two-tailed test is defined as two times the p-value of the corresponding one-tailed test.

As a consequence, the two-tailed p-value does not have the intuitive interpretation along the lines

“The probability of the sample result assuming the null is true”.

This convention leads to a paradox.

(continued)

58

One-tailed Test on IBM Returns

Step 1: H0: µ ≤ 0.005 Ha: µ > 0.005

Step 2: Calculation of test statistic.

Step 3: Calculation of p-value.

T.DIST(1.236,311,1) ≈ 0.1088

Step 4: Compare p-value to α = 0.15.

p-value = 0. 1088 < 0.15 = α.

Reject H0.

236.10.004559

005.00106.0/

0

ns

xt

59

Two-tailed Test on IBM Returns

Step 1: H0: µ = 0.005 Ha: µ ≠ 0.005

Step 2: Calculation of test statistic.

Step 3: Calculation of p-value.

T.DIST(1.236,311,2) ≈ 0.2175

Step 4: Compare p-value to α = 0.15.

p-value = 0. 2175 > 0.15 = α.

Do NOT reject H0.

236.10.004559

005.00106.0/

0

ns

xt

60

Paradox

According to the one-tailed hypothesis test we can prove that µ > 0.005. But according to the two-tailed test we cannot prove that µ ≠ 0.005.

That’s the paradox!

The reason for the convention leading to the paradox is to obtain a sensible relation between two-tailed hypothesis tests and confidence intervals.

61

Two-tailed Tests and Confidence Interval

The hypothesis Ha: µ ≠ 0.005 can be proved at the significance level α if and only if the (1- α)*100% confidence interval does not include 0.005.

62

Summary

Discussed hypothesis testing methodology

Introduced four-step process of hypothesis testing

Defined p-value

Performed z-test for the proportion

Performed t-test for the mean

Discussed two-tailed hypothesis test

63

Best Practices

Be sure that the data are an SRS from the population.

Pick the hypotheses before looking at the data.

Pick the α-level before you compute the test statistic and the p-value.

Think about whether α = 0.05 is appropriate for each test.

Report a p-value to summarize the outcome of a test.

64

Pitfalls

Do not confuse statistical significance with substantive importance.

Do not think that the p-value is the probability that the null hypothesis is true.

Avoid cluttering a test summary with jargon.

4 – Simple Linear Regression

Managerial Statistics

KH 19

Course material adapted from Chapter 19 of our textbookStatistics for Business, 2e © 2013 Pearson Education, Inc.

2

Learning Objectives

Calculate and interpret the simple linear regression equation for a set of data

Describe the meaning of the coefficients of the regression equation in the context of business applications

Examine and interpret the scatterplot and the residual plot as they relate to a regression

Understand the meaning (and limitation) of the R-squared statistic

3

Diamond Prices

Motivation: What is the relationship between the price and weight of diamonds?

Method: Using a sample of 320 emerald-cut diamonds of various weights, regression analysis produces an equation that relates price to weight.

Mechanics: Let y denote the response (“dependent”) variable (price) and let x denote the explanatory (“independent”) variable (weight).

4

Scatterplot of Price vs. Weight

$0.00

$200.00

$400.00

$600.00

$800.00

$1'000.00

$1'200.00

$1'400.00

$1'600.00

$1'800.00

$2'000.00

0.3 0.35 0.4 0.45 0.5 0.55

Pri

ce (

$)

Weight (carats)

Scatterplot

5

Linear Equation

There appears to be a linear trend.

We identify the trend line (“best-fit line” or “fitted line”) by an intercept b0 and a slope b1.

The equation of the fitted line is

Estimated Price = b0 + b1 × Weight .

In generic terms, = b0 + b1 x . y

6

Residuals

Not all data points will lie on the best-fit line.

The Residuals are the vertical deviations from the data points to the line (e=y- ).y

7

Method of Least Squares

The Method of Least Squares determines the best-fit line by minimizing the sum of squared residuals.

The method uses differential calculus to obtain the values of the coefficients b0 and b1 that minimize the sum of squared residuals, also called the sum of squared errors, SSE.

8

Minimizing SSE

Let the index i indicate the ith data point, (xi,yi).

210

2

2

min

ˆmin

min SSEmin

)]xb(b[y

)y(y

e

ii

ii

i

9

Least Square Regression

The method of least squares generates the following coefficient values:

X

Yn

ii

n

iii

s

sr

)x(x

)y)(yx(xb

1

2

11

xbyb 10

10

Diamonds: Fitted Line

The least squares regression equation relating diamond prices to weight is

Estimated Price =

43.5 + 2670 Weight

Regression: Price ($)constant Weight (carats)

coefficient 43.48910163 2669.745803std error of coef 71.90155144 172.4731816t-ratio 0.6048 15.4792p-value 54.5715% 0.0000%beta-weight 0.6555

standard error of regression 170.2149256R-squared 42.97%adjusted R-squared 42.79%

number of observations 320residual degrees of freedom 318

t-statistic for computing95%-confidence intervals 1.9675

11

Using the Fitted Line

The average price of a diamond that weighs 0.4 carat is

Estimated Price = 43.49 + 2669.75 × 0.4

≈ 1111.39,

that is, the estimated price is (about) $1,111.

A diamond that weighs 0.5 carat costs (about) $267 more, on average.

12

Illustration

13

Interpreting the Slope

The slope coefficient b1 describes how differences in the explanatory variable xassociate with differences in the response y.

In the diamond example, we can interpret the slope b1 as the marginal cost of an additional carat. (i.e., marginal cost is $2,670 per carat).

14

Interpreting the Intercept

The intercept b0 estimates the average response when x = 0 (where the line crosses the y axis).

The intercept is the portion of y that is present for all values of x.

In the diamond example we can interpret b0 as fixed cost, $43.49, per diamond.

15

Interpreting the Intercept

In many applications, the intercept coefficient does not have a useful interpretation.

Unless the range of x values includes zero, the value for b0 is the result of an extrapolation.

(continued)

16

Residual Plot

A Residual Plot shows the variation that remains in the data after accounting for the linear relationship defined by the fitted line. Put differently, the plot shows the variation of the data points around the fitted line.

The residuals should be plotted against the predicted values of y (or against x) to check for patterns.

17

Residual Plot

If the least squares line captures the association between x and y, then a plot of residuals should stretch out horizontally with consistent vertical scatter. No particular pattern should be visible.

Our task is to visually check for the absence of a pattern.

(continued)

18

Residuals vs. Predicted Values

-600

-400

-200

0

200

400

600

800 900 1000 1100 1200 1300 1400 1500

resi

du

als

predicted values of Price ($)

Residual Plot

19

Variation of Residuals

The standard deviation of the residuals measures how much the residuals vary around the fitted line.

This standard deviation is called the Standard Error of Regression or the Root Mean Squared Error (RMSE).

22

222

21

n

eee)SSE/(ns n

e

20

Diamonds

For the diamond example, se=170.21.

The standard error of regression is $170.21.

Regression: Price ($)constant Weight (carats)

coefficient 43.48910163 2669.745803std error of coef 71.90155144 172.4731816t-ratio 0.6048 15.4792p-value 54.5715% 0.0000%beta-weight 0.6555

standard error of regression 170.2149256R-squared 42.97%adjusted R-squared 42.79%

number of observations 320residual degrees of freedom 318

t-statistic for computing95%-confidence intervals 1.9675

21

Measures of Variation

xi

y

X

SST = (yi - y)2

SSE = (yi - yi )2

SSR = (yi - y)2 _

_

y

y

__y

Yyi

22

Measures of Variation

SST = total sum of squares

Variation of the yi values around their mean,

SSR = regression sum of squares

Explained variation attributable to the linear relationship between x and y

SSE = error sum of squares (sum of squared errors)

Variation attributable to factors other than the linear relationship between x and y

y

(continued)

23

Measures of Variation

Total variation is made up of two parts:

SSE SSR SST Total Sum of Squares

Regression Sum of Squares

Error Sum of Squares

2i )y(ySST 2

ii )y(ySSE ˆ 2i )yy(SSR ˆ

where:

= Average value of the dependent variable

yi = Observed values of the dependent variable

i = Predicted value of y for the given xi valuey

y

(continued)

24

The Coefficient of Determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable.

The coefficient of determination is also called R-squared and is denoted by r2 or R2.

Coefficient of Determination, R2

10 2 Rnote:

squares of sum total squares of sum regression

SSTSSR2 R

25

r2 = 1

Examples of R-squared Values

Y

X

Y

X

r2 = 1

r2 = 1

Perfect linear relationship between X and Y:

100% of the variation in Y is explained by variation in X.

26

Examples of R-squared Values

Y

X

Y

X

0 < r2 < 1

Weaker linear relationships between X and Y:

Some but not all of the variation in Y is explained by variation in X.

(continued)

27

Examples of R-squared Values

r2 = 0

No linear relationship between X and Y:

The value of Y does not depend on X. (None of the variation in Y is explained by variation in X).

Y

Xr2 = 0

(continued)

28

Diamonds

For the diamond example,

r2 = 0.4297.

The R-squared is 43%. That is, the regression explains 43% of the variation in price.

Regression: Price ($)constant Weight (carats)

coefficient 43.48910163 2669.745803std error of coef 71.90155144 172.4731816t-ratio 0.6048 15.4792p-value 54.5715% 0.0000%beta-weight 0.6555

standard error of regression 170.2149256R-squared 42.97%adjusted R-squared 42.79%

number of observations 320residual degrees of freedom 318

t-statistic for computing95%-confidence intervals 1.9675

29

Checklist for Simple Regression

Linear: Examine the scatterplot to see if pattern resembles a straight line.

Random residual variation: Examine the residual plot to make sure no pattern exists.

(No obvious lurking variable: Think about whether other explanatory variables may better explain the linear association

between x and y.)

30

Application: Lease Costs

Motivation

How can a dealer anticipate the effect of age on the value of a used car? The dealer estimates that $4,000 is enough to cover the depreciation per year.

31

Lease Costs

Method

Use regression analysis to find the equation that relates y (resale value in dollars) to x (age of the car in years). The car dealer has data on the prices and age of 218 used BMWs in the Philadelphia area.

32

Lease Costs

Mechanics

(Think about lurking variables)

Check scatterplot

Run regression

Check residual plot

(continued)

33

Lease Costs: Scatterplot

$10'000.00

$15'000.00

$20'000.00

$25'000.00

$30'000.00

$35'000.00

$40'000.00

$45'000.00

$50'000.00

0 1 2 3 4 5 6

Pri

ce

Age

Regression Equation: Price = 39851.7199 - 2905.5284 Age

Scatterplot

34

Lease Costs: Regression

MechanicsRegression: Price

constant Agecoefficient 39851.7199 -2905.5284std error of coef 758.460867 219.3264t-ratio 52.5429 -13.2475p-value 0.0000% 0.0000%beta-weight -0.6695

standard error of regression 3366.63713R-squared 44.83%adjusted R-squared 44.57%

number of observations 218residual degrees of freedom 216

t-statistic for computing95%-confidence intervals 1.9710

35

Lease Costs: Residual Plot

-10000

-5000

0

5000

10000

15000

20000 25000 30000 35000 40000 45000

resi

du

als

predicted values of Price

Residual Plot

36

Lease Costs: Regression

Mechanics

The linear regression equation is

Estimated Price = 39,851.72 – 2,905.53 Age

The R-squared is 0.4483, the standard error of regression is se = $3366.64.

37

Conclusion

Message

The results indicate that used BMWs decline in resale value by $2,900 per year. The current lease price of $4,000 per year appears profitable. However, the fitted line leaves more than half of the variation unexplained.

Leases longer than 5 years would require extrapolation.

38

Best Practices

Always look at the scatterplot.

Know the substantive context of the model.

Describe the intercept and slope using units of the data.

Limit predictions to the range of observed conditions.

39

Pitfalls

Do not assume that changing x causes changes in y.

Do not forget lurking variables.

Do not trust summaries like R-squared without looking at plots.

Do not call a regression with a high R-squared “good” or a regression with a low R-squared “bad”.

5 – Simple Regression Model

Managerial Statistics

KH 19

Course material adapted from Chapter 21 of our textbookStatistics for Business, 2e © 2013 Pearson Education, Inc.

2

Learning Objectives

Understand the framework of the simple linear regression model

Calculate and interpret confidence intervals for the regression coefficients

Perform hypothesis tests on the regression coefficients

Understand the difference between confidence and prediction intervals for the predicted value

3

Berkshire Hathaway

Motivation: How can we test the CAPM (Capital Asset Pricing Model) for Berkshire Hathaway stock?

Method: Formulate the simple regression with percentage excess return in Berkshire Hathaway stock as y and the percentage excess return in value of the whole stock market (“value-weighted stock market index) as x.

4

From Description to Inference

We do not only want to describe the historical relationship between x and y that is evident in the data. In addition, we now want to make inferences about the underlying population.

We have to think of our data as a sample from a population.

5

From Description to Inference

Naturally, the question arises, what conclusions can we derive from the sample about the population?

The central idea is to use inference related to regression: standard errors, confidence intervals and hypothesis tests.

(continued)

6

Model of the Population

The Simple Linear Regression Model (SRM) is a model for the association in the population between an explanatory variable x and a response variable y.

The SRM equation describes how the (conditional) mean of y depends on x.

The SRM assumes that these means lie on a straight line with intercept β0 and slope β1:

xxXYExy 10)(

7

Model of the Population

The response variable y is a random variable. The actual values vary around the mean. The deviations of responses around their (conditional) mean are called errors,

Errors ε can be positive or negative. They have zero mean, that is, the average deviation from the line is zero.

(continued)

xyy

8

εxββy 10linear component

Simple Linear Regression Model

The population regression model:

population Y intercept

population slopecoefficient

random error term

dependent variable

independent variable

random errorcomponent

9

(continued)

random error for this xi value

Y

X

observed value of y for xi

average value of y for xi

iii εxββy 10

xi

slope = β1

intercept = β0

εi

Simple Linear Regression Model

10

Data Generating Process

The “true regression line” is a characteristic of the population, not the observed data.

The true line’s parameters β0 and β1 are (and will remain) unknown!

The SRM is a model and offers a simplified view of the population.

The observed data points are a simple random sample from the population.

The fitted line provides an estimate of the population regression line.

11

ii xbby 10ˆ

The simple linear regression equation provides an estimate of the population regression line.

Simple Linear Regression Equation

estimate of the regression intercept

estimate of the regression slope

estimated (or predicted) yvalue for observation i

value of x for observation i

The individual random error terms ei are

)xb-(by)y-(ye iiiii 10ˆ value of y for observation i

12

Estimates vs. Parameters

13

From Description to Inference

We want to use the estimated regression line to make inferences about the true relationship between the explanatory and the response variable.

The central idea is to use the standard statistical tools: standard errors, confidence intervals and hypothesis tests.

The application of these tools requires us to make some assumptions.

14

SRM: Classical Assumptions

(1) The regression model is linear.

(2) The error term ε has zero mean, E(ε) = 0.

(3) The explanatory variable x and the error term εare uncorrelated.

(4) The error terms are uncorrelated with each other.

15

SRM: Classical Assumptions

(5) The error term has a constant variance, Var(ε) = σe

2 for any value of x. (homoskedasticity)

(6) The error terms are normally distributed.

(This assumption is optional but usually invoked.)

(continued)

16

Inference

If assumptions (1) – (6) hold, then we can easily compute confidence intervals for the unknown parameters β0 and β1. Similarly, we can perform hypothesis tests for these parameters.

17

Modeling Process: Practical Checklist

Before looking at plots or running a regression, ask the following questions: Does a linear relationship make sense to us?

What type of relationship (sign of coefficients) do we expect?

Could there be lurking variables?

Then begin working with data.

18

Modeling Process: Practical Checklist

Plot y versus x and verify a linear association in the scatterplot.

Compute the fitted line.

Plot the residuals versus the predicted values (or x) and inspect the residual plot. Do the … … residuals appear to be independent?

… residuals appear to have similar variances?

(… residuals appear to be nearly normal?)

(Time series require additional checks.)

(continued)

19

CAPM: Berkshire Hathaway

Check scatterplot: relationship appears linear

-30

-20

-10

0

10

20

30

40

-25 -20 -15 -10 -5 0 5 10 15

% C

han

ge

Ber

k-H

ath

% Change Market

Scatterplot

20

CAPM: Berkshire Hathaway

Run simple linear regression

(continued)

Regression: % Change Berk-Hath

constant % Change Market

coefficient 1.39620459 0.72234946

std error of coef 0.33968223 0.07776332

t-ratio 4.1103 9.2891

p-value 0.0049% 0.0000%

beta-weight 0.4334

standard error of regression 6.51740865

R-squared 18.79%

adjusted R-squared 18.57%

number of observations 375

residual degrees of freedom 373

t-statistic for computing

95%-confidence intervals 1.9663

21

CAPM: Berkshire Hathaway

Check residual plot: no pattern visible

(continued)

-30

-20

-10

0

10

20

30

40

-20 -15 -10 -5 0 5 10 15

resi

du

als

predicted values of % Change Berk-Hath

Residual Plot

22

Standard Errors of the Coefficients

The Standard Errors of the Coefficients describe the sample-to-sample variability of the coefficients b0 and b1.

The estimated standard error of b1, se(b1), is

x

e

sn

sbse

11

)( 1

23

Estimated Standard Error of b1

The estimated standard error of b1 depends on three factors: Standard deviation of the residuals se. As se

increases, the standard error se(b1) increases.

Sample size n. As n increases, the standard error se(b1) decreases.

Standard deviation sx of x. As sx increases, the standard error se(b1) decreases.

24

CAPM: Berkshire Hathaway

CAPM regression for Berkshire HathawayRegression: % Change Berk-Hath

constant % Change Market

coefficient 1.39620459 0.72234946

std error of coef 0.33968223 0.07776332

t-ratio 4.1103 9.2891

p-value 0.0049% 0.0000%

beta-weight 0.4334

standard error of regression 6.51740865

R-squared 18.79%

adjusted R-squared 18.57%

number of observations 375

residual degrees of freedom 373

t-statistic for computing

95%-confidence intervals 1.9663

25

Confidence Intervals

Confidence intervals for the coefficients

The 95% confidence interval for β1 is

The 95% confidence interval for β0 is

)( 02,025.00 bsetb n

)( 12,025.01 bsetb n

26

Confidence Intervals: CAPM

The 95% confidence interval for β1 is

0.72234 ± 1.9663×0.077763 = [0.5694, 0.8753].

The 95% confidence interval for β0 is

1.3962 ± 1.9663×0.33968 = [0.7283, 2.064].

27

Hypothesis Tests

Hypothesis tests on the coefficients

Test statistic for H0: β1 = 0:

Test statistic for H0: β0 = 0:

)se(b

bt

1

1

)se(b

bt

0

0

28

Hypothesis Tests: CAPM

Hypothesis test of statistical significance for β1: The t-statistic of 9.2891 with a p-value of less than 0.0001% indicates that the slope is significantly different from zero.

Hypothesis test of statistical significance for β0: The t-statistic of 4.1103 with a p-value of 0.0049% indicates that the intercept is significantly different from zero.

29

Application: Locating a Gas Station

Motivation

Does traffic volume affect gasoline sales? How much more gasoline can be expected to be sold at a gas station with an average of 40,000 drive-bys a day compared to one with an average of 32,000 drive-bys?

30

Gas Station

Method

Use sales data from a recent month obtained from 80 gas stations (from the same franchise).

Run a regression of sales against traffic volume.

The 95% confidence interval for 8,000 times the estimated slope will indicate how much more gas is expected to sell at the busier location.

31

Gas Station

Mechanics

(Think about lurking variables)

Check scatterplot

Run regression

Check residual plot

(continued)

32

Gas Station: Scatterplot

Mechanics

Check scatterplot: relationship appears linear

2

4

6

8

10

12

14

20 25 30 35 40 45 50 55

Sal

es (

000

gal

.)

Traffic Volume (000)

Scatterplot

33

Gas Station: Regression

Mechanics

Run a regression

Regression: Sales (000 gal.)

constant Traffic Volume (000)

coefficient -1.3380974 0.23672864

std error of coef 0.94584359 0.02431421

t-ratio -1.4147 9.7362

p-value 16.1132% 0.0000%

beta-weight 0.7407

standard error of regression 1.5054068

R-squared 54.86%

adjusted R-squared 54.28%

number of observations 80

residual degrees of freedom 78

t-statistic for computing

95%-confidence intervals 1.9908

34

Gas Station: Residual Plot

Mechanics

Check the residual plot: no pattern

-4

-3

-2

-1

0

1

2

3

4

5

4 5 6 7 8 9 10 11resi

du

als

predicted values of Sales (000 gal.)

Residual Plot

35

Gas Station: Regression

Mechanics

The linear regression equation is

Estimated Sales = -1.338 + 0.23673 Traffic Vol.

The 95% confidence interval for β1 is

0.23673 ± 1.9908×0.024314 = [0.1883, 0.2851].

The 95% confidence interval for 8000×β1 is

8000×[0.1883, 0.2851] ≈ [1507, 2281].

36

Conclusion

Message

Based on a sample of 80 gas stations, we expect that a station located at a site with 40,000 drive-bys will sell, on average, from 1,507 to 2,281 more gallons of gas daily than a location with 32,000 drive-bys.

37

Standard Errors of the Fitted Value

The fitted value, , for a given value of x is an estimator of two different unknown values: It is a point estimate for the average value of y for all

data points with the particular x value.

It is a point estimate for the y value of a single observation with this particular x value.

It is much more difficult to make a prediction about a single observation than to make a prediction about an average value.

y

38

SE Estimated Mean

ŷ = b0 + b1*x

x = 40

ŷ = 8.13

Std error of ŷ for estimating μy|x: SE of estimated mean.

Confidence Interval for average Sales at Traffic Volume = 40.

b0

y = Sales

x = Traffic Volume

39

SE Prediction

x = 40

Std error of ŷ for estimating avg y at x: SE of estimated mean.

Std error of ŷ for estimating individual y: SE of prediction.

(SE of prediction)2 = (SE of est. mean)2 + (SE of regression)2

x = Traffic Volume

y = Sales

b0

Prediction Interval for Sales at Traffic Volume = 40

ŷ = 8.13

ŷ = b0 + b1*x

40

Standard Errors of the Fitted Value

The Standard Error of the Estimated Meancaptures the variability of the estimated mean of y around μy|x, the (true but unknown) population average y at the given x.

The fitted ŷ = b0 + b1*x is our estimator for the average y at x. The SE of Estimated Mean is a measure for its sample-by-sample variation.

41

Standard Errors of the Fitted Value

The Standard Error of Regression , se, measures the variability of the individual yaround the fitted line.

By SRM assumption (5) (homoskedasticity), the std. deviation of y around the average μy|x does not vary with x; this std. deviation is estimated by the SE of Regression. (Note: it is not the std. error of any estimator.)

(continued)

42

The Standard Error of Prediction captures the variability of any individual observation y around μy|x, the (true but unknown) population averagey at any given x.

(SE of Prediction)2 =

(SE of Est. Mean)2 + (SE of Regression)2

Standard Errors of the Fitted Value(continued)

43

Two Different Intervals

Confidence Interval: An interval designed to hold an unknown population parameter with some level (often 95%) of confidence.

Prediction Interval: An interval designed to hold a fraction of the values of the variable y(for a given value of x).

A prediction interval differs from a confidence interval because it makes a statement about the location of a new observation rather than a parameter of a population.

44

CI vs. PI

(1- α) Confidence Interval for a mean

Predicted Value ± TINV(α,df)×SE Est. Mean

Prediction Interval for a single observation

Predicted Value ± TINV(α,df)×SE Prediction

Prediction intervals are sensitive to SRM assumptions (5), constant variance, and (6), normal errors.

45

Gas Station: CI and PI

Prediction, using most-recent regression

constantTraffic Volume

(000)

coefficients -1.3381 0.236729

values for prediction 40

predicted value of Sales (000 gal.) 8.131048

standard error of prediction 1.515364

standard error of regression 1.505407

standard error of estimated mean 0.173427

confidence level 95.00%

t-statistic 1.9908

residual degr. freedom 78

confidence limits lower 5.114191

for prediction upper 11.14791

confidence limits lower 7.785781

for estimated mean upper 8.476316

95% CI: [7.786, 8.476]

95% PI: [5.114, 11.148]

46

Interpretation of Intervals

We are 95% confident that average sales at gas stations with 40,000 drive-bys per day are between 7,786 gallons and 8,476 gallons.

We are 95% confident that sales at an individual gas station with 40,000 drive-bys per day are between 5,114 gallons and 11,148 gallons.

47

Best Practices

Verify that your model makes sense, both visually and substantively.

Consider other possible explanatory variables.

Check the conditions, in the listed order.

Use confidence intervals to express what you know about the slope and intercept.

Check the assumptions of the SRM carefully before using prediction intervals.

Be careful when extrapolating.

48

Pitfalls

Don’t overreact to residual plots.

Do not mistake varying amounts of data for unequal variances.

Do not confuse confidence intervals with prediction intervals.

Do not expect that r2 and se must improve with a larger sample.

6 – Multiple Regression

Managerial Statistics

KH 19

Course material adapted from Chapter 23 of our textbookStatistics for Business, 2e © 2013 Pearson Education, Inc.

2

Learning Objectives

Apply multiple regression analysis to decision-making situations in business

Analyze and interpret multiple regression models

Understand the difference between partial and marginal slopes

Decide when to exclude variables from a regression model

3

Chain of Women’s Apparel Stores

Motivation: How are sales at a chain of women’s apparel stores (annually in dollars per square foot of retail space) affected by competition (number of competing apparel stores in the same shopping mall)?

First approach: Formulate a simple regression with sales at stores of this chain as the response variable y and the number of competing stores as the explanatory variable x.

4

Scatterplot of Sales vs. Competitors

$300.00

$400.00

$500.00

$600.00

$700.00

$800.00

$900.00

0 1 2 3 4 5 6 7

Sal

es (

$/sq

ft)

Competitors

Scatterplot

5

Simple Linear Regression

Positive relationship: more competitors, higher sales!

Does this make sense?

Regression: Sales ($/sq ft)constant Competitors

coefficient 502.201557 4.63517778std error of coef 25.4436616 8.74691578t-ratio 19.7378 0.5299p-value 0.0000% 59.8029%beta-weight 0.0666

standard error of regression 105.778443R-squared 0.44%adjusted R-squared -1.14%

number of observations 65residual degrees of freedom 63

t-statistic for computing95%-confidence intervals 1.9983

6

Interpretation

A large number of competitors is indicative of a shopping mall in a location with a high median household income. Put differently, the number of competitors and the median household income are positively correlated.

The simple regression of Sales on Competitorsmixes the decrease in sales associated with increased competition with the increase in sales associated with higher income levels (that accompany a larger number of competitors).

7

Apparel Sales: Multiple Regression

Multiple regression with 2 explanatory variables Median household income in the area (in thousands

of dollars) Number of competing apparel stores in the same

mall Response variable as before

Sales at stores of the chain (annually in dollars per square foot of retail space)

8

Apparel Sales: Multiple Regression

Estimated Sales = 60.359 + 7.966 Income – 24.165 Competitors

Regression: Sales ($/sq ft)constant Income ($000) Competitors

coefficient 60.3586702 7.965979876 -24.16503223std error of coef 49.290165 0.838249629 6.38991396t-ratio 1.2246 9.5031 -3.7817p-value 22.5374% 0.0000% 0.0353%beta-weight 0.8727 -0.3473

standard error of regression 68.03062709R-squared 59.47%adjusted R-squared 58.17%

number of observations 65residual degrees of freedom 62

t-statistic for computing95%-confidence intervals 1.9990

9

Sales: Residual Plot

Check the residual plot: no pattern

-150

-100

-50

0

50

100

150

200

300 350 400 450 500 550 600 650 700 750 800

resi

du

als

predicted values of Sales ($/sq ft)

Residual Plot

10

Interpreting the Equation

The slope 7.966 for Income implies that a store in a location with a higher median household of $10,000 sells, on average, $79.66 more per square foot than a store in a less affluent loca-tion with the same number of competitors.

The slope -24.165 for Competitors implies that, among stores in equally affluent locations, each additional competitor lowers average sales by $24.165 per square foot.

11

Multiple Regression

The Multiple Regression Model (MRM) is a model for the association in the population between multiple explanatory variables x1, x2, …,xk and a response y.

While the SRM bundles all but one explanatory variable into the error term, multiple regression allows for the inclusion of several variables in the model.

Multiple regression separates the effects of each explanatory variable on the response and reveals which really matter.

12

Multiple Regression Model

Idea: Examine the linear relationship between a response (y) & 2 or more explanatory variables (xi)

εxβxβxββy kk 22110

Multiple regression model with k independent variables:

y intercept population slopes random error

13

Multiple Regression Equation

The coefficients of the multiple regression model are estimated using sample data

kk xbxbxbby 22110ˆ

estimated slope coefficients

Estimated multiple regression equation:

estimatedintercept

14

Graph for Two-Variable Model

y

x1

x2

22110ˆ xbxbby

15

Residuals in a Two-Variable Model

y

x2

22110ˆ xbxbby yi

yi

<

x2i

x1i

sample observation

residual = ei = (yi – yi)

<

16

MRM: Classical Assumptions

(1) The regression model is linear.

(2) The error term ε has zero mean, E(ε) = 0.

(3) All explanatory variables x1, x2, …,xk are uncorrelated with the error term ε.

(4) Observations of the error term are uncorre-lated with each other.

17

MRM: Classical Assumptions

(5) The error term has a constant variance, Var(ε) = σe

2 for any value of x. (homoskedasticity)

(6) No explanatory variable is a perfect linear function of any other explanatory variables.

(7) The error terms are normally distributed.(This assumption is optional but usually invoked.)

(continued)

18

Multiple vs. Simple Regressions

Partial slope: slope of an explanatory variable in a multiple regression that statistically exclu-des the effects of other explanatory variables.

Marginal slope: slope of the explanatory variable in a simple regression.

Partial and Marginal slopes only agree when the explanatory variables are uncorrelated.

19

Partial Slopes: Women’s Apparel

Competitors has a direct negative effect on Sales.Income has a positive effect on Sales.

Competitors and Income are positively correlated.

SalesCompetitors

Income

++

20

Marginal Slope: Women’s Apparel

The direct effect of Competitors on Sales is negative (–). The indirect effect (via Income) is positive (+ × +).

The marginal slope of Competitors in the simple regression is now the sum of these two effects.

SalesCompetitorsIncome

–++

– + (+ × +)

21

Partial vs. Marginal Slopes

The MRM separates the individual effects of all explanatory variables (into the partial slopes). Indirect effects (resulting from correlation among explanatory variables) are not present.

The SRM does not separate individual effects and so indirect effects are present. The marginal slope of the (single) explanatory variable reflects both the direct effect of this variable as well as the indirect effect(s) due to missing explanatory variable(s).

22

Apparel Sales: Multiple Regression

Estimated Sales = 60.359 + 7.966 Income – 24.165 Competitors

Regression: Sales ($/sq ft)constant Income ($000) Competitors

coefficient 60.3586702 7.965979876 -24.16503223std error of coef 49.290165 0.838249629 6.38991396t-ratio 1.2246 9.5031 -3.7817p-value 22.5374% 0.0000% 0.0353%beta-weight 0.8727 -0.3473

standard error of regression 68.03062709R-squared 59.47%adjusted R-squared 58.17%

number of observations 65residual degrees of freedom 62

t-statistic for computing95%-confidence intervals 1.9990

23

Inference in Multiple Regression

Hypothesis test of statistical significance for β1: The t-ratio of 9.5031 with a p-value of less than 0.0001% indicates that the partial slope of Income is significantly different from zero.

Hypothesis test of statistical significance for β2: The t-statistic of -3.7817 with a p-value of 0.0353% indicates that the partial slope of Competitors is significantly different from zero.

24

Inference in Multiple Regression

Both explanatory variables, Income and Competitors, have a statistically significant effect on the response, Sales.

Hypothesis test of statistical significance for β0: The t-statistic of 1.2246 with a p-value of 22.5374% indicates that the constant coefficient is not significantly different from zero.

(continued)

25

Prediction with a Multiple RegressionPrediction, using most-recent regression

constant Income ($000) Competitorscoefficients 60.35867 7.965979876 -24.16503223values for prediction 50 3

predicted value of Sales ($/sq ft) 386.1626standard error of prediction 69.9607standard error of regression 68.0306standard error of estimated mean 16.3198

confidence level 95.00% t-statistic 1.9990residual degr. freedom 62

confidence limits lower 246.3131for prediction upper 526.0120

confidence limits lower 353.5398for estimated mean upper 418.7853

26

Prediction with a Multiple Regression

The 95% prediction interval for annual sales per square foot at a location with median household income of $50,000 and 3 competitors is [$246.31, $526.01].

The 95% confidence interval for average annual sales per square foot at locations with median household income of $50,000 and 3 competi-tors is [$353.54, $418.79].

(continued)

27

Application: Subprime Mortgages

MotivationA banking regulator would like to verify how lenders use credit scores to determine the interest rate paid by subprime borrowers. The regulator would like to separate its effect from other variables such as loan-to-value (LTV) ratio, income of the borrower and value of the home.

28

Subprime Mortgages

MethodUse multiple regression on data obtained for 372 mortgages from a credit bureau. The explanatory variables are the LTV, credit score (FICO), income of the borrower, and home value. The response is the annual percentage rate of interest on the loan (APR).

29

Subprime Mortgages

MechanicsRun regressionCheck residual plot

(continued)

30

Subprime Mortgages: Regression

Regression: APRconstant LTV FICO Stated Income ($000) Home Value ($000)

coefficient 23.7253652 -1.588843 -0.0184318 0.000403212 -0.000752082std error of coef 0.6859028 0.51971233 0.00135016 0.003326563 0.000818648t-ratio 34.5900 -3.0572 -13.6515 0.1212 -0.9187p-value 0.0000% 0.2398% 0.0000% 90.3591% 35.8862%beta-weight -0.1339 -0.6008 0.0047 -0.0362

standard error of regression 1.24383566R-squared 46.31%adjusted R-squared 45.73%

number of observations 372residual degrees of freedom 367

t-statistic for computing95%-confidence intervals 1.9664

31

Subprime Mortgages: Residual Plot

MechanicsCheck the residual plot: no pattern

-4

-2

0

2

4

6

8

8 9 10 11 12 13 14 15 16 17

resi

du

als

predicted values of APR

Residual Plot

32

Subprime Mortgages: Regression

MechanicsThe linear regression equation isEstimated APR = 23.725 – 1.5888 LTV – 0.01843 FICO

+ 0.0004032 Stated Income – 0.000752 Home Value

The first two variables, LTV and Credit Score (FICO) have low p-values. The remaining two variables, Stated Income and Home Value, have high p-values.

33

Conclusion

MessageRegression analysis shows that the credit score (FICO) of the borrower and the loan LTV affect interest rates in the market. Neither income of the borrower nor the home value improves a model with these two variables.

34

Dropping Variables

Since the variables Stated Income and Home Value have no statistically significant effect on the response variable APR, we may decide to drop them from the regression.

We run a new regression with only two explanatory variables, LTV and Credit Score (FICO).

35

New Regression

Estimated APR = 23.691 – 1.5773 LTV – 0.018566 FICO

Regression: APRconstant LTV FICO

coefficient 23.6913824 -1.5773413 -0.0185656std error of coef 0.64984629 0.51842379 0.00134003t-ratio 36.4569 -3.0426 -13.8546p-value 0.0000% 0.2514% 0.0000%beta-weight -0.1329 -0.6051

standard error of regression 1.24189462R-squared 46.19%adjusted R-squared 45.90%

number of observations 372residual degrees of freedom 369

t-statistic for computing95%-confidence intervals 1.9664

36

Removing Variables

Multiple regressions may often indicate that some of the explanatory variables are not statistically significant.

Depending on the context of the analysis, we may decide to remove insignificant variables from the regression.

If we remove such variables then we should do so one at a time to make sure that we don’t omit a useful variable.

37

Best Practices

Know the business context of your model.

Distinguish marginal from partial slopes.

Check the assumptions of the model before interpreting the output.

38

Pitfalls

Don’t confuse a multiple regression with several simple regressions.

Don’t believe that you have all of the important variables. Do not think that you have found causal effects.

Do not interpret an insignificant t-ratio to mean that an explanatory variable has no effect.

Don’t think that the order of the explanatory variables in a regression matters.

Don’t remove several explanatory variables from your model at once.

7 – Dummy Variables

Managerial Statistics

KH 19

Course material adapted from Chapter 25 of our textbookStatistics for Business, 2e © 2013 Pearson Education, Inc.

2

Learning Objectives

Incorporate qualitative variables into regression models by using dummy variables

Interpret the effect of a dummy variable on the regression equation

Analyze interaction effects by introducing slope dummy variables

Apply and interpret regression models with slope dummy variables

3

Dummy Variable

A Dummy Variable is a variable that only takes values 0 or 1. It usually expresses a qualitative difference; e.g., whether the observation is for a man or a woman, or from customer A or B, etc.

For example, we can define a dummy variable Group as follows:

Group = 0, if the data point is for a womanGroup = 1, if the data point is for a man

4

Gender and Salaries

Motivation: How can we examine the impact of the variables ‘years of experience’ and ‘gender (male/female)’ on average salaries of managers?

Method: Represent the categorical variable gender by a dummy variable. Then run a regression with the response variable Salary and two explanatory variables, years of experience and the new dummy variable.

5

Regression with a Dummy

Estimated Salary = 133.47 + 0.8537 Years + 1.024 Group

Regression: Salary ($000)constant Years of Experience Group

coefficient 133.467579 0.853708343 1.024190096std error of coef 2.13151142 0.192481379 2.057626623t-ratio 62.6164 4.4353 0.4978p-value 0.0000% 0.0016% 61.9298%beta-weight 0.3449 0.0387

standard error of regression 11.77881458R-squared 13.11%adjusted R-squared 12.09%

number of observations 174residual degrees of freedom 171

t-statistic for computing95%-confidence intervals 1.9739

6

Substituting Values for the Dummy

Estimated Salary = 133.47 + 0.8537 Years + 1.024 Group

Equation for women (Group = 0)Estimated Salary = 133.47 + 0.8537 Years

Equation for men (Group = 1)Estimated Salary = 134.49 + 0.8537 Years

7

Effect of the Dummy Coefficient

After substituting the two values 0 and 1 for the dummy variable, we obtain two regression equations.

The equation for Group = 0 yields a relationship between Salary and Years for women.

The equation for Group = 1 yields a relationship between Salary and Years for men.

The two lines have different intercepts but identical slopes.

The coefficient of the dummy variable, bGroup=1.024, determines the difference between the intercepts of the two regression lines.

8

In General Terms

Regression with two variables, x1 and dum:

Substituting values for the dummy:

same slope

12010

1010

121

121

)()1(ˆ

)0(ˆ

xbbbbxbby

xbbbxbby

dum = 0

dum = 1

different intercept

dumbxbby 21 10ˆ

9

Illustration

x1

y

b0 + b2

b0 slope b1

If H0: β2 = 0 is rejected, then the dummy variable dum has a significant effect on the response y.

10

Dummy: Gender and Salaries

The coefficient of the dummy variable Group, bGroup, can be interpreted as the difference in starting salaries between men and women.

The coefficient is bGroup= 1.024. So, on average, men have higher starting salaries than women.

The p-value of this coefficient is 61.9298%. Therefore, the difference in starting salaries appears to be statistically insignificant.

11

Possible Interaction Effect

There is no significant difference between start-ing salaries of men and women. But, perhaps, a significant difference arises during the time of employment. Put differently, one group of em-ployees may see larger pay increases than the other one.

Such an effect is called an Interaction Effect. The variables Group and Years interact in their respective effects on the response variable Salary.

12

Slope Dummy Variable

How can we detect the presence of such an interaction effect?

We need to include an Interaction (Variable), also called, Slope Dummy Variable.

This new variable is the product of an explanatory variable and a dummy variable.

13

In General Terms

Regression with the variables, x1, dum and x1×dum:

Substituting values for the dummy:

)(ˆ 110 321 dumxbdumbxbby

13201310

101310

)()()1()1(ˆ

)0()0(ˆ

121

121

xbbbbxbbxbby

xbbxbbxbby

different intercept

different slope

14

Illustration

x1

y

b0 + b2

b0

slope b1

If H0: β2 = 0 is rejected, then the dummy variable dumhas a significant effect on the response y.If H0: β3 = 0 is rejected, then the slope dummy variable x1×dum has a significant effect on the response y.

slope b1+b3

15

Dummy and Slope Dummy

Regression: Salary ($000)constant Years of Experience Group Group x Years

coefficient 130.988793 1.175983272 4.61128123 -0.41492239std error of coef 3.49019381 0.407570912 4.497011759 0.462459128t-ratio 37.5305 2.8853 1.0254 -0.8972p-value 0.0000% 0.4417% 30.6627% 37.0876%beta-weight 0.4751 0.1743 -0.2314

standard error of regression 11.78553688R-squared 13.52%adjusted R-squared 11.99%

number of observations 174residual degrees of freedom 170

t-statistic for computing95%-confidence intervals 1.9740

16

Substituting Values for the Dummy

Estimated Salary = 130.99 + 1.176 Years + 4.611 Group – 0.4149 Group×Years

Equation for women (Group = 0)Estimated Salary = 130.99 + 1.176 Years

Equation for men (Group = 1)Estimated Salary = 135.60 + 0.7611 Years

17

Significance

Question: Is there a statistically significant difference between salaries paid to women and salaries paid to men?

Answer: The differences in salaries are statistically insignificant. The p-values of the dummy variable Group and the slope dummy variable Group×Years exceed 30%, respectively.

18

Principle of Marginality

Principle of Marginality: if the slope dummy is statistically significant, retain it as well as both of its components regardless of their level of significance.

If the interaction is not statistically significant, remove it from the regression and re-estimate the equation. A model without an interaction term is simpler to interpret since the lines fit to the groups are parallel.

19

Prediction with Slope DummyPredictions, using most-recent regression

coefficients values for predictionconstant 130.98879

Years of Experience 1.1759833 10 10Group 4.6112812 0 1

Group x Years -0.4149224 0 10

predicted value of Salary ($000) 142.7486 143.2107standard error of prediction 11.92218 11.84443standard error of regression 11.78554 11.78554standard error of estimated mean 1.799847 1.179728

confidence level 95.00% t-statistic 1.9740residual degr. freedom 170

confidence limits lower 119.214 119.8296for prediction upper 166.2832 166.5918

confidence limits lower 139.1957 140.8819for estimated mean upper 146.3016 145.5395

Predict

20

Best Practices

Be thorough in your search for confounding variables.

Consider interactions. Choose an appropriate baseline group. Write out the fits for separate groups. Be careful interpreting the coefficient of the

dummy variable. (Check for comparable variances in the groups.) (Use color-coding or different plot symbols to

identify subsets of observations in plots.)

21

Pitfalls

Don’t think that you have adjusted for all of the confounding factors.

Don’t confuse the different types of slopes.

Don’t forget to check the conditions of the MRM.

©2014 by the Kellogg School of Management at Northwestern University. This case was developed with support from the December 2009 graduates of the Executive MBA Program (EMP-76). This case was prepared by Professor Karl Schmedders with the assistance of Charlotte Snyder and Sophie Tinz. Cases are developed solely as the basis for class discussion. Cases are not intended to serve as endorsements, sources of primary data, or illustrations of effective or ineffective management. To order copies or request permission to reproduce materials, call 800-545-7685 (or 617-783-7600 outside the United States or Canada) or e-mail [email protected]. No part of this publication may be reproduced, stored in a retrieval system, used in a spreadsheet, or transmitted in any form or by any means�electronic, mechanical, photocopying, recording, or otherwise�without the permission of Kellogg Case Publishing.

REVISED MARCH 19, 2014

KARL SCHMEDDERS KEL754

Germany�s Bundesliga: Does Money Score Goals?

Some people believe football is a matter of life and death; I am very disappointed with that attitude. I can assure you it is much, much more important than that.

�William �Bill� Shankly (1913�1981), Scottish footballer and legendary Liverpool manager

�Tor! [Goal!]� yelled the jubilant announcer as 22-year-old midfielder Toni Kroos of FC Bayern München fired a blistering shot past Borussia Dortmund�s goalkeeper. After sixty-six minutes of scoreless football (�soccer� in the United States) on December 1, 2012, Bayern had pulled ahead of the reigning German champion and Cup winner.

A sigh escaped Franz Dully, a financial analyst who covered football clubs belonging to the Union of European Football Associations (UEFA). He was disappointed for two reasons: Not only had a bout with the flu kept him home, but as a staunch Dortmund fan he had a decidedly nonprofessional interest in the outcome. The day�s showdown between Germany�s top professional teams and archrivals would possibly be the deciding match for the remainder of the season; with only three more matches before the mid-season break, FC Bayern had already obtained the coveted title of Herbstmeister (winter champion).

History had shown that the league leader at the break often went on to win the coveted German Bundesliga Championship title. It was no guarantee, however, as Dortmund had demonstrated last season when the club had overcome Bayern�s mid-season lead to take the title in May. This year Bayern, the league�s traditional frontrunner, was determined to reclaim its glory (and trophy).

As the station cut to the delighted Bayern fans in the stands, the phone rang. Dully knew exactly who would be on the other end of the line.

�Tough break, comrade! Wish you were here!� yelled his friend Max Vogel. Dully could barely hear him over the Bayern fans celebrating at Allianz Arena.

�Let�s skip the schadenfreude, shall we? It�s most unbecoming.�

GERMANY�S BUNDESLIGA KEL754

2 KELLOGG SCHOOL OF MANAGEMENT

�Who, me?� Vogel asked. �Surely you jest. I would never take pleasure in my childhood friend�s suffering. But disappointment is inevitable when you root for the underdog.�

�That underdog, as you call it, has taken the title for the last two years and we�re going for three in a row.�

Vogel was undeterred. �Fortunately, I had the foresight to move to Munich, city of champions. Remember the old saying: Money scores goals. And Bayern has the most.�

�Money is no guarantee of success,� Dully countered.

�Really?� his friend shot back. �Haven�t billionaires from Russia, America, and Abu Dhabi bought the last three English Premier League titles for Chelsea, Manchester United, and Manchester City?�

�Well, money certainly helps,� Dully conceded. �But you�re using British examples, and German football is altogether different. To quote our mutual patron saint Sepp Herberger: �The ball is round� and football is anything but predictable. This match isn�t over until the whistle blows, and that�s true for the season, too.�

�Well, you�re the numbers wizard. If anyone can calculate whether money offers an advantage, it�s you. Your readers might find it interesting if you managed to prove what football fans think they already know.�

�I�ll see,� said Dully, without enthusiasm.

�I�ll drink a beer for you in the meantime! Feel better! Tschüss!�

Dully grunted and put the phone down, but his friend�s offhand remark stuck with him. With one eye on the game, he leaned over the side of his chair and felt around for his laptop. He dreaded Vogel�s gloating if Bayern held onto its lead to win the match; perhaps he could quiet him down if he met his friend�s challenge to show that money correlated with winning football matches as surely as a talented striker.

The Bundesliga

Football was widely recognized as one of Germany�s top pastimes. Since the German Football Association (DFB) was founded in 1900, it had grown to encompass nearly 27,000 clubs and 6.3 million people around the country.1 Initially the game was played only at an amateur level, although semi-professional teams emerged after World War II.

Professional football in Germany appeared later than in many of its international counterparts. The country�s top professional league, known as the Bundesliga, was formed on July 28, 1962, after Yugoslavia stunned the German national team with a quarter-final World Cup defeat. Sixteen clubs initially were granted admission to the new league based on athletic

1 Deutscher Fussball-Bund, �History,� http://www.dfb.de/index.php?id=311002 (accessed January 4, 2013).

KEL754 GERMANY�S BUNDESLIGA

KELLOGG SCHOOL OF MANAGEMENT 3

performance, economics, and infrastructural criteria. Enthusiasm developed quickly, and 327,000 people watched Germany�s first professional football matches on August 24, 1963.2

The Bundesliga was organized in two divisions, the 1 and 2 Bundesliga, with the former drawing far more fan attention than the latter. In 2001 the German Football League was formed to oversee all regular-season and playoff matches, licensing, and operations for both divisions. As of 2012, eighteen teams competed in each division.

The season ran from August to May, with most games played on weekends. Each team played every other team twice, once at home and once away. The winner of each match earned three points, the loser received no points, and a draw earned one point for each team. At the end of the season, the top team from the 1 Bundesliga was awarded the �Deutsche Meisterschaft� (German Championship, the Bundesliga title). (The fans jokingly referred to the cup given to the champion as the �Salad Bowl.�) In 2012 the top three teams of the 1 Bundesliga qualified for the prestigious European club championship known as the Champions League, and the fourth-place team was given the opportunity to compete in a playoff round for a Champions League spot. Within the league, the bottom two teams from the 1 Bundesliga were relegated to the 2 Bundesliga and the top two teams from the 2 Bundesliga were promoted. The team that came in third from the bottom in the 1 Bundesliga played the third-place team of the 2 Bundesliga for the final spot in the top league for the following season.

Based on the number of spectators, German football was the most popular sport in the world after the U.S. National Football League�it had higher attendance per game than Major League Baseball, the National Basketball Association, and the National Hockey League in the United States. More people attended football games in Germany than in any other country (see Exhibit 1). From a performance perspective, the UEFA ranked the Bundesliga as the third best league in Europe after Spain and England.3 Germany had also distinguished itself as one of the two most successful participants in World Cup history.4

* * *

Dully roared with glee a few minutes later as Dortmund midfielder Mario Götze evened the score with a shot that sliced through a pack of players before finding the bottom corner of the Bayern goal.

This is the magic of German football, he reflected. The neck-and-neck races between the top few teams, the surprises, the upsets, the legends like Franz Beckenbauer and Lothar Matthäus. And of course, there were the magical moments, perhaps none more so than that rainy 1954 day when Germany�s David defeated the Hungarian Goliath and stunned the world by winning the World Cup in what came to be called the Miracle of Berne.

�Call me mad, call me crazy!�5 the announcer had shrieked over the airwaves when Helmut Rahn nudged the ball past Hungarian goalkeeper Gyuli Grosics and gave Germany the lead over

2 Silvio Vella, �The Birth of Professional Football in Germany,� Malta Independent, July 28, 2012. 3 UEFA Rankings, http://www.uefa.com/memberassociations/uefarankings/country/index.html (accessed January 4, 2013). 4 FIFA, �All-Time FIFA World Cup Ranking 1930�2010,� http://www.fifa.com/aboutfifa/officialdocuments/doclists/matches.html (accessed January 4, 2013). 5 Ulrich Hesse-Lichtenberger, Tor!: The Story of German Football (London: WSC Ltd, 2003), 126.

GERMANY�S BUNDESLIGA KEL754

4 KELLOGG SCHOOL OF MANAGEMENT

the Hungarians, a team that had gone unbeaten for thirty-one straight games in the preceding four years and was considered the undisputed superpower of world football.6 Minutes later, the Germans raised the Jules Rimet World Cup trophy high for the first time.

Bundesliga Finances: The Envy of International Football

Most European football clubs wrestled with finances: In the 2010�2011 season, the twenty clubs in the English Premier League showed £2.4 billion in debt,7 a figure surpassed by the twenty Spanish La Liga clubs, which hit �3.53 billion (£2.9 billion).8 In contrast, the thirty-six Bundesliga clubs showed a net profit of �52.5 million in 2010�2011. The Bundesliga had the distinction of being the most profitable football league in the world.

In 2010�2011 the Bundesliga had revenues of �2.29 billion, more than half of which came from advertising and media management (see Exhibit 2).9 Television was one of the largest sources of income. This money was split between the football clubs according to their performance during the season.

Secrets of the Bundesliga�s success included club ownership policies, strict licensing rules, and low ticket costs. With a few notable exceptions, German football clubs were large membership associations with the same majority owner: their members. League regulations dictated a 50+1 rule, which meant that club members had to maintain control of 51 percent of shares. This left room for private investment without risking instability as a result of individual entrepreneurs with deep pockets taking over teams and jeopardizing long-term financial stability for short-term success on the field.

Bundesliga licensing procedures mandated that clubs had to open their books to league accountants and not spend more than they made in order to avoid fines and be granted a license to play the following year. Among a host of other stipulations, precise rules established liquidity and debt requirements; Teutonic efficiency had little patience for inflated transfer fees and spiraling wages that could send clubs into financial ruin.

Football player salaries were the highest of any sport in the world. A 2012 ESPN survey revealed that seven of the top ten highest-paying sports teams were football clubs, with U.S. major league baseball and basketball clubs rounding out the set. FC Barcelona�s players led the world�s professional athletes with an average salary of $8.68 million�a weekly salary of $166,934. Real Madrid players followed close behind with an average salary of $7.80 million per year.10

While the salaries were impressive, the cost of transferring players between countries and leagues could be even more so. A transfer fee was paid to a club for relinquishing a player (either still under contract or with an expired contract) to an international counterpart, and such transfers

6 FIFA, �1954 World Cup Switzerland,� http://www.fifa.com/worldcup/archive/edition=9/overview.html. 7 Deloitte Annual Review of Football Finance, May 31, 2012. 8 �La Liga Debt Crisis Casts a Shadow Over On-Pitch Domination,� Daily Mail, April 19, 2012. 9 Bundesliga Annual Report 2012, p. 50. 10 Jeff Gold, �Highest-Paying Teams in the World,� ESPN, May 2, 2012.

KEL754 GERMANY�S BUNDESLIGA

KELLOGG SCHOOL OF MANAGEMENT 5

were regulated by football�s world governing body, the Fédération Internationale de Football Association (FIFA). Historically, transfers were permitted twice a year�for a longer period during the summer between seasons, and for a shorter period during the winter partway through the season. FIFA reported that $3 billion was spent transferring players between teams in 2011 and that a transfer was conducted every 45 minutes.11 Although the average transfer fee was $1.5 million in 2011, clubs often paid top dollar to secure star power. In 2011 thirty-five players transferred at fees exceeding �15 million,12 including Javier Pastore, who transferred from Palermo to Paris Saint-Germain for �42 million.13 The highest transfer fee ever paid was �94 million by Real Madrid to Manchester United for Cristiano Ronaldo in 2009.

After financial crises in the business world demonstrated that no company was �too big to fail� and evidence to this effect began mounting in the football world, the UEFA approved fair play legislation in 2010 requiring teams to live within their means or face elimination from competition. The policies were designed to prevent football teams from crumpling under oppressive debt and to ensure a more stable economic future for the game.14 The legislation was to be phased in over several years, with some key components taking effect in the 2011�2012 season.

Because the Bundesliga already operated under a system that linked expenditure with revenue, wealth was relatively evenly distributed among the clubs, and teams could not vastly outspend one another as was frequently the case in the Spanish La Liga and the British Premier League. As a result, a greater degree of competitive parity made for exciting matches and competition for the Deutsche Meisterschaft.

The league�s reasonable ticket prices made Germany arguably one of the greatest places in the world to be a football fan. A BBC survey revealed that the average price of the cheapest match ticket in the Premier League was £28.30 ($46), but season tickets to Dortmund matches, for example, cost only �225 ($14 per game including three Champions League games) and included free rail travel. In comparison, season tickets to Arsenal matches (the most expensive in the Premier League) cost £1,955 ($3,154) for 2012�2013.15

Germany had some of the biggest and most modern stadiums in the world as the result of �1.4 billion spent by the government expanding and refurbishing them in preparation for hosting the 2006 World Cup.16 According to the London Times, two German stadiums made the list of the world�s ten best football venues�the Signal Iduna Park (formerly known as Westfalenstadion) in Dortmund (ranked number one) and the Allianz Arena in Munich (number five).

During the 2010�2011 season, more than 17 million people watched Bundesliga football matches live in stadiums, and the 1 Bundesliga attendance averaged a record-breaking 42,101 per game.17 The average attendance at Dortmund�s Signal Iduna Park in the first half of the 2012�

11 Tom McGowan, �A FIFA First: Football�s Transfer Figures Released,� CNN, March 6, 2012. 12 Mark Chaplin, �Financial Fair Play�s Positive Effects,� UEFA News, August 31, 2012. 13 �PSG Complete Record-Breaking Pastore Transfer,� UEFA News, August 6, 2011. 14 �Financial Fair Play Regulations Are Approved,� UEFA News, May 27, 2010. 15 �Ticket Prices: Arsenal Costliest,� ESPN News, October 18, 2012. 16 �German Football Success: A League Apart,� The Economist, May 16, 2012. 17 Bundesliga Annual Report 2012, p. 56.

GERMANY�S BUNDESLIGA KEL754

6 KELLOGG SCHOOL OF MANAGEMENT

2013 Bundesliga season was 80,577.18 In addition, around 18 million people�nearly a quarter of the country�tuned in to the Bundesliga matches on television each weekend.19 No other leisure time activity consistently generated that level of interest in Germany.

FC Bayern München

In the Bundesliga�s fifty-year history, FC Bayern München had been a perennial powerhouse; the club boasted twenty-one title victories and an aggregate advantage of nearly 500 points in the �eternal league table.�

Conventional wisdom held that clubs with a higher market value were more likely to win championships because they could afford to pay the highest wages and transfer fees to attract the best talent. FC Bayern was the eighth highest-paying sports team in the world, with an average salary of $5.9 million per player according to ESPN in 2012.20 The highest transfer fee ever paid in the Bundesliga occurred in the summer of 2012 when Bayern bought midfielder Javi Martinez from the Spanish team Athletic Bilbao for �40 million.21 Bayern�s appearance in the Champions League in eleven of the previous twelve years (including one first-place and two second-place finishes) raised the team to new heights on the international stage and increased its brand value; in 2012 it was the second most valuable football club brand in the world according to Brand Finance, a leading independent brand valuation consultancy (see Table 1).

Table 1: Bundesliga Club Brand Value and Average Player Salary

Club Number of Titles

2012 Rank

2012 Market Value ($ in millions)

Average Annual Salary per Player for the 2011�2012 Season ($ in millions)

FC Bayern München 21 2 786 5,907,652 FC Schalke 04 0 10 266 4,187,722 Borussia Dortmund 5 11 227 3,122,824 Hamburger SV 3 17 153 2,579,904 VfB Stuttgart 3 28 71 2,721,154 SV Werder Bremen 4 30 68 2,734,924

Source: Brand Finance Football Brands 2012 and Jeff Gold, �Highest-Paying Teams in the World,� ESPN, May 2, 2012.

Bayern was also the only Bundesliga club to appear on the Forbes magazine list of the fifty most valuable sports franchises worldwide. It was one of five football teams that consistently appeared alongside the National Football League teams that dominated the list�from 2010 to 2012, the club�s ranking climbed from 27 to 14. In 2012 the magazine estimated that Bayern had the fourth highest revenue of any football team in the world and valued the club at $1.23 billion.22

18 �Europe�s Getting to Know Dortmund,� Bundesliga News, December 26, 2012. 19 �Sky Strikes Bundesliga Deal with Deutsche Telekom,� Reuters, January 4, 2013. 20 Gold, �Highest-Paying Teams in the World.� 21 �Javi Martinez Joins Bayern Munich,� ESPN News, August 29, 2012. 22 Kurt Badenhausen, �Manchester United Tops the World�s 50 Most Valuable Sports Teams,� Forbes, July 16, 2012.

KEL754 GERMANY�S BUNDESLIGA

KELLOGG SCHOOL OF MANAGEMENT 7

Despite Bayern�s privileged position, competition in the league remained strong. All eighteen of the 1 Bundesliga teams ranked among the top 200 highest-paying sports teams in the world, with average salaries above $1.3 million per year for the 2011�2012 season.23 The Bundesliga�s depth kept seasons interesting: since 2000, five different teams had won the title and two more had been Herbstmeister (see Exhibit 3).

Seeking Correlation

Dully flipped off the television and went to the kitchen to get some food. The match had ended in a 1�1 draw, leaving the country in suspense over whether Bayern would run away from the pack in the league table or if Dortmund could catch up. The phone rang again.

�Have you proven me right yet?� Vogel asked above the din.

�No,� said Dully. �I�m averse to promoting �financial doping.��

�You always were an idealist,� Vogel observed. �Or a purist or something.�

�I�m the complement to your cynicism.�

�Ah yes, that must be why we get along so well. I�d like to see your analysis, though, when you actually come up with some.�

�Funny you should ask for that,� Dully said. �I�ll get back to you. Maybe.�

After a few more minutes of banter followed by well-intentioned plans for catching up someday soon, the friends hung up. Dully returned to the living room and flopped on the couch.

The analyst wondered about the future of a Bundesliga with one team that was much wealthier than the rest�would it remain competitive and exciting or, as Vogel said, would �money shoot goals� and give those rich Bayern the German Cup year after year?

Dully returned to the spreadsheet he had started during the match, looking for a statistical correlation between money and Bundesliga success.

23 Gold, �Highest-Paying Teams in the World.�

GERMANY�S BUNDESLIGA KEL754

8 KELLOGG SCHOOL OF MANAGEMENT

Exhibit 1: Comparison of Sporting League Attendance Worldwide, 2010�2011 Season League Average Attendance per Game U.S. National Football League 66,960 German Bundesliga 42,690 Australian A-League 38,243 British Premier League 35,283 U.S. Major League Baseball 30,066 Spanish La Liga 29,128 Mexican Liga MX 27,178 Italian Serie A 24,031 French Ligue 1 19,912 Dutch Eredivisie 19,116

Source: ESPN Soccer Zone, WorldFootball.net, and Bundesliga Annual Report 2012, p. 56.

Exhibit 2: Bundesliga Revenue

1 BUNDESLIGA REVENUE

Sector Revenue (� in thousands) % Revenue Match earnings 411,164 21.17 Advertisement 522,699 26.92 Media management 519,629 26.76 Transfers 195,498 10.07 Merchandising 79,326 4.08 Other 213,665 11.00

Total 1,941,980 100

Source: �Bundesliga Report 2012: The Economic State of German Professional Football,� January 23, 2012.

TOTAL REVENUE FOR 1 AND 2 BUNDESLIGA

Sector Revenue (� in thousands) % Revenue Match earnings 469,510 20.41 Advertisement 634,010 27.57 Media management 629,079 27.35 Transfers 215,110 9.35 Merchandising 89,493 3.89 Other 262,779 11.43

Total 2,299,980 100

Source: �Bundesliga Report 2012: The Economic State of German Professional Football,� January 23, 2012.

KEL754 GERMANY�S BUNDESLIGA

KELLOGG SCHOOL OF MANAGEMENT 9

Exhibit 3: Bundesliga Mid-Season Leaders and Champions Season Mid-Season Leader Champion 2012�2013 FC Bayern München 2011�2012 FC Bayern München Borussia Dortmund 2010�2011 Borussia Dortmund Borussia Dortmund 2009�2010 Bayer 04 Leverkusen FC Bayern München 2008�2009 1899 Hoffenheim VfL Wolfsburg 2007�2008 FC Bayern München FC Bayern München 2006�2007 SV Werder Bremen VfB Stuttgart 2005�2006 FC Bayern München FC Bayern München 2004�2005 FC Bayern München FC Bayern München 2003�2004 SV Werder Bremen SV Werder Bremen 2002�2003 FC Bayern München FC Bayern München 2001�2002 Bayer 04 Leverkusen Borussia Dortmund 2000�2001 FC Schalke 04 FC Bayern München 1999�2000 FC Bayern München FC Bayern München 1998�1999 FC Bayern München FC Bayern München 1997�1998 1.FC Kaiserslautern 1.FC Kaiserslautern 1996�1997 FC Bayern München FC Bayern München 1995�1996 Borussia Dortmund Borussia Dortmund 1994�1995 Borussia Dortmund Borussia Dortmund 1993�1994 Eintracht Frankfurt FC Bayern München 1992�1993 FC Bayern München SV Werder Bremen 1991�1992 Eintracht Frankfurt VfB Stuttgart 1990�1991 SV Werder Bremen 1.FC Kaiserslautern 1989�1990 FC Bayern München FC Bayern München 1988�1989 FC Bayern München FC Bayern München 1987�1988 SV Werder Bremen SV Werder Bremen 1986�1987 Hamburger SV FC Bayern München 1985�1986 SV Werder Bremen FC Bayern München 1984�1985 FC Bayern München FC Bayern München 1983�1984 VfB Stuttgart VfB Stuttgart 1982�1983 Hamburger SV Hamburger SV 1981�1982 1.FC Köln Hamburger SV 1980�1981 Hamburger SV FC Bayern München 1979�1980 FC Bayern München FC Bayern München 1978�1979 1.FC Kaiserslautern Hamburger SV 1977�1978 1.FC Köln 1.FC Köln 1976�1977 Borussia Mönchengladbach Borussia Mönchengladbach 1975�1976 Borussia Mönchengladbach Borussia Mönchengladbach 1974�1975 Borussia Mönchengladbach Borussia Mönchengladbach 1973�1974 FC Bayern München FC Bayern München 1972�1973 FC Bayern München FC Bayern München 1971�1972 FC Schalke 04 FC Bayern München 1970�1971 FC Bayern München Borussia Mönchengladbach 1969�1970 Borussia Mönchengladbach Borussia Mönchengladbach 1968�1969 FC Bayern München FC Bayern München 1967�1968 1.FC Nürnberg 1.FC Nürnberg 1966�1967 Eintracht Braunschweig Eintracht Braunschweig 1965�1966 TSV 1860 München TSV 1860 München 1964�1965 SV Werder Bremen SV Werder Bremen 1963�1964 1.FC Köln 1.FC Köln

Source: Bundesliga, �History Stats,� http://www.bundesliga.com/en/stats/history (accessed January 4, 2013).

GERMANY�S BUNDESLIGA KEL754

10 KELLOGG SCHOOL OF MANAGEMENT

Questions

PART I

1. What were the smallest, average, and largest market values of football teams in the Bundesliga in the 2011�2012 season?

2. Develop a regression model that predicts the number of points a team earns in a season based on its market value. Write down the estimated regression equation.

3. Are the regression coefficients statistically significant? Explain.

4. Carefully interpret the slope coefficient in your regression in the context of the case.

5. Conventional wisdom among football traditionalists states that the aggregate number of points at the end of a Bundesliga season closely correlates with the market value of a club. Simply put, �money scores goals,� which in turn lead to wins and points. Comment on this wisdom in light of your regression equation.

6. Some of the (estimated) market values at the beginning of the 2012�2013 season were as follows:

SC Freiburg �46,650,000

1.FSV Mainz 05 �46,000,000

Eintracht Frankfurt �49,400,000

Provide a point estimate for the difference between the number of points Eintracht Frankfurt and 1.FSV Mainz 05 will earn in the 2012�2013 season.

7. Provide a point estimate and a 95% interval for the number of points SC Freiburg will earn in the 2012�2013 season.

PART I I

The first half of a Bundesliga season ends in mid-December. After a break for the holiday season and potentially bad winter weather (which could lead to the cancellation of games) the league resumes play in late January.

8. Develop a regression model that predicts the number of points a team earns at the end of a season based on its market value and the number of points it earned during the first half of the season. Write down the estimated regression equation.

9. Carefully interpret the two slope coefficients in your regression in the context of the case.

10. Compare your regression equation to the simple linear regression you obtained in Part I. How did the coefficient of the variable Marketvalue_2011_Mio (� in millions) change? Provide an explanation for the difference.

KEL754 GERMANY�S BUNDESLIGA

KELLOGG SCHOOL OF MANAGEMENT 11

11. Drop all insignificant variables (use = 0.05). Write down the final regression equation.

12. At the beginning of the 2012�2013 season, the market value of Borussia Mönchengladbach was estimated to be �88,350,000; the market value of 1.FC Nürnberg was estimated at �41,500,000. During the first half of the 2012�2013 season, Borussia Mönchengladbach earned 25 points and 1.FC Nürnberg, 20 points.

Provide a point estimate and an 80% interval for the number of points Borussia Mönchengladbach will earn in the 2012�2013 season.

13. Provide a point estimate for the difference between the number of points Borussia Mönchengladbach and 1.FC Nürnberg will earn in the 2012�2013 season.

14. An intuitive claim may be that, on average, a team earns twice as many points in an entire season as it earns in the first half of the season. Put differently, on average, the total number of a team�s points should just be two times the number of points at mid-season. Can you reject this claim based on your regression model (at a significance level of = 0.05)?

©2015 by the Kellogg School of Management at Northwestern University. This case was prepared by Markus Schulze (Kellogg-WHU ’16) under the supervision of Professor Karl Schmedders. It is based on Markus Schulze’s EMBA master’s thesis. Cases are developed solely as the basis for class discussion. Cases are not intended to serve as endorsements, sources of primary data, or illustrations of effective or ineffective management. To order copies or request permission to reproduce materials, call 847.491.5400 or e-mail [email protected]. No part of this publication may be reproduced, stored in a retrieval system, used in a spreadsheet, or transmitted in any form or by any means—electronic, mechanical, photocopying, recording, or otherwise—without the permission of Kellogg Case Publishing.

KARL SCHMEDDERS AND MARKUS SCHULZE 5-215-250 

Solid as Steel: Production Planning at ThyssenKrupp

On Monday, March 31, 2014, production manager Markus Schulze received a call from Reinhardt Täger, senior vice president of ThyssenKrupp Steel Europe’s production operations in Bochum, Germany. Täger was preparing to meet with the company’s chief operating officer and was eager to learn the reasons why the current figures of one of Bochum’s main production lines were far behind schedule. Schulze explained that the line had had three major breakdowns in early March and therefore would miss the planned utilization rate for that month. Consequently, the scheduled production volume could not be carried out. Schulze knew that a lack of production capacity utilization would lead to unfulfilled orders at the end of the planning period. In a rough steel market with fierce competition, however, delivery performance was an important differentiation factor for ThyssenKrupp.

Täger wanted a chance to review the historic data, so he and Schulze agreed to meet later that week to continue their discussion.

After looking over the production figures from the past ten years, Täger was shocked. When he met with Schulze later that week, he expressed his frustration. “Look at the historic data!” Täger said. “All but one of the annual deviations from planned production are negative. We never achieved the production volumes we promised in the planning meetings. We need to change that!”

“I agree,” Schulze replied. “Our capacity planning is based on forecast figures that are not met in reality, which means we can’t fulfill all customers’ orders in time. And the product cost calculations are affected, too.”

“You’re right,” Täger said. “We need appropriate planning figures to meet the agreed delivery time in the contracts with our customers. What do you think would be necessary for that?”

“Hm, I guess we need a broad analysis of data to identify the root causes.” Schulze answered. “It’ll take some time to build queries for the databases and aggregate data. And—”

“Stop!” Täger interrupted him. “We need data for the next planning period. The planning meeting for May is in two weeks.”

PRODUCTION PLANNING AT THYSSENKRUPP 5-215-250

2 KELLOGG SCHOOL OF MANAGEMENT

ThyssenKrupp Steel Europe

ThyssenKrupp Steel Europe, a major European steel company, was formed in a 1999 merger between historic German steel makers Thyssen and Krupp, both of which had been founded in the nineteenth century. ThyssenKrupp Steel Europe annually produced up to 12 million metric tons of steel with its 26,000 employees. In fiscal year 2013–2014, the company accounted for €9 billion of sales, roughly a quarter of the group sales of its parent company, ThyssenKrupp AG, which traded on the DAX 30 (an index of the top thirty blue-chip German companies). Its main drivers of success were customer orientation and reliability in terms of product quality and delivery time.

Bochum Production Lines

The production lines at ThyssenKrupp Steel’s Bochum site were supplied with interim products delivered from the steel mills in Duisburg, 40 kilometers west of Bochum. Usually, slabs1 were brought to Bochum by train and then processed in the hot rolling mill (see Figure 1). The outcome of this production step was coiled hot strip2 (see Figure 2) with mill scale3 on its surface. Whether the steel would undergo further processing in the cold rolling mill or would be sold directly as “pickled hot strip,” the mill scale needed to be removed from the surface.

The production line in which Täger and Schulze were interested, a so-called push pickling line (PPL), was designed to remove mill scale from the upstream hot rolling process. To remove the scale, the hot strip was uncoiled in the line and the head of the strip was pushed through the line. The processing part of the line held pickling containers filled with hot hydrochloric acid, which removed the scale from the surface. Following this pickling, the strip was pushed through a rinsing section to remove any residual acid from the surface. After oiling for corrosion protection, the strip was coiled again. The product of this step, pickled hot strip, could be sold to B2B customers, mainly in the automotive industry.

Other types of pickling lines were operated as continuous lines, in which the head of a new strip was welded to the tail of the one that preceded it. The differentiating factor of a PPL was its batching process, which involved pushing in each strip individually. Production downtimes due to push-in problems did not occur at continuous lines, but with PPLs this remained a concern.

1 Slabs are solid blocks of steel formed in a continuous casting process and then cut into lengths of about 20 meters. 2 A coiled hot strip is an intermediate product in steel production. Slabs are rolled at temperatures above 1,000°C. As they thin out they become longer; the result is a flat strip that needs to be coiled. 3 Mill scale is an iron oxide layer on the hot strip’s surface that is created just after hot rolling, when the steel is exposed to air (which contains oxygen). Mill scale protects the steel to a certain extent, but it is unwanted in further processes such as stamping or cold rolling.

Figure 1. Source: ThyssenKrupp AG, http://www.thyssenkrupp.com/en/presse/bilder.html&photo_id=898.

Figure 2. Source: ThyssenKrupp AG, http://www.thyssenkrupp.com/en/presse/bilder.html&photo_id=891.

5-215-250 PRODUCTION PLANNING AT THYSSENKRUPP

KELLOGG SCHOOL OF MANAGEMENT 3

Nevertheless, ThyssenKrupp chose to build a PPL in 2000 because increasing demand for high-strength steel made it profitable to invest in such a production line. At that time, high-strength steel grades could not be welded to one another with existing machines, and the dimensions (at a thickness of more than 7.0 millimeters) could not be processed in continuous lines.

The material produced on the PPL was not simply a commodity called steel. Rather, it was a portfolio of different steel grades—that is, different metallurgical compositions with specific mechanical properties. (For purposes of this case, the top five steel grades in terms of annual production volume have been randomly assigned numbers from 1 to 5.) Within these top five grades were two high-strength steel grades. These high-strength grades were rapidly cooled after the hot rolling process—from around 1,000°C down to below 100°C. Removing the mill scale generated during this rapid cooling process required a different process speed in the pickling line. Only one of the five grades could be processed without limitations in speed and without expected downtimes.

Performance Indicators

At ThyssenKrupp, managers responsible for production lines needed to report regularly on the performance of the lines and the fulfillment of individual objectives. The output, or throughput, of the production lines had always been an important metric. Even today, coping with overcapacities and customers’ increasing demands concerning product quality, the line throughput was part of the set of key performance indicators. These indicators were taken into account for internal benchmarking against comparable production lines at other sites. The line-specific variable production cost was calculated as cost over throughput and was expressed in euros per metric ton. Capacity planning was based on these figures, eventually resulting in delivery time performance. In the steel industry, production reports contained performance indicators at different levels of aggregation. A very important metric was throughput (tons4 produced) per time unit5; the performance indicator run time ratio6 (RTR) was the portion of time used for production (run time) compared to the operating time of a production line.

Operating time = Calendar time – (legal holidays, shortages,7 all scheduled maintenance)

Run Time = Operating time – (breakdowns, exceeding downtime for maintenance, set-up time)

Both figures were reported not only on a daily basis (i.e., a 24-hour production period) but also monthly and per fiscal year. Deviations from planned figures were typically noted in automated reports containing database queries. Thus, every plant manager received an overview of past periods. Comparable production lines of different sites were benchmarked internally.

4 Throughout this case, the term “ton” refers to a metric ton. 5 Tons produced are usually reported by shift (eight hours), by month, and eventually by fiscal year. 6 The metric run time ratio is calculated as run time over operating time (e.g., 8 hours of operating time, or 480 minutes, with 48 minutes of downtime yields a RTR of 90%). 7 Shortages can refer to material shortages, lack of orders, labor disputes, or energy/fuel shortages (external).

PRODUCTION PLANNING AT THYSSENKRUPP 5-215-250

4 KELLOGG SCHOOL OF MANAGEMENT

Deviation from Planned Throughput

Steel production lines had typical characteristics and an average performance calculated based on an average production portfolio, mostly determined empirically using historic figures. For planning purposes, a fixed number was usually used to place order volumes on the production lines and in this way “fill capacities.” On a monthly basis, real orders then were placed to a certain amount, which was capped by the line capacity. Each month’s production figures had three possible outcomes.

The first possibility was that the planned throughput would be reached and at the end of the month there would be extra capacity. In this case, the extra capacity would be filled with orders from the next month if the intermediate product already were available for processing. Otherwise, the line would stand still without fulfilling orders. This mode was very expensive because idle capacity would be wasted, and fixed costs occurred anyway.

The second possibility was that the planned throughput would not be reached. This would mean that at the end of the month, orders would be left that could not be fulfilled. This mode was also very expensive because the planned capacity could not be used, and real production costs were higher than pre-calculated. Product calculation would result in prices that were too low, so contribution margins would be much lower than expected—or even negative.

In the third scenario, the exact planned throughput would be met (+/- 100 tons per month, or +/- 1,200 tons per year, was set as accurate). This was the ideal case, but this had occurred only once in the first ten years of line history (see the annual figures in Table 1).

Table 1: Annual Deviation from Planned Production in the First Ten Years of Line Operation

Year of Operation Annual Deviation from Planned

Production (tons)

1 - 23,254

2 - 22,691

3 + 1,115

4 - 22,774

5 - 2,807

6 - 20,363

7 (financial crisis) - 66,810

8 - 21,081

9 - 4,972

10 - 9,486

Each month, production management had to explain the deviation from planned figures. Many reasonable explanations had been given in the past. Major breakdowns were a common explanation because downtimes directly influenced the RTR. The RTR theory—the lower the run time ratio, the higher the negative deviation from the plan—was often mentioned as the dominating force behind the PPL not achieving the planned throughput.

The production engineers’ gut feeling was that a straightforward reason would explain patterns that showed peaks “against the RTR theory,” namely the material structure: The resulting

5-215-250 PRODUCTION PLANNING AT THYSSENKRUPP

KELLOGG SCHOOL OF MANAGEMENT 5

throughput can be explained on the basis of whether the material structure is favorable or unfavorable. A specific metric of the structure was the ratio meters per ton (MPT), a dimension indicator. The MPT theory reflected the fact that material with a low thickness and/or a low width carried a lower weight per meter. In other words, it took longer to put one ton of material through the production line if the process speed remained constant. According to the MPT theory, negative deviations in months with average or above-average RTR could be explained by this metric.

Data

Schulze realized he had to compile data carefully in order to have any hope of finding possible explanations for the deviations from planned throughput. He decided to define aggregate clusters for material dimensions such as the width and the thickness of the strips.

The technical data of the Bochum PPL relevant to the data collection were:

Width: 800 to 1,650 mmThickness: 1.5 to 12.5 mmMaximum throughput: 80,000 tons per month

Then Schulze reviewed available past production data, beginning with the night shift on October 1, 2013, up until the early shift on April 4, 2014. Unfortunately, he had to omit a few shifts during this six-month period because of missing or obviously erroneous data. Schulze’s data set accompanies this case in a spreadsheet.

The explanation of the variables in the data set is as follows:

Shift: The day and time at the beginning of a shift.

Shift type: The production line operated 24/7 with three eight-hour shifts; the early shift (“E”) started at 6 a.m., the late (or Midday) shift (“M”) started at 2 p.m., and the night shift (“N”) started at 10 p.m.

Shift number: ThyssenKrupp Steel used a continuous rolling shift system with five different shift groups (shift group 1, shift group 2, etc.). The binary variables indicate whether the shift group i worked a particular shift.

Weekday: The line operated Monday through Sunday, but engineers usually workedMonday to Friday on a dayshift basis (usually starting at 7 a.m.).

Throughput: The throughput (in tons) during a shift.

Delta throughput: The deviation (in tons) of actual throughput from planned throughput.

PRODUCTION PLANNING AT THYSSENKRUPP 5-215-250

6 KELLOGG SCHOOL OF MANAGEMENT

MPT: A dimension indicator (meters per ton).

Thickness clusters: Each cluster represented a certain scope of material thickness in millimeters within the technical feasible range of the production line. Strips fell into one of three clusters. The variables “thickness 1,” “thickness 2,” and “thickness 3” denote the number of strips from the first, second, and third thickness clusters, respectively, that were processed during a shift.

Width clusters: Each cluster represented a certain scope of material width in millimeters within the technically feasible range of the production line. Strips fell into one of three width clusters. The variables “width 1,” “width 2,” and “width 3” denote the number of strips from the first, second, and third width clusters, respectively, that were processed during a shift.

Steel grades: Strips of many different steel grades were processed on the line. The steel grades 1 to 5 are the grades with the largest portion by volume. The variables “grade 1,” “grade 2,” “grade 3,” “grade 4,” and “grade 5” denote the proportion (in %) of steel of that grade that was processed during a given shift. The remaining strips were of other steel grades; their proportion is given by “grade rest.”

RTR: The run time ratio (in %), which is calculated as run time divided by operating time.

Schulze quickly realized he had data on more variables than he could employ for his analysis. Obviously, the total number of strips in the three width clusters had to be the same as the total number of strips in the three thickness clusters. Similarly, the proportions of the six different steel grades always added up to 100%. Schulze also decided to omit the dimension indicator (MPT) for his own analysis, as he now had much more detailed and reliable information about the size of the strips.

After the analysis of the aggregated and clustered data, Schulze looked at his prediction model for delta throughput. From his experience, he knew he had found the key drivers for deviations from the planned production volume. “Look at this equation,” he said to the production engineer in charge of the PPL. “The model coefficients determine the outcome, which is the deviation from planning. If we had the forecast figures for May, I could predict the deviation based on this model. Please get the numbers of coils from the different clusters and the proportions of the different steel grades. For the RTR, I’m guessing 86% is an appropriate figure.”

5-215-250 PRODUCTION PLANNING AT THYSSENKRUPP

KELLOGG SCHOOL OF MANAGEMENT 7

Assignment Questions

PART A: IN IT IAL ANALYSIS

First, obtain an initial overview of the data. Next, plan to examine the two theories proposed by the production engineers.

Questions:

1. Perform a univariate analysis and answer the following questions:

a. What is the average number of strips per shift?

b. Strips of which thickness cluster are the most common, and strips of which thickness cluster are the least common?

c. What are the minimum, average, and maximum values of delta throughput and RTR?

d. Are there shifts during which the PPL processes strips of only steel grade 1, or of only steel grade 2, etc.?

2. Can the RTR theory adequately explain the deviations from the planned production figures? Explain why or why not.

3. Is the MPT theory sufficient to explain the deviations? Explain why or why not.

PART B: SCHULZE’S MODEL

Now interpret Schulze’s model.

Questions:

4. Develop a sound regression model that can be used to predict delta throughput based on the characteristics of the strips scheduled for production. Include only explanatory variables that have a coefficient with a 10% level of significance.

5. Interpret the coefficient of RTR for the PPL and provide a 90% confidence interval for the value of the coefficient (in the population).

6. A strip of thickness 1 and width 1 is replaced by a strip of thickness 3 and width 3. This change does not affect any other aspect of the production. Provide an estimate for the change in delta throughput.

PART C: PREDICTION OF MAY THROUGHPUT

Two weeks after the first phone call about the deviations of production figures from planned volumes, Schulze was happy to have a sound prediction model on hand. Now he was looking forward to applying the model for future planning periods. The planning meeting for May was scheduled for the next day, and the production engineers have provided the requested material-structure data that would serve as input for the model.

“Let’s see what the prediction tells us,” Schulze said to Täger. As usual, the initial plan included an average capacity of 750 tons per shift. “I’m pretty sure the initial estimate will yield a

PRODUCTION PLANNING AT THYSSENKRUPP 5-215-250

8 KELLOGG SCHOOL OF MANAGEMENT

useful first benchmark, but we also need to look at the uncertainty in the forecast,” Schulze continued, and he entered the data.

“All right,” Täger replied. “I can see the predicted deviation from planned production for the next month in the model. We should show this in the planning meeting tomorrow and adjust the line capacity for May.”

The next day, the predicted outcome was included in the monthly planning for the very first time. A new era of production planning at ThyssenKrupp Steel Europe had begun.

Next, determine Schulze’s forecast.

Questions:

7. The table below shows the data provided by the production engineers. Because of major upcoming maintenance on the PPL, only 84 shifts were planned for the month of May. Provide an estimate for the average delta throughput per shift in May based on these estimated figures. (The actual figures are, of course, still unknown.)

Table 2: Planned Production in May (units of all forecasts: numbers of strips) Characteristic Forecast

Thickness 1 996

Thickness 2 1,884

Thickness 3 434

Width 1 1,242

Width 2 1,191

Grade 1 109

Grade 2 709

Grade 3 167

Grade 4 243

Grade 5 121

8. Provide a 90% confidence interval for the average delta throughput per shift in May.

9. An RTR of 86% for a production facility such as the Bochum PPL is considered a good value. A value of 90% would be considered world class. The effort to increase production performance measured in RTR by just one percentage point, from 86% to 87%, is assumed to be very costly. In light of your model, would you expect such a performance improvement to pay for itself?

PART D: ADDITIONAL ANALYSIS

Schulze’s prediction model led to an intensive discussion in the production-planning meeting that provided him with much food for thought. As a result, he decided to analyze whether the inclusion of some human or timing factors potentially could enhance his prediction model.

In the final part of the analysis, consider some enhancements to your model.

5-215-250 PRODUCTION PLANNING AT THYSSENKRUPP

KELLOGG SCHOOL OF MANAGEMENT 9

Questions:

10. Determine whether, for given production quantities, the performance of the PPL depends on the group working each shift. Can you detect any significantly over- or under-performing shift groups?

11. Tests and rework are regularly scheduled on early shifts during the week (but not on weekends). Both involve interruptions and slow process speed, which are not indicated as downtimes and are not included in the RTR. As a result, all else being equal, early shifts during the week should process less steel than the other shifts. Can you show the presence of this effect?

12. Provide a final critical evaluation of your prediction model. What are the key insights with respect to production planning at the Bochum PPL? What are the weaknesses of your model?

KH19, Exercises

1

Exercises

QUESTION 1

Unoccupied seats on flights cause airlines to lose revenues. A large airline wants to estimate its

average number of unoccupied seats per flight over the past year. To accomplish this, the records

of 225 flights are randomly selected, and the number of unoccupied seats is noted for each of the

flights in the sample. The sample mean is 14.5 seats and the sample standard deviation is s = 8.2

seats.

a) Provide a 95% confidence interval for the mean number of unoccupied seats per flight during

the past year.

b) Provide an 80% confidence interval for the mean number of unoccupied seats per flight

during the past year.

c) Can you prove, at a 2% level of significance, that the average number of unoccupied seats per

flight during the last year was smaller than 15.5?

QUESTION 2

During the National Football League (NFL) season, Las Vegas odds-makers establish a point

spread on each game for betting purposes. The final scores of NFL games were compared against

the final spreads established by the odds-makers ahead of the game. The difference between the

game outcome and point spread is called the point-spread error. For example, before the 2003

Super Bowl the Oakland Raiders were established as 3-point favorites over the Tampa Bay

Buccaneers. Tampa Bay won the game by 27 points and so the point-spread error was –30. (Had

the Oakland Raiders won the game by 10 points then the point-spread error would have been +7.)

In a sample of 240 NFL games the average point-spread error was – 1.6. The sample standard

deviation was s = 13.3.

Can you reject that the true mean point-spread error for all NFL games is zero? (significance level

α = 0.05)

KH19, Exercises

2

QUESTION 3

In a random sample of 95 manufacturing firms, 67 respondents have indicated that their company

attained ISO certification within the last two years. Find a 99% confidence interval for the

population proportion of companies that have been certified within the last two years.

QUESTION 4

Of a random sample of 361 owners of small businesses that had gone into bankruptcy, 105

reported conducting no marketing studies prior to opening the business. Can you reject the null

hypothesis that at most 25% of all members of this population conducted no marketing studies

before opening the business (significance level α = 0.05)?

QUESTION 5

Hertz contracts with Uniroyal to provide tires for Hertz’ rental car fleet. A clause in the contract

states that the tires must have a life expectancy of at least 28,000 miles. Of the 10,000 cars in the

Hertz’ fleet, 400 are based in Chicago. The Chicago garage tested the tires on 60 of their cars.

The life spans of the 60 tire sets are listed in the file tires.xls. If Hertz wants to use a 1% level of

significance, should Hertz seek relief from (i.e., sue) Uniroyal? That is, can Hertz prove that the

tires did not meet the contractually agreed (average) life expectancy?

QUESTION 6

Tyler Realty would like to be able to predict the selling price of new homes. They have collected

data on size (“sqfoot” in square feet) and selling price (“price” in thousands of dollars) which are

stored in the file tyler.xls. Download this file from the course homepage and answer the

following questions.

a) Develop a scatter diagram for these data with size on the horizontal axis using KStat. Display

the best fit line in the scatter diagram.

b) Develop an estimated regression equation. Report the KStat regression output.

c) Predict the selling price for a home that is 2,000 square feet.

KH19, Exercises

3

QUESTION 7

The time between eruptions of the Old Faithful geyser in Yellowstone National Park is

random but is related to the duration of the previous eruption. In order to investigate this

relationship you collect data on 21 eruptions. For each observed eruption, you write

down its duration (call it DUR) and the waiting time to the next eruption (call it TIME).

That is, your variables are:

DUR Duration of the previous eruption (in minutes)

TIME Time until the next eruption (in minutes)

You obtain the following regression output from KStat.

Regression: TIME Constant DUR Coefficient 31.01311 9.79006898 std error of coef 4.41658492 1.29990618 t-ratio 7.0220 7.5314 p-value 0.0001% 0.0000%

a) Write down the estimated regression equation, and verbally interpret the intercept and the

slope coefficients (in terms of geysers and eruption times).

b) The most recent eruption lasted 3 minutes. What is your best estimate for the time till

the next eruption?

c) Based on your regression, what is difference between the average time until the next

eruption after a 3.2-minute eruption and the average time until the next eruption after

a 3-minute eruption?