§15.3--Estimation and Prediction - Furman...

Post on 05-Apr-2020

19 views 0 download

Transcript of §15.3--Estimation and Prediction - Furman...

§15.3–Estimation and Prediction

Tom Lewis

Fall Term 2009

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 1 / 10

Outline

1 Estimating the conditional mean of the response variable

2 Estimating the observed value of the response variable

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 2 / 10

Estimating the conditional mean of the response variable

The problem

We will assume that y = β1x + β0 + ε satisfies the assumptions of theregression model and that the standard deviation of ε is σ. Given a fixedvalue xp of the explanatory variable, what is our best estimate for thevalue of β1xp + β0?

The basic idea

Given a sample {x1, x2, . . . , xn} of explanatory variables, we canconstruct the regression equation y = b1x + b0. The value

yp = b1xp + b0

should be a close approximation to β1xp + β0. This value is called theconditional mean of the response variable (at x = xp).

To understand how good an approximation yp gives, we need tounderstand its variability.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 3 / 10

Estimating the conditional mean of the response variable

The problem

We will assume that y = β1x + β0 + ε satisfies the assumptions of theregression model and that the standard deviation of ε is σ. Given a fixedvalue xp of the explanatory variable, what is our best estimate for thevalue of β1xp + β0?

The basic idea

Given a sample {x1, x2, . . . , xn} of explanatory variables, we canconstruct the regression equation y = b1x + b0. The value

yp = b1xp + b0

should be a close approximation to β1xp + β0. This value is called theconditional mean of the response variable (at x = xp).

To understand how good an approximation yp gives, we need tounderstand its variability.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 3 / 10

Estimating the conditional mean of the response variable

The problem

We will assume that y = β1x + β0 + ε satisfies the assumptions of theregression model and that the standard deviation of ε is σ. Given a fixedvalue xp of the explanatory variable, what is our best estimate for thevalue of β1xp + β0?

The basic idea

Given a sample {x1, x2, . . . , xn} of explanatory variables, we canconstruct the regression equation y = b1x + b0. The value

yp = b1xp + b0

should be a close approximation to β1xp + β0. This value is called theconditional mean of the response variable (at x = xp).

To understand how good an approximation yp gives, we need tounderstand its variability.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 3 / 10

Estimating the conditional mean of the response variable

The problem

We will assume that y = β1x + β0 + ε satisfies the assumptions of theregression model and that the standard deviation of ε is σ. Given a fixedvalue xp of the explanatory variable, what is our best estimate for thevalue of β1xp + β0?

The basic idea

Given a sample {x1, x2, . . . , xn} of explanatory variables, we canconstruct the regression equation y = b1x + b0. The value

yp = b1xp + b0

should be a close approximation to β1xp + β0. This value is called theconditional mean of the response variable (at x = xp).

To understand how good an approximation yp gives, we need tounderstand its variability.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 3 / 10

Estimating the conditional mean of the response variable

Theorem

The random variable yp is normally distributed.

The mean of yp is β1xp + β0

The standard deviation of yp is

σ

√1

n+

(xp − x)2

Sxx

In other words

z =yp − (β1xp + β0)

σ√

1n +

(xp−x)2

Sxx

is a standard normal random variable.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 4 / 10

Estimating the conditional mean of the response variable

Theorem

The random variable yp is normally distributed.

The mean of yp is β1xp + β0

The standard deviation of yp is

σ

√1

n+

(xp − x)2

Sxx

In other words

z =yp − (β1xp + β0)

σ√

1n +

(xp−x)2

Sxx

is a standard normal random variable.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 4 / 10

Estimating the conditional mean of the response variable

Theorem

The random variable yp is normally distributed.

The mean of yp is β1xp + β0

The standard deviation of yp is

σ

√1

n+

(xp − x)2

Sxx

In other words

z =yp − (β1xp + β0)

σ√

1n +

(xp−x)2

Sxx

is a standard normal random variable.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 4 / 10

Estimating the conditional mean of the response variable

Theorem

The random variable yp is normally distributed.

The mean of yp is β1xp + β0

The standard deviation of yp is

σ

√1

n+

(xp − x)2

Sxx

In other words

z =yp − (β1xp + β0)

σ√

1n +

(xp−x)2

Sxx

is a standard normal random variable.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 4 / 10

Estimating the conditional mean of the response variable

Theorem

The random variable yp is normally distributed.

The mean of yp is β1xp + β0

The standard deviation of yp is

σ

√1

n+

(xp − x)2

Sxx

In other words

z =yp − (β1xp + β0)

σ√

1n +

(xp−x)2

Sxx

is a standard normal random variable.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 4 / 10

Estimating the conditional mean of the response variable

Theorem

The random variable

t =yp − (β1xp + β0)

se

√1n +

(xp−x)2

Sxx

has a t-distribution with n − 2 degrees of freedom.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 5 / 10

Estimating the conditional mean of the response variable

Problem

The data in the set LinModSec3.txt was created according to the modely = −2x + 30 + ε, where ε is a standard normal random variable. Here aresome summary statistics:

Sxx Syy Sxy b1 b0

82.5 338.536 -164.6 -1.9952 30.093

therefore

SST SSR SSE

338.536 328.4019394 10.13406061

Create a 95% confidence interval for the conditional mean of the responsevariable at xp = 8.5.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 6 / 10

Estimating the observed value of the response variable

The problem

We will assume that y = β1x + β0 + ε satisfies the assumptions of theregression model and that the standard deviation of ε is σ. Given a fixedvalue xp of the explanatory variable, what is our best estimate for thevalue of yp, the observed value of the response variable?

The basic idea

Given a sample {x1, x2, . . . , xn} of explanatory variables, we canconstruct the regression equation y = b1x + b0. The value

yp = b1xp + b0

should be a close approximation to yp.

To understand how good an approximation yp gives, we need tounderstand the variability of yp − yp.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 7 / 10

Estimating the observed value of the response variable

The problem

We will assume that y = β1x + β0 + ε satisfies the assumptions of theregression model and that the standard deviation of ε is σ. Given a fixedvalue xp of the explanatory variable, what is our best estimate for thevalue of yp, the observed value of the response variable?

The basic idea

Given a sample {x1, x2, . . . , xn} of explanatory variables, we canconstruct the regression equation y = b1x + b0. The value

yp = b1xp + b0

should be a close approximation to yp.

To understand how good an approximation yp gives, we need tounderstand the variability of yp − yp.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 7 / 10

Estimating the observed value of the response variable

The problem

We will assume that y = β1x + β0 + ε satisfies the assumptions of theregression model and that the standard deviation of ε is σ. Given a fixedvalue xp of the explanatory variable, what is our best estimate for thevalue of yp, the observed value of the response variable?

The basic idea

Given a sample {x1, x2, . . . , xn} of explanatory variables, we canconstruct the regression equation y = b1x + b0. The value

yp = b1xp + b0

should be a close approximation to yp.

To understand how good an approximation yp gives, we need tounderstand the variability of yp − yp.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 7 / 10

Estimating the observed value of the response variable

The problem

We will assume that y = β1x + β0 + ε satisfies the assumptions of theregression model and that the standard deviation of ε is σ. Given a fixedvalue xp of the explanatory variable, what is our best estimate for thevalue of yp, the observed value of the response variable?

The basic idea

Given a sample {x1, x2, . . . , xn} of explanatory variables, we canconstruct the regression equation y = b1x + b0. The value

yp = b1xp + b0

should be a close approximation to yp.

To understand how good an approximation yp gives, we need tounderstand the variability of yp − yp.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 7 / 10

Estimating the observed value of the response variable

Theorem

The random variable yp − yp is normally distributed.

The mean of yp − yp is 0.

The standard deviation of yp − yp is

σ

√1 +

1

n+

(xp − x)2

Sxx

In other words

z =yp − yp

σ√

1 + 1n +

(xp−x)2

Sxx

is a standard normal random variable.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 8 / 10

Estimating the observed value of the response variable

Theorem

The random variable yp − yp is normally distributed.

The mean of yp − yp is 0.

The standard deviation of yp − yp is

σ

√1 +

1

n+

(xp − x)2

Sxx

In other words

z =yp − yp

σ√

1 + 1n +

(xp−x)2

Sxx

is a standard normal random variable.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 8 / 10

Estimating the observed value of the response variable

Theorem

The random variable yp − yp is normally distributed.

The mean of yp − yp is 0.

The standard deviation of yp − yp is

σ

√1 +

1

n+

(xp − x)2

Sxx

In other words

z =yp − yp

σ√

1 + 1n +

(xp−x)2

Sxx

is a standard normal random variable.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 8 / 10

Estimating the observed value of the response variable

Theorem

The random variable yp − yp is normally distributed.

The mean of yp − yp is 0.

The standard deviation of yp − yp is

σ

√1 +

1

n+

(xp − x)2

Sxx

In other words

z =yp − yp

σ√

1 + 1n +

(xp−x)2

Sxx

is a standard normal random variable.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 8 / 10

Estimating the observed value of the response variable

Theorem

The random variable yp − yp is normally distributed.

The mean of yp − yp is 0.

The standard deviation of yp − yp is

σ

√1 +

1

n+

(xp − x)2

Sxx

In other words

z =yp − yp

σ√

1 + 1n +

(xp−x)2

Sxx

is a standard normal random variable.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 8 / 10

Estimating the observed value of the response variable

Theorem

The random variable

t =yp − yp

se

√1 + 1

n +(xp−x)2

Sxx

has a t-distribution with n − 2 degrees of freedom.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 9 / 10

Estimating the observed value of the response variable

Problem

The data in the set LinModSec3.txt was created according to the modely = −2x + 30 + ε, where ε is a standard normal random variable. Here aresome summary statistics:

Sxx Syy Sxy b1 b0

82.5 338.536 -164.6 -1.9952 30.093

therefore

SST SSR SSE

338.536 328.4019394 10.13406061

Create a 95% confidence interval for the observed value of the responsevariable at xp = 8.5.

Tom Lewis () §15.3–Estimation and Prediction Fall Term 2009 10 / 10