Extrapolation Technique Summarized

27
Extrapolation Technique Summarized • The extrapolation technique (aka curve fitting) is a simplistic model that uses past gross population trends to project future population levels. • “The defining characteristics of trend extrapolation is that future values of any variable are determined solely by its historical values.” (SLPP, p. 161 emphasis added) • Basic Procedure: 1) Identify overall past trend and fit proper curve 2) Project future populations based upon your chosen curve • We use a linear equation for most of these equations. A linear transformation is required to make projections for all but the Parabolic Curve. • Advantages: 1) Low data requirements 2) Very easy methodology 3) 1+2 = Low resource requirements (money, skills, etc.)

Transcript of Extrapolation Technique Summarized

Page 1: Extrapolation Technique Summarized

Extrapolation Technique Summarized• The extrapolation technique (aka curve fitting) is a simplistic model that uses past gross population trends to project future population levels.

• “The defining characteristics of trend extrapolation is that future values of any variable are determined solely by its historical values.” (SLPP, p. 161 emphasis added)

• Basic Procedure: 1) Identify overall past trend and fit proper curve 2) Project future populations based upon your chosen curve

• We use a linear equation for most of these equations. A linear transformation is required to make projections for all but the Parabolic Curve.

• Advantages: 1) Low data requirements2) Very easy methodology3) 1+2 = Low resource requirements (money, skills, etc.)

• Disadvantages: 1) Uses only aggregate data 2) Assumes that past trends will predict the future

Page 2: Extrapolation Technique Summarized

Leon County Population, 1940-1990

-

50,000

100,000

150,000

200,000

250,000

1940 1950 1960 1970 1980 1990

Year

Po

pu

lati

on

Visualizing the Technique

Page 3: Extrapolation Technique Summarized

The Curves to Be Fit• Linear Curve: Plots a straight line based on the formula:

Y = a + bX

• Geometric Curve: Plots a curve based upon a rate of compounding growth over discrete intervals via the formula: Y = aebX

• Parabolic (Polynomial) Curve: A curve with “one bend” and a constantly changing slope. Formula: Y = a + bX + cX2

• Modified Exponential Curve*: An asymptotic growth curve that recognizes that a region will reach an upper limit of growth. It takes the form: Y = c + abX

• Gompertz Curve*: Describes a growth pattern that is quite slow, increases for a time, and then tapers off as the population approaches a growth limit. Form: Y = c(a) exp (bX)

• Logistic Curve*: Similar to the Gompertz Curve, this is useful for describing phenomena that grow slowly at first, increase rapidly, and then slow with approach to a growth limit.

Y = (c + abX)-1 * = Asymptotic Curves

Page 4: Extrapolation Technique Summarized

The Linear Curve (Y = a + bX)• Fits a straight line to population data. The growth rate is assumed to be constant, with non-compounding incremental growth. Calculated exactly the same as using linear regression (least-squares criterion).

• Advantages:--Simplest curve--Most widely used--Useful for slow or non-growth areas

• Disadvantages:--Rarely appropriate to demographic data

• Example:

Y = 55,000 + 6,000(X)

In plain language, this equation tells us that for each year that passes, we can project an additional 6,000 people will be added to the population. So, in 10 years we would project 60,000 more people using this equation (6,000 * 10).

• Evaluation: Generally used as a staring point for curve fitting.

Page 5: Extrapolation Technique Summarized

Manatee County Linear Curve

Year Actual Data Projection

1950 34,704 21,421

1960 69,168 67,862

1970 97,115 114,303

1980 148,442 160,743

1990 211,707 207,184

2000 264,002 253,625

2010 300,066

2020 346,507

2030 392,948

Y Int -9034568.9

Slope 4644.09714

Page 6: Extrapolation Technique Summarized

Manatee County Linear Regression Projections

0

50,000

100,000

150,000

200,000

250,000

300,000

350,000

400,000

450,000

1950 1960 1970 1980 1990 2000 2010 2020 2030

Year

Po

pu

lati

on

Actual Data

Projection

Page 7: Extrapolation Technique Summarized

The Geometric Curve (Y = aebX)• In this curve, a growth rate is assumed to be compounded at set intervals using a constant growth rate. To transform this equation into a linear equation, we use logarithms.

• Advantages:--Assumes a constant rate of growth--Still simple to use

• Disadvantage:--Does not take into account a growth limit

• Example:

Y = 55,000 * (1.00 + 0.06)X

In plain language, this equation tells us that we have a 6% growth rate. After one year we project a population of 58,300. After 10 years we would project a population of 98,497.

• Evaluation: Pretty good for short term fast-growing areas. However, over the long-run, this curve usually generates unrealistically high numbers.

Page 8: Extrapolation Technique Summarized

Manatee County Geometric Curve

Year Actual DataLog of Pop Log Proj Projection

1950 34,704 4.5404 4.6158 41,281

1960 69,168 4.8399 4.7885 61,454

1970 97,115 4.9873 4.9613 91,484

1980 148,442 5.1716 5.1341 136,189

1990 211,707 5.3257 5.3069 202,741

2000 264,002 5.4216 5.4797 301,813

2010 5.6525 449,298

2020 5.8253 668,855

2030 5.9981 995,702

Y Int (29.080)

Slope 0.0173

Page 9: Extrapolation Technique Summarized

Manatee County Geometric Curve Projections

0

200,000

400,000

600,000

800,000

1,000,000

1,200,000

1950 1960 1970 1980 1990 2000 2010 2020 2030

Year

Po

pu

lati

on

Actual Data

Projection

Page 10: Extrapolation Technique Summarized

The Parabolic Curve (Y = a + bX + cX2)• Generally has a constantly changing slope and one bend. Very similar to the Linear Curve except for the additional parameter (c). Growing very quickly when c > 0, declining quickly when c < 0.

• Advantage:--Models fast growing areas

• Disadvantages:--Poor for long range projections (familiar refrain?)--No Growth Limit--More complex

• Example:

Y = 43.46 + 8.78(X) + 0.581(X2)

When X=0, Y =43.46. When X = 6, Y = 117.1

• Evaluation: Exactly the same as the Geometric Curve; good for fast growing areas, but poor over the long run.

Page 11: Extrapolation Technique Summarized

Even Number of Observations Product of Column F

YearActual Data

Index Value

Index Squared Index ^4

Index and Observed Squared Projection

1950 34,704 -5 25 625 -173520 867600 35,136

1960 69,168 -3 9 81 -207504 622512 65,118

1970 97,115 -1 1 1 -97115 97115 103,330

1980 148,442 1 1 1 148442 148442 149,771

1990 211,707 3 9 81 635121 1905363 204,441

2000 264,002 5 25 625 1320010 6600050 267,341

2010 7 49 2401 338,471

2020 9 81 6561 417,830

2030 11 121 14641 505,419

Manatee County Parabolic Curve

Page 12: Extrapolation Technique Summarized

Manatee County Parabolic Curve

0

100,000

200,000

300,000

400,000

500,000

600,000

1950 1960 1970 1980 1990 2000 2010 2020 2030

Year

Po

p

Actual Data

Projections

Page 13: Extrapolation Technique Summarized

Modified Exponential Curve (Y = c + abX )• The first of the Asymptotic Curves. Takes into account an upper or lower limit when computing projected values. The asymptote can be derived from local analysis or supplied by the model itself.

• Advantage:--Growth limit is introduced--“Best fitting” growth limit

• Disadvantage:--Much more complex calculations--Misleading “Growth limit” (high and low)

• Example:

Yc = 114 - 64(0.75)X

The growth limit is 114. The curve takes into account the number of time periods and as X gets larger the closer you get to the Growth limit. When X = 0, Y = 50; when X = 2, Y = 78, etc.

• Evaluation: This curve largely depends upon the growth limit. If the limit is reasonable, then the curve can be a good one. Also, the ability to calculate the growth limit within the model is very useful.

Page 14: Extrapolation Technique Summarized

Year Index Actual Data Projection

1950 0 34,704 38,242

1960 1 69,168 65,630

1970 2 97,115 100,535

1980 3 148,442 145,022

1990 4 211,707 201,722

2000 5 264,002 273,987

2010 6 366,090

2020 7 483,476

2030 8 633,087

Total 825,138

Manatee County Modified Exponential Curve

Page 15: Extrapolation Technique Summarized

Manatee County Pop ProjectionsBest Fitting Mod Exp Curve

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

1950 1960 1970 1980 1990 2000 2010 2020 2030

Year

Po

pu

latio

n

Actual Data

Projection

Page 16: Extrapolation Technique Summarized

The Gompertz Curve (Y = c(a) exp (bX))• Describes a growth pattern that is initially quite slow, increases for a period and then tapers off. Like the Mod Exp curve, the upper limit can be assumed or derived by the model.

• Advantage:--Reflects very common growth patterns

• Disadvantages:--Getting even more complex--Misleading growth limit (limit can be high or low)

• Example:

log Yc = 2.699 - 1.056(0.9221)X

The equation itself is tough to understand. When X = 0, Log Y = 1.64, so Y = 44.0 (via antilog calculation). Note: Antilog of 2.699 is 500 (the growth limit)

• Evaluation: A very useful curve that can be fitted to all kinds of growth patterns. However, as with the previous curve, using an assumed growth limit can be problematic unless it is reasonable and makes sense for the case at hand.

Page 17: Extrapolation Technique Summarized

Actual Log of Log of

Year Index Data Obs Value Proj Projection

1950 0 34,704 4.5404 4.5788 37,910

1960 1 69,168 4.8399 4.8015 63,319

1970 2 97,115 4.9873 4.9952 98,906

1980 3 148,442 5.1716 5.1636 145,754

1990 4 211,707 5.3257 5.3100 204,186

2000 5 264,002 5.4216 5.4373 273,726

2010 6 5.5480 353,169

2020 7 5.6442 440,755

2030 8 5.7278 534,378

Total 825,138

Manatee County Gompertz Curve

Page 18: Extrapolation Technique Summarized

Manatee County Pop Projections Best Fitting Gompertz Curve

0

100,000

200,000

300,000

400,000

500,000

600,000

1950 1960 1970 1980 1990 2000 2010 2020 2030

Year

Po

pu

latio

n

Actual Data

Projection

Page 19: Extrapolation Technique Summarized

The Logistic Curve (Y = (c + abX)-1 )• VERY similar to the Mod Exp and the Gompertz curves, except that we are taking the reciprocals of the observed values. A very popular curve.

• Advantages:--Has proven to be a good projection tool--Considered a bit more stable than the Gompertz curve

• Disadvantages:--Complex!--Hard to interpret the formula

• Example:

Yc-1 = 0.0020 + 0.217(0.8015)X

Another difficult to interpret equation. When X = 0, Y = 42.1. When X = 6, Y = 128.9. Note: Reciprocal of .002 is 500 (GL)

• Evaluation: Considered to be the “best” of the extrapolation curves. It reflects a well-known growth pattern. It is more stable than the Gompertz curve and it does not have a misleading growth limit.

Page 20: Extrapolation Technique Summarized

Actual Recip of Log of

Year Index Data Observd Proj Projection

1950 0 34,704 0.00002882 0.000026959 37,093

1960 1 69,168 0.00001446 0.000016313 61,300

1970 2 97,115 0.00001030 0.000010246 97,601

1980 3 148,442 0.00000674 0.000006788 147,321

1990 4 211,707 0.00000472 0.000004817 207,588

2000 5 264,002 0.00000379 0.000003694 270,700

2010 6 0.000003054 327,434

2020 7 0.000002689 371,848

2030 8 0.000002481 403,002

Total

Manatee County Logistic Curve

Page 21: Extrapolation Technique Summarized

Manatee County Pop ProjectionsBest Fitting Logistic Curve

0

50,000

100,000

150,000

200,000

250,000

300,000

350,000

400,000

450,000

1950 1960 1970 1980 1990 2000 2010 2020 2030Year

Po

pu

latio

n

Actual Data

Projection

Page 22: Extrapolation Technique Summarized

The Curve Fitting Procedure 1) Plot the data in a chart 2) Eyeball the data: Identify and eliminate “erroneous data”; Identify past population trends; Eliminate curves that don’t fit the data 3) Process the data using the chosen curves, Plot your results in charts 4) Use quantitative procedures to identify best-fitting curves 5) Make your choice of forecast based upon a combination of quantitative and qualitative evaluations of the various projections• Many issues affect how the fit of the various curves: --Choice of the Base Period, including the Base Year --Calibration of projections --Use of Growth Limits

Page 23: Extrapolation Technique Summarized

Understanding Extrapolation• One basic principle when using the the extrapolation technique

effectively is:The choice of the Base Period can have a significant impact upon the projection generated.

In the Manatee County example, if we use a varying Base Period and the Lin Reg method, we get the following results:

  Actual Data 1920-2000 1950-2000 1980-2000

1970 97,115      

1980 148,442      

1990 211,707      

2000 264,002      

2010   253,817 300,066 323,610

2020   284,749 346,507 381,390

2030   315,680 392,948 439,170

Page 24: Extrapolation Technique Summarized

The Effect of Different Base Periods on Population Projections

0

50,000

100,000

150,000

200,000

250,000

300,000

350,000

400,000

450,000

500,000

1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020 2030

Year

Popula

tion

Actual Data

1920-2000

1950-2000

1980-2000

Page 25: Extrapolation Technique Summarized

Improving Extrapolation Projections through Calibration

• The Linear Curve also helps to illustrate one improvement to the extrapolation technique:

Oftentimes analysts “calibrate” their model to fit the projection to the observed data.

• Calibration is very simply an adjustment that makes the projected population consistent with the launch year population.

• Calibration is calculated by subtracting the estimated population from the observed population in the Launch Year (Observed – Estimated).

• In our Manatee County example, the adjustment for BY1950 is:Observed Pop 2000: 264,002 Estimated Pop 2000: 253,625 Calibration: +10,377

• This figure is then added to all subsequent projections using this mixture of curve type (Lin Reg) and base period (1950-2000)

• Calibration is typically used with the Lin Regression technique, but can be used in others as well.

Page 26: Extrapolation Technique Summarized

Improving Extrapolation Projections through Upper Limits

• The three asymptotic curves (Mod Exp, Gompertz, Logistic) have two derivations that offer an opportunity to “fine tune” our projections :

1) Under one approach the model itself calculates a limit to population growth.

2) Alternatively the analyst can set an “upper limit” for the population.

• This upper limit can be generated by a carrying capacity analysis (as in Monroe County (the Keys)) or from some other study that generates an upper population bound.

• The concept of growth limits has been found to be very useful in projections as populations cannot grow infinitely… there is some limit to their growth.

• In incorporating this concept into the extrapolation technique there is evidence that better projections are generated.

Page 27: Extrapolation Technique Summarized

County Population ProjectionsBest Fitting Modified Exponential Curve

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

1950 1960 1970 1980 1990 2000 2010 2020 2030

Year

Po

pu

latio

n

Actual Data

Projection

County Mod Exp UL Pop Projections

-

50,000

100,000

150,000

200,000

250,000

300,000

350,000

400,000

Year 1950 1960 1970 1980 1990 2000 2010 2020

Year

Po

pu

latio

nActual Data

Projection

Limit Calculated by Model

Upper Limit Assumed To be 1.2 Million People

Manatee County ExampleBP 1950-2000