Project 1: Linear models for data By Kayli-Anna Bar- riteau · 2014. 2. 14. · Project 1: Linear...
Transcript of Project 1: Linear models for data By Kayli-Anna Bar- riteau · 2014. 2. 14. · Project 1: Linear...
Project 1: Linear models for
data By Kayli-Anna Bar-
riteau
Juvenile height
The average hight of juveniles as a function of age is shown below. the first number in the pair
represents the age while the second number represents the height of the juvenile.
ageheightdata = 881, 30<, 83, 36<, 85, 43<, 87, 48<, 89, 51<, 811, 56<, 813, 61<<
881, 30<, 83, 36<, 85, 43<, 87, 48<, 89, 51<, 811, 56<, 813, 61<<
lm = LinearModelFit@ageheightdata, x, xD
FittedModelB 28.80357142857143` + 2.5178571428571415` x F
Plot@lm@xD, 8x, 1, 13<D
1L FittedModelB 28.80357142857143` + 2.5178571428571415` x F
^ This is our straight line functuion that represents the data.
According to the graph, a child at birth (0 years) will be around 29 inches. were a 27 year old
would be around 97 inches. I do not think that this linear model is accurate because I beleive that
growing stops before the age of 27. This also predicts that the juevenile will be almost nine feet
which is rarley seen. this means that the juevenile will have to be growing continuously and as i
mentioned before, the human body stops growing in length before the age of 27.
According to the graph, a child at birth (0 years) will be around 29 inches. were a 27 year old
would be around 97 inches. I do not think that this linear model is accurate because I beleive that
growing stops before the age of 27. This also predicts that the juevenile will be almost nine feet
which is rarley seen. this means that the juevenile will have to be growing continuously and as i
mentioned before, the human body stops growing in length before the age of 27.
2L lm@0D
28.80357142857143
3L �lm@27D
96.78571428571425
U .S.carbon dioxide emissions
the first number in each pair is the year, and the second number is the annual emission estimate
in teragrams of carbon dioxide equivalents
data = 881991, 4697.7<, 81992, 4801.0<, 81993, 4921.9<, 81994, 4991.7<, 81995, 5040.6<,
81996, 5231.6<, 81997, 5296.9<, 81998, 5332.7<, 81999, 5399.6<, 82000, 5583.2<,
82001, 5518.8<, 82002, 5554.8<, 82003, 5615.4<, 82004, 5709.4<, 82005, 5748.7<<881991, 4697.7<, 81992, 4801.<, 81993, 4921.9<, 81994, 4991.7<, 81995, 5040.6<,
81996, 5231.6<, 81997, 5296.9<, 81998, 5332.7<, 81999, 5399.6<, 82000, 5583.2<,
82001, 5518.8<, 82002, 5554.8<, 82003, 5615.4<, 82004, 5709.4<, 82005, 5748.7<<
1L p1 = ListPlot@dataD
1992 1994 1996 1998 2000 2002 2004
4800
5000
5200
5400
5600
the estimated amount of carbon dioxide equivalent as a function of the year as a plotted graph.
lm = LinearModelFit@data, x, xD
2 kayli.nb
2L FittedModelB -142896.82047619132` + 74.17071428571467` x F
^data by a linear function of the year
p2 = Plot@lm@xD, 8x, 1991, 2005<D
1994 1996 1998 2000 2002 2004
5000
5200
5400
5600
5800
3L Show@8p1, p2<D
1992 1994 1996 1998 2000 2002 2004
4800
5000
5200
5400
5600
lm@"RSquared"D0.966382
from the visual plot and from the RSquared value, the data is off a perfect fit by around .033618.
Overall, I beleive it to be a pretty good fit because it is way closer to 1 than to 0.
4) the estimated carbon dioxide emissions for 2006 through 2011 given by the function are
showed below.
kayli.nb 3
5) The actual estimated carbon dioxide emissions for 2006 through 2011 are as follows
2006 = 5665.8
2007 = 5767.7
2008 = 5590.6
2009 = 5222.4
2010 = 5408.1
2011 = 5277.2
6) I beleive that the calculated data is higher then the recorded data because in 1991, pollution
was probably not seen as much of a concern as it is today. the numbers are probably lower then
what was calculated because as time progressed, enviormental laws and other things that are
trying to prevent air polution have caused plants to find ways to cut back on how much co2 they
release into the atmosphere.
Life expectancy
The Centers for Disease Control and Prevention keep data on U.S. life expectancy for ages 0
through 100, in 5-year increments. The first number in each pair is age, and the second number
is the life expectancy at that age.
4 kayli.nb
agedata = 880, 78.5<, 81, 78.0<, 85, 74.1<, 810, 69.1<,
815, 64.1<, 820, 59.3<, 825, 54.6<, 830, 49.8<, 835, 45.1<, 840, 40.4<,
845, 35.8<, 850, 31.3<, 855, 27.1<, 860, 23.0<, 865, 19.1<, 870, 15.5<,
875, 12.1<, 880, 9.1<, 885, 6.6<, 890, 4.7<, 895, 3.3<, 8100, 2.3<<
1L p1 = ListPlot@agedataD
life expectancy versus age plotted graph
20 40 60 80 100
20
40
60
80
2L LinearModelFit@agedata, x, xD
a linear function to the life expectancy data
FittedModelB 75.2608 - 0.811454 x F
p2 = Plot@lm@xD, 8x, 0, 100<D
20 40 60 80 100
-142 000
-141 000
-140 000
-139 000
-138 000
-137 000
-136 000
kayli.nb 5
Show@8p1, p2<D
20 40 60 80 100
20
40
60
80
lm@"RSquared"D0.966382
3) A major discrepancy that i see between the between the model and the data, is that although
the straight line model and the plotted data clearly tells us that the funtion is decreasing
(negative slope), the graph of the linear model is increasing. They are going in opposite direc-
tions even though the RSquared value says that it is a good fit when visually, it is not.
Solve@lm@xD � 0, xD
4L 88x → 1926.5935599020293`<<
^ Life expectancy is 0% at the age of 1,926. This makes perfect sense because a human rarley
makes it long past 100.
6 kayli.nb
T = Table@8x, lm@xD<, 8x, 0, 100<D880, −142 897.<, 81, −142 823.<, 82, −142 748.<, 83, −142 674.<, 84, −142 600.<,
85, −142 526.<, 86, −142 452.<, 87, −142 378.<, 88, −142 303.<, 89, −142 229.<,
810, −142 155.<, 811, −142 081.<, 812, −142 007.<, 813, −141 933.<, 814, −141 858.<,
815, −141 784.<, 816, −141 710.<, 817, −141 636.<, 818, −141 562.<, 819, −141 488.<,
820, −141 413.<, 821, −141 339.<, 822, −141 265.<, 823, −141 191.<, 824, −141 117.<,
825, −141 043.<, 826, −140 968.<, 827, −140 894.<, 828, −140 820.<, 829, −140 746.<,
830, −140 672.<, 831, −140 598.<, 832, −140 523.<, 833, −140 449.<, 834, −140 375.<,
835, −140 301.<, 836, −140 227.<, 837, −140 153.<, 838, −140 078.<, 839, −140 004.<,
840, −139 930.<, 841, −139 856.<, 842, −139 782.<, 843, −139 707.<, 844, −139 633.<,
845, −139 559.<, 846, −139 485.<, 847, −139 411.<, 848, −139 337.<, 849, −139 262.<,
850, −139 188.<, 851, −139 114.<, 852, −139 040.<, 853, −138 966.<, 854, −138 892.<,
855, −138 817.<, 856, −138 743.<, 857, −138 669.<, 858, −138 595.<, 859, −138 521.<,
860, −138 447.<, 861, −138 372.<, 862, −138 298.<, 863, −138 224.<, 864, −138 150.<,
865, −138 076.<, 866, −138 002.<, 867, −137 927.<, 868, −137 853.<, 869, −137 779.<,
870, −137 705.<, 871, −137 631.<, 872, −137 557.<, 873, −137 482.<, 874, −137 408.<,
875, −137 334.<, 876, −137 260.<, 877, −137 186.<, 878, −137 112.<, 879, −137 037.<,
880, −136 963.<, 881, −136 889.<, 882, −136 815.<, 883, −136 741.<, 884, −136 666.<,
885, −136 592.<, 886, −136 518.<, 887, −136 444.<, 888, −136 370.<,
889, −136 296.<, 890, −136 221.<, 891, −136 147.<, 892, −136 073.<,
893, −135 999.<, 894, −135 925.<, 895, −135 851.<, 896, −135 776.<,
897, −135 702.<, 898, −135 628.<, 899, −135 554.<, 8100, −135 480.<<
TableForm@TD
5) a table of life expectancies for all ages 0 through 100 (in 1 year increments).
0 −142 897.
1 −142 823.
2 −142 748.
3 −142 674.
4 −142 600.
5 −142 526.
6 −142 452.
7 −142 378.
8 −142 303.
9 −142 229.
10 −142 155.
11 −142 081.
12 −142 007.
13 −141 933.
14 −141 858.
15 −141 784.
16 −141 710.
17 −141 636.
18 −141 562.
19 −141 488.
20 −141 413.
21 −141 339.
22 −141 265.
23 −141 191.
24 −141 117.
25 −141 043.
kayli.nb 7
26 −140 968.
27 −140 894.
28 −140 820.
29 −140 746.
30 −140 672.
31 −140 598.
32 −140 523.
33 −140 449.
34 −140 375.
35 −140 301.
36 −140 227.
37 −140 153.
38 −140 078.
39 −140 004.
40 −139 930.
41 −139 856.
42 −139 782.
43 −139 707.
44 −139 633.
45 −139 559.
46 −139 485.
47 −139 411.
48 −139 337.
49 −139 262.
50 −139 188.
51 −139 114.
52 −139 040.
53 −138 966.
54 −138 892.
55 −138 817.
56 −138 743.
57 −138 669.
58 −138 595.
59 −138 521.
60 −138 447.
61 −138 372.
62 −138 298.
63 −138 224.
64 −138 150.
65 −138 076.
66 −138 002.
67 −137 927.
68 −137 853.
69 −137 779.
70 −137 705.
71 −137 631.
72 −137 557.
73 −137 482.
74 −137 408.
75 −137 334.
76 −137 260.
77 −137 186.
78 −137 112.
79 −137 037.
80 −136 963.
8 kayli.nb
81 −136 889.
82 −136 815.
83 −136 741.
84 −136 666.
85 −136 592.
86 −136 518.
87 −136 444.
88 −136 370.
89 −136 296.
90 −136 221.
91 −136 147.
92 −136 073.
93 −135 999.
94 −135 925.
95 −135 851.
96 −135 776.
97 −135 702.
98 −135 628.
99 −135 554.
100 −135 480.
High school senior alcohol consumption
U.S. Federal survey data indicates a decline in alcohol consumption by young people over
several decades.Below is data on the decline of the proportion of high school seniors who have
consumed alcohol within previous 30 days, from 1980 through 2010.The first number in each
pair is the year of the survey; the second number is the proportion of high school seniors who
report costuming alcohol in the previous 30 days.
alcoholdata =
881980, 0.72<, 81990, 0.571<, 82000, 0.5009<, 82009, 0.435<, 82010, 0.412<<
1) The data as a ploted graph
kayli.nb 9
p1 = ListPlot@alcoholdataD
1985 1990 1995 2000 2005 2010
0.45
0.50
0.55
0.60
0.65
0.70
2) A fitted linear model for the data
lm = LinearModelFit@alcoholdata, x, xD
FittedModelB 19.5976 - 0.0095454 x F
p2 = Plot@lm@xD, 8x, 1980, 2010<D
1985 1990 1995 2000 2005 2010
0.45
0.50
0.55
0.60
0.65
0.70
10 kayli.nb
Show@8p1, p2<D
1985 1990 1995 2000 2005 2010
0.45
0.50
0.55
0.60
0.65
0.70
lm@"RSquared"D0.972251
3) Visually and based on the RSquared Value, it is a good fit, the value is closer to 1 than to 0.
4) According to the Linear model, the rate of decreas is .0095454 students per year.
T = Table@8x, lm@xD<, 8x, 1980, 2010<D881980, 0.697688<, 81981, 0.688143<, 81982, 0.678597<,
81983, 0.669052<, 81984, 0.659507<, 81985, 0.649961<, 81986, 0.640416<,
81987, 0.63087<, 81988, 0.621325<, 81989, 0.61178<, 81990, 0.602234<,
81991, 0.592689<, 81992, 0.583143<, 81993, 0.573598<, 81994, 0.564053<,
81995, 0.554507<, 81996, 0.544962<, 81997, 0.535416<, 81998, 0.525871<,
81999, 0.516326<, 82000, 0.50678<, 82001, 0.497235<, 82002, 0.487689<,
82003, 0.478144<, 82004, 0.468599<, 82005, 0.459053<, 82006, 0.449508<,
82007, 0.439962<, 82008, 0.430417<, 82009, 0.420871<, 82010, 0.411326<<
5L TableForm@8T<, TableHeadings → 88"r1", "r2", "r3"<, 8"c1", "c2"<<D
� �
Year
% drinkers
1980
0.6976881546894091`
1981
0.6881427527405677`
1982
0.6785973507917262`
1983
0.6690519488428812`
kayli.nb 11
ListPlot@TD
1985 1990 1995 2000 2005 2010
0.45
0.50
0.55
0.60
0.65
0.70
^ A table and a list plot of the data for every year between 1980 and 2010
Solve@lm@xD � 0, xD88x → 2053.09<<
Above and below show values that make sense. They give years that make it possible for a
steady decrease to appear between these two years.
Solve@lm@xD � 1, xD88x → 1948.33<<
World population estimates
Data below gives estimates of world population since 1950
12 kayli.nb
populationdata = 881950, 2 525 779 000<, 81951, 2 572 851 000<, 81952, 2 619 292 000<,
81953, 2 665 865 000<, 81954, 2 713 172 000<, 81955, 2 761 651 000<, 81956, 2 811 572 000<,
81957, 2 863 043 000<, 81958, 2 916 030 000<, 81959, 2 970 396 000<, 81960, 3 026 003 000<,
81961, 3 082 830 000<, 81962, 3 141 072 000<, 81963, 3 201 178 000<, 81964, 3 263 739 000<,
81965, 3 329 122 000<, 81966, 3 397 475 000<, 81967, 3 468 522 000<, 81968, 3 541 675 000<,
81969, 3 616 109 000<, 81970, 3 691 173 000<, 81971, 3 766 754 000<, 81972, 3 842 874 000<,
81973, 3 919 182 000<, 81974, 3 995 305 000<, 81975, 4 071 020 000<, 81976, 4 146 136 000<,
81977, 4 220 817 000<, 81978, 4 295 665 000<, 81979, 4 371 528 000<, 81980, 4 449 049 000<,
81981, 4 528 235 000<, 81982, 4 608 962 000<, 81983, 4 691 560 000<, 81984, 4 776 393 000<,
81985, 4 863 602 000<, 81986, 4 953 377 000<, 81987, 5 045 316 000<, 81988, 5 138 215 000<,
81989, 5 230 452 000<, 81990, 5 320 817 000<, 81991, 5 408 909 000<, 81992, 5 494 900 000<,
81993, 5 578 865 000<, 81994, 5 661 086 000<, 81995, 5 741 822 000<, 81996, 5 821 017 000<,
81997, 5 898 688 000<, 81998, 5 975 304 000<, 81999, 6 051 478 000<, 82000, 6 127 700 000<,
82001, 6 204 147 000<, 82002, 6 280 854 000<, 82003, 6 357 992 000<, 82004, 6 435 706 000<,
82005, 6 514 095 000<, 82006, 6 593 228 000<, 82007, 6 673 106 000<,
82008, 6 753 649 000<, 82009, 6 834 722 000<, 82010, 6 916 183 000<<
Below is a the data as a plotted graph
1L p1 = ListPlot@populationdataD
1960 1970 1980 1990 2000 2010
3 µ 109
4 µ 109
5 µ 109
6 µ 109
7 µ 109
2L lm = LinearModelFit@populationdata, x, xD
Below is a linear model to the data
FittedModelB -1.46235µ1011
+ 7.61555µ107
x F
kayli.nb 13
p2 = Plot@lm@xD, 8x, 1950, 2010<D
1960 1970 1980 1990 2000 2010
3 µ 109
4 µ 109
5 µ 109
6 µ 109
3L Show@8p1, p2<D
1960 1970 1980 1990 2000 2010
3 µ 109
4 µ 109
5 µ 109
6 µ 109
7 µ 109
From a visual stand point and from the RSquared Value (Which is about 1), the model is almost a
perfect fit to the data.
lm@"RSquared"D0.995437
According to the Linear model, the anual rate of increase is 76,155,500 people per year.
lm@2050D
9.88396 × 109
^ I do not think this make sense because the calculated data is above the recorded data by
almost 1 billion
Solve@lm@xD � 0, xD88x → 1920.21<<
^ This is absolutley wrong because this is telling us that the population was 0 in 1920 when I
know for a fact my great grandmother was around and I’m pretty sure she was not the only
human on this planet.
14 kayli.nb
^ This is absolutley wrong because this is telling us that the population was 0 in 1920 when I
know for a fact my great grandmother was around and I’m pretty sure she was not the only
human on this planet.
Rates of diabetes, US states: 1994-2010
The Centers for Disease Control and Prevention have data on rates of diabetes for each U.S.
state for the years 1994 through 2010.
In[5]:= CAdata = 881994, 4.7<, 81995, 4.8<, 81996, 5.4<, 81997, 5.4<, 81998, 5.7<,
81999, 6.1<, 82000, 6.4<, 82001, 6.9<, 82002, 7<, 82003, 7.2<, 82004, 7.1<,
82005, 7.4<, 82006, 7.6<, 82007, 8.1<, 82008, 8.4<, 82009, 8.7<, 82010, 8.9<<
In[6]:= ListPlot@CAdataD
Out[6]=
1995 2000 2005 2010
5
6
7
8
In[11]:= lm = LinearModelFit@CAdata, x, xD
Linear Model for the California data
Out[29]= FittedModelB -511.353 + 0.258824 x F
When using this model to calculate the data for each year, you can see that it actually gives you
an out put that is off by two or three decimals.
In[30]:= lm@1994DOut[30]= 4.74118
In[31]:= lm@1995DOut[31]= 5.
In[32]:= lm@1996DOut[32]= 5.25882
kayli.nb 15
In[33]:= lm@1997DOut[33]= 5.51765
In[34]:= lm@1998DOut[34]= 5.77647
In[35]:= lm@1999DOut[35]= 6.03529
In[36]:= lm@2000DOut[36]= 6.29412
In[37]:= lm@2001DOut[37]= 6.55294
In[38]:= lm@2002DOut[38]= 6.81176
In[39]:= lm@2003DOut[39]= 7.07059
In[40]:= lm@2004DOut[40]= 7.32941
In[41]:= lm@2005DOut[41]= 7.58824
In[42]:= lm@2006DOut[42]= 7.84706
In[43]:= lm@2007DOut[43]= 8.10588
In[44]:= lm@2008DOut[44]= 8.36471
In[45]:= lm@2009DOut[45]= 8.62353
In[46]:= lm@2010DOut[46]= 8.88235
In[26]:= TXdata = 881994, 5.2<, 81995, 4.7<, 81996, 5<, 81997, 5.1<, 81998, 5.9<,
81999, 6<, 82000, 6.5<, 82001, 6.8<, 82002, 7.4<, 82003, 7.6<, 82004, 7.9<,
82005, 7.8<, 82006, 8.8<, 82007, 9.3<, 82008, 9.8<, 82009, 9.6<, 82010, 9.5<<
^ data for texas
16 kayli.nb
In[27]:= ListPlot@TXdataD
Out[27]=
1995 2000 2005 2010
5
6
7
8
9
In[48]:= lm = LinearModelFit@TXdata, x, xD
Out[48]= FittedModelB -675.315 + 0.340931 x F
^ Linear model for the data regarding texas.
In[49]:= lm@1994DOut[49]= 4.50196
In[50]:= lm@1995DOut[50]= 4.84289
In[51]:= lm@1996DOut[51]= 5.18382
In[52]:= lm@1997DOut[52]= 5.52475
In[53]:= lm@1998DOut[53]= 5.86569
In[54]:= lm@1999DOut[54]= 6.20662
In[55]:= lm@2000DOut[55]= 6.54755
In[56]:= lm@2001DOut[56]= 6.88848
kayli.nb 17
In[57]:= lm@2002DOut[57]= 7.22941
In[58]:= lm@2003DOut[58]= 7.57034
In[59]:= lm@2004DOut[59]= 7.91127
In[60]:= lm@2005DOut[60]= 8.25221
In[61]:= lm@2006DOut[61]= 8.59314
In[62]:= lm@2007DOut[62]= 8.93407
In[63]:= lm@2008DOut[63]= 9.275
In[64]:= lm@2009DOut[64]= 9.61593
In[65]:= lm@2010DOut[65]= 9.95686
Just like the linear function for california; the linear function for texas’s data gives a calculated
answer that is off by a few decimals than the recording data. If we were to use california’s linear
model for texas it would be even more off.
18 kayli.nb
kayli.nb 19
20 kayli.nb
kayli.nb 21