francisharten.weebly.com€¦ · Web viewProject #2 – Multiple Regression Analysis. Francis T....
Transcript of francisharten.weebly.com€¦ · Web viewProject #2 – Multiple Regression Analysis. Francis T....
Running head: PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 1
Project #2 – Multiple Regression Analysis
Francis T. Harten
Long Island University / Post
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 2
***************************************************************** The following Stata code utilized for EDD 1006 Project #2**
** Multiple Regression Analysis Due March 25th, 2013 **
** Original Stata data set load from class web site of **
** Professor Red Owl with the following added Stata code **
** from Francis T. Harten **
****************************************************************
*Step 1
use "http://myweb.liu.edu/~redowl/data/reportcards200405.dta", clear
*Step 2
codebook, compact
*Step 3
describe
*Step 4
Summarize
*Step 5
tab1 district co enroll enroll000 ppexp ppexp000 povrate attend csize medtchexp
passeng4 passmat4 passeng8 passmat8 passengr passmathr passsocsr gradrate pct4col
pass8 pass4 passrege
*Step 6
regress pct4col enroll000, beta
*Step 7
regress pct4col enroll000, beta
*Step 8
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 3
regress pct4col ppexp000, beta
*Step 9
regress pct4col csize, beta
*Step 10
regress pct4col attend, beta
*Step 11
regress pct4col pass4, beta
*Step 12
regress pct4col passregents, beta
*Step 13
regress pct4col medtchexp, beta
*Step 14
regress pct4col co, beta
*Step 15
regress pct4col enroll000 ppexp000 csize attend pass4 passregents medtchexp co, beta
*Step 16
estat vif
correlate pct4col enroll000 ppexp000 csize attend pass4 passregents medtchexp co
*Step 17
*Step 18
graph matrix pct4col enroll000 ppexp000 csize attend pass4 passregents medtchexp co,
half
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 4
***************************************************************** The following Stata code produces linear and polynomial **** plots of the depependent variable (pct4col) vs. the **** independent variables specified in Data Analysis **** Project #2 (Multiple Regression). **** This provides visual evidence of the linearity or **** nonlinearity of the relationships between the dependent **** variable and each independent variable. **** Provided by Red Owl for use in EDD 1006 Project #2 **** Francis T. Harten 03-25-13 *****************************************************************
* Step 1* Load the data set from the web.use "http://myweb.liu.edu/~redowl/data/reportcards200405.dta", clear
* Step 2set more off
* Step 3* Turn on feature to place all graphs in tabs of graph window* instead of separate graph windows. (This may not work on Macs, but* that feature is just cosmetic and will not affect the ultimate graphs.)set autotabgraphs on
* Step 4* This drops any graphs that may be in memory from previous analyses.* The "capture" command instructs Stata to ignore errors if no graphs* are already in memory and none need to be dropped.capture graph drop _all
* Step 5* This loop produces the graph for each non-binary independent variable* in the varlist.foreach var of varlist enroll000 ppexp000 csize attend pass4 passregents medtchexp { twoway (lpolyci pct4col `var') (lfit pct4col `var', lcolor(blue)) (sc pct4col `var', msize(small) ) , legend(off) title(`var') scheme(s1color) name(gr_`var') }
* Step 6* This produces the graph for the single binary independent variable,* co, and omits the polynomial line. Any line between two points * (i.e., the averages of the categories of a binary variable) will always* be a straight line, so a polynomial fit would not be meaninful.twoway (lfitci pct4col co) (lfit pct4col co, lcolor(blue)) (sc pct4col co, mcolor(maroon) msize(small) ) , xlabel(1(1)2, valuelabel) legend(off) title(co) scheme(s1color) name(gr_co)
* Step 7 * This produces the first set of combined graphs from the loop above.graph combine gr_enroll000 gr_ppexp000 gr_csize gr_attend, cols(2) title(Review of Linearity) scheme(s1color) name(gr_comb1)
* Step 8* This produces the second set of combined graphs from the loop above.graph combine gr_pass4 gr_passregents gr_medtchexp gr_co, cols(2) title(Review of Linearity) scheme(s1color) name(gr_comb2)
* Step 9* This shows the nonlinear relationship between pct4col and ppexp * for all districts.twoway (lpolyci pct4col ppexp000) (lfit pct4col ppexp000 if ppexp000<=19, lcolor(blue) range(12.8 19)) (lfit pct4col ppexp000 if ppexp000>=19 & ppexp000<=22, lcolor(blue)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 5
range(19 22)) (lfit pct4col ppexp000 if ppexp000>=22, lcolor(blue) range(22 25)) (sc pct4col ppexp000, mcolor(orange) mlabel(district) msize(.5) mlabsize(1.5) mlabcolor(black)) , ytitle(pct4col) ylabel(10(10)100) xtitle(ppexp000) xlabel(12(2)26) xline(19) xline(22) legend(off) title(Regression Lines - All Districts) scheme(s1color) name(gr_ppexp000all)
* Step 10* This adds an overall linear regression line to the polynomical and segmented regression linestwoway (lpolyci pct4col ppexp000) (lfit pct4col ppexp000, lcolor(red) range(12.8 25)) (lfit pct4col ppexp000 if ppexp000<=19, lcolor(blue) range(12.8 19)) (lfit pct4col ppexp000 if ppexp000>=19 & ppexp000<=22, lcolor(blue) range(19 22)) (lfit pct4col ppexp000 if ppexp000>=22, lcolor(blue) range(22 25)) (sc pct4col ppexp000, mcolor(orange) mlabel(district) msize(.5) mlabsize(1.5) mlabcolor(black)) , ytitle(pct4col) ylabel(10(10)100) xtitle(ppexp000) xlabel(12(2)26) xline(19) xline(22) legend(off) title(Regression Lines - All Districts) scheme(s1color) name(gr_ppexp000all2)
* Step 11* This shows the nonlinear relationship between pct4col and ppexp * for Suffolk County districts.preservekeep if co==1twoway (lpolyci pct4col ppexp000) (lfit pct4col ppexp000, lcolor(red) range(12 25)) (lfit pct4col ppexp000 if ppexp000<=19, lcolor(blue) range(12 19)) (lfit pct4col ppexp000 if ppexp000>=22, lcolor(blue) range(22 26)) (sc pct4col ppexp000, mcolor(orange) mlabel(district) msize(.5) mlabsize(1.5) mlabcolor(black)) , ytitle(pct4col) ylabel(20(10)100) xtitle(ppexp000) xlabel(12(2)26) xline(19) xline(22) legend(off) title(Suffolk County) scheme(s1color) name(gr_ppexp000suf)restore
* Step 12* This shows the nonlinear relationship between pct4col and ppexp * for Nassau County districts.preservekeep if co==2twoway (lpolyci pct4col ppexp000) (lfit pct4col ppexp000, lcolor(red) range(12 25)) (lfit pct4col ppexp000 if ppexp000<=19, lcolor(blue) range(12 19)) (lfit pct4col ppexp000 if ppexp000>=19 & ppexp000<=22, lcolor(blue) range(19 22)) (lfit pct4col ppexp000 if ppexp000>=22, lcolor(blue) range(22 25)) (sc pct4col ppexp000, mcolor(orange) mlabel(district) msize(.5) mlabsize(1.5) mlabcolor(black)) , ytitle(pct4col) ylabel(10(10)100) xtitle(ppexp000) xlabel(12(2)26) xline(19) xline(22) legend(off) title(Nassau County) scheme(s1color) name(gr_ppexp000nas)restore
* Step 13* This shows the nonlinear relationship between pct4col and csize * for all districts.twoway (lpolyci pct4col csize) (lfit pct4col csize if csize<=20, lcolor(blue) range(16 20)) (lfit pct4col csize if csize>=20 & csize<=23, lcolor(blue) range(20 23)) (lfit pct4col csize if csize>=23, lcolor(blue) range(23 25)) (sc pct4col csize, mcolor(orange) mlabel(district) msize(.5) mlabsize(1.5) mlabcolor(black)) , ytitle(pct4col) ylabel(10(10)100) xtitle(csize) xlabel(16(2)26) xline(20) xline(23) legend(off) title(Regression Lines - All Districts) scheme(s1color) name(gr_csizeall)
* Step 14* This adds an overall linear regression line to the polynomical and segmented regression linestwoway (lpolyci pct4col csize) (lfit pct4col csize, lcolor(red) range(16 25)) (lfit pct4col csize if csize<=20, lcolor(blue) range(16 20)) (lfit pct4col csize if csize>=20 & csize<=23, lcolor(blue) range(20 23)) (lfit pct4col csize if csize>=23, lcolor(blue) range(23 25)) (sc pct4col csize, mcolor(orange) mlabel(district) msize(.5) mlabsize(1.5) mlabcolor(black)) , ytitle(pct4col) ylabel(10(10)100)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 6
xtitle(csize) xlabel(16(2)26) xline(20) xline(23) legend(off) title(Regression Lines - All Districts) scheme(s1color) name(gr_csizeall2)
* Step 15* This shows the nonlinear relationship between pct4col and csize* for Suffolk County districts.preservekeep if co==1twoway (lpolyci pct4col csize) (lfit pct4col csize, lcolor(red) range(16 25)) (lfit pct4col csize if csize<=20 & csize<=24, lcolor(blue) range(16 20)) (lfit pct4col csize if csize>=20 & csize<=24, lcolor(blue) range(20 24)) (lfit pct4col csize if csize>=24, lcolor(blue) range(24 25)) (sc pct4col csize, mcolor(orange) mlabel(district) msize(.5) mlabsize(1.5) mlabcolor(black)) , ytitle(pct4col) ylabel(10(10)100) xtitle(csize) xlabel(16(2)26) xline(20) xline(24) legend(off) title(Regression Lines - Suffolk County) scheme(s1color) name(gr_csizesuf)restore
* Step 16* This shows the nonlinear relationship between pct4col and csize* for Nassau County districts.preservekeep if co==2twoway (lpolyci pct4col csize) (lfit pct4col csize, lcolor(red) range(18 23)) (lfit pct4col csize if csize<=20, lcolor(blue) range(18 20)) (lfit pct4col csize if csize>=20, lcolor(blue) range(20 23)) (sc pct4col csize, mcolor(orange) mlabel(district) msize(.5) mlabsize(1.5) mlabcolor(black)) , ytitle(pct4col) ylabel(10(10)100) xtitle(csize) xlabel(18(2)24) xline(20) legend(off) title(Regression Lines - Nassau County) scheme(s1color) name(gr_csizenas)restore
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 7
___ ____ ____ ____ ____ (R)
/__ / ____/ / ____/
___/ / /___/ / /___/ 12.1 Copyright 1985-2011 StataCorp LP
Statistics/Data Analysis StataCorp
4905 Lakeway Drive
College Station, Texas 77845 USA
800-STATA-PC http://www.stata.com
979-696-4600 [email protected]
979-696-4601 (fax)
Single-user Stata perpetual license:
Serial number: 30120594666
Licensed to: Francis T. harten
Long Island University
Notes:
. use "http://datalibrary.us/reportcards200405.dta", clear
. codebook, compact
Variable Obs Unique Mean Min Max Label
--------------------------------------------------------------------------------------
--------------------------------------------------------------------------------
------------------
district 92 92 . . . School District Name
co 92 2 1.434783 1 2 County
enroll 92 92 4627.261 249 17158 Enrollment
enroll000 92 92 4.627261 .249 17.158 Enrollment in 1000s
ppexp 92 91 16927.52 12799 25022 Per Pupil Expenditures
ppexp000 92 91 16.92752 12.799 25.022 Per Pupil Expenditures in $1000s
povrate 92 33 12.93478 0 89 Poverty Rate
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 8
attend 92 8 94.93478 90 97 Average Percent Attendance
csize 92 10 21.47826 16 25 Average Class Size
medtchexp 92 12 11.15217 7 18 Median Teacher Workforce
Experience in Years
passeng4 92 34 83.72826 52 99 4th Grade Pass Rate Eng Regents
passmat4 92 21 93.38043 72 100 4th Grade Pass Rate Math Regents
passeng8 92 42 66.97826 18 90 8th Grade Pass Rate Eng Regents
passmat8 92 39 74.23913 23 93 8th Grade Pass Rate Math Regents
passengr 92 31 87.5 30 100 English Regents Pass Rate
passmathr 92 34 86.05435 21 99 Math Regents Pass Rate
passsocsr 92 34 86.52174 24 100 Social Studies Regents Pass Rate
gradrate 92 39 84.05435 33 100 Graduation Rate
pct4col 92 43 63.26087 11 100 % Grads Who Go to College in Year
after Graduation
pass8 92 61 70.6087 22 91 Avg Pass Rate 8th Grade Eng &
Math Regents
pass4 92 41 88.55435 64 99.5 Avg Pass Rate 4th Grade Eng &
Math Regents
passregents 92 60 86.69203 25 99.66666 Avg Pass Rate Eng Math SocStud
Regents
--------------------------------------------------------------------------------------
--------------------------------------------------------------------------------
------------------
. describe
Contains data from http://datalibrary.us/reportcards200405.dta
obs: 92
vars: 22 29 Feb 2012 12:35
size: 6,256
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 9
--------------------------------------------------------------------------------------
--------------------------------------------------------------------------------
------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------------
--------------------------------------------------------------------------------
------------------
district str25 %25s School District Name
co byte %11.0g coval County
enroll int %8.0g Enrollment
enroll000 float %9.0g Enrollment in 1000s
ppexp long %8.0g Per Pupil Expenditures
ppexp000 float %9.0g Per Pupil Expenditures in $1000s
povrate byte %8.0g Poverty Rate
attend byte %8.0g Average Percent Attendance
csize byte %8.0g Average Class Size
medtchexp float %9.0g Median Teacher Workforce Experience in
Years
passeng4 byte %8.0g 4th Grade Pass Rate Eng Regents
passmat4 byte %8.0g 4th Grade Pass Rate Math Regents
passeng8 byte %8.0g 8th Grade Pass Rate Eng Regents
passmat8 byte %8.0g 8th Grade Pass Rate Math Regents
passengr byte %8.0g English Regents Pass Rate
passmathr byte %8.0g Math Regents Pass Rate
passsocsr byte %8.0g Social Studies Regents Pass Rate
gradrate byte %8.0g Graduation Rate
pct4col byte %8.0g % Grads Who Go to College in Year after
Graduation
pass8 float %9.0g Avg Pass Rate 8th Grade Eng & Math
Regents
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 10
pass4 float %9.0g Avg Pass Rate 4th Grade Eng & Math
Regents
passregents float %9.0g Avg Pass Rate Eng Math SocStud Regents
--------------------------------------------------------------------------------------
--------------------------------------------------------------------------------
------------------
Sorted by:
. summarize
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
district | 0
co | 92 1.434783 .4984448 1 2
enroll | 92 4627.261 3024.397 249 17158
enroll000 | 92 4.627261 3.024397 .249 17.158
ppexp | 92 16927.52 3152.303 12799 25022
-------------+--------------------------------------------------------
ppexp000 | 92 16.92752 3.152303 12.799 25.022
povrate | 92 12.93478 16.82476 0 89
attend | 92 94.93478 1.412692 90 97
csize | 92 21.47826 1.680386 16 25
medtchexp | 92 11.15217 2.467079 7 18
-------------+--------------------------------------------------------
passeng4 | 92 83.72826 9.032424 52 99
passmat4 | 92 93.38043 5.706835 72 100
passeng8 | 92 66.97826 15.25244 18 90
passmat8 | 92 74.23913 16.04757 23 93
passengr | 92 87.5 11.80566 30 100
-------------+--------------------------------------------------------
passmathr | 92 86.05435 13.03787 21 99
passsocsr | 92 86.52174 12.48338 24 100
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 11
gradrate | 92 84.05435 12.98551 33 100
pct4col | 92 63.26087 16.19311 11 100
pass8 | 92 70.6087 15.23819 22 91
-------------+--------------------------------------------------------
pass4 | 92 88.55435 7.105352 64 99.5
passregents | 92 86.69203 12.19142 25 99.66666
. tab1 district co enroll enroll000 ppexp ppexp000 povrate attend csize medtchexp
passeng4 passmat4 passeng8 passmat8 passengr passmathr passsocsr gradrate
pct4col pass8 pass4 passrege
> nts
-> tabulation of district
School District Name | Freq. Percent Cum.
--------------------------+-----------------------------------
Amityville | 1 1.09 1.09
Babylon | 1 1.09 2.17
Baldwin | 1 1.09 3.26
Bay Shore | 1 1.09 4.35
Bayport-Blue Point | 1 1.09 5.43
Bethpage | 1 1.09 6.52
Brentwood | 1 1.09 7.61
Carle Place | 1 1.09 8.70
Center Moriches | 1 1.09 9.78
Central Islip | 1 1.09 10.87
Cold Spring Harbor | 1 1.09 11.96
Commack | 1 1.09 13.04
Comsewogue | 1 1.09 14.13
Connetquot | 1 1.09 15.22
Copiague | 1 1.09 16.30
Deer Park | 1 1.09 17.39
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 12
East Hampton | 1 1.09 18.48
East Islip | 1 1.09 19.57
East Meadow | 1 1.09 20.65
East Rockaway | 1 1.09 21.74
East Williston | 1 1.09 22.83
Elwood | 1 1.09 23.91
Farmingdale | 1 1.09 25.00
Freeport | 1 1.09 26.09
Garden City | 1 1.09 27.17
Glen Cove | 1 1.09 28.26
Great Neck | 1 1.09 29.35
Greenport | 1 1.09 30.43
Half Hollow Hills | 1 1.09 31.52
Hampton Bays | 1 1.09 32.61
Harborfields | 1 1.09 33.70
Hauppauge | 1 1.09 34.78
Hempstead | 1 1.09 35.87
Herricks | 1 1.09 36.96
Hewlett-Woodmere | 1 1.09 38.04
Hicksville | 1 1.09 39.13
Huntington | 1 1.09 40.22
Island Trees | 1 1.09 41.30
Islip | 1 1.09 42.39
Jericho | 1 1.09 43.48
Kings Park | 1 1.09 44.57
Lawrence | 1 1.09 45.65
Levittown | 1 1.09 46.74
Lindenhurst | 1 1.09 47.83
Locust Valley | 1 1.09 48.91
Long Beach | 1 1.09 50.00
Longwood | 1 1.09 51.09
Lynbrook | 1 1.09 52.17
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 13
Malverne | 1 1.09 53.26
Manhasset | 1 1.09 54.35
Massapequa | 1 1.09 55.43
Middle Country | 1 1.09 56.52
Miller Place | 1 1.09 57.61
Mineola | 1 1.09 58.70
Mount Sinai | 1 1.09 59.78
North Babylon | 1 1.09 60.87
North Shore | 1 1.09 61.96
Northport-East Northport | 1 1.09 63.04
Oceanside | 1 1.09 64.13
Oyster Bay-East Norwich | 1 1.09 65.22
Patchogue-Medford | 1 1.09 66.30
Plainedge | 1 1.09 67.39
Plainview-Old Bethpage | 1 1.09 68.48
Port Jefferson | 1 1.09 69.57
Port Washington | 1 1.09 70.65
Riverhead | 1 1.09 71.74
Rockville Centre | 1 1.09 72.83
Rocky Point | 1 1.09 73.91
Roosevelt | 1 1.09 75.00
Roslyn | 1 1.09 76.09
Sachem | 1 1.09 77.17
Sag Harbor | 1 1.09 78.26
Sayville | 1 1.09 79.35
Seaford | 1 1.09 80.43
Shelter Island | 1 1.09 81.52
Shoreham-Wading River | 1 1.09 82.61
Smithtown | 1 1.09 83.70
South Country | 1 1.09 84.78
South Huntington | 1 1.09 85.87
Southampton | 1 1.09 86.96
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 14
Southold | 1 1.09 88.04
Syosset | 1 1.09 89.13
Three Village | 1 1.09 90.22
Uniondale | 1 1.09 91.30
Wantagh | 1 1.09 92.39
West Babylon | 1 1.09 93.48
West Hempstead | 1 1.09 94.57
West Islip | 1 1.09 95.65
Westbury | 1 1.09 96.74
Westhampton Beach | 1 1.09 97.83
William Floyd | 1 1.09 98.91
Wyandanch | 1 1.09 100.00
--------------------------+-----------------------------------
Total | 92 100.00
-> tabulation of co
County | Freq. Percent Cum.
------------+-----------------------------------
Suffolk Co. | 52 56.52 56.52
Nassau Co. | 40 43.48 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of enroll
Enrollment | Freq. Percent Cum.
------------+-----------------------------------
249 | 1 1.09 1.09
679 | 1 1.09 2.17
934 | 1 1.09 3.26
1014 | 1 1.09 4.35
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 15
1266 | 1 1.09 5.43
1267 | 1 1.09 6.52
1386 | 1 1.09 7.61
1465 | 1 1.09 8.70
1628 | 1 1.09 9.78
1663 | 1 1.09 10.87
1711 | 1 1.09 11.96
1730 | 1 1.09 13.04
1752 | 1 1.09 14.13
1833 | 1 1.09 15.22
1957 | 1 1.09 16.30
1975 | 1 1.09 17.39
2132 | 1 1.09 18.48
2254 | 1 1.09 19.57
2284 | 1 1.09 20.65
2372 | 1 1.09 21.74
2437 | 1 1.09 22.83
2537 | 1 1.09 23.91
2606 | 1 1.09 25.00
2702 | 1 1.09 26.09
2750 | 1 1.09 27.17
2755 | 1 1.09 28.26
2819 | 1 1.09 29.35
2843 | 1 1.09 30.43
2851 | 1 1.09 31.52
2865 | 1 1.09 32.61
2945 | 1 1.09 33.70
3035 | 1 1.09 34.78
3063 | 1 1.09 35.87
3104 | 1 1.09 36.96
3140 | 1 1.09 38.04
3219 | 1 1.09 39.13
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 16
3283 | 1 1.09 40.22
3355 | 1 1.09 41.30
3535 | 1 1.09 42.39
3553 | 1 1.09 43.48
3589 | 1 1.09 44.57
3617 | 1 1.09 45.65
3622 | 1 1.09 46.74
3647 | 1 1.09 47.83
3660 | 1 1.09 48.91
3662 | 1 1.09 50.00
3946 | 1 1.09 51.09
4013 | 1 1.09 52.17
4077 | 1 1.09 53.26
4126 | 1 1.09 54.35
4160 | 1 1.09 55.43
4203 | 1 1.09 56.52
4212 | 1 1.09 57.61
4399 | 1 1.09 58.70
4483 | 1 1.09 59.78
4786 | 1 1.09 60.87
4787 | 1 1.09 61.96
4801 | 1 1.09 63.04
4896 | 1 1.09 64.13
4911 | 1 1.09 65.22
4999 | 1 1.09 66.30
5161 | 1 1.09 67.39
5309 | 1 1.09 68.48
5472 | 1 1.09 69.57
5482 | 1 1.09 70.65
5811 | 1 1.09 71.74
5874 | 1 1.09 72.83
6137 | 1 1.09 73.91
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 17
6189 | 1 1.09 75.00
6242 | 1 1.09 76.09
6323 | 1 1.09 77.17
6410 | 1 1.09 78.26
6453 | 1 1.09 79.35
6475 | 1 1.09 80.43
6677 | 1 1.09 81.52
6913 | 1 1.09 82.61
6951 | 1 1.09 83.70
7125 | 1 1.09 84.78
7482 | 1 1.09 85.87
7561 | 1 1.09 86.96
7972 | 1 1.09 88.04
7987 | 1 1.09 89.13
8004 | 1 1.09 90.22
8353 | 1 1.09 91.30
9144 | 1 1.09 92.39
9745 | 1 1.09 93.48
9974 | 1 1.09 94.57
10191 | 1 1.09 95.65
10541 | 1 1.09 96.74
11520 | 1 1.09 97.83
15528 | 1 1.09 98.91
17158 | 1 1.09 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of enroll000
Enrollment |
in 1000s | Freq. Percent Cum.
------------+-----------------------------------
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 18
.249 | 1 1.09 1.09
.679 | 1 1.09 2.17
.934 | 1 1.09 3.26
1.014 | 1 1.09 4.35
1.266 | 1 1.09 5.43
1.267 | 1 1.09 6.52
1.386 | 1 1.09 7.61
1.465 | 1 1.09 8.70
1.628 | 1 1.09 9.78
1.663 | 1 1.09 10.87
1.711 | 1 1.09 11.96
1.73 | 1 1.09 13.04
1.752 | 1 1.09 14.13
1.833 | 1 1.09 15.22
1.957 | 1 1.09 16.30
1.975 | 1 1.09 17.39
2.132 | 1 1.09 18.48
2.254 | 1 1.09 19.57
2.284 | 1 1.09 20.65
2.372 | 1 1.09 21.74
2.437 | 1 1.09 22.83
2.537 | 1 1.09 23.91
2.606 | 1 1.09 25.00
2.702 | 1 1.09 26.09
2.75 | 1 1.09 27.17
2.755 | 1 1.09 28.26
2.819 | 1 1.09 29.35
2.843 | 1 1.09 30.43
2.851 | 1 1.09 31.52
2.865 | 1 1.09 32.61
2.945 | 1 1.09 33.70
3.035 | 1 1.09 34.78
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 19
3.063 | 1 1.09 35.87
3.104 | 1 1.09 36.96
3.14 | 1 1.09 38.04
3.219 | 1 1.09 39.13
3.283 | 1 1.09 40.22
3.355 | 1 1.09 41.30
3.535 | 1 1.09 42.39
3.553 | 1 1.09 43.48
3.589 | 1 1.09 44.57
3.617 | 1 1.09 45.65
3.622 | 1 1.09 46.74
3.647 | 1 1.09 47.83
3.66 | 1 1.09 48.91
3.662 | 1 1.09 50.00
3.946 | 1 1.09 51.09
4.013 | 1 1.09 52.17
4.077 | 1 1.09 53.26
4.126 | 1 1.09 54.35
4.16 | 1 1.09 55.43
4.203 | 1 1.09 56.52
4.212 | 1 1.09 57.61
4.399 | 1 1.09 58.70
4.483 | 1 1.09 59.78
4.786 | 1 1.09 60.87
4.787 | 1 1.09 61.96
4.801 | 1 1.09 63.04
4.896 | 1 1.09 64.13
4.911 | 1 1.09 65.22
4.999 | 1 1.09 66.30
5.161 | 1 1.09 67.39
5.309 | 1 1.09 68.48
5.472 | 1 1.09 69.57
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 20
5.482 | 1 1.09 70.65
5.811 | 1 1.09 71.74
5.874 | 1 1.09 72.83
6.137 | 1 1.09 73.91
6.189 | 1 1.09 75.00
6.242 | 1 1.09 76.09
6.323 | 1 1.09 77.17
6.41 | 1 1.09 78.26
6.453 | 1 1.09 79.35
6.475 | 1 1.09 80.43
6.677 | 1 1.09 81.52
6.913 | 1 1.09 82.61
6.951 | 1 1.09 83.70
7.125 | 1 1.09 84.78
7.482 | 1 1.09 85.87
7.561 | 1 1.09 86.96
7.972 | 1 1.09 88.04
7.987 | 1 1.09 89.13
8.004 | 1 1.09 90.22
8.353 | 1 1.09 91.30
9.144 | 1 1.09 92.39
9.745 | 1 1.09 93.48
9.974 | 1 1.09 94.57
10.191 | 1 1.09 95.65
10.541 | 1 1.09 96.74
11.52 | 1 1.09 97.83
15.528 | 1 1.09 98.91
17.158 | 1 1.09 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of ppexp
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 21
Per Pupil |
Expenditure |
s | Freq. Percent Cum.
------------+-----------------------------------
12799 | 1 1.09 1.09
12814 | 1 1.09 2.17
12863 | 1 1.09 3.26
13086 | 1 1.09 4.35
13148 | 1 1.09 5.43
13227 | 1 1.09 6.52
13229 | 1 1.09 7.61
13295 | 1 1.09 8.70
13433 | 1 1.09 9.78
13467 | 1 1.09 10.87
13594 | 1 1.09 11.96
13633 | 1 1.09 13.04
13775 | 1 1.09 14.13
13780 | 1 1.09 15.22
13795 | 1 1.09 16.30
14023 | 1 1.09 17.39
14126 | 1 1.09 18.48
14194 | 1 1.09 19.57
14225 | 1 1.09 20.65
14274 | 1 1.09 21.74
14398 | 1 1.09 22.83
14456 | 1 1.09 23.91
14487 | 1 1.09 25.00
14494 | 1 1.09 26.09
14526 | 1 1.09 27.17
14596 | 1 1.09 28.26
14651 | 1 1.09 29.35
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 22
14704 | 1 1.09 30.43
14707 | 1 1.09 31.52
14822 | 1 1.09 32.61
14841 | 1 1.09 33.70
14993 | 1 1.09 34.78
15034 | 1 1.09 35.87
15142 | 1 1.09 36.96
15211 | 1 1.09 38.04
15236 | 1 1.09 39.13
15362 | 1 1.09 40.22
15466 | 1 1.09 41.30
15573 | 1 1.09 42.39
15607 | 1 1.09 43.48
15636 | 1 1.09 44.57
15647 | 1 1.09 45.65
15757 | 1 1.09 46.74
16122 | 1 1.09 47.83
16148 | 1 1.09 48.91
16227 | 1 1.09 50.00
16466 | 1 1.09 51.09
16502 | 1 1.09 52.17
16530 | 1 1.09 53.26
16663 | 1 1.09 54.35
16932 | 1 1.09 55.43
17000 | 1 1.09 56.52
17003 | 1 1.09 57.61
17081 | 1 1.09 58.70
17204 | 1 1.09 59.78
17324 | 1 1.09 60.87
17325 | 2 2.17 63.04
17368 | 1 1.09 64.13
17370 | 1 1.09 65.22
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 23
17513 | 1 1.09 66.30
17552 | 1 1.09 67.39
17615 | 1 1.09 68.48
17768 | 1 1.09 69.57
17833 | 1 1.09 70.65
17916 | 1 1.09 71.74
18020 | 1 1.09 72.83
18027 | 1 1.09 73.91
18178 | 1 1.09 75.00
18388 | 1 1.09 76.09
18404 | 1 1.09 77.17
18864 | 1 1.09 78.26
19181 | 1 1.09 79.35
19434 | 1 1.09 80.43
19634 | 1 1.09 81.52
19688 | 1 1.09 82.61
19917 | 1 1.09 83.70
21038 | 1 1.09 84.78
21296 | 1 1.09 85.87
21664 | 1 1.09 86.96
21695 | 1 1.09 88.04
21705 | 1 1.09 89.13
21910 | 1 1.09 90.22
22245 | 1 1.09 91.30
22301 | 1 1.09 92.39
22419 | 1 1.09 93.48
22529 | 1 1.09 94.57
23162 | 1 1.09 95.65
24063 | 1 1.09 96.74
24654 | 1 1.09 97.83
24981 | 1 1.09 98.91
25022 | 1 1.09 100.00
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 24
------------+-----------------------------------
Total | 92 100.00
-> tabulation of ppexp000
Per Pupil |
Expenditure |
s in $1000s | Freq. Percent Cum.
------------+-----------------------------------
12.799 | 1 1.09 1.09
12.814 | 1 1.09 2.17
12.863 | 1 1.09 3.26
13.086 | 1 1.09 4.35
13.148 | 1 1.09 5.43
13.227 | 1 1.09 6.52
13.229 | 1 1.09 7.61
13.295 | 1 1.09 8.70
13.433 | 1 1.09 9.78
13.467 | 1 1.09 10.87
13.594 | 1 1.09 11.96
13.633 | 1 1.09 13.04
13.775 | 1 1.09 14.13
13.78 | 1 1.09 15.22
13.795 | 1 1.09 16.30
14.023 | 1 1.09 17.39
14.126 | 1 1.09 18.48
14.194 | 1 1.09 19.57
14.225 | 1 1.09 20.65
14.274 | 1 1.09 21.74
14.398 | 1 1.09 22.83
14.456 | 1 1.09 23.91
14.487 | 1 1.09 25.00
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 25
14.494 | 1 1.09 26.09
14.526 | 1 1.09 27.17
14.596 | 1 1.09 28.26
14.651 | 1 1.09 29.35
14.704 | 1 1.09 30.43
14.707 | 1 1.09 31.52
14.822 | 1 1.09 32.61
14.841 | 1 1.09 33.70
14.993 | 1 1.09 34.78
15.034 | 1 1.09 35.87
15.142 | 1 1.09 36.96
15.211 | 1 1.09 38.04
15.236 | 1 1.09 39.13
15.362 | 1 1.09 40.22
15.466 | 1 1.09 41.30
15.573 | 1 1.09 42.39
15.607 | 1 1.09 43.48
15.636 | 1 1.09 44.57
15.647 | 1 1.09 45.65
15.757 | 1 1.09 46.74
16.122 | 1 1.09 47.83
16.148 | 1 1.09 48.91
16.227 | 1 1.09 50.00
16.466 | 1 1.09 51.09
16.502 | 1 1.09 52.17
16.53 | 1 1.09 53.26
16.663 | 1 1.09 54.35
16.932 | 1 1.09 55.43
17 | 1 1.09 56.52
17.003 | 1 1.09 57.61
17.081 | 1 1.09 58.70
17.204 | 1 1.09 59.78
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 26
17.324 | 1 1.09 60.87
17.325 | 2 2.17 63.04
17.368 | 1 1.09 64.13
17.37 | 1 1.09 65.22
17.513 | 1 1.09 66.30
17.552 | 1 1.09 67.39
17.615 | 1 1.09 68.48
17.768 | 1 1.09 69.57
17.833 | 1 1.09 70.65
17.916 | 1 1.09 71.74
18.02 | 1 1.09 72.83
18.027 | 1 1.09 73.91
18.178 | 1 1.09 75.00
18.388 | 1 1.09 76.09
18.404 | 1 1.09 77.17
18.864 | 1 1.09 78.26
19.181 | 1 1.09 79.35
19.434 | 1 1.09 80.43
19.634 | 1 1.09 81.52
19.688 | 1 1.09 82.61
19.917 | 1 1.09 83.70
21.038 | 1 1.09 84.78
21.296 | 1 1.09 85.87
21.664 | 1 1.09 86.96
21.695 | 1 1.09 88.04
21.705 | 1 1.09 89.13
21.91 | 1 1.09 90.22
22.245 | 1 1.09 91.30
22.301 | 1 1.09 92.39
22.419 | 1 1.09 93.48
22.529 | 1 1.09 94.57
23.162 | 1 1.09 95.65
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 27
24.063 | 1 1.09 96.74
24.654 | 1 1.09 97.83
24.981 | 1 1.09 98.91
25.022 | 1 1.09 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of povrate
Poverty |
Rate | Freq. Percent Cum.
------------+-----------------------------------
0 | 4 4.35 4.35
1 | 10 10.87 15.22
2 | 8 8.70 23.91
3 | 5 5.43 29.35
4 | 9 9.78 39.13
5 | 6 6.52 45.65
6 | 5 5.43 51.09
7 | 3 3.26 54.35
8 | 4 4.35 58.70
9 | 5 5.43 64.13
10 | 2 2.17 66.30
11 | 3 3.26 69.57
12 | 3 3.26 72.83
13 | 1 1.09 73.91
14 | 1 1.09 75.00
16 | 1 1.09 76.09
18 | 1 1.09 77.17
19 | 1 1.09 78.26
21 | 1 1.09 79.35
25 | 3 3.26 82.61
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 28
26 | 3 3.26 85.87
28 | 1 1.09 86.96
30 | 2 2.17 89.13
34 | 1 1.09 90.22
35 | 1 1.09 91.30
37 | 1 1.09 92.39
41 | 1 1.09 93.48
42 | 1 1.09 94.57
49 | 1 1.09 95.65
58 | 1 1.09 96.74
67 | 1 1.09 97.83
72 | 1 1.09 98.91
89 | 1 1.09 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of attend
Average |
Percent |
Attendance | Freq. Percent Cum.
------------+-----------------------------------
90 | 1 1.09 1.09
91 | 2 2.17 3.26
92 | 2 2.17 5.43
93 | 6 6.52 11.96
94 | 22 23.91 35.87
95 | 19 20.65 56.52
96 | 33 35.87 92.39
97 | 7 7.61 100.00
------------+-----------------------------------
Total | 92 100.00
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 29
-> tabulation of csize
Average |
Class Size | Freq. Percent Cum.
------------+-----------------------------------
16 | 1 1.09 1.09
17 | 1 1.09 2.17
18 | 4 4.35 6.52
19 | 4 4.35 10.87
20 | 12 13.04 23.91
21 | 19 20.65 44.57
22 | 26 28.26 72.83
23 | 17 18.48 91.30
24 | 7 7.61 98.91
25 | 1 1.09 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of medtchexp
Median |
Teacher |
Workforce |
Experience |
in Years | Freq. Percent Cum.
------------+-----------------------------------
7 | 3 3.26 3.26
8 | 11 11.96 15.22
9 | 14 15.22 30.43
10 | 12 13.04 43.48
11 | 10 10.87 54.35
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 30
12 | 18 19.57 73.91
13 | 9 9.78 83.70
14 | 6 6.52 90.22
15 | 3 3.26 93.48
16 | 4 4.35 97.83
17 | 1 1.09 98.91
18 | 1 1.09 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of passeng4
4th Grade |
Pass Rate |
Eng Regents | Freq. Percent Cum.
------------+-----------------------------------
52 | 1 1.09 1.09
56 | 1 1.09 2.17
63 | 1 1.09 3.26
64 | 1 1.09 4.35
65 | 1 1.09 5.43
67 | 1 1.09 6.52
70 | 1 1.09 7.61
71 | 2 2.17 9.78
72 | 2 2.17 11.96
73 | 1 1.09 13.04
74 | 1 1.09 14.13
75 | 2 2.17 16.30
76 | 3 3.26 19.57
78 | 3 3.26 22.83
79 | 2 2.17 25.00
80 | 3 3.26 28.26
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 31
81 | 2 2.17 30.43
82 | 4 4.35 34.78
83 | 5 5.43 40.22
84 | 7 7.61 47.83
85 | 2 2.17 50.00
86 | 6 6.52 56.52
87 | 6 6.52 63.04
88 | 5 5.43 68.48
89 | 2 2.17 70.65
90 | 7 7.61 78.26
91 | 6 6.52 84.78
92 | 3 3.26 88.04
93 | 3 3.26 91.30
94 | 1 1.09 92.39
95 | 2 2.17 94.57
96 | 1 1.09 95.65
98 | 3 3.26 98.91
99 | 1 1.09 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of passmat4
4th Grade |
Pass Rate |
Math |
Regents | Freq. Percent Cum.
------------+-----------------------------------
72 | 1 1.09 1.09
79 | 2 2.17 3.26
80 | 2 2.17 5.43
82 | 1 1.09 6.52
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 32
83 | 3 3.26 9.78
85 | 2 2.17 11.96
86 | 2 2.17 14.13
87 | 1 1.09 15.22
88 | 1 1.09 16.30
89 | 4 4.35 20.65
90 | 1 1.09 21.74
91 | 2 2.17 23.91
92 | 5 5.43 29.35
93 | 5 5.43 34.78
94 | 8 8.70 43.48
95 | 11 11.96 55.43
96 | 12 13.04 68.48
97 | 9 9.78 78.26
98 | 7 7.61 85.87
99 | 8 8.70 94.57
100 | 5 5.43 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of passeng8
8th Grade |
Pass Rate |
Eng Regents | Freq. Percent Cum.
------------+-----------------------------------
18 | 1 1.09 1.09
20 | 1 1.09 2.17
22 | 1 1.09 3.26
30 | 1 1.09 4.35
36 | 1 1.09 5.43
44 | 1 1.09 6.52
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 33
45 | 2 2.17 8.70
46 | 2 2.17 10.87
48 | 1 1.09 11.96
50 | 1 1.09 13.04
51 | 2 2.17 15.22
52 | 1 1.09 16.30
54 | 4 4.35 20.65
55 | 1 1.09 21.74
56 | 1 1.09 22.83
58 | 2 2.17 25.00
61 | 2 2.17 27.17
63 | 1 1.09 28.26
64 | 3 3.26 31.52
65 | 5 5.43 36.96
66 | 3 3.26 40.22
67 | 2 2.17 42.39
68 | 2 2.17 44.57
69 | 5 5.43 50.00
70 | 1 1.09 51.09
71 | 4 4.35 55.43
72 | 4 4.35 59.78
73 | 1 1.09 60.87
74 | 3 3.26 64.13
75 | 3 3.26 67.39
76 | 3 3.26 70.65
77 | 5 5.43 76.09
78 | 3 3.26 79.35
79 | 2 2.17 81.52
81 | 4 4.35 85.87
82 | 3 3.26 89.13
83 | 1 1.09 90.22
84 | 1 1.09 91.30
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 34
85 | 2 2.17 93.48
86 | 3 3.26 96.74
88 | 1 1.09 97.83
90 | 2 2.17 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of passmat8
8th Grade |
Pass Rate |
Math |
Regents | Freq. Percent Cum.
------------+-----------------------------------
23 | 1 1.09 1.09
24 | 1 1.09 2.17
26 | 1 1.09 3.26
33 | 1 1.09 4.35
44 | 1 1.09 5.43
46 | 1 1.09 6.52
49 | 1 1.09 7.61
51 | 2 2.17 9.78
52 | 1 1.09 10.87
55 | 3 3.26 14.13
57 | 1 1.09 15.22
58 | 2 2.17 17.39
59 | 1 1.09 18.48
61 | 2 2.17 20.65
63 | 1 1.09 21.74
65 | 2 2.17 23.91
67 | 2 2.17 26.09
70 | 1 1.09 27.17
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 35
71 | 4 4.35 31.52
72 | 1 1.09 32.61
73 | 2 2.17 34.78
74 | 4 4.35 39.13
75 | 5 5.43 44.57
76 | 4 4.35 48.91
78 | 2 2.17 51.09
79 | 2 2.17 53.26
80 | 2 2.17 55.43
81 | 4 4.35 59.78
82 | 2 2.17 61.96
84 | 6 6.52 68.48
85 | 8 8.70 77.17
86 | 2 2.17 79.35
87 | 1 1.09 80.43
88 | 4 4.35 84.78
89 | 1 1.09 85.87
90 | 3 3.26 89.13
91 | 1 1.09 90.22
92 | 6 6.52 96.74
93 | 3 3.26 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of passengr
English |
Regents |
Pass Rate | Freq. Percent Cum.
------------+-----------------------------------
30 | 1 1.09 1.09
36 | 1 1.09 2.17
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 36
54 | 1 1.09 3.26
65 | 1 1.09 4.35
67 | 1 1.09 5.43
68 | 1 1.09 6.52
70 | 1 1.09 7.61
75 | 4 4.35 11.96
77 | 1 1.09 13.04
78 | 1 1.09 14.13
80 | 1 1.09 15.22
81 | 1 1.09 16.30
82 | 2 2.17 18.48
83 | 4 4.35 22.83
84 | 3 3.26 26.09
85 | 1 1.09 27.17
86 | 6 6.52 33.70
87 | 4 4.35 38.04
88 | 3 3.26 41.30
89 | 5 5.43 46.74
90 | 4 4.35 51.09
91 | 3 3.26 54.35
92 | 4 4.35 58.70
93 | 7 7.61 66.30
94 | 5 5.43 71.74
95 | 5 5.43 77.17
96 | 9 9.78 86.96
97 | 5 5.43 92.39
98 | 2 2.17 94.57
99 | 4 4.35 98.91
100 | 1 1.09 100.00
------------+-----------------------------------
Total | 92 100.00
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 37
-> tabulation of passmathr
Math |
Regents |
Pass Rate | Freq. Percent Cum.
------------+-----------------------------------
21 | 1 1.09 1.09
43 | 1 1.09 2.17
44 | 1 1.09 3.26
46 | 1 1.09 4.35
57 | 1 1.09 5.43
68 | 1 1.09 6.52
70 | 1 1.09 7.61
71 | 1 1.09 8.70
72 | 1 1.09 9.78
73 | 1 1.09 10.87
74 | 1 1.09 11.96
75 | 1 1.09 13.04
77 | 2 2.17 15.22
79 | 1 1.09 16.30
80 | 3 3.26 19.57
81 | 1 1.09 20.65
82 | 3 3.26 23.91
83 | 4 4.35 28.26
84 | 2 2.17 30.43
85 | 2 2.17 32.61
86 | 4 4.35 36.96
87 | 2 2.17 39.13
88 | 3 3.26 42.39
89 | 7 7.61 50.00
90 | 3 3.26 53.26
91 | 5 5.43 58.70
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 38
92 | 5 5.43 64.13
93 | 3 3.26 67.39
94 | 8 8.70 76.09
95 | 9 9.78 85.87
96 | 8 8.70 94.57
97 | 2 2.17 96.74
98 | 2 2.17 98.91
99 | 1 1.09 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of passsocsr
Social |
Studies |
Regents |
Pass Rate | Freq. Percent Cum.
------------+-----------------------------------
24 | 1 1.09 1.09
38 | 1 1.09 2.17
47 | 1 1.09 3.26
61 | 1 1.09 4.35
66 | 2 2.17 6.52
67 | 1 1.09 7.61
72 | 1 1.09 8.70
74 | 2 2.17 10.87
75 | 1 1.09 11.96
76 | 1 1.09 13.04
77 | 2 2.17 15.22
78 | 1 1.09 16.30
79 | 1 1.09 17.39
80 | 2 2.17 19.57
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 39
81 | 1 1.09 20.65
82 | 2 2.17 22.83
83 | 2 2.17 25.00
84 | 4 4.35 29.35
85 | 4 4.35 33.70
86 | 2 2.17 35.87
87 | 3 3.26 39.13
88 | 5 5.43 44.57
89 | 2 2.17 46.74
90 | 5 5.43 52.17
91 | 3 3.26 55.43
92 | 9 9.78 65.22
93 | 4 4.35 69.57
94 | 5 5.43 75.00
95 | 10 10.87 85.87
96 | 4 4.35 90.22
97 | 4 4.35 94.57
98 | 1 1.09 95.65
99 | 2 2.17 97.83
100 | 2 2.17 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of gradrate
Graduation |
Rate | Freq. Percent Cum.
------------+-----------------------------------
33 | 1 1.09 1.09
39 | 1 1.09 2.17
43 | 1 1.09 3.26
52 | 1 1.09 4.35
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 40
53 | 1 1.09 5.43
65 | 1 1.09 6.52
66 | 1 1.09 7.61
67 | 1 1.09 8.70
68 | 1 1.09 9.78
69 | 2 2.17 11.96
72 | 1 1.09 13.04
73 | 2 2.17 15.22
74 | 1 1.09 16.30
75 | 4 4.35 20.65
76 | 1 1.09 21.74
77 | 1 1.09 22.83
78 | 2 2.17 25.00
79 | 2 2.17 27.17
80 | 1 1.09 28.26
81 | 2 2.17 30.43
82 | 3 3.26 33.70
83 | 1 1.09 34.78
84 | 3 3.26 38.04
85 | 3 3.26 41.30
86 | 3 3.26 44.57
87 | 5 5.43 50.00
88 | 2 2.17 52.17
89 | 5 5.43 57.61
90 | 6 6.52 64.13
91 | 4 4.35 68.48
92 | 4 4.35 72.83
93 | 6 6.52 79.35
94 | 6 6.52 85.87
95 | 2 2.17 88.04
96 | 4 4.35 92.39
97 | 1 1.09 93.48
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 41
98 | 2 2.17 95.65
99 | 3 3.26 98.91
100 | 1 1.09 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of pct4col
% Grads Who |
Go to |
College in |
Year after |
Graduation | Freq. Percent Cum.
------------+-----------------------------------
11 | 1 1.09 1.09
31 | 2 2.17 3.26
38 | 2 2.17 5.43
41 | 3 3.26 8.70
45 | 5 5.43 14.13
46 | 1 1.09 15.22
48 | 1 1.09 16.30
49 | 2 2.17 18.48
50 | 2 2.17 20.65
51 | 3 3.26 23.91
54 | 3 3.26 27.17
55 | 5 5.43 32.61
56 | 5 5.43 38.04
57 | 1 1.09 39.13
58 | 1 1.09 40.22
59 | 2 2.17 42.39
60 | 2 2.17 44.57
61 | 2 2.17 46.74
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 42
62 | 3 3.26 50.00
63 | 3 3.26 53.26
64 | 2 2.17 55.43
65 | 2 2.17 57.61
66 | 2 2.17 59.78
67 | 3 3.26 63.04
69 | 3 3.26 66.30
70 | 2 2.17 68.48
71 | 2 2.17 70.65
73 | 3 3.26 73.91
75 | 3 3.26 77.17
77 | 2 2.17 79.35
78 | 2 2.17 81.52
79 | 3 3.26 84.78
82 | 1 1.09 85.87
84 | 1 1.09 86.96
85 | 2 2.17 89.13
86 | 1 1.09 90.22
87 | 1 1.09 91.30
88 | 2 2.17 93.48
89 | 1 1.09 94.57
90 | 2 2.17 96.74
91 | 1 1.09 97.83
95 | 1 1.09 98.91
100 | 1 1.09 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of pass8
Avg Pass |
Rate 8th |
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 43
Grade Eng & |
Math |
Regents | Freq. Percent Cum.
------------+-----------------------------------
22 | 1 1.09 1.09
23 | 1 1.09 2.17
26.5 | 2 2.17 4.35
42.5 | 1 1.09 5.43
45.5 | 1 1.09 6.52
46 | 1 1.09 7.61
48.5 | 1 1.09 8.70
50 | 1 1.09 9.78
51 | 2 2.17 11.96
52 | 1 1.09 13.04
52.5 | 1 1.09 14.13
54.5 | 2 2.17 16.30
56.5 | 1 1.09 17.39
57 | 1 1.09 18.48
58 | 1 1.09 19.57
59.5 | 1 1.09 20.65
60.5 | 1 1.09 21.74
62.5 | 2 2.17 23.91
63.5 | 1 1.09 25.00
64 | 1 1.09 26.09
64.5 | 1 1.09 27.17
65 | 1 1.09 28.26
66 | 1 1.09 29.35
67.5 | 1 1.09 30.43
68.5 | 1 1.09 31.52
69 | 1 1.09 32.61
69.5 | 2 2.17 34.78
70 | 2 2.17 36.96
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 44
70.5 | 2 2.17 39.13
71 | 1 1.09 40.22
72.5 | 4 4.35 44.57
73 | 2 2.17 46.74
73.5 | 1 1.09 47.83
74 | 2 2.17 50.00
74.5 | 1 1.09 51.09
75 | 2 2.17 53.26
75.5 | 2 2.17 55.43
76.5 | 2 2.17 57.61
77 | 2 2.17 59.78
77.5 | 2 2.17 61.96
78 | 2 2.17 64.13
78.5 | 1 1.09 65.22
79 | 3 3.26 68.48
79.5 | 1 1.09 69.57
80.5 | 3 3.26 72.83
81 | 2 2.17 75.00
81.5 | 1 1.09 76.09
82 | 2 2.17 78.26
82.5 | 1 1.09 79.35
83 | 3 3.26 82.61
83.5 | 1 1.09 83.70
84 | 1 1.09 84.78
84.5 | 2 2.17 86.96
85 | 2 2.17 89.13
85.5 | 1 1.09 90.22
87 | 3 3.26 93.48
87.5 | 1 1.09 94.57
88.5 | 1 1.09 95.65
89 | 2 2.17 97.83
90 | 1 1.09 98.91
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 45
91 | 1 1.09 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of pass4
Avg Pass |
Rate 4th |
Grade Eng & |
Math |
Regents | Freq. Percent Cum.
------------+-----------------------------------
64 | 1 1.09 1.09
66 | 1 1.09 2.17
71 | 1 1.09 3.26
72.5 | 1 1.09 4.35
73 | 1 1.09 5.43
73.5 | 1 1.09 6.52
78 | 1 1.09 7.61
78.5 | 3 3.26 10.87
79.5 | 1 1.09 11.96
80 | 1 1.09 13.04
81.5 | 1 1.09 14.13
82.5 | 2 2.17 16.30
83 | 3 3.26 19.57
83.5 | 1 1.09 20.65
84 | 3 3.26 23.91
84.5 | 1 1.09 25.00
85 | 1 1.09 26.09
85.5 | 2 2.17 28.26
86 | 2 2.17 30.43
86.5 | 1 1.09 31.52
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 46
88 | 6 6.52 38.04
89 | 2 2.17 40.22
89.5 | 3 3.26 43.48
90 | 5 5.43 48.91
90.5 | 6 6.52 55.43
91 | 4 4.35 59.78
91.5 | 3 3.26 63.04
92 | 4 4.35 67.39
92.5 | 6 6.52 73.91
93 | 2 2.17 76.09
93.5 | 2 2.17 78.26
94 | 1 1.09 79.35
94.5 | 5 5.43 84.78
95 | 2 2.17 86.96
95.5 | 2 2.17 89.13
96 | 1 1.09 90.22
96.5 | 4 4.35 94.57
97 | 1 1.09 95.65
98 | 1 1.09 96.74
99 | 2 2.17 98.91
99.5 | 1 1.09 100.00
------------+-----------------------------------
Total | 92 100.00
-> tabulation of passregents
Avg Pass |
Rate Eng |
Math |
SocStud |
Regents | Freq. Percent Cum.
------------+-----------------------------------
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 47
25 | 1 1.09 1.09
39.33333 | 1 1.09 2.17
49 | 1 1.09 3.26
57 | 1 1.09 4.35
66.66666 | 1 1.09 5.43
67 | 1 1.09 6.52
69 | 1 1.09 7.61
72 | 1 1.09 8.70
74 | 1 1.09 9.78
75.33334 | 1 1.09 10.87
76 | 1 1.09 11.96
77.33334 | 1 1.09 13.04
77.66666 | 1 1.09 14.13
78 | 1 1.09 15.22
79 | 1 1.09 16.30
79.33334 | 1 1.09 17.39
79.66666 | 1 1.09 18.48
80.66666 | 1 1.09 19.57
82 | 1 1.09 20.65
82.33334 | 1 1.09 21.74
82.66666 | 1 1.09 22.83
83 | 1 1.09 23.91
83.33334 | 1 1.09 25.00
83.66666 | 1 1.09 26.09
84.33334 | 1 1.09 27.17
85 | 1 1.09 28.26
85.33334 | 2 2.17 30.43
85.66666 | 1 1.09 31.52
86 | 1 1.09 32.61
86.33334 | 1 1.09 33.70
86.66666 | 2 2.17 35.87
87 | 1 1.09 36.96
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 48
87.33334 | 2 2.17 39.13
88 | 2 2.17 41.30
88.33334 | 2 2.17 43.48
88.66666 | 3 3.26 46.74
89 | 1 1.09 47.83
90 | 1 1.09 48.91
90.33334 | 1 1.09 50.00
90.66666 | 2 2.17 52.17
91 | 1 1.09 53.26
91.33334 | 2 2.17 55.43
91.66666 | 1 1.09 56.52
92 | 2 2.17 58.70
92.33334 | 6 6.52 65.22
92.66666 | 2 2.17 67.39
93.33334 | 2 2.17 69.57
93.66666 | 1 1.09 70.65
94 | 3 3.26 73.91
94.33334 | 3 3.26 77.17
94.66666 | 2 2.17 79.35
95 | 4 4.35 83.70
95.33334 | 3 3.26 86.96
95.66666 | 3 3.26 90.22
96 | 1 1.09 91.30
96.66666 | 3 3.26 94.57
97.33334 | 2 2.17 96.74
98.66666 | 1 1.09 97.83
99 | 1 1.09 98.91
99.66666 | 1 1.09 100.00
------------+-----------------------------------
Total | 92 100.00
. regress pct4col enroll000, beta
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 49
Source | SS df MS Number of obs = 92
-------------+------------------------------ F( 1, 90) = 3.97
Model | 1008.31333 1 1008.31333 Prob > F = 0.0493
Residual | 22853.4258 90 253.926953 R-squared = 0.0423
-------------+------------------------------ Adj R-squared = 0.0316
Total | 23861.7391 91 262.216914 Root MSE = 15.935
------------------------------------------------------------------------------
pct4col | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
enroll000 | -1.100622 .5523253 -1.99 0.049 -.2055638
_cons | 68.35374 3.04827 22.42 0.000 .
------------------------------------------------------------------------------
. regress pct4col ppexp000, beta
Source | SS df MS Number of obs = 92
-------------+------------------------------ F( 1, 90) = 12.43
Model | 2895.02137 1 2895.02137 Prob > F = 0.0007
Residual | 20966.7178 90 232.963531 R-squared = 0.1213
-------------+------------------------------ Adj R-squared = 0.1116
Total | 23861.7391 91 262.216914 Root MSE = 15.263
------------------------------------------------------------------------------
pct4col | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
ppexp000 | 1.789275 .5075692 3.53 0.001 .3483171
_cons | 32.97287 8.738007 3.77 0.000 .
------------------------------------------------------------------------------
. regress pct4col csize, beta
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 50
Source | SS df MS Number of obs = 92
-------------+------------------------------ F( 1, 90) = 3.27
Model | 835.986169 1 835.986169 Prob > F = 0.0740
Residual | 23025.753 90 255.8417 R-squared = 0.0350
-------------+------------------------------ Adj R-squared = 0.0243
Total | 23861.7391 91 262.216914 Root MSE = 15.995
------------------------------------------------------------------------------
pct4col | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
csize | -1.803723 .9978284 -1.81 0.074 -.1871753
_cons | 102.0017 21.4964 4.75 0.000 .
------------------------------------------------------------------------------
. regress pct4col attend, beta
Source | SS df MS Number of obs = 92
-------------+------------------------------ F( 1, 90) = 32.61
Model | 6346.29455 1 6346.29455 Prob > F = 0.0000
Residual | 17515.4446 90 194.616051 R-squared = 0.2660
-------------+------------------------------ Adj R-squared = 0.2578
Total | 23861.7391 91 262.216914 Root MSE = 13.95
------------------------------------------------------------------------------
pct4col | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
attend | 5.91142 1.035192 5.71 0.000 .5157142
_cons | -497.9385 98.28651 -5.07 0.000 .
------------------------------------------------------------------------------
. regress pct4col pass4, beta
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 51
Source | SS df MS Number of obs = 92
-------------+------------------------------ F( 1, 90) = 49.96
Model | 8518.01997 1 8518.01997 Prob > F = 0.0000
Residual | 15343.7192 90 170.485768 R-squared = 0.3570
-------------+------------------------------ Adj R-squared = 0.3498
Total | 23861.7391 91 262.216914 Root MSE = 13.057
------------------------------------------------------------------------------
pct4col | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
pass4 | 1.361642 .1926361 7.07 0.000 .597473
_cons | -57.31848 17.113 -3.35 0.001 .
------------------------------------------------------------------------------
. regress pct4col passregents, beta
Source | SS df MS Number of obs = 92
-------------+------------------------------ F( 1, 90) = 92.56
Model | 12098.4906 1 12098.4906 Prob > F = 0.0000
Residual | 11763.2485 90 130.702761 R-squared = 0.5070
-------------+------------------------------ Adj R-squared = 0.5015
Total | 23861.7391 91 262.216914 Root MSE = 11.433
------------------------------------------------------------------------------
pct4col | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
passregents | .9457814 .0983032 9.62 0.000 .7120567
_cons | -18.73084 8.605051 -2.18 0.032 .
------------------------------------------------------------------------------
. regress pct4col medtchexp, beta
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 52
Source | SS df MS Number of obs = 92
-------------+------------------------------ F( 1, 90) = 1.90
Model | 494.508029 1 494.508029 Prob > F = 0.1710
Residual | 23367.2311 90 259.635901 R-squared = 0.0207
-------------+------------------------------ Adj R-squared = 0.0098
Total | 23861.7391 91 262.216914 Root MSE = 16.113
------------------------------------------------------------------------------
pct4col | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
medtchexp | .9448936 .6846658 1.38 0.171 .1439579
_cons | 52.72325 7.818131 6.74 0.000 .
------------------------------------------------------------------------------
. regress pct4col co, beta
Source | SS df MS Number of obs = 92
-------------+------------------------------ F( 1, 90) = 4.17
Model | 1056.69105 1 1056.69105 Prob > F = 0.0441
Residual | 22805.0481 90 253.389423 R-squared = 0.0443
-------------+------------------------------ Adj R-squared = 0.0337
Total | 23861.7391 91 262.216914 Root MSE = 15.918
------------------------------------------------------------------------------
pct4col | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
co | 6.836538 3.347777 2.04 0.044 .2104374
_cons | 53.45192 5.081951 10.52 0.000 .
------------------------------------------------------------------------------
. regress pct4col enroll000 ppexp000 csize attend pass4 passregents medtchexp co, beta
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 53
Source | SS df MS Number of obs = 92
-------------+------------------------------ F( 8, 83) = 19.24
Model | 15503.6188 8 1937.95235 Prob > F = 0.0000
Residual | 8358.12034 83 100.700245 R-squared = 0.6497
-------------+------------------------------ Adj R-squared = 0.6160
Total | 23861.7391 91 262.216914 Root MSE = 10.035
------------------------------------------------------------------------------
pct4col | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
enroll000 | .0302549 .4037618 0.07 0.940 .0056507
ppexp000 | 2.082902 .5055581 4.12 0.000 .4054772
csize | .2630747 .7725277 0.34 0.734 .0272997
attend | 1.570267 .91467 1.72 0.090 .1369906
pass4 | .2538726 .2337281 1.09 0.281 .1113964
passregents | .7451234 .1280158 5.82 0.000 .560986
medtchexp | -1.3152 .5588814 -2.35 0.021 -.2003755
co | 1.073157 2.573468 0.42 0.678 .0330332
_cons | -200.811 83.39689 -2.41 0.018 .
------------------------------------------------------------------------------
. estat vif
Variable | VIF 1/VIF
-------------+----------------------
pass4 | 2.49 0.401233
ppexp000 | 2.30 0.435703
passregents | 2.20 0.454311
medtchexp | 1.72 0.582081
csize | 1.52 0.656663
attend | 1.51 0.662773
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 54
co | 1.49 0.672539
enroll000 | 1.35 0.742098
-------------+----------------------
Mean VIF | 1.82
. save "http://datalibrary.us/reportcards200405.dta", replace
may not write files over Internet
r(633);
. * Step 1
.
. * Load the data set from the web.
.
. use "http://myweb.liu.edu/~redowl/data/reportcards200405.dta", clear
.
.
.
. * Step 2
.
. set more off
.
.
.
. * Step 3
.
. * Turn on feature to place all graphs in tabs of graph window
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 55
.
. * instead of separate graph windows. (This may not work on Macs, but
.
. * that feature is just cosmetic and will not affect the ultimate graphs.)
.
. set autotabgraphs on
(set autotabgraphs preference recorded)
.
.
.
. * Step 4
.
. * This drops any graphs that may be in memory from previous analyses.
.
. * The "capture" command instructs Stata to ignore errors if no graphs
.
. * are already in memory and none need to be dropped.
.
. capture graph drop _all
.
.
.
. * Step 5
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 56
.
. * This loop produces the graph for each non-binary independent variable
.
. * in the varlist.
.
. foreach var of varlist enroll000 ppexp000 csize attend pass4 passregents medtchexp {
2.
. twoway (lpolyci pct4col `var') (lfit pct4col `var', lcolor(blue)) (sc pct4col
`var', msize(small) ) , legend(off) title(`var') scheme(s1color) name(gr_`var')
3.
. }
.
.
.
. * Step 6
.
. * This produces the graph for the single binary independent variable,
.
. * co, and omits the polynomial line. Any line between two points
.
. * (i.e., the averages of the categories of a binary variable) will always
.
. * be a straight line, so a polynomial fit would not be meaninful.
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 57
.
. twoway (lfitci pct4col co) (lfit pct4col co, lcolor(blue)) (sc pct4col co,
mcolor(maroon) msize(small) ) , xlabel(1(1)2, valuelabel) legend(off) title(co)
scheme(s1color) name(gr_co
> )
.
.
.
. * Step 7
.
. * This produces the first set of combined graphs from the loop above.
.
. graph combine gr_enroll000 gr_ppexp000 gr_csize gr_attend, cols(2) title(Review of
Linearity) scheme(s1color) name(gr_comb1)
.
.
.
. * Step 8
.
. * This produces the second set of combined graphs from the loop above.
.
. graph combine gr_pass4 gr_passregents gr_medtchexp gr_co, cols(2) title(Review of
Linearity) scheme(s1color) name(gr_comb2)
.
.
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 58
.
. * Step 9
.
. * This shows the nonlinear relationship between pct4col and ppexp
.
. * for all districts.
.
. twoway (lpolyci pct4col ppexp000) (lfit pct4col ppexp000 if ppexp000<=19,
lcolor(blue) range(12.8 19)) (lfit pct4col ppexp000 if ppexp000>=19 &
ppexp000<=22, lcolor(blue) range(19 22
> )) (lfit pct4col ppexp000 if ppexp000>=22, lcolor(blue) range(22 25)) (sc pct4col
ppexp000, mcolor(orange) mlabel(district) msize(.5) mlabsize(1.5)
mlabcolor(black)) , ytitle(pct4col
> ) ylabel(10(10)100) xtitle(ppexp000) xlabel(12(2)26) xline(19) xline(22) legend(off)
title(Regression Lines - All Districts) scheme(s1color) name(gr_ppexp000all)
.
.
.
. * Step 10
.
. * This adds an overall linear regression line to the polynomical and segmented
regression lines
.
. twoway (lpolyci pct4col ppexp000) (lfit pct4col ppexp000, lcolor(red) range(12.8
25)) (lfit pct4col ppexp000 if ppexp000<=19, lcolor(blue) range(12.8 19)) (lfit
pct4col ppexp000 if p
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 59
> pexp000>=19 & ppexp000<=22, lcolor(blue) range(19 22)) (lfit pct4col ppexp000 if
ppexp000>=22, lcolor(blue) range(22 25)) (sc pct4col ppexp000, mcolor(orange)
mlabel(district) msize(
> .5) mlabsize(1.5) mlabcolor(black)) , ytitle(pct4col) ylabel(10(10)100)
xtitle(ppexp000) xlabel(12(2)26) xline(19) xline(22) legend(off)
title(Regression Lines - All Districts) schem
> e(s1color) name(gr_ppexp000all2)
.
.
.
. * Step 11
.
. * This shows the nonlinear relationship between pct4col and ppexp
.
. * for Suffolk County districts.
.
. preserve
.
. keep if co==1
(40 observations deleted)
.
. twoway (lpolyci pct4col ppexp000) (lfit pct4col ppexp000, lcolor(red) range(12 25))
(lfit pct4col ppexp000 if ppexp000<=19, lcolor(blue) range(12 19)) (lfit pct4col
ppexp000 if ppexp
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 60
> 000>=22, lcolor(blue) range(22 26)) (sc pct4col ppexp000, mcolor(orange)
mlabel(district) msize(.5) mlabsize(1.5) mlabcolor(black)) , ytitle(pct4col)
ylabel(20(10)100) xtitle(ppexp00
> 0) xlabel(12(2)26) xline(19) xline(22) legend(off) title(Suffolk County)
scheme(s1color) name(gr_ppexp000suf)
.
. restore
.
.
.
. * Step 12
.
. * This shows the nonlinear relationship between pct4col and ppexp
.
. * for Nassau County districts.
.
. preserve
.
. keep if co==2
(52 observations deleted)
.
. twoway (lpolyci pct4col ppexp000) (lfit pct4col ppexp000, lcolor(red) range(12 25))
(lfit pct4col ppexp000 if ppexp000<=19, lcolor(blue) range(12 19)) (lfit pct4col
ppexp000 if ppex
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 61
> p000>=19 & ppexp000<=22, lcolor(blue) range(19 22)) (lfit pct4col ppexp000 if
ppexp000>=22, lcolor(blue) range(22 25)) (sc pct4col ppexp000, mcolor(orange)
mlabel(district) msize(.5)
> mlabsize(1.5) mlabcolor(black)) , ytitle(pct4col) ylabel(10(10)100)
xtitle(ppexp000) xlabel(12(2)26) xline(19) xline(22) legend(off) title(Nassau
County) scheme(s1color) name(gr_ppe
> xp000nas)
.
. restore
.
.
.
. * Step 13
.
. * This shows the nonlinear relationship between pct4col and csize
.
. * for all districts.
.
. twoway (lpolyci pct4col csize) (lfit pct4col csize if csize<=20, lcolor(blue)
range(16 20)) (lfit pct4col csize if csize>=20 & csize<=23, lcolor(blue)
range(20 23)) (lfit pct4col csi
> ze if csize>=23, lcolor(blue) range(23 25)) (sc pct4col csize, mcolor(orange)
mlabel(district) msize(.5) mlabsize(1.5) mlabcolor(black)) , ytitle(pct4col)
ylabel(10(10)100) xtitle(cs
> ize) xlabel(16(2)26) xline(20) xline(23) legend(off) title(Regression Lines - All
Districts) scheme(s1color) name(gr_csizeall)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 62
.
.
.
. * Step 14
.
. * This adds an overall linear regression line to the polynomical and segmented
regression lines
.
. twoway (lpolyci pct4col csize) (lfit pct4col csize, lcolor(red) range(16 25)) (lfit
pct4col csize if csize<=20, lcolor(blue) range(16 20)) (lfit pct4col csize if
csize>=20 & csize<=2
> 3, lcolor(blue) range(20 23)) (lfit pct4col csize if csize>=23, lcolor(blue)
range(23 25)) (sc pct4col csize, mcolor(orange) mlabel(district) msize(.5)
mlabsize(1.5) mlabcolor(black)
> ) , ytitle(pct4col) ylabel(10(10)100) xtitle(csize) xlabel(16(2)26) xline(20)
xline(23) legend(off) title(Regression Lines - All Districts) scheme(s1color)
name(gr_csizeall2)
.
.
.
. * Step 15
.
. * This shows the nonlinear relationship between pct4col and csize
.
. * for Suffolk County districts.
.
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 63
. preserve
.
. keep if co==1
(40 observations deleted)
.
. twoway (lpolyci pct4col csize) (lfit pct4col csize, lcolor(red) range(16 25)) (lfit
pct4col csize if csize<=20 & csize<=24, lcolor(blue) range(16 20)) (lfit pct4col
csize if csize>=2
> 0 & csize<=24, lcolor(blue) range(20 24)) (lfit pct4col csize if csize>=24,
lcolor(blue) range(24 25)) (sc pct4col csize, mcolor(orange) mlabel(district)
msize(.5) mlabsize(1.5) mlab
> color(black)) , ytitle(pct4col) ylabel(10(10)100) xtitle(csize) xlabel(16(2)26)
xline(20) xline(24) legend(off) title(Regression Lines - Suffolk County)
scheme(s1color) name(gr_csize
> suf)
.
. restore
.
.
.
. * Step 16
.
. * This shows the nonlinear relationship between pct4col and csize
.
. * for Nassau County districts.
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 64
.
. preserve
.
. keep if co==2
(52 observations deleted)
.
. twoway (lpolyci pct4col csize) (lfit pct4col csize, lcolor(red) range(18 23)) (lfit
pct4col csize if csize<=20, lcolor(blue) range(18 20)) (lfit pct4col csize if
csize>=20, lcolor(bl
> ue) range(20 23)) (sc pct4col csize, mcolor(orange) mlabel(district) msize(.5)
mlabsize(1.5) mlabcolor(black)) , ytitle(pct4col) ylabel(10(10)100)
xtitle(csize) xlabel(18(2)24) xline
> (20) legend(off) title(Regression Lines - Nassau County) scheme(s1color)
name(gr_csizenas)
.
. restore
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 65
% Grads WhoGo to
College inYear afterGraduation
Enrollmentin
1000s
Per PupilExpenditures
in $1000s
AverageClassSize
AveragePercent
Attendance
Avg PassRate 4th
Grade Eng &Math
Regents
Avg PassRate Eng
MathSocStudRegents
MedianTeacher
WorkforceExperience
in Years
County
0 50 100
0
10
20
0 10 20
10
15
20
25
10 15 20 25
15
20
25
15 20 25
90
95
100
90 95 100
60
80
100
60 80 100
0
50
100
0 50 100
5
10
15
20
5 10 15 201
1.5
2
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 66
020
4060
8010
0
0 5 10 15 20
enroll000
. graph save gr_enroll000 "C:\Users\Owner\Documents\gr_enroll000.gph"
(file C:\Users\Owner\Documents\gr_enroll000.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 67
020
4060
8010
0
10 15 20 25
ppexp000
. graph save gr_ppexp000 "C:\Users\Owner\Documents\gr_ppexp000.gph"
(file C:\Users\Owner\Documents\gr_ppexp000.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 68
020
4060
8010
0
16 18 20 22 24 26
csize
. graph save gr_csize "C:\Users\Owner\Documents\gr_csize.gph"
(file C:\Users\Owner\Documents\gr_csize.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 69
020
4060
8010
0
90 92 94 96 98
attend
. graph save gr_attend "C:\Users\Owner\Documents\gr_attend.gph"
(file C:\Users\Owner\Documents\gr_attend.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 70
020
4060
8010
0
60 70 80 90 100
pass4
. graph save gr_pass4 "C:\Users\Owner\Documents\gr_pass4.gph"
(file C:\Users\Owner\Documents\gr_pass4.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 71
020
4060
8010
0
20 40 60 80 100
passregents
. graph save gr_passregents "C:\Users\Owner\Documents\gr_passregents.gph"
(file C:\Users\Owner\Documents\gr_passregents.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 72
020
4060
8010
0
5 10 15 20
medtchexp
. graph save gr_medtchexp "C:\Users\Owner\Documents\gr_medtchexp.gph"
(file C:\Users\Owner\Documents\gr_medtchexp.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 73
020
4060
8010
0
Suffolk Co. Nassau Co.County
co
. graph save gr_co "C:\Users\Owner\Documents\gr_co.gph"
(file C:\Users\Owner\Documents\gr_co.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 74
020
4060
8010
0
0 5 10 15 20
enroll000
020
4060
8010
0
10 15 20 25
ppexp000
020
4060
8010
0
16 18 20 22 24 26
csize
020
4060
8010
0
90 92 94 96 98
attend
Review of Linearity
. graph save gr_comb1 "C:\Users\Owner\Documents\gr_comb1.gph"
(file C:\Users\Owner\Documents\gr_comb1.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 75
020
4060
8010
0
60 70 80 90 100
pass4
020
4060
8010
0
20 40 60 80 100
passregents
020
4060
8010
0
5 10 15 20
medtchexp
020
4060
8010
0
Suffolk Co. Nassau Co.County
co
Review of Linearity
. graph save gr_comb2 "C:\Users\Owner\Documents\gr_comb2.gph"
(file C:\Users\Owner\Documents\gr_comb2.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 76
Amityville
Babylon
Bay Shore
Bayport-Blue Point
Brentwood
Center Moriches
Central Islip
Cold Spring Harbor
Commack
Comsewogue
Connetquot
Copiague
Deer Park
East HamptonEast Islip
East Meadow
Elwood
Greenport
Half Hollow Hills
Hampton Bays
Harborfields
Hauppauge
Huntington
Islip
Kings Park
Lindenhurst
LongwoodMiddle Country
Miller Place
Mount Sinai
North Babylon
Northport-East Northport
Patchogue-Medford
Port Jefferson
Riverhead
Rocky Point
Sachem
Sag Harbor
Sayville
Shelter IslandShoreham-Wading RiverSmithtown
South Country
South Huntington
Southampton
Southold
Three Village
West Babylon
West IslipWesthampton Beach
William Floyd
Wyandanch
Baldwin
Bethpage
Carle Place
East Rockaway
East Williston
Farmingdale
Freeport
Garden City
Glen Cove
Great Neck
Hempstead
Herricks
Hewlett-Woodmere
Hicksville
Island Trees
Jericho
Lawrence
Levittown
Locust Valley
Long Beach
Lynbrook
Malverne
Manhasset
Massapequa
Mineola
North Shore
Oceanside
Oyster Bay-East Norwich
Plainedge
Plainview-Old Bethpage
Port WashingtonRockville Centre
Roosevelt
Roslyn
Seaford
Syosset
UniondaleWantaghWest Hempstead
Westbury
1020
3040
5060
7080
9010
0pc
t4co
l
12 14 16 18 20 22 24 26ppexp000
Regression Lines - All Districts
. graph save gr_ppexp000all "C:\Users\Owner\Documents\gr_ppexp000all.gph"
(file C:\Users\Owner\Documents\gr_ppexp000all.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 77
Amityville
Babylon
Bay Shore
Bayport-Blue Point
Brentwood
Center Moriches
Central Islip
Cold Spring Harbor
Commack
Comsewogue
Connetquot
Copiague
Deer Park
East HamptonEast Islip
East Meadow
Elwood
Greenport
Half Hollow Hills
Hampton Bays
Harborfields
Hauppauge
Huntington
Islip
Kings Park
Lindenhurst
LongwoodMiddle Country
Miller Place
Mount Sinai
North Babylon
Northport-East Northport
Patchogue-Medford
Port Jefferson
Riverhead
Rocky Point
Sachem
Sag Harbor
Sayville
Shelter IslandShoreham-Wading RiverSmithtown
South Country
South Huntington
Southampton
Southold
Three Village
West Babylon
West IslipWesthampton Beach
William Floyd
Wyandanch
Baldwin
Bethpage
Carle Place
East Rockaway
East Williston
Farmingdale
Freeport
Garden City
Glen Cove
Great Neck
Hempstead
Herricks
Hewlett-Woodmere
Hicksville
Island Trees
Jericho
Lawrence
Levittown
Locust Valley
Long Beach
Lynbrook
Malverne
Manhasset
Massapequa
Mineola
North Shore
Oceanside
Oyster Bay-East Norwich
Plainedge
Plainview-Old Bethpage
Port WashingtonRockville Centre
Roosevelt
Roslyn
Seaford
Syosset
UniondaleWantaghWest Hempstead
Westbury
1020
3040
5060
7080
9010
0pc
t4co
l
12 14 16 18 20 22 24 26ppexp000
Regression Lines - All Districts
. graph save gr_ppexp000all2 "C:\Users\Owner\Documents\gr_ppexp000all2.gph"
(file C:\Users\Owner\Documents\gr_ppexp000all2.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 78
Amityville
Babylon
Bay Shore
Bayport-Blue Point
Brentwood
Center Moriches
Central Islip
Cold Spring Harbor
Commack
Comsewogue
Connetquot
Copiague
Deer Park
East HamptonEast Islip
East Meadow
Elwood
Greenport
Half Hollow Hills
Hampton Bays
Harborfields
Hauppauge
Huntington
Islip
Kings Park
Lindenhurst
LongwoodMiddle Country
Miller Place
Mount Sinai
North Babylon
Northport-East Northport
Patchogue-Medford
Port Jefferson
Riverhead
Rocky Point
Sachem
Sag Harbor
Sayville
Shelter IslandShoreham-Wading RiverSmithtown
South Country
South Huntington
Southampton
Southold
Three Village
West Babylon
West IslipWesthampton Beach
William Floyd
Wyandanch
2030
4050
6070
8090
100
pct4
col
12 14 16 18 20 22 24 26ppexp000
Suffolk County
. graph save gr_ppexp000suf "C:\Users\Owner\Documents\gr_ppexp000suf.gph"
(file C:\Users\Owner\Documents\gr_ppexp000suf.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 79
Baldwin
Bethpage
Carle Place
East Rockaway
East Williston
Farmingdale
Freeport
Garden City
Glen Cove
Great Neck
Hempstead
Herricks
Hewlett-Woodmere
Hicksville
Island Trees
Jericho
Lawrence
Levittown
Locust Valley
Long Beach
Lynbrook
Malverne
Manhasset
Massapequa
Mineola
North Shore
Oceanside
Oyster Bay-East Norwich
Plainedge
Plainview-Old Bethpage
Port WashingtonRockville Centre
Roosevelt
Roslyn
Seaford
Syosset
UniondaleWantaghWest Hempstead
Westbury
1020
3040
5060
7080
9010
0pc
t4co
l
12 14 16 18 20 22 24 26ppexp000
Nassau County
. graph save gr_ppexp000nas "C:\Users\Owner\Documents\gr_ppexp000nas.gph"
(file C:\Users\Owner\Documents\gr_ppexp000nas.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 80
Amityville
Babylon
Bay Shore
Bayport-Blue Point
Brentwood
Center Moriches
Central Islip
Cold Spring Harbor
Commack
Comsewogue
Connetquot
Copiague
Deer Park
East HamptonEast Islip
East Meadow
Elwood
Greenport
Half Hollow Hills
Hampton Bays
Harborfields
Hauppauge
Huntington
Islip
Kings Park
Lindenhurst
LongwoodMiddle Country
Miller Place
Mount Sinai
North Babylon
Northport-East Northport
Patchogue-Medford
Port Jefferson
Riverhead
Rocky Point
Sachem
Sag Harbor
Sayville
Shelter Island Shoreham-Wading River Smithtown
South Country
South Huntington
Southampton
Southold
Three Village
West Babylon
West IslipWesthampton Beach
William Floyd
Wyandanch
Baldwin
Bethpage
Carle Place
East Rockaway
East Williston
Farmingdale
Freeport
Garden City
Glen Cove
Great Neck
Hempstead
Herricks
Hewlett-Woodmere
Hicksville
Island Trees
Jericho
Lawrence
Levittown
Locust Valley
Long Beach
Lynbrook
Malverne
Manhasset
Massapequa
Mineola
North Shore
Oceanside
Oyster Bay-East Norwich
Plainedge
Plainview-Old Bethpage
Port WashingtonRockville Centre
Roosevelt
Roslyn
Seaford
Syosset
Uniondale WantaghWest Hempstead
Westbury
1020
3040
5060
7080
9010
0pc
t4co
l
16 18 20 22 24 26csize
Regression Lines - All Districts
. graph save gr_csizeall "C:\Users\Owner\Documents\gr_csizeall.gph"
(file C:\Users\Owner\Documents\gr_csizeall.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 81
Amityville
Babylon
Bay Shore
Bayport-Blue Point
Brentwood
Center Moriches
Central Islip
Cold Spring Harbor
Commack
Comsewogue
Connetquot
Copiague
Deer Park
East HamptonEast Islip
East Meadow
Elwood
Greenport
Half Hollow Hills
Hampton Bays
Harborfields
Hauppauge
Huntington
Islip
Kings Park
Lindenhurst
LongwoodMiddle Country
Miller Place
Mount Sinai
North Babylon
Northport-East Northport
Patchogue-Medford
Port Jefferson
Riverhead
Rocky Point
Sachem
Sag Harbor
Sayville
Shelter Island Shoreham-Wading River Smithtown
South Country
South Huntington
Southampton
Southold
Three Village
West Babylon
West IslipWesthampton Beach
William Floyd
Wyandanch
Baldwin
Bethpage
Carle Place
East Rockaway
East Williston
Farmingdale
Freeport
Garden City
Glen Cove
Great Neck
Hempstead
Herricks
Hewlett-Woodmere
Hicksville
Island Trees
Jericho
Lawrence
Levittown
Locust Valley
Long Beach
Lynbrook
Malverne
Manhasset
Massapequa
Mineola
North Shore
Oceanside
Oyster Bay-East Norwich
Plainedge
Plainview-Old Bethpage
Port WashingtonRockville Centre
Roosevelt
Roslyn
Seaford
Syosset
Uniondale WantaghWest Hempstead
Westbury
1020
3040
5060
7080
9010
0pc
t4co
l
16 18 20 22 24 26csize
Regression Lines - All Districts
. graph save gr_csizeall2 "C:\Users\Owner\Documents\gr_csizeall2.gph"
(file C:\Users\Owner\Documents\gr_csizeall2.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 82
Amityville
Babylon
Bay Shore
Bayport-Blue Point
Brentwood
Center Moriches
Central Islip
Cold Spring Harbor
Commack
Comsewogue
Connetquot
Copiague
Deer Park
East HamptonEast Islip
East Meadow
Elwood
Greenport
Half Hollow Hills
Hampton Bays
Harborfields
Hauppauge
Huntington
Islip
Kings Park
Lindenhurst
LongwoodMiddle Country
Miller Place
Mount Sinai
North Babylon
Northport-East Northport
Patchogue-Medford
Port Jefferson
Riverhead
Rocky Point
Sachem
Sag Harbor
Sayville
Shelter Island Shoreham-Wading River Smithtown
South Country
South Huntington
Southampton
Southold
Three Village
West Babylon
West IslipWesthampton Beach
William Floyd
Wyandanch
1020
3040
5060
7080
9010
0pc
t4co
l
16 18 20 22 24 26csize
Regression Lines - Suffolk County
. graph save gr_csizesuf "C:\Users\Owner\Documents\gr_csizesuf.gph"
(file C:\Users\Owner\Documents\gr_csizesuf.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 83
Baldwin
Bethpage
Carle Place
East Rockaway
East Williston
Farmingdale
Freeport
Garden City
Glen Cove
Great Neck
Hempstead
Herricks
Hewlett-Woodmere
Hicksville
Island Trees
Jericho
Lawrence
Levittown
Locust Valley
Long Beach
Lynbrook
Malverne
Manhasset
Massapequa
Mineola
North Shore
Oceanside
Oyster Bay-East Norwich
Plainedge
Plainview-Old Bethpage
Port WashingtonRockville Centre
Roosevelt
Roslyn
Seaford
Syosset
Uniondale WantaghWest Hempstead
Westbury
1020
3040
5060
7080
9010
0pc
t4co
l
18 20 22 24csize
Regression Lines - Nassau County
. graph save gr_csizenas "C:\Users\Owner\Documents\gr_csizenas.gph"
(file C:\Users\Owner\Documents\gr_csizenas.gph saved)
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 84
Results and Discussion
The purpose of this project was to conduct a multiple regression analysis on a set of
select variables from the 2004-05 New York State Education Department school district report
cards http://nysed.gov. The goal is to see if any correlation exists between certain independent
variable in the data and the prediction of the number of post one year high school graduates who
enroll in a four year college. An important factor in determining the effectiveness of high school
educational procedures and study material content comprising student experiential process is the
ratio of graduation rate to post-secondary education enrollment ( ). The New York State
Report Cards annual report provides information on the “enrollment, demographic, attendance,
suspension, dropout, teacher, assessment, accountability, graduation rate, post-graduate plan,
career and technical education, and fiscal data for public and charter schools, districts, and the
State” ( ). From the report a set of seven research questions were developed to be answered
by multiple regression analysis for the project. The questions main focus of concern; school
district factors that affect the rates at which their students enroll in four-year colleges
immediately after high school. The districts of focus were Nassau and Suffolk counties. From
these two counties there were 92 school districts contained in the data. The questions will be
addressed later in this report based on the analysis of the data.
Regression Method
The data submitted to the New York State Education Department’s report cards for
documentation were collected from all state school districts and their reporting elements; school
district officials, school superintendents and school principles. All efforts were afforded to those
reporting elements to review verify and correct reporting data. The data collected concerning
graduate rates is for a specific academic graduate year. The academic graduate grade year is
designated as a cohort from whom all reference will be made. This project sought to analyze the
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 85
2004-2005 cohorts graduation rate as it pertained to the cohort’s total populous that enrolled in a
4 year college within one year of graduation. The school districts of concern were Nassau
County and Suffolk County. From these two counties there were n = 92 separate reporting school
districts or observations that encompassed the submitted data.
The initial data reported contained 22 variables; some of these variables for the districts
had incomplete information and therefore were excluded. A multiple regression analysis was
conducted with the following predictor independent variables (IVs): County (co), Enrollment in
1000s (enroll000), Per Pupil Expenditures in $1000s (ppexp000), Average Percent Attendance
(attend), Average Class Size (csize), Median Teacher Workforce Experience in Years
(medtchexp), Average Pass Rate 4th Grade English and math Regents (pass4), and Average Pass
rate English Math Social Studies Regents (passregents), with Percent Graduates Who Go to
College in Year After Graduation (pct4col) as the outcome dependent variable (DV).
First we see that the F-test is statistically significant, hence the model is significantly
significant. The model produced an R squared of .6497 or that means approximately 65% of the
variances of pct4col is accounted for by the model, which was statistically significant, and with
[F (8, 83) = 19.24, p < .00005] see (Table 1). The overall regression model shows that we can
have a good confidence level that the results are not merely from chance. However this does not
indicate that every IV is significant. Some may even warrant removal later. With the F statistic
and its p-value (Prob > F) telling us that the overall regression model is statistically significant
we can therefore can use the model and interpret its results. The following discussion will focus
on the eight independent variable predictors accounting for 65.0% (Table 1) of the variance in
the percent of high school graduates who go on to college in the year after their high school
graduation in Nassau and Suffolk counties when all 92 school districts are taken into account.
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 86
(Insert Table 1 Approximately Here)
In the Stata regression shown in (Table 1), the predicted regression equation is:
pct4col = -200.81 + 3.03*enroll000 + 2.08*ppexp000 + .26*csize + 1.57*attend + .25*pass4
+ .75*passregents + -1.32*medtchexp + 1.07*co
The equation shows that the dependent variable’s constant coefficient -200.81 the percent of
graduates who go to college in one year after graduation high school (pct4col) remains constant
when all independent variables are held constant at zero. The dependent variable (pct4col) is
predicted to react accordingly with the related influences of the select independent predictor
variables as references in (Table 1) showing Multiple Regression variances and in (Table 3)
when only simple linear regression variances are considered. For this project the data set from
(Table 1) will be discussed in length to show that the dependent variable (pct4col) will:
Increase by 3.03 units when the enrollment in the 1000s (enroll000) goes up by one unit.
Increase by 2.08 units when per pupil expenditures in the $1000s (ppexp000) goes up by
one unit
Increase by .26 units when the average class size (csize) goes up by one unit.
Increase by 1.57 units when the average percent attendance (attend) goes up by one unit.
Increase by .25 units by when average pass rate of 4th grade English and Math Regents
(pass4) goes up by one unit.
Increase by .75 units when the average pass rate on English, Math, and Social Studies
Regents (passregents) goes up by one unit.
Decreaseby by -1.32 units when the median teacher workforce experience increases in
years (medtchexp) goes up by one unit.
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 87
The final independent variable shows an increase of 1.07 units when all 92 school
districts of Nassau and Suffolk counties (co) are considered.
(Insert Table 2 Approximately Here)
(Insert Table 3 Approximately Here)
Let’s begin to focus more precisely on the eight predictors and whether they are statically
significant and, if so, the direction of their relationship to the dependent variable the percent of
high school graduates who go on to a four year college in a year after graduation (pct4col). This
will be specifically addressed by answering the seven research questions posed in this class
project of multiple regression analysis. The following represents the research questions needed
to be addressed given the data set in the pre-report STATA section:
Research Question 1
The enrollment size in 1000’s (enroll000, b=0.030), is not significant (p=0.94). The
coefficient is positive which would indicate that the larger the enrollment of a school district is
related to a higher percentage of that particular school districts senior graduates who would go
on to a four year college a year after their graduation date, which is what we would expect.
Research Question 2
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 88
The budgetary resources or per pupil expenditures in $1000’s (ppexp000, b=2.082), is
highly significant (p=0.000). The coefficient is positive which would indicate that the greater the
amount of budgetary resources spent on student learning is related to a higher percentage of that
particular school districts senior graduates who would go on to a four year college a year after
their graduation date, which is what we would expect.
Research Question 3
The average class size in a school district (csize, b=0.263), is not significant (p=0.734).
The coefficient is positive which would indicate that the smaller the class size allowing for a
smaller ratio of student per teacher is related to a higher percentage of that particular school
districts senior graduates who would go on to a four year college a year after their graduation
date, which is what we would expect. Indicating more individualized per student instruction
times and personalized instruction methods being utilized by teachers.
Research Question 4
The average percent of attendance in a school district (attend, b=1.570), is not significant
(p=0.90). The coefficient is positive which would indicate that the greater the overall attendance
rate of a particular school districts is related to a higher percentage of their senior graduates who
would go on to a four year college a year after their graduation date, which is what we would
expect. Indicating that with a higher average percentage attendance the students on the whole
received more in class instruction time allowing for greater retention and understanding of
instructional materials.
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 89
(Insert Graph 1 Approximately Here)
Research Question 5
The academic preparation of a school districts students looked at by two of the
independent variable predictors are represented by the following:
The average pass rate of 4th grade English and Math regents (pass4, b=0.254), is not significant
(p=0.281). The coefficient is positive which would indicate that the higher the average pass rate
is related to a to a higher percentage of that particular school districts senior graduates who
would go on to a four year college a year after their graduation date, which is what we would
expect.
The average pass rate of English, Math, and Social Studies regents (passregents, b=0.745), is
highly significant (p=0.000). The coefficient is positive which would indicate that the higher the
average pass rate of students on the English, math, and Social Studies regents is related to a
higher percentage of that particular school districts senior graduates who would go on to a four
year college a year after their graduation date, which is what we would expect. Results
indicating students received proper instructional preparation and educational motivation for
future advancement to post-secondary education levels.
Research Question 6
The median teacher workforce experience in years (medtchexp, b=-1.315), is seemingly
not significant (p=0.021). The coefficient is negative which would indicate that the higher the
teacher workforce in years of experience is related to a lower percentage of that particular school
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 90
districts senior graduates who would go on to a four year college a year after their graduation
date. The percentage of teacher workforce experience is at best seemingly unrelated to graduates
who seek enrollment in a four year college within one year of graduation; at worst it would
indicate that there is a negative effect as it relates to the independent variables relation to the
dependent variable. This all would indicate that long time teacher experience in years is not an
important factor in predicting college enrollment; this is a somewhat unexpected result.
Research Question 7
The county from which the districts are located (co, b= 1.073), is not significant
(p=0.678). The coefficient is positive which would indicate that the county of a particular school
districts is related to a higher percentage of their senior graduates who would go on to a four year
college a year after their graduation date, which is what we would expect if we could assume that
these two select counties are made up of affluent areas in relationship to all other counties in
New York State. Without this extended comparison it is difficult to make this assumption based
on the presented data set. It seems these two counties are related in equal fashion when looking
at the percentage of graduates who go on to college in the year after graduation
(Insert Graph 2 Approximately Here)
(Insert Graph 3 Approximately Here)
Conclusion
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 91
From these results, based on the data set presented, one would conclude that the most
statistically significantly coefficients were (ppexp000) and (passregents) as they both met the
statistical requirements of the F-test and p-probability parameter. They were both directly
related to either higher or lower percentage rates of high school graduates who enroll in college
within one year of graduation from any one particular school district. All of the other
coefficients (enroll000, csize, attend, pass4, medtchexp, & co) did have either a positive or
negative effect on the dependent variables final percent outcome but were all at seemingly
insignificant levels of related probability only lessoning the adjusted R-squared rate to
approximately 62%. This still leaves approximately 35% of unknown probable independent
variable out there to consider in future studies relating to the percentage of high school graduates
who enroll in college within a year after graduation. One might consider searching in the socio-
economic backgrounds of student population (family’s ability to pay for a post-secondary college
education) for other independent predictor variable coefficients that might be found to hold
statistically significant data results. Notwithstanding it is clear that this particular area of
educational data sets has much further investigative analysis ahead.
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 92
References
Table 1
Multiple Regression Percent High School Graduates Who Go to College Year after Graduation
Source SS df MS Number of obs = 92 F ( 8, 83) = 19.24
Model15503.61
9 8 1937.9524 Prob > F = 0.0000
Residual8358.120
3 83 100.70025 R-squared = 0.6497 Adj R-squared = 0.6160
Total23861.73
9 91 262.21691 Root MSE = 10.035
pct4col Coef.Std. Err. t P> l t l Beta
enroll0000.030254
9 0.40376 0.07 0.940 0.0056507ppexp000 2.082902 0.50556 4.12 0.000 0.4054772
csize0.263074
7 0.77253 0.34 0.734 0.0272997
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 93
attend 1.570267 0.91467 1.72 0.090 0.1369906
pass40.253872
6 0.23373 1.09 0.281 0.1113964
passregents0.745123
4 0.12802 5.82 0.000 0.560986medtchexp -1.3152 0.55888 -2.35 0.021 -0.2003755
co 1.073157 2.57347 0.42 0.678 0.0330332_cons -200.811 83.3969 -2.41 0.018 .
Table 2
Correlatation of Variables
pct4col
enr~00
ppe~00 csize attend pass4
pas~nts
med~xp co
pct4col1.000
0
enroll000
-0.205
61.000
0
ppexp0000.348
3
-0.438
21.000
0
csize
-0.187
20.345
8 -0.5591.000
0attend 0.515
7-
0.2110.100
8-
0.0561.000
0
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 94
4 2
pass40.597
5 -0.1610.179
5
-0.147
60.494
61.000
0
passregents
0.7121
-0.115
40.062
7
-0.023
50.525
40.677
1 1.0000
madtchexp 0.144
-0.370
10.603
4
-0.277
20.065
90.211
7 0.115 1.0000
co0.210
4
-0.165
90.378
6
-0.211
60.118
70.395
1 0.0777 0.36561.00
0
Table 3
Simple Linear Regression of All Independent Predictor Variables
Source SS df MS Number of obs = 92
Model 1008.3133 1 1008.31333 Prob > F = 0.0493Residual 22853.4258 90 253.926953 R-squared = 0.0423
Adj R-squared= 0.0316Total 23861.7391 91 262.216914 Root MSE = 15.935
pct4col Coef. Std. Err. t P> l t l Beta
enroll000 -1.100622 0.5523253 -1.99 0.049 -0.2055638_cons 68.35374 3.04827 22.42 0.000 .
Source SS df MS Number of obs = 92
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 95
F ( 1, 90) = 12.43Model 2895.02137 1 2895.02137 Prob > F = 0.0007
Residual 20966.7178 90 232.963531 R-squared = 0.1213Adj R-squared= 0.1116
Total 23861.7391 91 262.216914 Root MSE = 15.263
pct4col Coef. Std. Err. t P> l t l Beta
ppexp000 1.789275 0.5075692 3.53 0.001 0.3483171_cons 32.97287 8.738007 3.77 0.000 .
Source SS df MS Number of obs = 92F ( 1, 90) = 3.27
Model 835.986169 1 835.986169 Prob > F = 0.0740Residual 23025.753 90 255.8417 R-squared = 0.0350
Adj R-squared= 0.0243Total 23861.7391 91 262.216914 Root MSE = 15.995
pct4col Coef. Std. Err. t P> l t l Beta
csize -1.803723 0.9978284 -1.81 0.074 -1.871752_cons 102.0017 21.4964 4.75 0.000 .
Source SS df MS Number of obs = 92F ( 1, 90) = 32.61
Model 6346.29455 1 6346.29455 Prob > F = 0.000Residual 17515.4446 90 194.616051 R-squared = 0.2660
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 96
Adj R-squared= 0.2578Total 23861.7391 91 262.216914 Root MSE = 13.95
pct4col Coef. Std. Err. t P> l t l Beta
attend 5.91142 1.035192 5.71 0.000 0.5157142_cons -497.9385 98.28651 -5.07 0.000 .
Source SS df MS Number of obs = 92F ( 1, 90) = 49.96
Model 8518.01997 1 8518.01997 Prob > F = 0.0000Residual 15343.7192 90 170.485768 R-squared = 0.3570
Adj R-squared= 0.3498Total 23861.7391 91 262.216914 Root MSE = 13.057
pct4col Coef. Std. Err. t P> l t l Beta
pass4 1.36142 0.1926361 7.07 0.000 0.597473_cons -57.31848 17.113 -3.35 0.001 .
Source SS df MS Number of obs = 92F ( 1, 90) = 92.56
Model 12098.4906 1 12098.4906 Prob > F = 0.0000Residual 11763.2485 90 130.702761 R-squared = 0.5070
Adj R-squared= 0.5015
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 97
Total 23861.7391 91 262.216914 Root MSE = 11.433
pct4col Coef. Std. Err. t P> l t l Beta
passregents 0.9457814 0.0983032 9.62 0.000 0.7120567
_cons -18.73084 8.605051 -2.18 0.032 .
Source SS df MS Number of obs = 92F ( 1, 90) = 1.90
Model 494.508029 1 494.508029 Prob > F = 0.1710Residual 23367.2311 90 259.635901 R-squared = 0.0207
Adj R-squared= 0.0098Total 23861.7391 91 262.216914 Root MSE = 16.113
pct4col Coef. Std. Err. t P> l t l Beta
medtchexp 0.9448936 0.6846658 1.38 0.171 0.1439579_cons 52.72325 7.818131 6.74 0.000 .
Source SS df MS Number of obs = 92F ( 1, 90) = 4.17
Model 1056.69105 1 1056.69105 Prob > F = 0.0441Residual 22805.0481 90 253.389423 R-squared = 0.0443
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 98
Adj R-squared= 0.0337Total 23861.7391 91 262.216914 Root MSE = 15.918
pct4col Coef. Std. Err. t P> l t l Beta
co 6.836538 3.347777 2.04 0.044 0.2104374_cons 53.45192 5.081951 10.52 0.000 .
Graph 1
Scatterplot Matrix for All Variables
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 99
% Grads WhoGo to
College inYear afterGraduation
Enrollmentin
1000s
Per PupilExpenditures
in $1000s
AverageClassSize
AveragePercent
Attendance
Avg PassRate 4th
Grade Eng &Math
Regents
Avg PassRate Eng
MathSocStudRegents
MedianTeacher
WorkforceExperience
in Years
County
0 50 100
0
10
20
0 10 20
10
15
20
25
10 15 20 25
15
20
25
15 20 25
90
95
100
90 95 100
60
80
100
60 80 100
0
50
100
0 50 100
5
10
15
20
5 10 15 201
1.5
2
Graph 2
Scatterplots of Linearity Observations Enrollment, Expenditures, Class Size, & Attendance
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 100
020
4060
8010
0
0 5 10 15 20
enroll000
020
4060
8010
0
10 15 20 25
ppexp000
020
4060
8010
0
16 18 20 22 24 26
csize
020
4060
8010
0
90 92 94 96 98
attend
Review of Linearity
Graph 3
PROJECT #2 – MULTIPLE REGRESSION ANALYSIS 101
Scatterplots of Linearity Observations 4th Grade Regents Pass Rate, All Regents Pass Rate,
Median Teacher Workforce Experience, & County0
2040
6080
100
60 70 80 90 100
pass4
020
4060
8010
020 40 60 80 100
passregents
020
4060
8010
0
5 10 15 20
medtchexp
020
4060
8010
0
Suffolk Co. Nassau Co.County
co
Review of Linearity