Project Group 3-Report
Transcript of Project Group 3-Report
-
7/31/2019 Project Group 3-Report
1/44
UCSD Exte ns ion
(Marc h 12 , 2012)
Group Me mbe rs :
He rnand e z, Agnie s zka; Mathe w, Babu; De Le on , Luis ;Yanamandra , Shanthi; Olin, Thomas .
-
7/31/2019 Project Group 3-Report
2/44
Introduc tion
in into
account as a variable when making a product purchase decision, including the preference to
choose U.S. produce as a way of supporting national-economy and national-development. This
report investigates whether making the decision about supporting U.S-origin produce such as
cars, is in general a cost-effective decision. Knowing the information, we will also make a
hypothesis and determine if there is a strong relation between the car price and car fuel economy
(MPG).
Our Team chooses our analysis topics because we were able to perform all three types of
analysis required. Additionally our determination is that this is interesting information the
audience of magazines would consider to be current and of value. This includes market trends
and selecting factors that indicate importance for consumers to use when buying domestic or
foreign manufactured cars.
In the first section: we are examining if the car s population-data of US-origin and non-US
origin are homogeneous. This information will be further used to interpret the results of our
analysis performed in the second section: both population means are compared. The outcome of
population homogeneity verification will help us better understand if there are similar types of
cars when it comes to the population s structure (by car type). This step is overtaken because
both populations are consisting of various types of cars that differently impact the research
dependent variable: Car Price. This step would not be needed (from the topic selection
perspective) if we have enough sample accounted within each car type segment and would be
homogeneity would not be in question.
-
7/31/2019 Project Group 3-Report
3/44
In the second section we are examining if the average car price of US-origin verses non-US
origin car are significantly different, with the assumption to be verified that the American cars
are less expensive than non-US cars. That information would help us determine if, U.S.-origin
car purchase is an effective decision only from the car price perspective, and without taking other
important factors into account including: fuel-economy, reliability etc. Customer most likely
s.
The results of our analysis will help us as well as future cars customers, make informative
decisions based on
comparison
In the third section we will examine other (than car origin) selected potential car price
factors that might possibly impact car prices in general. The selected dependent variable is the
car fuel economy in the city (in first regression model) and highway fuel economy in on the
highway (in second regression model). As mentioned earlier, there are possibly much more car
price factors to be taken into consideration as dependability (car reliability, customer
satisfaction) but the selection of these particular dependent variables is based on data availability
in the given data set. We will examine how strong of a correlation exists between the city-fuel
economy and price; highway-fuel economy and price. We will examine it by performing
regression analysis and verifying in what percent the assumed variable describes the car price
dependence.
-
7/31/2019 Project Group 3-Report
4/44
Data Dis c uss ion:
We did some data transformations before working on the different analyses. To maintain
consistency, we have used the same data for the whole report. Here is a detailed discussion as to
what was done, and the data is in the appendix.
Various types of cars that are based in US and Non-US are shown below.
Type US Non-USCompact 7 8Mid-Size 10 11
Small 7 14Sporty 8 6Van 5 4Large 11 0
From the diagram it is clear that Non-
expected outcome can be biased.
Similarly, since the sample size is not the same, to make the study more non-biased, we need to
make the sample size the same for the same set of Cars. This can be accomplished by the doing
the following
Removing randomly selected two (2) Compact type cars from the non-US car sample,
Removing randomly selected one (1) Midsize type cars from the non-US cars sample, Removing randomly selected seven (7) Small type cars from the non-US cars sample,
Removing randomly selected two (2) Sporty type cars from US cars sample,
Removing randomly selected one (1) Van type car from US cars sample.
-
7/31/2019 Project Group 3-Report
5/44
Once these changes are made, the new sample distribution will look as follows
Type US Non-USCompact 7 7Mid-Size 10 10Small 7 7Sporty 6 6Van 4 4
N O T E : As ignific ance le ve l of 0. 05 has bee n c hos e n throughout th e re port . The re as on
be ing we thought tha t th e c on s e qu e nce s of making of Type 1 e rror we re not s e ve re .
-
7/31/2019 Project Group 3-Report
6/44
Q1: Chi-Square te s t of Homoge ne ity be twee n type s of c ars for U. S . and Non-U. S origin c ars
De s c riptive Statis tic s :
The study is focusing on the car types based on American and Non-American Origin. In the
descriptive statistics, we will be looking into the distribution of various car types based on the
origin. The diagram shown below represents the frequency distribution of the various types of
Non- U.S. origin cars. . The sample data is given in the appendix (Table 1.1 & Table 2.1)
For the sake of analysis, descriptive information has been translated into a numerical equivalent
and the translation is given below.
(Tabl e 1) Car Typ e s and D e s c rip t ion s
Car Type s De s c ription1 Compact2 Small3 Mid Size4 Sporty5 Van6 Large
Typ e s o f Car s Fr e qu e n c y Cu m ula t iv e %
1 9 20.45%2 14 52.27%3 11 77.27%4 6 90.91%5 4 100.00%6 0 100.00%
-
7/31/2019 Project Group 3-Report
7/44
From the frequency diagram we can see that most of the non U.S. based cars belongs to Type 2
which is type (From Table 1 above). It is also evident that 80% of the Cars are covered
under the first three types of Cars (i.e Small, Compact & Mid size). From this we are derive that,
non U.S. based CAR manufacturers focus more on the small car segment.
In the following diagrams given below, the descriptive data about U.S. based cars are provided.
Car Typ e s Fr e qu e n c y Cu m ula t iv e %1 7 14.58%2 7 29.17%3 10 50.00%4 8 66.67%5 5 77.08%6 11 100.00%
0%
20%
40%60%
80%
100%
0
5
10
15
1 2 3 4 5 6
F r e q u e n
c y
Car Types
Non US Originated Cars
Frequency Cumulative %
-
7/31/2019 Project Group 3-Report
8/44
From the frequency distribution chart provide above, the types of car that has the highest
frequency is
Table 1 above). From the cumulative frequency distribution, the first three types of Cars
(Compact, Small & Mid Size) cover only 50% of the Cars. The other 50% is distributed between
Sporty, Van and Large types of vehicles. This could mean that, the US based manufactures are
either focusing the entire spectrum of the vehicles market or there is no real focus on a given
segment at all.
Infe re ntial Statis tic s :C HI-SQUAR E TEST
Considering the fact that we will be comparing the average price of American cars versus non-
American cars we want to verify if both data sets have the same proportion of elements with the
same characteristics describe by cars type. Car type proportion within each population is
important to be considered, as car type characterizes the car by specific set of features such as car
size or engine size that overall heavily impact cars prices. For that purpose we are employing the
chi-square test to verify both American and Non-American cars populations homogeneity based
on cars type.
0%
20%
40%
60%
80%
100%
0
2
4
6
8
10
12
1 2 3 4 5 6
F r e q u e n c y
Car Types
US based Cars
Frequency Cumulative %
-
7/31/2019 Project Group 3-Report
9/44
Base on the available sample size and the subject of our study we want to verify that American
and Non-American population are homogeneous and have the same proportions with the
characteristics. For that purpose we are employing the chi-square test to verify both populations
homogeneity based on sampled cars type.
The test hypothesis is as follows:
H0: American cars and Non-American cars have the same proportion of cars types in their
populations.
H1: American cars and Non-American cars have different proportion of cars types in their populations.
The count of cars by type for each population was summarized in the contingency table below:Cars Type A me ric an
Cars Non-Ame ric anCars
Total
Compact 7 9 16Large 11 0 11Midsize 10 11 21
Small 7 14 21Sporty 8 6 14Van 5 4 9Total 37 55 92
Table below reflects observed and expected frequencies to be used in statistic calculation:Obs e rve d Expec te d
Ame ric anCars
Non-A me ric an
Cars
A me ric anCars
Non-A me ric an
Cars
Compact 7 9 6.4 9.6Large 0 11 4.4 6.6Midsize 10 11 8.4 12.6Small 7 14 8.4 12.6Sporty 8 6 5.6 8.4
Van 5 4 3.6 5.4
where:
-
7/31/2019 Project Group 3-Report
10/44
- expect observations count in given category for given sample was calculated using the
following formula:
=
Considering compiled expected counts all applicable test requirements are satisfied:
1. All expected counts are greater than or equal to 1 and
2. No more than 20% of the expected frequencies are less than 5.
The reason being we thought that the consequences of
making of Type 1 error were not severe.
The test statistic is as follows:
= = + + + + + + + + + +
+ =10.92
Considering the seriousness of making type I error we assume the level of significance =0.05.
This test statistics has a distribution with (5-1)(2-1) = 4 degrees of freedom.(Classical Approach)
There are r=2 and c=4 so we find the critical value of the distribution using (5-1)(2-1) = 4degrees of freedom.
Considering the fact that this is a right-tailed test, the critical value at assumed level of
significance =0.05 with 4 degrees of freedom is 9.488
Because the test statistics 10,92 is greater than the critical value, 9.488, we reject the null
hypothesis.
-
7/31/2019 Project Group 3-Report
11/44
(P-Value Approach):
There are r=2 and c=4 columns so the P-value using (5-1)(2-1) = 4 degrees of freedom. The P-
values is the are under the distribution w 4 degrees of freedom the right of = 10.92Using the distribution table we found the row that corresponds to 4 degrees of freedom thevalue of 10.92 lies between 9.488 and 11.143. The area under the distribution (with 4 degreesof freedom) to the right of 9.488 is 0.05. The area under the distribution to the right of 11.143is 0.025. Because the 10.92 is between 9.488 and 11.143, the P-value is between 0.025 and 0.05.
So 0.025
-
7/31/2019 Project Group 3-Report
12/44
Q2: Hypothe s is te s t and C I for two Me an s
Te s t to s ee if Ave rag e price of U. S . c ar is le ss than th e ave rage c os t of Non U. S . c ar
De s c ription:
A sample set of data had been provided about cars of various origins, and the information
includes details about the price, mileages and many other characteristics. We need to conduct a
test to check whether the average cost of Non US originated cars are higher than average cost of
the US originated car.
We are testing the hypothesis regarding the difference of two means from independent samples.
To perform the test without knowing population standard deviation, we verified that the
following requirements were met:
1. The samples were obtained using simple random sampling;
2. The samples are independent;
3. Both samples sizes are greater than 30.
The hypothesis here is
H0: Average cost of Non US Originated cars = Average cost of US originated Cars
H1: Average cost of Non US Originated cars > Average cost of US originated Cars
Using the Origin as a parameter it is possible to distinguish between the cars made in US and
outside of US. For the purpose of this study, we will be doing a comparison of between the
prices of the cars based on the Origin.
-
7/31/2019 Project Group 3-Report
13/44
De s c riptive Statis tic s :
The number of cars available based on a price range is shown below. For the sake of calculate,
the mid price point was identified and the distribution is given based on the mid point price for
a given range.
Frequency & Histogram information for NON US cars
Mid-Pri ce Poin t Fr e qu e n c y Cu m ula t iv e %9 3 8.82%
14 7 29.41%19 5 44.12%
24 9 70.59%29 3 79.41%34 4 91.18%39 2 97.06%50 1 100.00%
For NON us based cars the highest number of cars (regardless of the type of cars, such as
compact, mid-size etc) available are with a pricing midpoint of 24 followed by 14.
Frequency & Histogram information for US cars
0%
20%
40%
60%
80%
100%
0
2
4
6
8
10
9 14 19 24 29 34 39 50
F r e q u e n c y
Bin
Non US Based Car Prices
Frequency Cumulative %
-
7/31/2019 Project Group 3-Report
14/44
Mid Poin t Pri ce Fr e qu e n c y Cu m ula t iv e %9 2 5.88%
14 11 38.24%19 13 76.47%24 3 85.29%
29 2 91.18%34 0 91.18%39 2 97.06%50 1 100.00%
In the case of U.S. based cars, most of the cars (regardless of the car type such as compass, mid-
size etc) fall into the 19 price point, followed by 14 price point.
Summary s tatis tic s
Price Stats Non-U.S. cars U.S. carsMinimum 8.30 7.30First Quartile 12.85 11.60Median 19.30 15.65Third Quartile 27.53 18.88Maximum 47.90 40.10Mode 19.10 11.10IQR 14.68 7.28Lower Fence -9.16 0.69
0%
20%
40%
60%
80%
100%
0
2
4
6
8
10
12
14
9 14 19 24 29 34 39 50
F r e q u e n c y
Mid Point Pricing
US based Car price
Frequency Cumulative %
-
7/31/2019 Project Group 3-Report
15/44
Upper Fence 49.54 29.79Sample Size 34 34
Mean 20.82 17.04StandardDeviation 9.56 7.76
Infe re ntial Statis tic s :
For the purpose of the analysis, the following statistics will be used (The sample size is provided
in the appendix)
Step 1: Let 1 represents the mean price of the Non US based cars and 2 represent the price of
the US based cars. We are trying to prove that 1 = 2
With that
H0: 1 = 2 or H 0: 1 - 2 = 0
H1:
1>
2H
1:
1-
2> 0
. The reason being we thought that the consequences
of making of Type 1 error were not severe.
Step 3: The sample static given below. The test statistic is
t0 1 - 2 1 - 2)) / (sqrt(s 12/n1 + s 22 / n2))
= (20.821-17.044) 0 )/
( sqrt (+ (9.56 * 9.56 / 34) + (7.756 * 7.756 / 34))
= 3.777 / 2.11
-
7/31/2019 Project Group 3-Report
16/44
= 1.79
Step 4: (Classical Approach) this is a right tailed test. Since the size of both the samples is 34, we
have 34-1 = 33 degrees of freedom. With that we need to find t = 0.05 with 33 degrees of
freedom. The critical value is 1.692
Step 5: Since the test statistic is not greater than critical value we reject the null hypothesis
Step 4: (P Value Approach). Since this is a right tailed test, the P-value is the area to the right
of the t 0 = 1.79. Using the t-distribution, we find the value corresponding to 1.79 lies between
.
Step 5: That is 0.05 < P < 0.025
Since ject the null hypothesis.
Step 6: There is sufficient evidence to conclude that the mean price of Non US originated cars
are greater than the mean price of US originated cars.
Computing Confide nce Inte rvals
The goal will be to find the confidence interval at 95% level of confidence. With that we will
use the following formula (Equation 3 from 11.2)
Lower Bound: 1 - 2 ) t . sqrt(s 12/n1 + s 22 / n2)
Upper Bound: 1 - 2 ) + t . sqrt(s 12/n1 + s 22 / n2)Substituting the values in the equation
Lower Bound: = (20.821 - 17.044) 2.035 * sqrt ((9.56 * 9.56 / 34) + (7.756 * 7.756 / 34))
-
7/31/2019 Project Group 3-Report
17/44
= 3.777 2.035 * 2.11
= 3.777 4.294
= -0.517
Upper Bound: = (20.821 - 17.044) 2.035 * sqrt ((9.56 * 9.56 / 34) + (7.756 * 7.756 / 34))
= 3.777 2.035 * 2.11
= 3.777 + 4.294
= 8.071
Conc lus ion: We are 95% confident that, the mean difference between price of the Non US carsand that of US cars is between -0.517 and 8.071.
-
7/31/2019 Project Group 3-Report
18/44
Q3: Re gr e ss ion:Analys is of the Re lations hip be twee n Mile age and Price of ac ar
The variables Price, City MPG and Highway MPG from the cars dataset have been analyzed to
see if there exists a linear relation between the price of the car and it mileage. The reason these
were picked is that if a customer could understand this relationship, he would be able make an
informative decision about the car choice based on its purchase price and associated fuel
expenses. If he knows that expensive cars have good mileage, or expensive cars have poor
mileage or vice-versa, he will be able to decide which car was better a fit.
The reason these were picked is that if a customer could understand this relationship, he would
be able to plan for his total car expense. If he knows that expensive cars have good mileage, or
expensive cars have poor mileage or vice-versa, he will be able to decide which car was better
for him.
For the purpose of consistency in the underlying data analyzed, we have used the dataset by
removing the large type cars and also a random sample. The modified dataset is as per the
explanation in the Introduction and raw data can be seen in the Appendix.
Least square regression has been used to analyze the relationship between price and mileage.
Also, the analysis is spilt into 2 cases. One analysis with city MPG and the other one with
Highway MPG. The explanatory variable in both cases is the mileage and the predictor variable
is the Price. Based on the usage of the car inside the city or highway driving, the customer can
consider the appropriate relationship.
The data used has been obtained by random sampling, and the analysis is performed with 68
observations. Below are the summary statistics:
-
7/31/2019 Project Group 3-Report
19/44
Statis tic Price City MP G HighwayMPG
Minimum 7.4 15 20Firs t Quartile 12.1 18 25Me dian 16.4 22 28.5
Third Quartile 22.7 25 31.25Maximum 47.9 46 50Mode 11.1 18 30
I Q R 10.6 7 6.25Lowe r Fe nce -3.8 8 16Uppe r Fe nce 38.6 36 41Sample Size 34 34 34
Me an 18.93 22.41 29.07StandardDe viation 8.85 5.38 5.44
The Outliers in Price are: 40.1, 47.9.
The Outliers in City MPG are: 39, 46.
The Outliers in Highway MPG are: 43, 50.
Dis trib ution of the data-Price:
Price
Bin Frequency Cumulative % 7 0 0.00%
14 23 33.82%21 26 72.06%28 8 83.82%35 6 92.65%42 4 98.53%49 1 100.00%
0%
50%
100%
0
10
20
30
7 14 21 28 35 42 49
F r e q u e n c y
Bin
Histogram-Price
Frequency Cumulative %
-
7/31/2019 Project Group 3-Report
20/44
City MPG
City MPGBin Frequency Cumulative % 14 0 0.00%21 33 48.53%28 26 86.76%35 7 97.06%42 1 98.53%49 1 100.00%
Highway MPG:
Highway MPGBin Frequency Cumulative % 18 0 0.00%22 5 7.35%
26 19 35.29%30 21 66.18%34 15 88.24%38 5 95.59%42 1 97.06%46 1 98.53%50 1 100.00%
0%
50%
100%
0
10
20
30
40
14 21 28 35 42 49
F r e q u e n c
y
Bin
Histogram-City MPG
Frequency Cumulative %
0%
50%
100%
0
10
20
30
18 22 26 30 34 38 42 46 50
F r e q u e n c y
Bin
Histogram-Highway MPG
Frequency Cumulative %
-
7/31/2019 Project Group 3-Report
21/44
Infe re ntial Statis tic s
Cas e 1: Le as t Squar e s Re gr e ss ion- Analyze re lations hip be twee n City MP G and Price .
Below is the scatter plot with the line and the equation.
The equation of the line is :
Y=-1.0538x+42.55.
Slope and Y-inte rce pt :
The slope is -1.054, which means that if the average city mileage increases by 1MPG, then the
price of the car decreases by 1.05. The two variables are negatively associated.
The y-intercept is 42.55. As 0 MPG or values close to 0 for mileage do not make sense, there is
no interpretation for the y-intercept.
Line ar c or r e lation c oe ff ic ie nt and th e c oe ff ic ie nt of de te rmination:
>> The linear correlation coefficient r is calculated (using Excel)
r =
y = -1.0538x + 42.55R = 0.4106
0
10
20
30
40
50
60
0 10 20 30 40 50
P r
i c e
City MPG
Scatter Plot
Price
-
7/31/2019 Project Group 3-Report
22/44
r = - 0.64075
This implies a moderate negative linear relation between price and city mileage.
>> The coefficient of Determination R 2 is also calculated in Excel.
R 2=0.4106
This implies that 41% of the variation in the Price is explained by the least square regression
line.
Te s ting the Re gr e ss ion Mode l:
St e p 1: Testing to make sure the requirements of the regression model are met.
Requirement 1: If the plot of residuals against the explanatory variable shows anydiscernible pattern, then the linear model is not appropriate. Check the residuals plot below:
The residuals do not follow any particular pattern, so our first requirement is not violated.
-100
10
20
30
0 10 20 30 40
R e s i d u a
l s
City MPG
Residuals Plot
-
7/31/2019 Project Group 3-Report
23/44
Requirement 2: The residuals have to be normally distributed. See the probability plot for the
residuals.
The residuals have a normal distribution.
St e p 2: Hypothe s is te s t to know if the re e xis ts a line ar r e lation be twee n price an d c ity
mile age .
If 1 is the slope of the regression line, we will perform a two tail test to check if 1 is equal to
zero or not. If 1 is equal to zero, then the assumption that a linear relation exists between price
and mileage is not valid:
Null Hypothesis
H0 1 =0
0
20
40
60
0 20 40 60 80 100 120
Y
Sample Percentile
Normal Probability Plot
-
7/31/2019 Project Group 3-Report
24/44
Alternate Hypothesis
H1: 1
Calculating the Test statistic:
Test Statistic (t 0) = b1 - 1 S b1
To compute the test statistics, we are assuming that the null hypothesis is true. Hence we1 = 0.
Standard Error (S e) = Sum of Residuals 2 = 6.843n-2
S b1 = S e . = 0 . 15545 -Mean of x) 2
Test statistic t 0 = -6.78
Classical Approach: The t-value with 0.05 level of significance for a two tail test at 66 degrees of
freedom is approximately 1.997.
Given that the test statistic is to the left of -1.997, we reject the null hypothesis.
P-value approach: the P-value for the test is 0.0000000039771193, which is less than the level of
significance 0.05; therefore we reject the null hypothesis.
St e p 3: Conc lus ion
-
7/31/2019 Project Group 3-Report
25/44
There is sufficient evidence at
exists between price and city mileage.
Confide nce Inte rvals for the Slope of the Le as t Squar e Re gr e ss ion Line
95% confidence intervals for the slope of the regression line is calculated in excel:
The lower bound is -1.364, and the upper bound is -0.7435.
We are 95% confident that the mean decrease in price for each additional mileage point is
somewhere between -1.364 and -0.7435.
Cas e 2: Le as t Squar e s Re gr e ss ion- Analyze re lations hip be twee n Highway MPG and Price .
Below is the scatter plot with the line and the equation.
The equation of the line is:
Y=-0.9928x+47.796.
Slope and Y-inte rce pt :
The slope is -0.9928, which means that if the average highway mileage increases by 1MPG,
then the price of the car decreases by 0.99. The two variables are negatively associated.
y = -0.9928x + 47.796R = 0.3723
0
10
2030
40
50
60
0 10 20 30 40 50 60
P r i c e
Highway MPG
Scatter Plot
Price
-
7/31/2019 Project Group 3-Report
26/44
The y-intercept is 47.796. As 0 MPG or values close to 0 for mileage do not make sense, there is
no interpretation for the y-intercept.
Line ar c or r e lation c oe ff ic ie nt and th e c oe ff ic ie nt of de te rmination:
>> The linear correlation coefficient r is calculated (using Excel)
r =
r = - 0.6102
this implies a moderate negative linear relation between price and highway mileage.
>> The coefficient of Determination R 2 is also calculated in Excel.
R 2=0.3723
This implies that 37% of the variation in the Price is explained by the least square regression
line.
Te s ting the Re gr e ss ion Mode l:
St e p 1: Testing to make sure the requirements of the regression model are met.
Requirement 1: If the plot of residuals against the explanatory variable shows any
discernible pattern, then the linear model is not appropriate. Check the residuals plot below:
-
7/31/2019 Project Group 3-Report
27/44
The residuals do not follow any particular pattern, so our first requirement is not violated.
Requirement 2: The residuals have to be normally distributed. See the probability plot for the
residuals.
The residuals have a normal distribution.
St e p 2: Hypothe s is te s t to know if the re e xis ts a line ar r e lation be twee n price an d c ity
mile age .
o
is equal to zero, then the assumption that a linear relation exists between price
and mileage is not valid:
Null Hypothesis
-10
0
10
20
30
0 10 20 30 40 50
R e s i d u a
l s
Highway MPG
Residuals Plot
0
20
40
60
0 20 40 60 80 100 120
Y
Sample Percentile
Normal Probability Plot
-
7/31/2019 Project Group 3-Report
28/44
H0 1 =0
Alternate Hypothesis
H1: 1
>> The level of significance for this
Calculating the Test statistic:
Test Statistic (t 0) = b1 - 1 S b1
To compute the test statistics, we are assuming that the null hypothesis is true. Hence we1 = 0.
Standard Error (S e) = Sum of Residuals 2 = 7.062n-2
S b1 = S e . = 0 . 1587 -Mean of x) 2
Test statistic is calculated as t0 = -6.26
Classical Approach: The t-value with 0.05 level of significance for a two tail test at 66 degrees of
freedom is approximately 1.997.
Given that the test statistic is to the left of -1.997, we reject the null hypothesis.
P-value approach: the P-value for the test is 0.0000000331, which is less than the level of
significance 0.05; therefore we reject the null hypothesis.
-
7/31/2019 Project Group 3-Report
29/44
St e p 3: Conc lus ion
There is sufficient evidence at
exists between price and highway mileage.
Confide nce Inte rvals for the Slope of the Le as t Squar e Re gr e ss ion Line
95% confidence intervals for the slope of the regression line is calculated in excel:
The lower bound is -1.31, and the upper bound is -0.676.
We are 95% confident that the mean decrease in price for each additional highway mileage
point is somewhere between -1.31 and -0.676.
Conc lus ion:
With the three analyses performed: we lastly conclude meaningful information about what
impacts the car price, that every automobile magazine reader should find interesting and useful
to read, understand, and make assessments.
Based on performed analysis with 95% level of confidence we concluded in the first part
of the analysis that the populations of US and Non-US cars are not homogeneous. That implies
the selected car types in considered populations take different share. It is especially evident in
case of Large cars type that is apparent only in US sample. These circumstances raise the
question if Non-US car developers manufacture this kind/type of cars at all. If not, then naturallyUS customers with preference to Large cars are limited to select only from US origin cars despite
largely divers; therefore in further analysis it would be interesting to test whether the two
-
7/31/2019 Project Group 3-Report
30/44
population are homogeneous without Large car type being taken into account. The information
about both populations not being homogeneous means that the populations of US and Non-US
cars should not be directly compared without car type being taken into consideration in the
analysis.
Based on performed analysis with 95% level of confidence we concluded in the second
part of the analysis that US cars are less expensive than the Non-US origin car types. What this
finding communicates to the magazine readers is that it is worth to consider a purchase of US
cars, as opposed to Non-US cars, because they have better economical value and additionally
their purchase will support national economical growth and development.
Based on performed analysis with 95 level of confidence we concluded in the third part
of the regression analysis, that the car fuel economy influences the car prices. City fuel economy
upholds a better price description than Highway fuel economy. This conclusion was made by
comparing the Coefficient of Determination R 2 that was calculated for the two separated
regression models, build with one dependable city fuel economy (in the first model) and highway
fuel economy (in the second model). In both cases, the Coefficient of Determination R 2 come out
low: 0.41 and 0.37, this offers suggestion that there are other important factors that describe and
influence Car Price. The possible independent variables are: cars reliability factor, customer
satisfaction with the further analysis, we find the analyzing other
factors that impact car price would be interesting. If possible, a creation of the formulation of
regression model that describe car price in 90%-99% range, could be useful. In the regression
analysis we concluded: car fuel economy negatively influences car price. Implying that lower
fuel economy cars have greater prices. Also it can be explained by the fact that larger cars types
consume more fuel (have lower fuel economy) and in the same time cost more because of the
-
7/31/2019 Project Group 3-Report
31/44
overall greater size and greater engine power. That is an expected outcome, as in the US market
economy, the price for more economical produce are higher, due to the higher demand for those
products. Further analysis could be done
Non-US cars separately.
While a linear relationship exists between both Price and (highway and city) Mile Per
Gallon, the decrease in MPG, resulting in an increase in cost, may be due to factors such as
weight, engine size, vehicle type, engineering, etc. The are more variety of US originated cars
than that of Non-US originated cars. Foreign manufactured cars were of smaller size than that of
Domestic.
In conclusion, what we learned is that analyzing consumer information has many variables
and requires a thoughtful process in selecting which data the public truly cares about, for
emium will be paid for a car that has increased gas
mileage (MPG)
Ideas for future tests would be to compare cost, mileage, and the number of cylinders: to
examine if expensive, highly engineered, cars get better gas mileage.
Another example: Comparing Weight, Mileage, and Nationality Origin: to examine if foreign
cars of the same weight, have higher gas mileage (MPG) and may be engineered for higher fuel
economy due to the fact that most Non-US countries have significantly higher gas prices than
that of the US.
Le ss ons Le arne d:
Here is what the team members thought about the lessons learnt in the project work:
-
7/31/2019 Project Group 3-Report
32/44
Te am m e mb e r 1: I learned about the real life application of statistics during this course. The
questions we tried to answer in the project were of real life significance in nature.
To me, I have learned about the importance of collecting the sample properly, doing the
descriptive analysis, based on sample size and the type of the study. I also learned to correctly
interpret the result of a statistical analysis.
Working with the group was very helpful, even though it was an on line course. We were able
greatly benefit from the collective knowledge of the group. Since it was an on line course,
getting everyone at the same time for a meeting was a challenge. But on the other hand even for
the courses on campuses will have similar challenges.
Te am m e mb e r 2: It was very difficult to decide on project title without having the full
understanding of the theory. Based on the set deadlines for the project, the team had to first
decide on the topic and scope of the analysis. In the same time the theory of the material was not
cover on the lessons plan. I found myself adjusting the analysis topic of the project part assigned
once the particular topic was cover on the lessons plan. I found the selection of the topic being
one of the most difficult tasks of the project and most time consuming for the reason mentioned.
In the same time the research for the topic helped me being better prepared for the upcoming
lessons in the class and that was a great benefit out of this experience.
- Considering the Statistics subject, I learn that applying the statistical instrument to the real
life data is not easy and many times not possible for desired analysis approach. Many times the
population or sample structure does not meet the statistical requirements; therefore selected
statistical analysis is not applicable. In conclusion statistical tolls applicability is limited and it is
very important to ensure that the statistical requirements are met before applying it. Otherwise
the results of the test are pointless and the time invested in analysis is wasted.
-
7/31/2019 Project Group 3-Report
33/44
- Team structure with the project manager assigned that controls the status of accomplished
work was very effective and without this structure and project manager enforcement, perhaps the
team project would not be accomplished.
performed by the project manager was definitely a crucial part of the assignment, especially
considering the fact that the whole course is offered online.
Te am m e mb e r 3: The project was challenging, and was a good example to learn how to apply
theory to real life situations. It improved my understanding of the coursework and definitely
added value to the course. I think it is very important to have the project in the coursework.However, the team project was not a perfect scenario for me. Although I like to work in teams
than as an individual, the idea of taking an online course was to learn new concepts at my own
schedule. Given that the team members were from different parts of the country-in different time
zones, had different schedules and different ways to work, working in teams for an online course
with all working team members did not pan out the best. This was the biggest challenge. We did
overcome those challenges and are here so a couple of inputs would be:
1. It is not justified to have 50% of the grade for a group project.
2. If the team had 5 participants- then the group project should have been designed in a way
that it could be easily divided into 5 equal parts, with each person having a chance to add
to his learning and also work into a team.
3. An individual project is a better choice for an online course, in my opinion.
Overall, in spite of some complaints, I enjoyed working with my team and had a decent
experience with regard to the project.
-
7/31/2019 Project Group 3-Report
34/44
Te am m e mb e r 4 : -It was very difficult scale and filter out all the data into isolation so that we
our team could break down the segments. Choosing the right data to compare and analyze, being
decisive and strategic to make the proper and appropriate assessments for our project.
-I feel that such a project of real life science is always useful, including future consumption,
similar study, or in real world application. The same theories can be applied to future buying
trends, knowledge of such a market, or opinion based off hypothesis examination.
-Team-work is essential. Our team celebrates our wonderful diversity of international students,
ages, genders, and locations. It is very important to properly communicate with various
mechanisms to diversify communication including text messages, email, phone calls. We had the
best team project manager, the quarterback of our team, she was easy to understand and
communicate with, written e-mails, and phone-communication completely directive, clear, well
planned with great operation management and distinct. Communication and operation are
essential in the success of our team.
Shanthi took on the role our project leader; she helped lead the organization and mission. She
Othe r te am m e mb e rs had s imilar opinions .
-
7/31/2019 Project Group 3-Report
35/44
The first table is the transformed data that was used for the entire report.
-
7/31/2019 Project Group 3-Report
36/44
O r i g i n
T y p e
N o . w i t h i n T y p e M a n u f a c t u r e r M o d e l
P a s s e n g e r s P r i c e
_ C a t e g o r y P r i c e
C i t y M P G C i t y M P G
_ C a t e g o r y P r i c e
H i g h w a y M P G H i g h w a y M P G
_ C a t e g o r E n g i n e S i z E n g i n e s i z e
_ C a t e g o r H o r s e p o w e r H o r s e p o w e r
_ C a t e g o r F u e l T a n W e i g h t
W e i g h t_ C a t e g o r y
U S
S m a l l
1 F o r d
F e s t i v a
4 C h e a p
7 . 4
3 1 G o o d
7 . 4
3 3 G o o d
1 . 3 S m a l l
6 3 < 1 0 0
1 0
1 8 4 5 < 1 . 5 T o n
n o n - U S S m a l l
6 M a z d a
3 2 3
4 C h e a p
8 . 3
2 9 G o o d
8 . 3
3 7 G o o d
1 . 6 S m a l l
8 2 < 1 0 0
1 3 . 2
2 3 2 5 < 1 . 5 T o n
n o n - U S S m a l l
2 G e o
M e t r o
4 C h e a p
8 . 4
4 6 G o o d
8 . 4
5 0 G o o d
1 S m a l l
5 5 < 1 0 0
1 0 . 6
1 6 9 5 < 1 . 5 T o n
n o n - U S S m a l l
3 S u z u k
i
S w
i f t
4 C h e a p
8 . 6
3 9 G o o d
8 . 6
4 3 G o o d
1 . 3 S m a l l
7 0 < 1 0 0
1 0 . 6
1 9 6 5 < 1 . 5 T o n
U S
S m a l l
2 P o n t i a c
L e M a n s
4 C h e a p
9
3 1 G o o d
9
4 1 G o o d
1 . 6 S m a l l
7 4 < 1 0 0
1 3 . 2
2 3 5 0 < 1 . 5 T o n
n o n - U S S m a l l
5 V o l k s w
a g e n
F o x
4 C h e a p
9 . 1
2 5 G o o d
9 . 1
3 3 G o o d
1 . 8 S m a l l
8 1 < 1 0 0
1 2 . 4
2 2 4 0 < 1 . 5 T o n
U S
S m a l l
7 D o d g e
C o l t
5 C h e a p
9 . 2
2 9 G o o d
9 . 2
3 3 G o o d
1 . 5 S m a l l
9 2 < 1 0 0
1 3 . 2
2 2 7 0 < 1 . 5 T o n
n o n - U S S p o r t y
2 H y u
n d a i
S c o u p
e
4 C h e a p
1 0
2 6 G o o d
1 0
3 4 G o o d
1 . 5 S m a l l
9 2 < 1 0 0
1 1 . 9
2 2 8 5 < 1 . 5 T o n
U S
S m a l l
5 F o r d
E s c o r t
5 C h e a p
1 0 . 1
2 3 G o o d
1 0 . 1
3 0 G o o d
1 . 8 S m a l l
1 2 7 1 0 0 t o < 1 5 0
1 3 . 2
2 5 3 0 < 1 . 5 T o n
n o n - U S S m a l l
1 2 S u b
a r u
L o y a
l e
5 C h e a p
1 0 . 9
2 5 G o o d
1 0 . 9
3 0 G o o d
1 . 8 S m a l l
9 0 < 1 0 0
1 5 . 9
2 4 9 0 < 1 . 5 T o n
U S
C o m p a c t
2 P o n t i a c
S u n
b i r d
5 C h e a p
1 1 . 1
2 3 G o o d
1 1 . 1
3 1 G o o d
2 M e d i u m
1 1 0 1 0 0 t o < 1 5 0
1 5 . 2
2 5 7 5 < 1 . 5 T o n
U S
S m a l l
3 S a t u r
n
S L
5 C h e a p
1 1 . 1
2 8 G o o d
1 1 . 1
3 8 G o o d
1 . 9 S m a l l
8 5 < 1 0 0
1 2 . 8
2 4 9 5 < 1 . 5 T o n
U S
C o m p a c t
4 F o r d
T e m p o
5 C h e a p
1 1 . 3
2 2 G o o d
1 1 . 3
2 7 P o o r
2 . 3 M e d i u m
9 6 < 1 0 0
1 5 . 9
2 6 9 0 < 1 . 5 T o n
U S
S m a l l
6 D o d g e
S h a d o w
5 C h e a p
1 1 . 3
2 3 G o o d
1 1 . 3
2 9 P o o r
2 . 2 M e d i u m
9 3 < 1 0 0
1 4
2 6 7 0 < 1 . 5 T o n
U S
C o m p a c t
3 C h e v r
o l e t
C o r s i c a
5 C h e a p
1 1 . 4
2 5 G o o d
1 1 . 4
3 4 G o o d
2 . 2 M e d i u m
1 1 0 1 0 0 t o < 1 5 0
1 5 . 6
2 7 8 5 < 1 . 5 T o n
n o n - U S S m a l l
8 M a z d a
P r o t e g e
5 C h e a p
1 1 . 6
2 8 G o o d
1 1 . 6
3 6 G o o d
1 . 8 S m a l l
1 0 3 1 0 0 t o < 1 5 0
1 4 . 5
2 4 4 0 < 1 . 5 T o n
n o n - U S S m a l l
7 N i s s a n
S e n t r a
5 C h e a p
1 1 . 8
2 9 G o o d
1 1 . 8
3 3 G o o d
1 . 6 S m a l l
1 1 0 1 0 0 t o < 1 5 0
1 3 . 2
2 5 4 5 < 1 . 5 T o n
U S
S m a l l
4 E a g l e
S u m
m i t
5 C h e a p
1 2 . 2
2 9 G o o d
1 2 . 2
3 3 G o o d
1 . 5 S m a l l
9 2 < 1 0 0
1 3 . 2
2 2 9 5 < 1 . 5 T o n
n o n - U S S p o r t y
4 G e o
S t o r m
4 C h e a p
1 2 . 5
3 0 G o o d
1 2 . 5
3 6 G o o d
1 . 6 S m a l l
9 0 < 1 0 0
1 2 . 4
2 4 7 5 < 1 . 5 T o n
U S
C o m p a c t
7 D o d g e
S p i r i t
6 C h e a p
1 3 . 3
2 2 G o o d
1 3 . 3
2 7 P o o r
2 . 5 M e d i u m
1 0 0 1 0 0 t o < 1 5 0
1 6
2 9 7 0 < 1 . 5 T o n
U S
C o m p a c t
5 C h e v r
o l e t
C a v a l i e r
5 C h e a p
1 3 . 4
2 5 G o o d
1 3 . 4
3 6 G o o d
2 . 2 M e d i u m
1 1 0 1 0 0 t o < 1 5 0
1 5 . 2
2 4 9 0 < 1 . 5 T o n
U S
C o m p a c t
1 O l d s m o b i l e
A c h i e v a
5 C h e a p
1 3 . 5
2 4 G o o d
1 3 . 5
3 1 G o o d
2 . 3 M e d i u m
1 5 5 1 5 0 t o < 2 0 0
1 5 . 2
2 9 1 0 < 1 . 5 T o n
n o n - U S M i d s i z e
8 H y u
n d a i
S o n a t a
5 C h e a p
1 3 . 9
2 0 P o o r
1 3 . 9
2 7 P o o r
2 M e d i u m
1 2 8 1 0 0 t o < 1 5 0
1 7 . 2
2 8 8 5 < 1 . 5 T o n
U S
S p o r t y
7 P l y m o u t h
L a s e r
4 C h e a p
1 4 . 4
2 3 G o o d
1 4 . 4
3 0 G o o d
1 . 8 S m a l l
9 2 < 1 0 0
1 5 . 9
2 6 4 0 < 1 . 5 T o n
U S
M i d s i z e
3 M e r c u r y
C o u g a r
5 C h e a p
1 4 . 9
1 9 P o o r
1 4 . 9
2 6 P o o r
3 . 8 L a r g e
1 4 0 1 0 0 t o < 1 5 0
1 8
3 6 1 0 > 1 . 5 T o n
U S
S p o r t y
8 C h e v r
o l e t
C a m a r o
4 M o d e r a t e
1 5 . 1
1 9 P o o r
1 5 . 1
2 8 P o o r
3 . 4 L a r g e
1 6 0 1 5 0 t o < 2 0 0
1 5 . 5
3 2 4 0 > 1 . 5 T o n
U S
M i d s i z e
7 D o d g e
D y n
a s t y
6 M o d e r a t e
1 5 . 6
2 1 G o o d
1 5 . 6
2 7 P o o r
2 . 5 M e d i u m
1 0 0 1 0 0 t o < 1 5 0
1 6
3 0 8 0 > 1 . 5 T o n
n o n - U S C o m p a c t
2 N i s s a n
A l t i m a
5 M o d e r a t e
1 5 . 7
2 4 G o o d
1 5 . 7
3 0 G o o d
2 . 4 M e d i u m
1 5 0 1 5 0 t o < 2 0 0
1 5 . 9
3 0 5 0 > 1 . 5 T o n
U S
M i d s i z e
8 B u i
c k
C e n t u r y
6 M o d e r a t e
1 5 . 7
2 2 G o o d
1 5 . 7
3 1 G o o d
2 . 2 M e d i u m
1 1 0 1 0 0 t o < 1 5 0
1 6 . 4
2 8 8 0 < 1 . 5 T o n
U S
C o m p a c t
6 C h r y s
l e r
L e B a r o n
6 M o d e r a t e
1 5 . 8
2 3 G o o d
1 5 . 8
2 8 P o o r
3 L a r g e
1 4 1 1 0 0 t o < 1 5 0
1 6
3 0 8 5 > 1 . 5 T o n
U S
M i d s i z e
1 0 C h e v r
o l e t
L u m i n a
6 M o d e r a t e
1 5 . 9
2 1 G o o d
1 5 . 9
2 9 P o o r
2 . 2 M e d i u m
1 1 0 1 0 0 t o < 1 5 0
1 6 . 5
3 1 9 5 > 1 . 5 T o n
U S
S p o r t y
3 F o r d
M u s
t a n g
4 M o d e r a t e
1 5 . 9
2 2 G o o d
1 5 . 9
2 9 P o o r
2 . 3 M e d i u m
1 0 5 1 0 0 t o < 1 5 0
1 5 . 4
2 8 5 0 < 1 . 5 T o n
U S
M i d s i z e
1 O l d s m o b i l e
C u t
l a s s_ C i
5 M o d e r a t e
1 6 . 3
2 3 G o o d
1 6 . 3
3 1 G o o d
2 . 2 M e d i u m
1 1 0 1 0 0 t o < 1 5 0
1 6 . 5
2 8 9 0 < 1 . 5 T o n
U S
V a n
4 C h e v r
o l e t
L u m i n a_ A P
7 M o d e r a t e
1 6 . 3
1 8 P o o r
1 6 . 3
2 3 P o o r
3 . 8 L a r g e
1 7 0 1 5 0 t o < 2 0 0
2 0
3 7 1 5 > 1 . 5 T o n
n o n - U S C o m p a c t
3 M a z d a
6 2 6
5 M o d e r a t e
1 6 . 5
2 6 G o o d
1 6 . 5
3 4 G o o d
2 . 5 M e d i u m
1 6 4 1 5 0 t o < 2 0 0
1 5 . 5
2 9 7 0 < 1 . 5 T o n
n o n - U S C o m p a c t
1 H o n d a
A c c o r d
4 M o d e r a t e
1 7 . 5
2 4 G o o d
1 7 . 5
3 1 G o o d
2 . 2 M e d i u m
1 4 0 1 0 0 t o < 1 5 0
1 7
3 0 4 0 > 1 . 5 T o n
U S
S p o r t y
5 P o n t i a c
F i r e b i r d
4 M o d e r a t e
1 7 . 7
1 9 P o o r
1 7 . 7
2 8 P o o r
3 . 4 L a r g e
1 6 0 1 5 0 t o < 2 0 0
1 5 . 5
3 2 4 0 > 1 . 5 T o n
n o n - U S M i d s i z e
6 T o y o
t a
C a m r y
5 M o d e r a t e
1 8 . 2
2 2 G o o d
1 8 . 2
2 9 P o o r
2 . 2 M e d i u m
1 3 0 1 0 0 t o < 1 5 0
1 8 . 5
3 0 3 0 > 1 . 5 T o n
n o n - U S S p o r t y
5 T o y o
t a
C e l i c a
4 M o d e r a t e
1 8 . 4
2 5 G o o d
1 8 . 4
3 2 G o o d
2 . 2 M e d i u m
1 3 5 1 0 0 t o < 1 5 0
1 5 . 9
2 9 5 0 < 1 . 5 T o n
U S
M i d s i z e
5 P o n t i a c
G r a n d
_ P r i x
5 M o d e r a t e
1 8 . 5
1 9 P o o r
1 8 . 5
2 7 P o o r
3 . 4 L a r g e
2 0 0 > =
2 0 0
1 6 . 5
3 4 5 0 > 1 . 5 T o n
U S
V a n
1 D o d g e
C a r a v a n
7 M o d e r a t e
1 9
1 7 P o o r
1 9
2 1 P o o r
3 L a r g e
1 4 2 1 0 0 t o < 1 5 0
2 0
3 7 0 5 > 1 . 5 T o n
n o n - U S V a n
2 M a z d a
M P V
7 M o d e r a t e
1 9 . 1
1 8 P o o r
1 9 . 1
2 4 P o o r
3 L a r g e
1 5 5 1 5 0 t o < 2 0 0
1 9 . 6
3 7 3 5 > 1 . 5 T o n
n o n - U S V a n
4 N i s s a n
Q u e
s t
7 M o d e r a t e
1 9 . 1
1 7 P o o r
1 9 . 1
2 3 P o o r
3 L a r g e
1 5 1 1 5 0 t o < 2 0 0
2 0
4 1 0 0 > 1 . 5 T o n
n o n - U S C o m p a c t
9 S u b
a r u
L e g a c y
5 M o d e r a t e
1 9 . 5
2 3 G o o d
1 9 . 5
3 0 G o o d
2 . 2 M e d i u m
1 3 0 1 0 0 t o < 1 5 0
1 5 . 9
3 0 8 5 > 1 . 5 T o n
U S
V a n
2 O l d s m o b i l e
S i l h o u e t t e
7 M o d e r a t e
1 9 . 5
1 8 P o o r
1 9 . 5
2 3 P o o r
3 . 8 L a r g e
1 7 0 1 5 0 t o < 2 0 0
2 0
3 7 1 5 > 1 . 5 T o n
n o n - U S V a n
1 V o l k s w
a g e n
E u r
o v a n
7 M o d e r a t e
1 9 . 7
1 7 P o o r
1 9 . 7
2 1 P o o r
2 . 5 M e d i u m
1 0 9 1 0 0 t o < 1 5 0
2 1 . 1
3 9 6 0 > 1 . 5 T o n
n o n - U S S p o r t y
3 H o n d a
P r e l u d e
4 M o d e r a t e
1 9 . 8
2 4 G o o d
1 9 . 8
3 1 G o o d
2 . 3 M e d i u m
1 6 0 1 5 0 t o < 2 0 0
1 5 . 9
2 8 6 5 < 1 . 5 T o n
U S
V a n
3 F o r d
A e r o s t a r
7 M o d e r a t e
1 9 . 9
1 5 P o o r
1 9 . 9
2 0 P o o r
3 L a r g e
1 4 5 1 0 0 t o < 1 5 0
2 1
3 7 3 5 > 1 . 5 T o n
U S
M i d s i z e
6 F o r d
T a u r u s
5 E x p
e n s i v e
2 0 . 2
2 1 G o o d
2 0 . 2
3 0 G o o d
3 L a r g e
1 4 0 1 0 0 t o < 1 5 0
1 6
3 3 2 5 > 1 . 5 T o n
n o n - U S M i d s i z e
3 N i s s a n
M a x i m a
5 E x p
e n s i v e
2 1 . 5
2 1 G o o d
2 1 . 5
2 6 P o o r
3 L a r g e
1 6 0 1 5 0 t o < 2 0 0
1 8 . 5
3 2 0 0 > 1 . 5 T o n
n o n - U S C o m p a c t
7 V o l v o
2 4 0
5 E x p
e n s i v e
2 2 . 7
2 1 G o o d
2 2 . 7
2 8 P o o r
2 . 3 M e d i u m
1 1 4 1 0 0 t o < 1 5 0
1 5 . 8
2 9 8 5 < 1 . 5 T o n
n o n - U S V a n
3 T o y o
t a
P r e v i a
7 E x p
e n s i v e
2 2 . 7
1 8 P o o r
2 2 . 7
2 2 P o o r
2 . 4 M e d i u m
1 3 8 1 0 0 t o < 1 5 0
1 9 . 8
3 7 8 5 > 1 . 5 T o n
n o n - U S S p o r t y
6 V o l k s w
a g e n
C o r r a d o
4 E x p
e n s i v e
2 3 . 3
1 8 P o o r
2 3 . 3
2 5 P o o r
2 . 8 M e d i u m
1 7 8 1 5 0 t o < 2 0 0
1 8 . 5
2 8 1 0 < 1 . 5 T o n
U S
S p o r t y
2 D o d g e
S t e a l t h
4 E x p
e n s i v e
2 5 . 8
1 8 P o o r
2 5 . 8
2 4 P o o r
3 L a r g e
3 0 0 > =
2 0 0
1 9 . 8
3 8 0 5 > 1 . 5 T o n
n o n - U S M i d s i z e
4 M i t s u b
i s h i
D i a m a n t e
5 E x p
e n s i v e
2 6 . 1
1 8 P o o r
2 6 . 1
2 4 P o o r
3 L a r g e
2 0 2 > =
2 0 0
1 9
3 7 3 0 > 1 . 5 T o n
U S
M i d s i z e
4 B u i
c k
R i v i e r a
5 E x p
e n s i v e
2 6 . 3
1 9 P o o r
2 6 . 3
2 7 P o o r
3 . 8 L a r g e
1 7 0 1 5 0 t o < 2 0 0
1 8 . 8
3 4 9 5 > 1 . 5 T o n
n o n - U S M i d s i z e
1 0 L e x u
s
E S 3 0 0
5 E x p
e n s i v e
2 8
1 8 P o o r
2 8
2 4 P o o r
3 L a r g e
1 8 5 1 5 0 t o < 2 0 0
1 8 . 5
3 5 1 0 > 1 . 5 T o n
n o n - U S C o m p a c t
8 S a a b
9 0 0
5 E x p
e n s i v e
2 8 . 7
2 0 P o o r
2 8 . 7
2 6 P o o r
2 . 1 M e d i u m
1 4 0 1 0 0 t o < 1 5 0
1 8
2 7 7 5 < 1 . 5 T o n
n o n - U S C o m p a c t
5 A u d
i
9 0
5 E x p
e n s i v e
2 9 . 1
2 0 P o o r
2 9 . 1
2 6 P o o r
2 . 8 M e d i u m
1 7 2 1 5 0 t o < 2 0 0
1 6 . 9
3 3 7 5 > 1 . 5 T o n
n o n - U S M i d s i z e
2 B M W
5 3 5 i
4 E x p
e n s i v e
3 0
2 2 G o o d
3 0
3 0 G o o d
3 . 5 L a r g e
2 0 8 > =
2 0 0
2 1 . 1
3 6 4 0 > 1 . 5 T o n
n o n - U S S p o r t y
1 M a z d a
R X - 7
2 E x p
e n s i v e
3 2 . 5
1 7 P o o r
3 2 . 5
2 5 P o o r
1 . 3 S m a l l
2 5 5 > =
2 0 0
2 0
2 8 9 5 < 1 . 5 T o n
n o n - U S M i d s i z e
5 A c u r a
L e g e n d
5 E x p
e n s i v e
3 3 . 9
1 8 P o o r
3 3 . 9
2 5 P o o r
3 . 2 L a r g e
2 0 0 > =
2 0 0
1 8
3 5 6 0 > 1 . 5 T o n
U S
M i d s i z e
9 L i n c o l n
C o n t i n e n t a
6 E x p
e n s i v e
3 4 . 3
1 7 P o o r
3 4 . 3
2 6 P o o r
3 . 8 L a r g e
1 6 0 1 5 0 t o < 2 0 0
1 8 . 4
3 6 9 5 > 1 . 5 T o n
n o n - U S M i d s i z e
1 L e x u
s
S C 3 0 0
4 E x p
e n s i v e
3 5 . 2
1 8 P o o r
3 5 . 2
2 3 P o o r
3 L a r g e
2 2 5 > =
2 0 0
2 0 . 6
3 5 1 5 > 1 . 5 T o n
n o n - U S M i d s i z e
1 1 A u d
i
1 0 0
6 E x p
e n s i v e
3 7 . 7
1 9 P o o r
3 7 . 7
2 6 P o o r
2 . 8 M e d i u m
1 7 2 1 5 0 t o < 2 0 0
2 1 . 1
3 4 0 5 > 1 . 5 T o n
U S
S p o r t y
1 C h e v r
o l e t
C o r v e
t t e
2 E x p
e n s i v e
3 8
1 7 P o o r
3 8
2 5 P o o r
5 . 7 L a r g e
3 0 0 > =
2 0 0
2 0
3 3 8 0 > 1 . 5 T o n
U S
M i d s i z e
2 C a d i l l a c
S e v i l l e
5 E x p
e n s i v e
4 0 . 1
1 6 P o o r
4 0 . 1
2 5 P o o r
4 . 6 L a r g e
2 9 5 > =
2 0 0
2 0
3 9 3 5 > 1 . 5 T o n
n o n - U S M i d s i z e
9 I n f i n i t i
Q 4 5
5 E x p
e n s i v e
4 7 . 9
1 7 P o o r
4 7 . 9
2 2 P o o r
4 . 5 L a r g e
2 7 8 > =
2 0 0
2 2 . 5
4 0 0 0 > 1 . 5 T o n
-
7/31/2019 Project Group 3-Report
37/44
For Question 1 Chi-Square.
Table 1 (Distribution of Non US based Cars and Types )
Origin Type Car Type(Numeric)non-US Compact 1non-US Compact 1non-US Compact 1non-US Compact 1non-US Compact 1non-US Compact 1non-US Compact 1non-US Compact 1non-US Compact 1non-US Midsize 3non-US Midsize 3non-US Midsize 3non-US Midsize 3non-US Midsize 3non-US Midsize 3non-US Midsize 3non-US Midsize 3non-US Midsize 3non-US Midsize 3non-US Midsize 3non-US Small 2
non-US Small 2non-US Small 2non-US Small 2non-US Small 2non-US Small 2non-US Small 2non-US Small 2non-US Small 2non-US Small 2non-US Small 2non-US Small 2non-US Small 2non-US Small 2non-US Sporty 4non-US Sporty 4non-US Sporty 4non-US Sporty 4non-US Sporty 4non-US Sporty 4non-US Van 5
-
7/31/2019 Project Group 3-Report
38/44
non-US Van 5non-US Van 5non-US Van 5
Table 2 (Distribution of US based Cars and Types)
Origin TypeCar Type(Numeric)
US Compact 1US Compact 1US Compact 1US Compact 1US Compact 1US Compact 1US Compact 1US Large 6US Large 6US Large 6US Large 6US Large 6US Large 6US Large 6US Large 6US Large 6US Large 6US Large 6US Midsize 3US Midsize 3
US Midsize 3US Midsize 3US Midsize 3US Midsize 3US Midsize 3US Midsize 3US Midsize 3US Midsize 3US Small 2US Small 2US Small 2US Small 2US Small 2US Small 2US Small 2US Sporty 4US Sporty 4US Sporty 4US Sporty 4US Sporty 4
-
7/31/2019 Project Group 3-Report
39/44
US Sporty 4US Sporty 4US Sporty 4US Van 5US Van 5US Van 5US Van 5US Van 5
For Question 2: Hypothesis test for two means:
Table 1 (Sample population of Non US originate d c ars )
Origin Size Pa ss e nge rs Price non-US Compact 4 17.5non-US Compact 5 15.7non-US Compact 5 16.5non-US Compact 5 29.1non-US Compact 5 22.7non-
US Compact 5 28.7non-US Compact 5 19.5non-US Midsize 4 35.2non-US Midsize 4 30non-US Midsize 5 21.5non-US Midsize 5 26.1
non-US Midsize 5 33.9non-US Midsize 5 18.2non-US Midsize 5 13.9non-US Midsize 5 47.9
-
7/31/2019 Project Group 3-Report
40/44
non-US Midsize 5 28non-US Midsize 6 37.7non-
US Small 4 8.4non-US Small 4 8.6non-US Small 4 9.1non-US Small 4 8.3non-US Small 5 11.8non-US Small 5 11.6
non-US Small 5 10.9non-US Sporty 2 32.5non-US Sporty 4 10non-US Sporty 4 19.8non-US Sporty 4 12.5non-
US Sporty 4 18.4non-US Sporty 4 23.3non-US Van 7 19.7non-US Van 7 19.1non-US Van 7 22.7non-US Van 7 19.1
Table 2 (Sample population of US originate d c ars )
Origin Size Pa ss e nge rs Price US Compact 5 13.5
-
7/31/2019 Project Group 3-Report
41/44
US Compact 5 11.1US Compact 5 11.4US Compact 5 11.3US Compact 5 13.4US Compact 6 15.8US Compact 6 13.3US Midsize 5 16.3US Midsize 5 40.1US Midsize 5 14.9US Midsize 5 26.3US Midsize 5 18.5US Midsize 5 20.2US Midsize 6 15.6US Midsize 6 15.7US Midsize 6 34.3US Midsize 6 15.9US Small 4 7.4US Small 4 9US Small 5 11.1US Small 5 12.2US Small 5 10.1US Small 5 11.3US Small 5 9.2US Sporty 2 38
US Sporty 4 25.8US Sporty 4 15.9US Sporty 4 17.7US Sporty 4 14.4US Sporty 4 15.1US Van 7 19US Van 7 19.5US Van 7 19.9US Van 7 16.3
-
7/31/2019 Project Group 3-Report
42/44
For Question 3: Regression Analysis
RESIDUAL OUTPUT-City MPG
Observation Predicted Y Residuals 1 17.25863325 0.2413667542 17.25863325 -1.5586332463 15.15098622 1.3490137774 21.47392729 7.6260727075 20.42010378 2.2798962196 21.47392729 7.2260727077 18.31245676 1.1875432428 23.58157432 11.618425689 19.36628027 10.63371973
10 20.42010378 1.07989621911 23.58157432 2.51842568412 23.58157432 10.3184256813 19.36628027 -1.16628026914 21.47392729 -7.57392729315 24.63539783 23.2646021716 23.58157432 4.41842568417 22.5277508 15.172249218 -5.925484008 14.3254840119 1.451280573 7.14871942720 16.20480973 -7.10480973521 11.98951569 -3.68951568922 11.98951569 -0.18951568923 13.0433392 -1.443339224 16.20480973 -5.30480973525 24.63539783 7.86460217326 15.15098622 -5.15098622327 17.25863325 2.54136675428 10.93569218 1.56430782329 16.20480973 2.19519026530 23.58157432 -0.28157431631 24.63539783 -4.93539782732 23.58157432 -4.48157431633 23.58157432 -0.88157431634 24.63539783 -5.53539782735 17.25863325 -3.75863324636 18.31245676 -7.21245675837 16.20480973 -4.80480973538 19.36628027 -8.06628026939 16.20480973 -2.80480973540 18.31245676 -2.51245675841 19.36628027 -6.06628026942 18.31245676 -2.01245675843 25.68922134 14.4107786644 22.5277508 -7.627750804
-
7/31/2019 Project Group 3-Report
43/44
45 22.5277508 3.77224919646 22.5277508 -4.02775080447 20.42010378 -0.22010378148 20.42010378 -4.82010378149 19.36628027 -3.66628026950 24.63539783 9.66460217351 20.42010378 -4.52010378152 9.881868665 -2.48186866553 9.881868665 -0.88186866554 13.0433392 -1.943339255 11.98951569 0.21048431156 18.31245676 -8.21245675857 18.31245676 -7.01245675858 11.98951569 -2.78951568959 24.63539783 13.3646021760 23.58157432 2.21842568461 19.36628027 -3.46628026962 22.5277508 -4.82775080463 18.31245676 -3.91245675864 22.5277508 -7.42775080465 24.63539783 -5.63539782766 23.58157432 -4.08157431667 26.74304485 -6.8430448568 23.58157432 -7.281574316
RESIDUAL OUTPUT- Highway MPG
Observation Predicted Y Residuals 1 17.0197627 0.4802372 18.01255764 -2.312563 14.0413779 2.4586224 21.98373737 7.1162635 19.9981475 2.7018526 21.98373737 6.7162637 18.01255764 1.4874428 24.96212217 10.237889 18.01255764 11.98744
10 21.98373737 -0.4837411 23.96932724 2.13067312 22.9765323 10.9234713 19.00535257 -0.8053514 20.99094244 -7.0909415 25.9549171 21.9450816 23.96932724 4.03067317 21.98373737 15.7162618 -1.84334103 10.2433419 5.106223503 3.49377620 15.03417284 -5.93417
-
7/31/2019 Project Group 3-Report
44/44
21 11.0629931 -2.7629922 15.03417284 -3.2341723 12.05578804 -0.4557924 18.01255764 -7.1125625 22.9765323 9.52346826 14.0413779 -4.0413827 17.0197627 2.78023728 12.05578804 0.44421229 16.02696777 2.37303230 22.9765323 0.32346831 26.94771203 -7.2477132 23.96932724 -4.8693333 25.9549171 -3.2549234 24.96212217 -5.8621235 17.0197627 -3.5197636 17.0197627 -5.9197637 14.0413779 -2.6413838 20.99094244 -9.6909439 12.05578804 1.34421240 19.9981475 -4.1981541 20.99094244 -7.6909442 17.0197627 -0.7197643 22.9765323 17.1234744 21.98373737 -7.0837445 20.99094244 5.30905846 20.99094244 -2.4909447 18.01255764 2.18744248 20.99094244 -5.3909449 17.0197627 -1.3197650 21.98373737 12.3162651 19.00535257 -3.1053552 15.03417284 -7.6341753 7.091813369 1.90818754 10.07019817 1.02980255 15.03417284 -2.8341756 18.01255764 -7.9125657 19.00535257 -7.7053558 15.03417284 -5.8341759 22.9765323 15.0234760 23.96932724 1.83067361 19.00535257 -3.1053562 19.9981475 -2.2981563 18.01255764 -3.6125664 19.9981475 -4.8981565 26.94771203 -7.9477166 24.96212217 -5.4621267 27.94050697 -8.0405168 24.96212217 -8.66212