Real Makoya

57
INTRODUCTION This statistical project focuses on vehicles. These vehicles are a sample of 93 vehicles from a population of different models of cars by different manufactures in the USA. The 93 vehicles are categorized into six types being compact, small, mid-size, large, sporty and van respectively. To analyze the data in the data set of vehicles provided, two aims set out as guidelines are stipulated. The first aim is to compare price and mpg across categories of vehicles being compact, small, mid-size, large, sporty and van. To archive the first aim, price variables of both basic and top price are to be grouped according to the six car types. The mpg variables of mpg town and mpg best are also to be grouped according to the six car types. The Analysis Toolpak in Tools within Excel is then to be used to create tables for the price variables showing all descriptive statistics with the mean, median, standard deviation, range, skew measure, standard error, sum, count and coefficient of variation which is to be manually added in one table. There is to be an interpretation at the end of each table which includes concerns about the differences in the mean prices and mean mpg across vehicle types and the possible causes depending on whether it’s a mpg or price table. The distribution of price and mpg shapes across each vehicle type and what it means. Confidence intervals are to be used to talk about prices in the population of cars. Several differences of means tests on both prices and mpg are to be taken where there is no much difference between the means of different car types, in both prices and mpg. To find the likelihood of a cheap car of an efficient car across categories probability is to be used and its pattern is to be tested using chi squared tests. The second aim is to predict the mpg. This is to be done by using correlation to look for linear association between either town or best mpg and all other variables in the data set. A frequency 1

Transcript of Real Makoya

Page 1: Real Makoya

INTRODUCTION

This statistical project focuses on vehicles. These vehicles are a sample of 93 vehicles from a population of different models of cars by different manufactures in the USA. The 93 vehicles are categorized into six types being compact, small, mid-size, large, sporty and van respectively. To analyze the data in the data set of vehicles provided, two aims set out as guidelines are stipulated.

The first aim is to compare price and mpg across categories of vehicles being compact, small, mid-size, large, sporty and van. To archive the first aim, price variables of both basic and top price are to be grouped according to the six car types. The mpg variables of mpg town and mpg best are also to be grouped according to the six car types. The Analysis Toolpak in Tools within Excel is then to be used to create tables for the price variables showing all descriptive statistics with the mean, median, standard deviation, range, skew measure, standard error, sum, count and coefficient of variation which is to be manually added in one table. There is to be an interpretation at the end of each table which includes concerns about the differences in the mean prices and mean mpg across vehicle types and the possible causes depending on whether it’s a mpg or price table. The distribution of price and mpg shapes across each vehicle type and what it means. Confidence intervals are to be used to talk about prices in the population of cars. Several differences of means tests on both prices and mpg are to be taken where there is no much difference between the means of different car types, in both prices and mpg. To find the likelihood of a cheap car of an efficient car across categories probability is to be used and its pattern is to be tested using chi squared tests.

The second aim is to predict the mpg. This is to be done by using correlation to look for linear association between either town or best mpg and all other variables in the data set. A frequency table and a chi squared test are to be used. A correlation matrix or tables of comparative correlations of mpg measures against all other variables are to be shown, as well as comments on the implication of the correlation values. Scatter diagrams of the variables with mpg, useful for predictions are to be shown as well as the r2 value and the equation of the line. Comments on what the r2 value tells about the model used to predict the mpg and an explanation of what the value of the intercept and the slope mean in context are essential.

1

Page 2: Real Makoya

PART 1, AIM: COMPARING PRICES AND MPG’S ACROSS VEHICLE CATEGORIES

1. (A) COMPARISONS OF VARIABLE DISTRIBUTIONS

TABLE SHOWING THE DESCRIPTIVE MEASURES OF DIFFERENT CAR TYPES WITH BASIC PRICE

Type of measure

Compact Small Mid-size Large Sporty Van

($’000) ($’000) ($’000) ($’000) ($’000) ($’000)Mean 15.69375 8.428517143 24.11363636 22.93636364 16.85714286 16.2Standard Error

1.468288935 0.32580619 2.164484037 1.887676433 2.110119894 0.67597666

Median 14.05 8.2 23.05 19.9 13.7 16.6Standard Deviation

5.873155739 1.493031432 10.15233004 6.260714452 7.895345687 2.027929979

Coefficient of variation

0.374235331 0.177140463 0.421020284 0.273240545 0.468367964 0.125180862

Skewness 1.052169706 1.595838501 0.686663723 1.144790306 1.538621069 0.513253608Range 20.5 6.2 33 16.9 25.5 5.9Minimum 8.5 6.7 12.4 17.5 9.1 13.6Maximum 29 12.9 45.4 34.4 34.6 19.5Sum 251.1 177 530.5 252.3 236 145.8Count 16 21 22 11 14 9

Fig.1 basic price

INTERPRETATION OF DESCRIPTIVE STATISTICS

MEAN

Fig 1 above shows the mean prices of cars as $15 693.75 for compact cars, $8 428.52 for small cars, $24 113.63 for mid- size cars, $22 936.36 for large cars, $16 857.14 for sporty cars and $16 200 for vans. Mid-size cars have the highest mean price and the reason for this may be because they have an average engine size of 3.1 litres which is second biggest to large cars which have the highest average engine size of 4.1 litres. This coupled by the high number of air bags mid-size cars have ads on to their cost. However, small cars show the lowest mean price of $8 429. The difference between this price and that of mid-size cars is $15 684.6 which is way too extreme. This might be due to the fact that the mean engine size of small cars is 1.6 litres and that of mid-size cars is 3.1 litres, hence the difference is 1.464 litres. Another possible cause of the extreme difference in mean prices between these two car types is that mid-size cars have a total number of 25 airbags whilst small cars have only 5 airbags.

2

Page 3: Real Makoya

MEDIAN

The figure above shows Small cars having the lowest median price $8 200 compared to other car types. It is followed by sporty cars with a median price of $13 700, then compact cars $14 050. Next on the sequence are van cars with $16 600 then large cars with a median price of $19 900 and lastly mid-size cars with a median price of $23 050.The median price values are therefore unaltered by extremely low or high prices.

SKEW MEASURE

From fig1 above shows the Skewness of compact cars as 1.052169706, small cars as 1.595838501, mid-size cars as 0.686663723, large cars as 1.144790306, sporty cars as 1.538621069 and lastly vans cars with a skew measure of 0.513253608. The figure goes on to indicate that compact, small, mid-size, large and sporty cars have positive Skewness. Vans display negative Skewness. This might be due to the fact that of all the car types, van cars have the lowest number of airbags of 3 whereas other car types have an average number of 14.4 airbags.

COEFFICIENT OF VARIATION

Compact cars have a higher coefficient of variation (CV) 0f 0.374235331, this means that variables are too spread away from the mean price. Small cars have a CV of 0.1771, mid-size cars show a CV of 0.4210 and large cars reveal that of 0.2732. Second but not last are sporty cars, they have a CV of 0.4684 and lastly van cars show a value of 0.1252 which is quite close to the mean. This means that for all car types, variables are spread away from the mean which means that they have extreme values except for van cars which show that the variables are close to the mean.

RANGE

In figure 1 above, compact cars have a range in basic price of $20 500 while small cars have a range of $6 200and mid-size cars have a range of $33 000. Again, large cars show a range of $16 900 whilst sporty cars show a range of $25 500. Lastly, van cars reveal a range of$ 5 900. In contrast, mid-size cars have the highest range in basic price of $33 000 and the difference between this range and that of van cars is $27 100, this is too extreme and could be due to the fact that mid-size cars are really expensive more than van cars. However, the difference in range between mid-size cars and sporty cars is only $7 500, this could be as a result that sporty cars and mid-size cars almost have the same prices

3

Page 4: Real Makoya

TABLE SHOWING THE DESCRIPTIVE MEASURES OF DIFFERENT CAR TYPES WITH TOP PRICE

Type of measure

Compact Small Mid-size Large Sporty Van

($’000) ($’000) ($’000) ($’000) ($’000) ($’000)Mean 20.725 11.9047619 30.31363636 25.67272727 21.95714286 22.03333333Standard Error

1.990236586 0.61172964 3.216248624 2.010702768 2.291257016 1.003050902

Median 18.5 11.3 27.35 21.9 21.2 21.7Standard Deviation

7.960946342 2.803297378 15.08554324 6.668746645 8.573098738 3.009152705

Coefficient of variation

0.384122863 0.235476979 0.497648749 0.259759961 0.390446917 0.136572774

Skewness 0.94958771 0.918155113 1.816956584 0.912413115 1.029276406 0.124539188Range 25.7 10.9 65.1 19.4 30.5 8.6Minimum 11.4 7.9 14.9 18.4 11 18Maximum 37.1 18.8 80 37.8 41.5 26.6Sum 331.6 250 666.9 282.4 307.4 198.3Count 16 21 22 11 14 9

Fig.2 top price

INTERPRETATION OF DESCRIPTIVE STATISTICS

MEAN

The figure above shows compact cars have a mean price of $ 20 725, small cars $11 905, midsize $30 314, large cars $25 673, sporty cars $21 957 and lastly vans with a mean price of $ 22 033. It indicates that mid-size cars have the highest mean price and this means that they are the most expensive cars. This is because they have the highest average number of airbags than any other car types and this feature adds on to their cost. Moreover, this figure also shows that small cars have the lowest mean price and this may be because they have the smallest average engine size of just 1.6 litres.

MEDIAN

The median price of compact cars is $18 500, the median price of small cars is $11 300 and the mid-size cars have a median price of $27 350. Again, the one for large cars is $21 900, and as for large cars is $21 200 and lastly van cars have a median price of $21 700. Mid-size cars have the highest median price and small cars have the lowest median price. The difference between the median prices is $16 050, this could be due to the fact that their variables and extras are different.

4

Page 5: Real Makoya

SKEWNESS

From fig 2, shows the Skewness of compact cars as 0.94958771, small cars as 0.918155113 , mid-size cars as 1.816956584, large cars as 0.912413115, and sporty cars as 1.029276406 and vans as 0.124539188. From the data above there is a resemblance of positive Skewness across all car types. This might be due to that of all the car types, midsize cars are the only car types with a total number of 25 airbags whereas other car types have an average number of 10 airbags.

COFFICIENT OF VARIATION

Looking at figure 2, we can note that the coefficient of variation values for compact, small, mi-size, sporty and van cars are 0.3841, 0.2385, 0.4976, 0.3904, and 0.1365 respectively. This means that the variables are extremely spread away from their mean prices this is therefore due to some prices being too high and some too low. However, the coefficient of variation for large cars is 0.2598 and this value shows that the variables are spread closely to the mean price.

RANGE

Figure2 shows that mid-size cars have the highest range in top price of $65 100, followed by sporty cars of $30 500 and compact cars which have a range of $25 700. Again in the preceding sequence, large cars follow with a range of $19 400 and small cars follow with that of $10 900. Lastly van cars show a range in top price of $8 600 being the lowest. The difference between the price of midsize cars which is higher than that of any car type and that of vans which is the lowest is $56 500.These differences in range of top price could be due to the fact that some prices are too extreme.

5

Page 6: Real Makoya

TABLE SHOWING THE DESCRIPTIVE MEASURES OF DIFFERENT CAR TYPES WITH MPG TOWN

Type of measure

Compact Small Mid-size Large Sporty Van

Mean 22.6875 29.85714286 19.54545455 18.3636364 21.78571429 17Standard Error

0.480613757 1.333248297 0.404130582 0.45272362 1.043970503 0.40824829

Median 23 29 19 19 22.5 17Standard Deviation

1.922455028 6.109711239 1.895540449 1.50151439 3.906179944 1.224744871

Coefficient of variation

0.084736309 0.204631476 0.096981139 0.08176563 0.179300063 0.072043815

Skewness -0.00578058 1.287889433 0.038136436 -0.5460437 0.476793571 -1.04978132Range 6 24 7 4 13 3Minimum 20 22 16 16 17 15Maximum 26 46 23 20 30 18Sum 363 627 430 202 305 153Count 16 21 22 11 14 9

Fig 3 MPG town

INTERPRETATION OF DESCRIPTIVE STATISTICS

MEAN

Figure 3 shows that the mpg for compact cars is 22 688 miles, small cars is 29 857 miles, mid-size cars is 19 545 miles and large cars is 18 364 miles. Again, sporty cars have a mean mpg of 21 786miles and lastly van is 17 000miles. Small cars have the highest mean mpg because they have a relatively small engine size therefore they travel more miles than van cars which have the smallest mean mpg as they have bigger engine sizes. Small cars are therefore said to be fuel efficient and can travel more miles than van cars without using little fuel.

MEDIAN

The figure above (3) shows that the median values of mpg town arranged in ascending order are van with 17 miles, large and mid -size with 19 miles, sporty with 22.5 miles, compact with 23miles and small with 29 miles. The median value in mpg is 20.75miles as half of the values are above and below it. This means that the median value is not affected by extremely high or low prices.

6

Page 7: Real Makoya

SKEWNESS

Figure 3 shows that compact cars have a skew measure of -0.00578058, small cars have 1.287889433, mid-size have cars 0.038136436, large cars -0.546043701; sporty cars 0.476793571 and vans have -1.049781318. Van cars reveal a symmetric skew shape; this means that the median and mean mpg variables are relatively the same. Small cars and mid-size cars have showed a considerable positive skew shape. In addition, compact cars, large cars and sporty cars have negative Skewness. These means that the variables are clustered more to the median that the mean.

COEFFICIENT OF VARIATION

The co efficient of variation for compact cars is 0.084736309 , for small cars is for 0.204631476 , mid size cars is 0.096981139 , large cars is 0.081765634 , sporty cars is 0.179300063 and the CV for vans is 0.072043815. Looking at the figure above, it can be stated that the coefficient of variation values in miles for all cars types are spread away from the mean. These means that some values are extremely high and some are extremely low in miles.

RANGE

The figure above indicates that compact car have a range of 6 miles, small cars have 24 miles followed by mid-size cars with a range of 7 miles then large cars with 4 miles, sporty cars with 13 miles and lastly vans with a range of 3 miles. Vans display the smallest range among all the car types and therefore we can conclude that there is less dispersion in their MPG. Small cars have the highest range in mpg of 24miles; this could be due to that they have a small engine size and low fuel consumption rate.

7

Page 8: Real Makoya

TABLE SHOWING THE DESCRIPTIVE MEASURES OF DIFFERENT CAR TYPES WITH MPG BEST

Type of measure

Compact Small Mid-size Large Sporty Van

Mean 29.875 35.47619048 26.72727273 26.72727273 28.78571429 21.8888889Standard Error

0.735272058 1.224004061 0.383545875 0.383545875 0.973148123 0.4843221

Median 30 33 26.5 26 28.5 22Standard Deviation

2.941088234 5.60909126 1.272077756 1.272077756 3.641186861 1.45296631

Coefficient of variation

0.098446468 0.158108614 0.047594745 0.047594745 0.126492843 0.06637917

Skewness 0.589051528 1.184606075 0.12162795 -0.09127183 0.501810576 -0.0711534Range 10 21 9 3 12 4Minimum 26 29 22 25 24 20Maximum 36 50 31 28 36 24Sum 478 745 588 294 403 197Count 16 21 22 11 14 9Fig. 4 MPG best

INTERPRETATION OF DESCRIPTIVE STATISTICS

MEAN

Fig 4 shows that the mpg for compact cars is 29 875miles, small cars have 35 476miles, mid-size cars is 26 727miles and large cars is 26 727miles. Again, sporty cars have a mean mpg of 28 786miles and lastly van is 21 889miles. Small cars have the highest mpg of 35.476, it has are relatively small engine size of 1.6 litre per car and therefore can travel over a long distance consuming a gallon of a fuel. In contrast, van car have the lowest mpg of 21.889, they have a large engine size of about 3.2 litres per car, these cars covers short distances over a gallon of fuel. Small cars have a lower consumption rate as compared to van cars.

MEDIAN

According to the figure above, van cars have the smallest mpg best of 22miles; large cars follow with a range in mpg best of 26miles. In addition, large and mid-size cars have a range of 26.5 miles and 28.5 miles respectively. Compact cars have a range of 30 miles and lastly small cars have a range of 33 miles. The median mpg values are unaffected by extremely low or high miles.

SKEWNESS

8

Page 9: Real Makoya

According to the figure above, we note that small cars have the highest skew measure of 5.60909121 followed by sporty cars with 3.641186861 and compact cars with a skew measure of 2.94108823. On the sequence follows van cars with a skew measure of 1.452966315 and lastly mid-size and large cars with same skew value of 1.272077756. Across all these car types except for mid-size cars there is positive Skewness which shows that the mean is large that the median. This might be due to the fact that these cars types have a relatively medium sized engine. The mid-size cars show a symmetric Skewness because it has small sized engines.

COEFFICIENT OF VARIATION

The coefficient of variation for compact cars is , small cars is , mid-size cars is ,large cars is ,sporty cars is and for large cars is .According to the figure above, the coefficient of variation values in miles for all cars types are spread away from the mean. This means that some values are extremely high and some are extremely low in miles.

RANGE

The range of compact cars is 10miles , small cars is 21miles , mid-size cars 9miles, large is 3miles followed by sporty cars with 12miles and finally vans with 4miles. Small cars have the highest mpg range of 21 miles and large cars have the lowest mpg range of 3 miles. This means that there is less dispersion in mpg of large cars than in small cars because the range in of 3 miles is less than the range of 21 miles. That is the mpg of large cars is clustered more closely around the mean as compared to that of small cars which is dispersed away from the mean

(B) HYPOTHESIS TEST OF TWO MEANS

We undertook hypothesis testing to test the interesting differences between car types for prices and mpg’s. The t-test was used as n was less than 30.

BASIC PRICE

COMPACT AND VAN CAR TYPES

We tested the difference in mean basic prices because there was not so much difference between their prices.

1. H₀ : μ₁=μ₂

H₁: μ₁≠μ₂

2. α= 0.05

9

Page 10: Real Makoya

3. We used t-test because n is less than 30(n₁=16 and n₂=9) and population standard is not known.

4. Decision rule: reject Hₒ if tc ‹ -2.086 or tc › 2.086. Critical value = 2.086.5. T-statistic= -0.313191922. Therefore at 5% level of significance, we do not reject the null

hypothesis. There is sufficient evidence that the two sample means are from the same population.

SPORTY AND VAN CAR TYPES

There is a slight difference in means test between these vehicles types.

1. H₀ : μ₁=μ₂

H₁: μ₁≠μ₂

2. α= 0.053. t- test because n is less than 30 (n₁=14 and n₂=9) and the population standard deviation is

not known.4. Critical value= 2.120 therefore reject Hₒ if tc‹ -2.120 or tc › 2.2105. T-statistic = 0.296577999. Therefore, at 5 % level of significance the null hypothesis is

not rejected. There is enough evidence that the two sample means are from the same population.

TOP PRICE

SPORTY AND VAN CAR TYPES

1. H₀ : μ₁=μ₂

H₁: μ₁≠μ₂

2. α= 0.053. t- test because n is less than 30 (n₁=14 and n₂=9) and the population standard

deviation is not known.4. Critical value = 2.179. Reject Hₒ if tc ‹ -2.179 or tc › 2.1795. T statistic= -0.030460311. At 5% level of significance we do not reject the null

hypothesis. Sufficient evidence can show that sporty and van car types are quite equal.

10

Page 11: Real Makoya

SPORTY AND COMPACT CAR TYPES

1. H₀ : μ₁=μ₂

H₁: μ₁≠μ₂

2. α= 0.053. t- test because n is less than 30 (n₁=16 and n₂=14) and the population standard deviation

is not known.4. Critical value = 2.052. Reject Hₒ if tc ‹ -2.052 or tc › 2.0525. T statistic= -0.405985032. At 5% level of significance the null hypothesis not rejected.

Sufficient evidence can show that sporty and compact car types are quite equal.

MPG TOWN

COMPACT AND SPORTY CAR TYPES

A hypothesis test is taken to test the slight difference in means between these two car types.

1. H₀ : μ₁=μ₂

H₁: μ₁≠μ₂

2. α= 0.053. t- test because n is less than 30 (n₁=16 and n₂=14) and the population standard deviation

is not known.4. Critical value = 2.101 Reject Hₒ if tc ‹ -2.101 or tc › 2.1015. T statistic= 0.7846646963. At 5% level of significance the null hypothesis not rejected.

Sufficient evidence can show that sporty and compact car types are quite equal and belong to the same population.

11

Page 12: Real Makoya

MID-SIZE AND LARGE CAR TYPES

The difference in means between these two vehicles was so small we had to carry a hypothesis test.

1. H₀: μ₁=μ₂

H₁: μ₁≠μ₂

2. α= 0.05

3. t- test because n is less than 30 (n₁=22 and n₂=11) and the population standard deviation is not known.

4. Critical value = 2.060 Reject Hₒ if tc ‹ -2.060 or tc › 2.060

5. T statistic= 1.947428329. At 5% level of significance the null hypothesis not rejected. Sufficient evidence can show that mid-size and large car types are quite equal and belong to the same population.

MPG BEST

MID-SIZE AND LARGE CAR TYPES

We carried a hypothesis test as follows to test the difference in means for the above car types;-

1. H₀: μ₁=μ₂

H₁: μ₁≠μ₂

2. α= 0.053. t- test because n is less than 30 (n₁=22 and n₂=11) and the population standard deviation

is not known.

4. Critical value = 2.086 Reject Hₒ if tc ‹ -2.086 or tc › 2.086

5. T statistic= 0. At 5% level of significance the null hypothesis not rejected. Sufficient evidence can show that mid-size and large car types are equal and belong to the same population.

COMPACT AND SPORTY CAR TYPES

A t-test hypothesis test is undertaken to test the difference in means between these two vehicle types.

12

Page 13: Real Makoya

1. H₀: μ₁=μ₂

H₁: μ₁≠μ₂

2. α= 0.05

3. t- test because n is less than 30 (n₁=16 and n₂=14) and the population standard deviation is not known.

4. Critical value = 2.060 Reject Hₒ if tc ‹ -2.060 or tc › 2.060

5. T statistic= 0.893084498. At 5% level of significance the null hypothesis not rejected. Sufficient evidence can show that sporty and compact car types are quite equal and belong to the same population.

(C) CONFIDENCE INTERVAL

Here confidence interval is used on prices and mpg’s in the population of cars. 95% confidence interval was used because the standard error lies between 1.96 and not around 2.58 standard deviations of the population mean.

13

Page 14: Real Makoya

A TABLE SHOWING CONFIDENCE INTERVAL ACROSS CAR TYPES

BASIC PRICE

Car types Mean price($’000) Confidence interval($’000)

Compact 15.6937 12.81590369 to 18.57159631

Small 8.4285172539 7.789937051 to 9.067097233

Mid-size 24.11363636 19.87124765 to 28.35602507

Large 22.93636364 19.23651783 to 26.63620945

Sporty 16.85714286 12.72130787 to 20.99297785

Van 16.2 14.875085575 to 17.52491425

Figure 5

From the figure 5, there is 95% confidence that the true population means for compact cars lie between $12 816 to $18 572, small cars $7 790 and $9 067, mid-size cars $19 871 and $28 356, large cars $19 237 and $26 636, sporty cars $12 721 and $20 993 and lastly van cars lie between $14 875 and $17 525.

14

Page 15: Real Makoya

TOP PRICE

Car types Mean Price($’000) Confidence Interval($’000)

Compact 20 725 16 824 to 24 626

Small 11 905 10 706 to 13 104

Mid-size 30 314 24 001 to 31 617

Large 25 673 21 732 to 29 614

Sporty 21 957 17 466 to 26 448

van 22 033 20 067 to 23 999

Figure 6

Figure 6 shows that there is 95% confidence that the population means of compact cars lie between $16 824 and $24 626, small cars between $10 706 to $13 104, mid-size cars between $24 001 to $31 617 while large cars lie between $21 732 to $29 614. Again, sporty cars come around between $17 466 to $26 448 and lastly van cars are between $20 067 to $23 999.

MPG TOWN

Car types Mean mpg(miles) Confidence interval( miles)

Compact 22.7 21.7 to 23.6

Small 29.9 27.2 to 32.5

Mid-size 19.5 18.7 to 20.3

Large 18.4 17.5 to 19.2

Sporty 21.8 19.7 to 23.8

van 17.0 16.2 to 17.8

Figure 7

Figure 7 above shows that there is a 95% confidence that the true population mean of compact cars lies between 21.7miles to 23.6miles, small cars between 27.2miles to 32.5miles, mid-size

15

Page 16: Real Makoya

between 18.7miles to 20.3miles, large cars between 17.5miles to 19.2miles and sporty cars lie between 19.7miles to 23.8miles. Lastly, van cars lie between 16.2miles to 17.8miles.

MPG BEST

Car types Mean mpg(miles) Confidence Interval( miles)

Compact 29.9 28.4 to 31.3

Small 35.5 33.1 to 37.9

Mid-size 26.7 26.0 to 27.5

Large 26.7 26.0 to 27.5

Sporty 28.8 26.9 to 30.7

Van 21.8 20.9 to 22.8

Figure 8

The figure above(8) reveals that there is a 95% confidence that the true population mean for compact cars lies between 28.4miles to 31.3miles, small cars between 33.1miles to 37.9miles between 26.9miles to 30.7miles whereas van cars lie between 20.9miles to 22.8miles.

16

Page 17: Real Makoya

(D) PROBABILITY

BASIC PRICE

(Average price $17 100)

Car types High price($’000)

Low price ($000)

Total($000)

Compact 5 11 16

Small 0 21 21

Mid-size 13 9 22

Large 11 0 11

Sporty 4 10 14

Van 2 7 9

Total 35 58 93

Figure 9

According to figure 9, the probability that a car is;-

a) Compact and low in basic price is 11/58

b) Large but high in basic price is 11/35

c) A van is 9/93

d) Is of a low price 58/93

17

Page 18: Real Makoya

TOP PRICE

(Average price $21 900)

Car types High price($’000)

Low price($’000)

Total($’000)

Compact 6 10 16

Small 0 21 21

Mid-size 14 8 22

Large 6 5 11

Sporty 6 8 14

Van 4 5 9

Total(miles) 36 57 93

Figure 10

The figure above reveals that the probability that a car could be:-

a) Mid-sized is 22/93

b) Low in top price is 57/93

c) Sporty but low in top price is 8/14

d) Compact and highly priced is 6/16

18

Page 19: Real Makoya

MPG TOWN

(Average MPG 22.5miles)

Car types High MPG (miles)

Low MPG (miles)

Total(miles)

Compact 9 7 16

Small 20 1 21

Mid-size 1 21 22

Large 11 0 11

Sporty 7 7 14

Van 0 9 9

Total(miles) 48 45 93

Figure 11

Looking from figure 11, the probability that a car;-

a) Has a low mpg but it’s a van is 9/45

b) Is sporty is 7/93

c) Is small but has a high mpg is 20/48

d) Has a high mpg but it’s a van is 0/48

19

Page 20: Real Makoya

MPG BEST

(Average MPG 29.1 miles)

Car types High MPG(miles) Low MPG(miles) Total miles

Compact 9 7 16

Small 19 2 21

Mid-size 4 18 22

Large 0 11 11

Sporty 6 8 14

Van 0 9 9

Total(miles) 38 55 93

Figure 12

This figure (12) shows that the probability that a car;-

a) Has a high mpg is 38/93

b) Is small and has a low mpg is 2/21

c) Sporty and has a high mpg is 6/38

d) Is compact is 9/93

20

Page 21: Real Makoya

(E) CHI-SQUARED

Chi-squared was used across both prices and MPG’s to test the apparent pattern of probability.

BASIC PRICE

Car types High price ($’000) f₀ fₑ f -₀ fₑ (f -₀ f )ₑ 2 (f -₀ f )ₑ 2 /fₑ

Compact 5 5.83 -0.83 0.69 0.12

Small 0 5.83 -5.83 34.0 5.83

Mid-size 13 5.83 7.17 51.4 8.82

Large 11 5.83 5.17 26.7 4.58

Sporty 4 5.83 -1.83 3.35 0.57

Van 2 5.83 -3.83 14.67 2.52

Figure 13.

As from figure 13, the computed x2 value is 22.44. It is beyond the rejection region with a critical value of 11.1. At the 0.05 level of significance we reject the null hypothesis and accept the alternate hypothesis. The difference between the observed and expected high prices is large enough to be considered significant.

TOP PRICE

21

Page 22: Real Makoya

Car types High price ($’000) f₀ fₑ f -₀ fₑ (f -₀ f )ₑ 2 (f -₀ f )ₑ 2 /fₑ

Compact 6 6 0 0 0

Small 0 6 -6 36 6

Mid-size 14 6 8 64 10.67

Large 6 6 0 0 0

Sporty 6 6 0 0 0

Van 4 6 -2 4 0.67

Figure 14 f₀: observed high price

fₑ: expected high price

As from figure 14, the computed x2 value is 17.34. It is beyond the rejection region with a critical value of 11.1. At the 0.05 level of significance we reject the null hypothesis and accept the alternate hypothesis. The difference between the observed and expected high prices is large enough to be considered significant.

MPG TOWN

22

Page 23: Real Makoya

Car types High mpg(miles) f₀

fₑ f -₀ fₑ (f -₀ f )ₑ 2 (f -₀ f )ₑ 2 /fₑ

Compact 9 8 1 1 0.125

Small 20 8 12 144 18

Mid-size 1 8 -7 49 6.125

Large 11 8 3 9 1.125

Sporty 7 8 -1 1 0.125

Van 0 8 -8 64 8

Figure 15

This figure (15) reveals that the computed x2 value is 33.5 miles. It is in the rejection region beyond the critical value of 11.1. The null hypothesis is rejected at the 0.05 level of significance and we do not the alternate hypothesis.

MPG BEST

23

Page 24: Real Makoya

Car types High mpg(miles) f₀

fₑ f -₀ fₑ (f -₀ f )ₑ 2 (f -₀ f )ₑ 2 /fₑ

Compact 9 6.33 2.67 7.13 1.13

Small 14 6.33 12.67 160.53 25.36

Mid-size 4 6.33 -2.33 5.43 0.86

Large 0 6.33 -6.33 40.01 6.32

Sporty 6 6.33 -0.33 0.11 0.02

Van 0 6.33 -6.33 40.01 6.32

Figure 16

This figure (16) reveals that the computed x2 value is 40.01 miles. It is in the rejection region beyond the critical value of 11.1. The null hypothesis is rejected at the 0.05 level of significance and we accept the alternate hypothesis.

PART 2

24

Page 25: Real Makoya

AIM: PREDICTING MPG

(2) a. For this part we examine the relationship between MPG (miles per gallon) and other variables in the data set. MPG town seemed suitable and reasonable

A table showing a correlation matrix across vehicle types and variables

MPG TOWN

Compact Small Mid-size Large Sporty Van

HP .021 -.574 -.742 -.335 -.776 .069 Length .307 -.441 -.581 -.488 -.362 .123 Engine

size -.019 -.791 -.718 -.960 -.578 -.257

RPM .031 .054 -.153 .870 -.040 .637 Weight -.282 -.738 -.811 -.670 -.703 -.378

Figure 17

From figure 17, variables which seem to be relevant and reasonable determinants of MPG town are horsepower (hp), engine size (litres), maximum revolutions of engine per minute (RPM) and weight (pounds).

HORSEPOWER (hp)

From figure 17, the coefficient of correlation r for MPG town and horsepower for car types are as follows, compact 0.021, small -0.574, mid-size -0.742, large cars -0.335, sporty cars -0.776 and van cars 0.069. Sporty cars have the strongest negative correlation therefore horsepower can be used to predict the MPG for sporty cars, while compact cars have the weakest positive correlation with an r value quite close to 0.

ENGINE SIZE (litres)

Figure 17 shows that the coefficient of correlation r for MPG town and engine size for car types are as follows, compact -0.019, small -0.791, mid-size -0.718, large -0.960, sporty -0.578 and van cars -0.257, this means that there is a relationship between engine size and large cars since they have the strongest negative correlation thus can be used as an MPG determinant for large cars.

MAXIMUM REVOLUTIONS OF ENGINE (per minute)

25

Page 26: Real Makoya

Figure 17 shows that the coefficient of correlation r for MPG town and RPM for car types are as follows, compact 0.031, small 0.054, mid-size -0.153, large 0.870, sporty -0.040 and van 0.637.Large cars have the strongest positive correlation therefore have a strong relationship between RPM and MPG town, however RPM can be used as a predictor for MPG town for large cars unlike compact cars.

WEIGHT (pounds)

From figure 17, the coefficient of correlation r for MPG town and horsepower for car types are as follows, compact -0.282, small -0.738, mid-size -0.811, large -0.670, sporty -0.703 and van -0.378.There is a relationship between weight and MPG town of mid-size cars, they show the strongest negative correlation amongst other car types. However, weight can be used as an MPG determinant for mid-sized cars.

SCATTER DIAGRAMS TO SHOW PREDICTION OF MPG TOWN AND RELEVANT VARIABLES

Graph 1

Graph 1 shows the coefficient of determination, r2 as 0.1121. In equation y= -0.023x + 22.495 where y= MPG town and x = horsepower. 11.2% of variation in MPG town is justified by changes in horsepower. The graph shows that for a unit increase in horsepower there is a decrease in MPG town.

26

Page 27: Real Makoya

Graph 2

Graph 2 shows the coefficient of determination, r2 as 0.9224. In equation y= -1.8184x + 26.018 where y= MPG town and x = engine size. 92% of variation in MPG town is justified by changes in engine size. The graph shows that for a unit increase in engine size there is a decrease in MPG town. However, engine size is a perfect predictor of MPG ton for large cars.

27

Page 28: Real Makoya

Graph 3

Graph 3 shows the coefficient of determination, r2 as 0.6572. In equation y= -0.0048 + 35.803 where y= MPG town and x = weight. 65.7% of variation in MPG town is explained by changes in weight. The graph shows that for a unit increase in weight there is a decrease in MPG town. Therefore weight is a perfect predictor of MPG ton for mid-sized cars.

A TABLE SHOWING A CORRELATION MATRIX ACROSS VEHICLE TYPES AND VARIABLES

MPG BEST

Compact Small Mid-size Large Sporty VanHP -.053 -.546 -.685 -.222 -.724 .355

Length .187 -.295 -.294 -.654 -.176 .296 Engine

size -.317 -.628 -.548 -.780 -.453 -.041

RPM .149 -.014 -.254 .797 -.207 .549 Weight -.452 -.620 -.688 -.745 -.612 -.198

Graph 18

From figure 18, variables which seem to be relevant and reasonable determinants of MPG best are horsepower (hp), engine size (litres), maximum revolutions of engine per minute (RPM) and weight (pounds).

28

Page 29: Real Makoya

HORSEPOWER

From figure 18, the coefficient of correlation r for MPG best and horsepower for car types are as follows, compact -0.053, small -0.546, mid-size -0.685, large cars -0.222, sporty cars -0.724 and van cars 0.355. Sporty cars have the strongest negative correlation therefore horsepower can be used to predict the MPG for sporty cars, while compact cars have the weakest positive correlation with an r value quite close to 0.

ENGINE SIZE (litres)

Figure 18 shows that the coefficient of correlation r for MPG best and engine size for car types are as follows, compact -0.317, small -0.628, mid-size -0.548, large -0.780, sporty -0.433 and van cars -0.041, this means that there is a relationship between engine size and large cars since they have the strongest negative correlation thus can be used as an MPG determinant for large cars.

WEIGHT (pounds)

From figure 18, the coefficient of correlation r for MPG best and weight for car types are as follows, compact 0.452, small -0.620, mid-size -0.688, large -0.745, sporty -0.612 and van -0.198.There is a relationship between weight and MPG best of large cars, they show the strongest negative correlation amongst other car types. However, weight can be used as an MPG determinant for large cars while van cars show the weakest.

MAXIMUM REVOLUTIONS OF ENGINE (per minute)

Figure 18 shows that the coefficient of correlation r for MPG best and RPM for car types are as follows, compact 0.149, small -0.014, mid-size -0.254, large 0.797, sporty -0.207 and van 0.549.Large cars have the strongest positive correlation therefore have a strong relationship between RPM and MPG town, however RPM can be used as a predictor for MPG town for large cars unlike compact cars.

29

Page 30: Real Makoya

SCATTER DIAGRAMS TO SHOW PREDICTION OF MPG BEST AND RELEVANT VARIABLES

Graph 4

Graph 4 shows the coefficient of determination, r2 as 0.609. In equation y= -1.2518 + 31.996 where y= MPG best and x = engine size. 60% of variation in MPG best is explained by changes in engine size. The graph shows that for a unit increase in engine size there is a decrease in MPG town. Therefore weight is a perfect predictor of MPG best for large cars.

30

Page 31: Real Makoya

Graph 5

Graph 5 shows the coefficient of determination, r2 as 0.028. In equation y= -0.068x + 30.768 where y= MPG best and x = horsepower. 0.28% of variation in MPG best is justified by changes in horsepower. The graph shows when horsepower is 0 mpg is 30.768 which is quite impossible. Above an mpg of 0 a unit in (hp) increase in horsepower to a -0. 068 increase in efficiency. Therefore horsepower cannot be used to determine MPG best for compact cars.

31

Page 32: Real Makoya

Graph 6

Graph 6 shows the coefficient of determination, r2 as 0.0392. In equation y= -0.0019x + 29.036 where y= MPG best and x = weight. 3.9% of variation in MPG best is justified by changes in weight. The graph shows when weight is 0 mpg is 30.768 which is quite impossible. Above an mpg of 0 a unit in (pounds) increase in weight to a -0.0019 increase in efficiency. This clearly shows that weight is a bad predictor of MPG best for van cars.

32

Page 33: Real Makoya

CONCLUSION

From analyzing the data in the data set of vehicles provided, a conclusion can be drawn that for the first aim, across all the car types, midsize cars have the highest mean prices because they have most features like a large number of airbags and large engine sizes. Both median prices are unaffected by extremely low or high prices. All the car types are positively skewed except van cars due to the fact that these car types have a high average number of 14.4 air bags whereas van cars have a low average number of 3 airbags. The coefficient of variation shows that all car types have variables spread away from the mean with extreme values except for vans and large cars which have variables close to the mean.

Of all the car types, small cars have the highest mean mpg mainly because they have small engine sizes. Both the mpg medians are unaffected by the extremely high or low mile values. The mpg variables have all the different types of skewness being symmetric, positive and negative skewness, this is due to the small, medium sized and relatively larger sized engines respectively. The coefficient of variation shows that the mpg variables for the entire car types are spread away from the mean, that is, there are extremely high and low mpg values.

All hypothesis tests results from the differences of means test conducted show sufficient evidence that all these all these car types are from the same population. The confidence intervals show the approximate values of where the true population mean might lie in the population. Chi-square tests based on probability used to test the differences in the apparent pattern in probability show that the difference between the observed and expected frequencies is large enough to be considered significant.

For the second aim, from using the variables in the correlation matrix a conclusion can be drawn that sporty cars have the strongest negative correlation to be used for the prediction of mpg for sporty cars using horsepower whereas compact cars have the weakest positive correlation with an r value quite close to zero. Engine size can be used as a determinant of mpg for large cars since there is a strong negative correlation relationship between engine size and large cars. Large also have a strong positive correlation relationship between maximum revolutions of engine per minute and mpg town. Another significant relationship which can be used for prediction of mpg town is the one between weight and mpg town for large cars which is shown as a strong negative correlation relationship.

Using the equation from the scatter diagrams showing the relationship between different variables and mpg town, it has been established that 11.2% of variation in mpg town is justified by changes in horsepower for large cars. 92% of variation in mpg town is justified by engine size for large cars.

33

Page 34: Real Makoya

APPENDICES

BASIC PRICE

TOP PRICE

34

Page 35: Real Makoya

MPG TOWN

MPG BEST

Excel tool pak was used to obtain values for the discriptive satistics measures of mean, median, mode and skewness. Coefficient of variation was calculated by hand though by applying the formula coefficient of variation = standard deviation mean ̸ .

35

Page 36: Real Makoya

CONFIDENCE INTERVAL

basic priceConfidence interval - mean

95% confidence level15.69375 mean

5.87315574 std. dev.16 n

1.960 z2.8778 half-width

18.5715 upper confidence limit

12.8160 lower confidence limit

small             Confidence interval - mean

95% confidence level11.9047619 mean

2.803297378 std. dev.21 n

1.960 z1.1990 half-width

13.1037 upper confidence limit

10.7058 lower confidence limit

mid-size             Confidence interval - mean

95% confidence level24.9363636 mean

10.15233004 std. dev.22 n

1.960 z4.2423 half-width

29.1787 upper confidence limit

20.6941 lower confidence limit

36

Page 37: Real Makoya

large             Confidence interval - mean

95% confidence level22.9363634 mean

6.260714452 std. dev.11 n

1.960 z3.6998 half-width

26.6361 upper confidence limit

19.2366 lower confidence limit

sporty             Confidence interval - mean

95% confidence level16.85714286 mean7.895345687 std. dev.

14 n1.960 z

4.1358 half-width

20.9929 upper confidence limit

12.7214 lower confidence limit

van             Confidence interval - mean

95% confidence level16.2 mean

2.027929979 std. dev.9 n

1.960 z1.325 half-width

17.525upper confidence limit

14.875lower confidence limit

37

Page 38: Real Makoya

TOP PRICE

compactConfidence interval - mean

95% confidence level20.725 mean

7.960947 std. dev.16 n

1.960 z3.901 half-width

24.626upper confidence limit

16.824lower confidence limit

small             Confidence interval - mean

95% confidence level11.90476 mean2.803297 std. dev.

21 n1.960 z

1.1990 half-width

13.1037 upper confidence limit

10.7058 lower confidence limit

mid-size             Confidence interval - mean

95% confidence level30.31364 mean15.08554 std. dev.

22 n1.960 z

6.3037 half-width

36.6174 upper confidence limit

24.0099 lower confidence limit

38

Page 39: Real Makoya

large             Confidence interval - mean

95% confidence level25.67273 mean6.668747 std. dev.

21 n1.960 z

2.8522 half-width

28.5249 upper confidence limit

22.8205 lower confidence limit

sporty             Confidence interval - mean

95% confidence level21.95714 mean8.573099 std. dev.

14 n1.960 z

4.4908 half-width

26.4479 upper confidence limit

17.4664 lower confidence limit

van             Confidence interval - mean

95% confidence level22.03333 mean3.009153 std. dev.

9 n1.960 z

1.9659 half-width

23.9993 upper confidence limit

20.0674 lower confidence limit

39

Page 40: Real Makoya

MPG TOWN

compactConfidence interval - mean

95% confidence level22.6875 mean

1.922455028 std. dev.16 n

1.960 z0.9420 half-width

23.6295 upper confidence limit

21.7455 lower confidence limit

small             Confidence interval - mean

95% confidence level29.85714286 mean6.109711239 std. dev.

21 n1.960 z

2.6131 half-width

32.4703 upper confidence limit

27.2440 lower confidence limit

mid-size             Confidence interval - mean

95% confidence level19.545455 mean

1.895540449 std. dev.22 n

1.960 z0.7921 half-width

20.3375 upper confidence limit

18.7534 lower confidence limit

large

40

Page 41: Real Makoya

             Confidence interval - mean

95% confidence level18.36363636 mean1.501514387 std. dev.

11 n1.960 z

0.8873 half-width

19.2510 upper confidence limit

17.4763 lower confidence limit

sporty             Confidence interval - mean

95% confidence level21.78571429 mean3.906179944 std. dev.

14 n1.960 z

2.0461 half-width

23.8319 upper confidence limit

19.7396 lower confidence limit

van             Confidence interval - mean

95% confidence level17 mean

1.224744871 std. dev.9 n

1.960 z0.800 half-width

17.800upper confidence limit

16.200lower confidence limit

41

Page 42: Real Makoya

MPG BEST

compactConfidence interval - mean

95% confidence level29.875 mean

2.941088234 std. dev.16 n

1.960 z1.441 half-width

31.316upper confidence limit

28.434lower confidence limit

small             Confidence interval - mean

95% confidence level35.47619048 mean

5.60909126 std. dev.21 n

1.960 z2.3990 half-width

37.8752 upper confidence limit

33.0772 lower confidence limit

mid-size             Confidence interval - mean

95% confidence level26.72727273 mean1.272077756 std. dev.

22 n1.960 z

0.5316 half-width

27.2588 upper confidence limit

26.1957 lower confidence limit

large             Confidence interval - mean

95% confidence level26.72727273 mean

42

Page 43: Real Makoya

1.272077756 std. dev.11 n

1.960 z0.7517 half-width

27.4790 upper confidence limit

25.9755 lower confidence limit

sportyConfidence interval - mean

95% confidence level28.78571429 mean3.641186861 std. dev.

14 n2.160 t (df = 13)

2.1024 half-width

30.8881 upper confidence limit

26.6834 lower confidence limit

vanConfidence interval - mean

95% confidence level21.88888889 mean1.452966315 std. dev.

9 n1.960 z

0.9493 half-width

22.8381 upper confidence limit

20.9396 lower confidence limit

43