1 Stat 1510 Statistical Thinking & Concepts Describing Distributions with Numbers.
1.2 Describing Distributions with Numbers
description
Transcript of 1.2 Describing Distributions with Numbers
1.2 Describing Distributions with NumbersAP Statistics
Fuel Economy for 2004 car models2-seater Cars Compact cars
Model City HWY Model City HwyAcura NSX 17 24 Ashton Mar 12 19Audi 20 28 Audi TT 21 29BMW z4 20 28 BMW 325 19 27Cadillac XLR 17 25 BMW 330 19 28Corvette 18 25 BMWM3 16 23Miata 22 28 Jaguar XK8 18 26Viper 12 20 Jaguar XKR 16 23Ferrari 360 11 16 Lexus SC 18 23Ferrari M 10 16 Mini Cooper 25 32Honda I 60 66 Mits Eclipse 23 31Thunderbird 17 23 Porsche 911 14 22Lotus 15 22 Mits Spyder 20 29
Mean and Median
Mean: The Average
Median: The Middle
Mean Highway mileage for two seaters
1 2 ... 518 24.721 21
nx x xx
Caution!!!
The mean is sensitive to the influence of a few extreme observations. It is not a resistant measure of center.
Measuring Spread
The simplest useful numerical description of a distribution includes a description of both the center and the spread.
Range: simplest measure of spread. The highest value minus the lowest value.
Pth percentile of a distribution is the percent of observations that fall below it.
The Quartiles
Calculating Quartiles
The highway mileage of 20 2-seaters arranged in numerical order are:
13 15 16 16 17 19 20 22 23 23 |23 24 25 25 26 28 28 28 29 32The median is marked by |. The 1st Quartile (Q1) is the median of the 10 numbers to the left of the
median.The 3rd Quartile (Q1) is the median of the 10 numbers to the right of the
median.If there are an odd number of observations, the median is the middle
number and it is excluded from calculations of Q1 and Q3.
The Five–Number Summary and Boxplots
The Five–Number Summary and Boxplots
Boxplots of the highway and city gas mileages for cars classified as two-seaters and minicompacts.
OutliersLower Bound: Q1 – 1.5(IQR)•If an observation falls below the lower bound it is considered an outlier.
Upper Bound: Q3 + 1.5(IQR)•If an observation falls above the upper bound it is considered an outlier.
Measuring Spread: Variance and Standard Deviation
Some FAQs Why do we square the deviations?
If not, they would all add to zero. Why do we emphasize the standard deviation rather
than the variance?The standard deviation measures spread about the mean
in the original scale. Why do we average by dividing by n-1 rather than n
in calculating the variance? Because the sum of the deviations is always zero, the last
deviation can be found once we know the other n-1. The number n-1 is called the degrees of freedom.
Choosing Measures of Center and Spread