# Producing Data

date post

25-Feb-2016Category

## Documents

view

24download

0

Embed Size (px)

description

### Transcript of Producing Data

Proposition 1.1 De Moargans Laws

Chapter 3: Numerical Summary Measureshttp://anengineersaspect.blogspot.com/2013_05_01_archive.html

1Numerical Summary Measures: GoalsDescribe the center of a distribution by:meanMedianmodeCompare the mean and medianDescribe the measure of spread:rangeVariance and standard deviationQuartilesBe able to determine which summary statistics are appropriate for a given situationEmpirical Rule and introduction to the normal distributionDescribe a distribution by a boxplot (five-number summary and outliers)2DefinitionMeasures of central tendency indicate where the majority of the data is centered, bunched or clustered.

3Notationlower case letters, x, y, z indicate the variables.x1, x2, x3,.., xn refers to a set of fixed observations of a variable.n : This is the number of observations in a data set which is called the sample size.

4Sample Mean5Sample Mean: ExampleThe following data give the time in months from hire to promotion to manager for a random sample of 20 software engineers from all software engineers employed by a large telecommunications firm. a) What is the mean time for this sample?

b) Suppose that instead of x20 = 69, we had chosen another engineer that took 483 months to be promoted. what is the mean time for this new sample?

571214181414222125232434373449644767696Sample Median, xProcedureSort n observations from smallest to largestIf n is odd, x is the centerIf n is even, x is the average of the two center observations7Sample Median: ExampleThe following data give the time in months from hire to promotion to manager for a random sample of 20 software engineers from all software engineers employed by a large telecommunications firm. a) What is the median time for this sample?

b) Suppose that instead of x20 = 69, we had chosen another engineer that took 483 months to be promoted. what is the median time for this new sample?

571214141418212223242534343747496467698Mean and Median

MeanMedian

Left skewMean Median

Right skewMeanMedian 9Mode, MThe value with the greatest frequency.10Sample Mode: ExampleThe following data give the time in months from hire to promotion to manager for a random sample of 20 software engineers from all software engineers employed by a large telecommunications firm. a) What is the mode for this sample?

5712141414182122232425343437474964676911Variability of DataSet 1-15-10-5051015Set 2-15-5-101515Set 3-3-2-101231212Measures of VariabilitySample rangeSample variance (sample standard deviation)Interquartile Range (IQR)13Measures of VariabilitySample rangeSample variance (sample standard deviation)Interquartile Range (IQR)14Measures of VariabilitySample rangeSample variance (sample standard deviation)Interquartile Range (IQR)15Sample Variance16Comments for Standard DeviationVariance is used to determine spread for comparisons.s2 = 0 means that all of the observations are the same, normally s > 0n = 1s is not resistant to outlierss has the same units of measurement as the original observations17Sample Standard Deviation: ExampleThe following data give the time in months from hire to promotion to manager for a random sample of 20 software engineers from all software engineers employed by a large telecommunications firm. a) What is the standard deviation for this sample?

b) Suppose that instead of x20 = 69, we had chosen another engineer that took 483 months to be promoted. what is the standard deviation for this new sample?

5712141414182122232425343437474964676918Measures of VariabilitySample rangeSample variance (sample standard deviation)Interquartile Range (IQR)19Quartiles

Q1Q2Q320Quartiles - Procedure21Quartiles: ExampleThe following data give the time in months from hire to promotion to manager for a random sample of 19 software engineers from all software engineers employed by a large telecommunications firm.

Find the median and the quartiles.What is the Interquartile Range?Are there any outliers in this data set?

71214141418212223242534343747496410015022OutliersAfter finding the IQR, find the two inner fences (low and high) and the two outer fences (low and high)

IFL= Q1 1.5(IQR)IFH = Q3 + 1.5 (IQR) mildOFL= Q1 3(IQR)OFH = Q3 + 3 (IQR) extreme

23Quartiles: ExampleThe following data give the time in months from hire to promotion to manager for a random sample of 19 software engineers from all software engineers employed by a large telecommunications firm.

Find the median and the quartiles.What is the Interquartile Range?Are there any outliers in this data set?

71214141418212223242534343747496410015024BoxplotsProcedureFind Q1, Q3, median and IQRCalculate IFL, IFH, OFL, OFHDraw a central box from Q1 to Q3. Draw a line for the median.Extend lines (whiskers) from the box to the minimum and maximum values that are not outliers.Put in closed circles for mild outliers and open circles for extreme outliers.25Boxplot: Example

26Distributions and Boxplots

2727Side-by-side Boxplot: Example

28Choosing Measures of Center and SpreadChoicesMean and standard deviationMedian and IQR

ALWAYS PLOT YOUR DATA!29

http://freshspectrum.com/wp-content/uploads/2012/09/Hans-Rosling-Bubble-Plot-Cartoon.jpgEmpirical Rule68-95-99.7 Rule

30z-score31