IB Math Studies – Topic 6

29
IB Math Studies – Topic 6 Statistics

description

IB Math Studies – Topic 6. Statistics. IB Course Guide Description. IB Course Guide Description. IB Course Guide Description. Categorical Data – Describes a particular quality or characteristic. It can be divided into categories. i.e. c olor of eyes or types of ice cream. - PowerPoint PPT Presentation

Transcript of IB Math Studies – Topic 6

Page 1: IB Math Studies – Topic 6

IB Math Studies – Topic 6

Statistics

Page 2: IB Math Studies – Topic 6

IB Course Guide Description

Page 3: IB Math Studies – Topic 6

IB Course Guide Description

Page 4: IB Math Studies – Topic 6

IB Course Guide Description

Page 5: IB Math Studies – Topic 6

Types of Data

• Categorical Data – Describes a particular quality or characteristic. It can be divided into categories.

– i.e. color of eyes or types of ice cream

• Quantitative Data – Contains a numerical value. The information collected is termed numerical data.– Discrete – Takes exact number values and is often the result of counting.• i.e. number of TVs or number of houses on a street

– Continuous – Takes numerical values within a certain range and is often a result of measuring.• i.e. the height of seniors or the weight of freshman

Describing Data

Page 6: IB Math Studies – Topic 6

Types of Distribution

Symmetric Distribution Positively Skewed Distribution

Negatively Skewed Distribution

Page 7: IB Math Studies – Topic 6

24 families were surveyed to find the number of people in the family. The results are:

5, 9, 4, 4, 4, 5, 3, 4, 6, 8, 8, 5, 7, 6, 6, 8, 6, 9, 10, 7, 3, 5, 6, 6

a) Is this data discrete or continuous?b) Construct a frequency table for the data.c) Display the data using a column graph.d) Describe the shape of the distribution. Are there any outliers?e) What percentage of families have 5 or fewer people in them?

Example 1: Describing Data

Page 8: IB Math Studies – Topic 6

Standard Deviation Formula

x

• x is any score• is the mean• n is the number of

scores

n

xx

2

Page 9: IB Math Studies – Topic 6

Calculate the standard deviation

Values

245566735

• Calculate the mean• Subtract the mean from each

value• Square these• Add them• Divide by n• Take the square root

n

xx

2

xx 2xx

Page 10: IB Math Studies – Topic 6

Standard Deviation on the GDC

xmean• Type data in List 1• 1-Var Stats L1

xdeviationstandard

• On paper you’ll see ‘s’ being used to standard for standard deviation. • But you should use the σ measurement from the calculator.

Page 11: IB Math Studies – Topic 6

• The median is the second quartile, Q2

or 50th percentile• The lower quartile, Q1, is the median of the lower half of the

dataor 25th percentile

• The upper quartile, Q3, is the median of the upper half of the dataor 75th percentile

• The inter-quartile range is the difference in the upper quartile and the lower quartile.

IQR = Q3 – Q1

Measuring the Spread of Dara

Page 12: IB Math Studies – Topic 6

Box Plots

• The inter-quartile range is the width of the box.

• The maximum length of each whisker is 1.5 times the inter-quartile range.

• Any data value that is larger than (or smaller than) 1.5 × IQR is marked as an outlier.

Page 13: IB Math Studies – Topic 6

To Create a Box-and-Whisker Plot:

1) Make a number line.2) Create the box between Q1 and Q3.3) Draw in Q2.4) Determine any outliers:

• Upper boundary = Q3 + 1.5(IQR)• Lower boundary = Q1 – 1.5(IQR)

5) Plot any outliers.6) Extend the whiskers to the maximum & minimum (provided

they’re not outliers).

Page 14: IB Math Studies – Topic 6

A hospital is trialing a new anesthetic drug and has collected data on how long the new and old drugs take before the patient becomes unconscious. They wish to

know which drug acts faster and which is more reliable.

Old drug times:8, 12, 9, 8, 16, 10, 14, 7, 5, 21, 13, 10, 8, 10 11, 8, 11, 9, 11, 14

New drug times:8, 12, 7, 8, 12, 11, 9, 8, 10, 8, 10, 9, 12, 8, 8, 7, 10, 7, 9, 9

Prepare a parallel box plot for the data sets and use it to compare the two drugs for speed and reliability.

Example :Box and Whisker Plots

Page 15: IB Math Studies – Topic 6

FORMULA Pearson’s Correlation Coefficient: r

Page 16: IB Math Studies – Topic 6

Correlation Coefficient on the GDC

• Turn on your Diagnostics

• Enter the data in L1 and L2

• LinReg L1, L2

Page 17: IB Math Studies – Topic 6

In an experiment a vertical spring was fixed at its upper end. It was stretched by hanging different weights on its lower end. The length of the spring was then measured. The following readings were obtained.

Load (kg) x 0 1 2 3 4 5 6 7 8

Length (cm) y 23.5 25 26.5 27 28.5 31.5 34.5 36 37.5

It is given that the covariance Sxy is 12.17.

(d) (i) Write down the correlation coefficient, r, for these readings. (ii) Comment on this result.

(b) (i) Write down the mean value of the load, (ii) Write down the standard deviation of the load. (iii) Write down the mean value of the length, (iv) Write down the standard deviation of the length.

x

y

Example 1: Correlation Coefficient

Page 18: IB Math Studies – Topic 6

Average speed in the metropolitan area and age of drivers

The r-value for this association is 0.027. Describe

the association.

Example 2: Correlation Coefficient

Page 19: IB Math Studies – Topic 6

Drawing the Line of Best Fit

1. Calculate mean of x values , and mean of y values2. Mark the mean point on the scatter plot3. Draw a line through the mean point that is through

the middle of the data– equal number of points above and below line

x y ,x y

Page 20: IB Math Studies – Topic 6

Least Squares Regression Line

• Consider the set of points below.• Square the distances and find their

sum.• we want that sum to be small.• The regression line is used for

prediction purposes.• The regression line is less reliable

when extended far beyond the region of the data.

Page 21: IB Math Studies – Topic 6

Line of Regression using GDC

• LinReg(ax +b) Test, L1, L2• where L1 contains your independent data.• and L2 contains your dependent data

Page 22: IB Math Studies – Topic 6

The table shows the annual income and average weekly grocery bill for a selection of families

a) Construct a scatter plot to illustrate the data.b) Use technology to find the line of best fit.c) Estimate the weekly grocery bill for a family with an annual income of £95000.Comment on whether this estimate is likely to be reliable.

Example 3: Line of Regression

Page 23: IB Math Studies – Topic 6

X2 Test of Independence

• The variables may be dependent:– Females may be

more likely to exercise regularly than males.

• The variables may be independent:– Gender has no effect

on whether they exercise regularly.

A chi-squared test is used to determine whether two variables

from the same sample are independent.

Page 24: IB Math Studies – Topic 6

How to do it:1) Write the null hypothesis (H0) and the alternate

hypothesis (H1).2) Create contingency tables for observed and expected

values.3) Calculate the chi-square statistic and degrees of

freedom.4) Find the chi-squared critical value (booklet).• Depends on the level of significance (p) and the

degrees of freedom (v).5) Determine whether or not to accept the null

hypothesis.

Page 25: IB Math Studies – Topic 6

Contingency Tables Observed Frequencies

Expected Frequencies

Column1 Column2 TotalsRow1 a b sum row1Row2 c d sum row2Totals Sum column1 Sum column2 total

Column1 Column2 Totals

Row1 sum row1

Row2 sum row2

Totals sum column 1 sum column 2 total

row1sum column1sumtotal row1sum column2sum

total

row2sum column1sumtotal row2sum column2sum

total

Page 26: IB Math Studies – Topic 6

e

obscalc

fff

X2

exp2

On the calculator:Put your contingency table in matrix A

STAT TESTS

C: χ2 Test

Observed: [A] Expected: [B] (this is where you want to go) Calculate

Output:Χ2 Χ2 calculated valuedf degrees of freedomin Matrix B expected values

Χ2 Statistic on the GDC

Page 27: IB Math Studies – Topic 6

Find the Critical Value

• Get this from the formula booklet.

• Significance level (p) is always given in the problem.

A 5% significance level = 95% confidence level

• Degrees of freedom: v = (c - 1)(r – 1)

where c = number of columns in table

and r = number of rows in table

Page 28: IB Math Studies – Topic 6

Accepting the Null Hypothesis

If X2calc < Critical Value

ACCEPT the null hypothesis

If X2calc > Critical Value

REJECT the null hypothesis

Page 29: IB Math Studies – Topic 6

Important IB Notes:

• In examinations: the value of sxy will be given if required. • sx represents the standard deviation of the variable X; • sxy represents the covariance of the variables X and Y.• A GDC can be used to calculate r when raw data is given.• For the EXAM students do NOT need to know how to find

the covariance.• But, for their project if they’re doing regression, then they

DO need to do covariance by hand so they can do the r by hand so they can get points for using a sophisticated math process.