Linear Scaling of Regression Data © Christine Crisp “Teach A Level Maths” Statistics 1.
Demo Disc Teach A Level Maths Vol. 3: S1 © Christine Crisp.
-
Upload
danielle-hutchinson -
Category
Documents
-
view
246 -
download
1
Transcript of Demo Disc Teach A Level Maths Vol. 3: S1 © Christine Crisp.
Demo Demo DiscDisc
““Teach A Level Teach A Level Maths”Maths”
Vol. 3: S1Vol. 3: S1
© Christine Crisp
Volume 3 of “Teach A level Maths” covers the work on Probability and Statistics for the A/AS level Option Module S1.
All topics for the 4 specifications offered by the English examining bodies are covered. Where a topic relates to some specifications only, this is indicated in a contents file and also at the start of the presentation.
Explanation of Clip-art images
An important result, example or summary that students might want to note.
It would be a good idea for students to check they can use their calculators correctly to get the result shown.
An exercise for students to do without help.
26: Hypothesis Testing
28: Standardizing to Z
16: Linear Functions of a Discrete Random Variable
14: Discrete Random Variables
6: Histograms
4: Box and Whisker Diagrams
The slides that follow are samples from 9 of the 40 presentations.
23: Binomial Problems
10: Introduction to Probability
36: Calculating Residuals
4: Box and Whisker Diagrams
Demo version note: The S1 specifications require students to be familiar with topics covered in Data Handling at GCSE.
The first few presentations revise and extend the GCSE work.
By the time the students reach this 4th presentation they have been reminded about cumulative frequency diagrams and have met the age data referred to on the slide.
The projected population of the U.K. for 2005( by age )
The box
One whisker
The other whisker
The box can be any depth. medianminimu
m agemaximum age
lower quartil
e
upper quartil
e
Box and Whisker Diagrams
The diagram can easily be drawn using a cumulative frequency diagram.
I’ll use the age data that we met earlier.
The projected population of the U.K. for 2005( by age )
medianminimum age
maximum age
lower quartil
e
upper quartil
e
The diagram can easily be drawn using a cumulative frequency diagram.
Box and Whisker Diagrams
I’ll use the age data that we met earlier.
We need a scale.
The projected population of the U.K. for 2005( by age )
0 10050Age (years)
The diagram can easily be drawn using a cumulative frequency diagram.
Box and Whisker Diagrams
I’ll use the age data that we met earlier.
Histograms
6: Histograms
Demo version note: As well as explaining theory, the presentations show worked examples and set introductory exercises.
The 6th presentation reminds students about the rules for drawing Histograms. The exercise shown here reinforces these rules without the students needing to spend time drawing a diagram.
Exercise95 components are tested until they fail. The table gives the times taken ( hours ) until failure.
Time to failure (hours)
0-19 20-29 30-39 40-44 45-49 50-59 60-89
Number of components
5 8 16 22 18 16 10
Find 3 things wrong with the histogram which represents the data in the table.
Histograms
Answer:
Time to failure (hours)
0-19 20-29 30-39 40-44 45-49 50-59 60-89
Number of components
5 8 16 22 18 16 10
• Frequency has been plotted instead of frequency density.
• There is no title.
• There are no units on the x-axis.
Histograms
Time taken for 95 components to fail
Incorrect diagram
Correct diagram
Histograms
Introduction to Probability
10: Introduction to Probability
Demo version note: This presentation covers the introductory ideas of probability and leads to a later one on conditional probability.Summaries are given from time to time which teachers may want students to note down. This slide shows an example of a summary.
Outcomes are the results of trials or experiments.
SUMMARY
An event is a particular result or set of results.
A possibility space is the set of all possible outcomes.
For equally likely outcomes, the probability of an event, E, is given by
P (E)
number of ways E can occurnumber of possible outcomes
Introduction to Probability
Discrete Random Variables
14: Discrete Random Variables
Demo version note: The presentations all contain worked examples of the straightforward type of questions found in exams. This is the first of the examples in the presentation on Discrete Random Variables.
e.g. 1. A random variable X has the probability distribution
P 4
1
2
1
1 5 10x
(X = )x p
Find (a) the value of p and (b) the mean of X.
Solution:
(a) Since X is a discrete r.v., 1)( xXP
121
41 p
41p
(b) mean, )( xXxP 411
41 1051 2 4
21
Tip: Always check that your value of the mean lies within the range of the given values of x. Here, or 5·25, does lie between 1 and 10.
421
Discrete Random Variables
16: Linear Functions of a Discrete Random Variable
Demo version note: Some topics are not required by all the specifications. The contents file shows which topics are needed by each of the specifications and contains hyperlinks to the files.The topic Linear Functions of a Discrete Random Variable is only required in S1 by Edexcel.
Linear Functions of a Discrete Random Variable
The results we have found can be generalised to give
e.g. The probability distribution for the r.v. X is given by
108642x
12
1
4
1
12
5
6
1
12
1)( xXP
Find (a) E(X), (b) Hence find E(2X 3)Solution: (a) )()( xXxPXE
12
110...
4
14
12
12
36
352(b) )32( XE 3
283
26
655
6
35
3)(2 XE
“Hence” in part (b) of the question means that we must use the answer to part (a) rather than using the values and probabilities of 2X 3.
Linear Functions of a Discrete Random Variable
E(aX + b) = aE(X) + b
Binomial Problems
23: Binomial Problems
Demo version note: The Binomial Distribution is covered by AQA, MEI/OCR and OCR. Having learnt to carry out Binomial Calculations, students practise recognising the conditions for using the model and also learn the importance of defining a random variable and writing down its distribution.
e.g. 1. A factory produces a particular type of computer chip. Over a long period the number that are defective has been found to be 15%. What is the probability that in a sample of 20 taken at random, 19 are perfect?Are the conditions met for using the Binomial
model?• A trial has 2 possible outcomes, success and failure.
• The probability of success in one trial is p and p is constant for all the trials.
• The trials are independent.
• The trial is repeated n times.
Yes: Each chip is either defective or not.
Yes: 20 chips are selected so n = 20.
Yes: We are given 15% (so p 0·15 ) and we can assume it is constant.
Binomial Problems
Yes: The probability of selecting a defective chip does not depend on whether one has already been selected.
We must never miss out this stage since it reminds us that(i) X represents a number ( that can be 0, 1, 2, . . . n ), and(ii) we have to make the decision as to whether to
count the number of defective chips or perfect ones.
So, )150,20(~ BX
Writing the distribution of X in this way makes us check that we have the p that fits our definition of the r.v., defective rather than perfect.
e.g. 1. A factory produces a particular type of computer chip. Over a long period the number that are defective has been found to be 15%. What is the probability that in a sample of 20 taken at random, 19 are perfect?Solution:
Let X be the r.v. “number of defective chips”
Binomial Problems
We need to be very careful here and not use by mistake.
)19( XP
e.g. 1. A factory produces a particular type of computer chip. Over a long period the number that are defective has been found to be 15%. What is the probability that in a sample of 20 taken at random, 19 are perfect?Let X be the r.v. “number of defective
chips” So, )150,20(~ BXSolution:
I had set up the Binomial for the number of defective chips, because I had the proportion for defective. However, the question asked for the probability of 19 perfect ones.
The solution is now straightforward. We want .
)( XP 1
If I had writtenLet X be the r.v. “ number of perfect chips”Then
,)850,20(~ BX and I would
want )19( XP
)..3(1370)850)(150()1( 191
20 pdCXP
Binomial Problems
Hypothesis Testing
26: Hypothesis Testing
Demo version note: In the presentations extensive use is made of snapshots from the software package “Autograph”.
Here Autograph is used to illustrate an example on Hypothesis Testing in the presentation for the MEI/OCR specification.
e.g. 2. In a trial, 16 seeds are sown and only 11 germinate. Use a 10% significance level to test the supplier’s claim that 85% germinate. Find the critical region for the test.Let X be the random variable ”the number of seeds that germinate”
850:0 pH850:1 pH
),16(~ pBX
)11(XP 1007910 There is a probability of 0·0791 ( less than 10% ) that 11 or fewer seeds will germinate.
We reject the null hypothesis at the 10% level of significance and conclude that the germination rate is below 85% .
Test at 10% level of significance.
Solution:
To test the supplier’s claim, the alternative hypothesis is that fewer than 85% germinate.This is again a 1-tailed test but this time we need to test the bottom end of the distribution.
Hypothesis Testing
)850,16(~ BX
%91707910)11( XP
The Autograph illustration is as follows:
The probability of 12 or fewer germinating is 0·2101 ( 21·01% ), so the critical region for the test is 0, 1, 2, . . . 10, 11.
Hypothesis Testing
Standardizing to Z
28: Standardizing to Z
Demo version note: Students are encouraged to use their Formulae and Statistical Tables even when worked examples are being developed.This presentation is part of a series to be used by AQA and Edexcel students on the Normal Distribution.
X350
x
z
110
350400 z
450 z
So, )450()400( ZPXP
Tables only give 2 d.p. for z so this is all we need.
)450( 67360 450
Z
Solution: (a)
)400( XP
x = 400, so400
)110,350(~ 2NXe.g.1 If X is a random variable with distribution
find (a) (b) )400( XP )400250( XP
Standardizing to Z
X
350
)110,350(~ 2NXe.g.1 If X is a random variable with distribution
find (a) (b) )400( XP )400250( XP
110
3502501z
Solution: (b) )400250( XP
910
So, )450910()400250( ZPXP
400250
110
3504002z 450
There are 2 values to convert so we use subscripts for z.
N.B. This is left of the mean so the z value will be negative.
Standardizing to Z
)110,350(~ 2NXe.g.1 If X is a random variable with distribution
)910()450(
818601
Solution: (b)
)450910()400250( ZPXP
450
Z
910
)910(1)910(
1814067360
18140
49220 )910()450(
find (a) (b) )400( XP )400250( XP
Standardizing to Z
Calculating Residuals
36: Calculating Residuals
Demo version note: Throughout the module, students are encouraged to use their calculators efficiently and this is particularly important in the topic for AQA, Edexcel and OCR on Least Squares Regression.In the following slides, however, the emphasis is on the effect of outliers on the equation of a regression line rather than on calculating the line itself.
e.g. This is a scatter diagram of the data shown in the table.
38
77
116
125
144
123
182
51
yx
If we were to draw the line “by eye”, the 1st point . . . would lie well away from the line we would want to draw.
However, the calculation of the regression line includes the 1st point and distorts the position of the line.
Calculating Residuals
The diagram shows the y on x regression line for all the data. The residuals are shown by the red lines.
38
77
116
125
144
123
182
51
yx
xy 8802114
The left-hand end of the line is further down than it would be without the 1st point.
Calculating Residuals
e.g. This is a scatter diagram of the data shown in the table.
Removing the 1st point . . .
xy 8802114
Calculating Residuals
38
77
116
125
144
123
182
51
yx
e.g. This is a scatter diagram of the data shown in the table.
xy 0723621
xy 8802114
Removing the 1st point gives
Calculating Residuals
38
77
116
125
144
123
182
51
yx
e.g. This is a scatter diagram of the data shown in the table.
xy 0723621
xy 8802114
Removing the 1st point gives
Calculating Residuals
e.g. This is a scatter diagram of the data shown in the table.
1392 R
The sum of the squares of the residuals,
9192 R
The sum of the squares of the residuals,
Without the 1st point, we have a regression line that is a much better fit.
Full version available from: Chartwell-Yorke Ltd.
114 High Street, Belmont Village,
Bolton, Lancashire,
BL7 8AL England
tel (+44) (0)1204 811001, fax (+44) (0)1204 811008