Post on 15-Sep-2015
description
Basic Statistical Basic Statistical MethodsMethods
SHC790SHC790J vd WesthuizenJ vd Westhuizen
ContextContext
Data DescriptionData Description ProbabilityProbability InferenceInference
Data DescriptionData Description
Data DescriptionData Description
How can we summarise small How can we summarise small amounts of data?amounts of data? Remember that the following is Remember that the following is
required before we continue:required before we continue: Data needs to be a random sampleData needs to be a random sample How accurately were the How accurately were the
measurements mademeasurements made Notice the difference between Notice the difference between
measurements and countsmeasurements and counts Measurements Measurements More accurateMore accurate Counts Counts Always whole numbersAlways whole numbers
Data DescriptionData Description
Data DescriptionData Description
Example TableExample Table Bridge Column Bridge Column
Support HeightSupport Height Most important Most important
thing about thing about numbers in this listnumbers in this list How low a How low a
typical bridge is.typical bridge is. How variable How variable
the numbers the numbers are.are. 8.898.8999
8.278.2788
8.528.5277
7.997.9966
9.029.0255
8.988.9844
8.158.1533
8.078.0722
8.788.7811
Height (m)Height (m)NumberNumber
Data DescriptionData Description
Mean HeightMean Height
Symbol for Symbol for mean if mean if observations is observations is referred to as referred to as xsxs
8.898.8999
8.278.2788
8.528.5277
7.997.9966
9.029.0255
8.988.9844
8.158.1533
8.078.0722
8.788.7811
Height (m)Height (m)NumberNumber
518.89
89.827.852.899.702.998.815.807.878.8
=
++++++++=HeightMeanBridge
nxx
ni i =
=1
Data DescriptionData Description
Median HeightMedian Height
Bridge Median Height = 8.52Bridge Median Height = 8.52
If there is an even If there is an even number of observations number of observations we take the average of we take the average of the two middle ones the two middle ones
9.029.0255
8.988.9844
8.898.8999
8.788.7811
8.528.5277
8.278.2788
8.158.1533
8.078.0722
7.997.9966
Height (m)Height (m)NumberNumber
Data DescriptionData Description
The ModeThe Mode The mode is the observation that The mode is the observation that
occurs most frequently. In our occurs most frequently. In our previous example there are no previous example there are no mode.mode.
This is more useful for counted This is more useful for counted data. For example number of data. For example number of vehicles per hour over a 24 hour vehicles per hour over a 24 hour period.period.
Data DescriptionData Description
Guidelines for mean, median and Guidelines for mean, median and modesmodes In case of no outliers use mean.In case of no outliers use mean. In case of outliers In case of outliers
Median Mode
Meas
urem
ents Counts
Vehicles per hour - peak hours have a substantial
influence
Data DescriptionData Description
Other measures of location (See Other measures of location (See 10)10) Root mean squareRoot mean square
Geometric mean of n positive Geometric mean of n positive numbers = nth root of their numbers = nth root of their productproduct
Harmonic meanHarmonic mean
Data DescriptionData Description
Measure of variationMeasure of variation RangeRange
The range of a set of numbers is the The range of a set of numbers is the largest minus the smallestlargest minus the smallest
The range of bridge height is 9.02 The range of bridge height is 9.02 7.99 = 1.03 m7.99 = 1.03 m
Its very sensitive Its very sensitive To unusually big To unusually big or small observationsor small observations
Data DescriptionData Description
IQR and SIQR (InterIQR and SIQR (Inter--quartile quartile Range and SemiRange and Semi--InterInter--quartile quartile range) range) Distance between Distance between quartilesquartiles QuartileQuartile
25% 25% 25% 25%
Separate ordered data into 4 equal groups
1st quartile 2nd quartile 3rd quartile(median)(Lower) (Upper)
Data DescriptionData Description
To determine the lower and To determine the lower and upper quartiles we recommend upper quartiles we recommend the following:the following: Lower quartileLower quartile
0.25n+0.5 (n 0.25n+0.5 (n number of number of observation)observation)
Upper quartileUpper quartile 0.75n+0.50.75n+0.5
Data DescriptionData Description
ExampleExample n=9n=9 LQ = 0.25(9)+0.5 = 2.75LQ = 0.25(9)+0.5 = 2.75 UQ = 0.75(9)+0.5 = 7.25UQ = 0.75(9)+0.5 = 7.25
9.029.0255
8.988.9844
8.898.8999
8.788.7811
8.528.5277
8.278.2788
8.158.1533
8.078.0722
7.997.9966
Height (m)Height (m)NumberNumber
LQ = 8.13
UQ = 8.91 IQR = 8.91 8.13 = 0.78
Data DescriptionData Description Mean Absolute Mean Absolute
Deviation(MAD)Deviation(MAD)
354.0919.3..
=
=DAM xx xx
3.190.00Sum
0.370.378.898.89
0.25-0.258.278.27
0.000.008.528.52
0.53-0.537.997.99
0.500.509.029.02
0.460.468.988.98
0.37-0.378.158.15
0.45-0.458.078.07
0.260.268.788.78
Height (m) Height (m) --xx
Absolute deviation from mean
Deviation from meanObservation
nxx
DAM
=..
Data DescriptionData Description
Standard DeviationStandard Deviation SD is approximately the average of the SD is approximately the average of the
difference between each value and the difference between each value and the mean.mean.
SD is what is most commonly used as a SD is what is most commonly used as a measure of variationmeasure of variation
The square of the SD is called varianceThe square of the SD is called variance
( )1
2
=
nxxS
2var S=
Data DescriptionData Description
Standard Deviation Standard Deviation -- ExampleExample
( )1
2
=
nxxS
1.34890Sum
0.13690.378.898.89
0.0625-0.258.278.27
008.528.52
0.2809-0.537.997.99
0.250.59.029.02
0.21160.468.988.98
0.1369-0.378.158.15
0.2025-0.458.078.07
0.06760.268.788.78
Height (m) Height (m) --xx
Squared deviation from mean
Deviation from meanObservation
S =0.411
xx ( )2xx
Data DescriptionData Description
Standard Deviation Standard Deviation Typical Typical ObservationsObservations
Large dataset Large dataset (1000 observations)(1000 observations)
Few observations futher Few observations futher than 3 SD from the meanthan 3 SD from the mean
MediumMedium--sized dataset sized dataset (100 observations)(100 observations)
Few observations futher Few observations futher than 2 SD from the meanthan 2 SD from the mean
Small datasetSmall dataset(10 observations)(10 observations)
Few observations futher Few observations futher than 1 SD from the meanthan 1 SD from the mean
Dataset sizeDataset sizeDescriptionDescription
Data DescriptionData Description
The boxThe box--andand--whisker plotwhisker plot
Med
ian
20
10 20 30 40 50 60 70
LQ UQ
55
Whiskers
Data DescriptionData Description
Cumulative frequency plotCumulative frequency plot Vertical axis represents the Vertical axis represents the
number/percentage of observations number/percentage of observations that are less than or equal to the x that are less than or equal to the x value on the horizontal axisvalue on the horizontal axis
Data DescriptionData Description
Flood ExampleFlood Example
SIZE OF FLOOD
0 12 13 16 17 20 26 33 38 50 61 70
FREQ
UEN
CY
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Data DescriptionData Description
Flood Example Flood Example IQRIQR The boxThe box--andand--whisker plotwhisker plot
SIZE OF FLOOD
0 12 13 16 17 20 26 33 38 50 61 70
FREQ
UEN
CY
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
16 23 3812 61
Data DescriptionData Description
Large DatasetsLarge Datasets See arm length measurements See arm length measurements
below.below.
723800725784708795758757791761
737777780763765860812754776788
787835777808761727811765751759
771830767811767774802706796801
817829805834802801811831817801
803776796806786757833811791796
801809727804781763774747783764
816720802789830854830795785759
809740798775806754754758694765
810786767782833846845846817741
Data DescriptionData Description
Tabulate this dataTabulate this data Choosing classesChoosing classes
Use convenient round Use convenient round numbersnumbers
Sufficient many classes Sufficient many classes not wide groupingnot wide grouping
Sufficient few classes Sufficient few classes between 5 between 5 15 classes15 classes
Each observation must Each observation must go into only one classgo into only one class
1000870 and over
1002850-869
9811830-849
8711810-829
7616800-809
607790-799
5311780-789
428770-779
3412760-769
2213740-759
96720-739
32700-719
11Up to 699
CumulativeCount ObservationsRange
Example 1
Data DescriptionData Description StemStem--andand--leaf plotsleaf plots
69 470 6 67172 0 7 7 5 373 774 2 0 775 8 4 4 9 7 9 1 4 7 876 7 5 4 3 9 7 7 5 1 5 3 177 5 4 6 4 1 7 6 778 2 6 5 9 3 1 6 7 8 0 379 8 5 6 1 6 1 580 6 9 2 4 9 1 6 3 1 1 2 5 1 2 8 081 7 0 6 1 7 1 7 1 1 282 983 3 0 0 3 1 4 0 584 6 5 685 486 0
Data DescriptionData Description
Histogram presentationHistogram presentation
Arm Length (mm)
650 700 750 800 850 900
Prop
ortio
n of
obs
erva
tion
per c
lass
wid
th
0
2
4
6
8
10
12
14
16
Note the change in range size and the frequency half at range doublethe size of the selected standard
range.Rather keep range size constant
Data DescriptionData Description
Cumulative frequency plotCumulative frequency plot
Arm Length (mm)
680 700 720 740 760 780 800 820 840 860 880
% O
bser
vatio
ns
=
=
==
=
Data DescriptionData Description
Temperature Rainfall exampleTemperature Rainfall example Class ExampleClass Example
2.22.23.33.32.92.93.23.24.74.73.13.13.93.93.23.23.43.43.03.03.63.63.13.1
RainfallRainfallTemperatureTemperature
Temperature (Thousands of degrees)
2.9 3.0 3.1 3.2 3.3
Rai
nfal
l (H
unde
rds
mm
)
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
Data DescriptionData Description
Temperature Rainfall exampleTemperature Rainfall example Class ExampleClass Example
Data DescriptionData Description
Temperature Rainfall exampleTemperature Rainfall example Class ExampleClass Example
64.9575.0759.5920.718.9
7.264.8410.892.23.3
9.288.4110.242.93.2
14.5722.099.614.73.1
12.4815.2110.243.93.2
10.211.5693.43
11.1612.969.613.63.1
Temperature-Rainfall(Rainfall)2(Temperature)2RainfallTemperature
Data DescriptionData Description
Temperature Rainfall exampleTemperature Rainfall example Class ExampleClass Example
57.0655.3055.0
255.0
=
=
=
r
r
SSS
ryyxx
xy
Sxx = 59.59 1/6(18.9)2 = 0.055
Syy = 75.07 1/6(20.7)2 = 3.655
Sxy = 64.95 1/6(18.9)(20.7) = -0.255
( ) 15.415.064.445.364.4055.0255.0
==
=
=
=
ab
b
SS
bxx
xy
ProbabilityProbability
What is meant by probability?What is meant by probability? The probability of an event is the The probability of an event is the
chance that it will occur.chance that it will occur. A priori approach.A priori approach.
Sometimes, the experimental setSometimes, the experimental set--up is so clear, we know the up is so clear, we know the probabilities in advance of probabilities in advance of collecting any data collecting any data
ProbabilityProbability
A priori approach.A priori approach. ExamplesExamples
CoinCoin P[x = head] = P[x = tail] = 0.5P[x = head] = P[x = tail] = 0.5
DiceDice P[x = 1] = P[x = 2] = P[x = 3] = P[x = 4] P[x = 1] = P[x = 2] = P[x = 3] = P[x = 4]
=P[x = 5] = P[x = 6] = 1/6=P[x = 5] = P[x = 6] = 1/6
CardsCards P[x = 4 of harts] = 1/52P[x = 4 of harts] = 1/52 P[x = Ace] = 4/52 = 1/13P[x = Ace] = 4/52 = 1/13
ProbabilityProbability
EmpiricalEmpirical By having enough experimental By having enough experimental
datadata ExamplesExamples
Break 100 concrete cubes. 30 cubes Break 100 concrete cubes. 30 cubes strength is more than 50MPastrength is more than 50MPa
P[x > 50] = 30/100 = 0.3P[x > 50] = 30/100 = 0.3
ProbabilityProbability
Subjective assessmentsSubjective assessments What is the probability that it will What is the probability that it will
rain on the 27rain on the 27thth of July in Pretoria.of July in Pretoria. Argument: During winter iArgument: During winter itt does does
not rain regularly in Pretoria.not rain regularly in Pretoria. No experimental informationNo experimental information Intuition:Intuition:
P[x = rain] = 0.05P[x = rain] = 0.05
ProbabilityProbability
Probability ScaleProbability Scale
Prob
abilit
y
Swim through the Atlantic ocean
Dice
Coin
Pass Statistics
Dying 1.00.9
0.5
0.167
0
ProbabilityProbability
MutuallyMutually--exclusive eventsexclusive events This means they cannot occur This means they cannot occur
together.together. Consequently, the Consequently, the probabilityprobability of of
one one oror the other of two mutuallythe other of two mutually--exclusive events occurring is the exclusive events occurring is the sum of their individual sum of their individual probabilitiesprobabilities..
ProbabilityProbability
Example Example MutuallyMutually--exclusive eventsexclusive events Number of students on campus is Number of students on campus is
25000.25000. B.Eng = 5000B.Eng = 5000 B.Com = 8000B.Com = 8000 Choosing a studentChoosing a student
P[B.Eng] = 5000/25000 = 0.2P[B.Eng] = 5000/25000 = 0.2 P[B.Com] = 8000/25000 = 0.32P[B.Com] = 8000/25000 = 0.32
0][52.032.02.0][
=
=+=
BComBIngPBComBIngP
ProbabilityProbability
Venn Diagram Venn Diagram MutuallyMutually--exclusiveexclusive
B.Com
B.Eng
Other
P[B.Eng]0.2
P[B.Com]0.32
P[Other]0.48Sum of Ps = 1
ProbabilityProbability
Independent probabilityIndependent probability Where two experiments is not influencing Where two experiments is not influencing
each other.each other. Probability of both happening is the Probability of both happening is the
productproduct of their of their individual probabilitiesindividual probabilities ExampleExample
What sex is a person? Man What sex is a person? Man A, Women A, Women BB Does the person own a vehicle? Yes Does the person own a vehicle? Yes C, No C, No
DD P[A] = 0.45, P[B] = 0.55, P[C] = 0.8, P[D] = 0.2P[A] = 0.45, P[B] = 0.55, P[C] = 0.8, P[D] = 0.2
ProbabilityProbability
ExampleExample
No
No
Ve
hicl
e Ve
hicl
e -- Y
esYes
Man Sex Women
0.09 0.11
0.36 0.44
[ ][ ][ ][ ] 11.02.055.0
44.08.055.009.02.045.036.08.045.0
==
==
==
==
DBPCBPDAPCAP
ProbabilityProbability
General addition RuleGeneral addition Rule
A C
A is made up of {A but not C} plus {A and C}C is made up of {C but not A} plus {A and C}Therefore A or C is {A} plus {C}, minus {A and C}
ProbabilityProbability
ExampleExample
No
No
Ve
hicl
e Ve
hicl
e -- Y
esYes
Man Sex Women
0.09 0.11
0.36 0.44
[ ][ ][ ][ ]
91.044.08.055.0][][][][89.036.08.045.0][][][][
11.02.055.044.08.055.009.02.045.036.08.045.0
=+=+=
=+=+=
==
==
==
==
CBPCPBPCBPCAPCPAPCAP
DBPCBPDAPCAP
ProbabilityProbability
Same Example Same Example Probability TreeProbability Tree
Sex
Vehicle Vehicle
ManP=0.45
WomanP=0.55
YesP=0.8
YesP=0.8
NoP=0.2
NoP=0.2
0.36 0.09 0.44 0.11
ProbabilityProbability
General Multiplication RuleGeneral Multiplication Rule
A C
yprobabilitlconditionaaastoreferredisACPthatgivenmeanslineverticaltheWhere
CAPCPCandAPor
ACPAPCandAP
}\{""\
}\{}{}{
}\{}{}{
=
=
ProbabilityProbability
ExampleExample 12 people12 people 9 9 Native bornNative born 3 3 Foreign bornForeign born If we select 2 people, what is the If we select 2 people, what is the
probability that both are foreign born?probability that both are foreign born? P[F1] = 3/12 = 0.25P[F1] = 3/12 = 0.25 Once F occurred we know that there are 11 Once F occurred we know that there are 11
remaining of whom 2 are foreign born.remaining of whom 2 are foreign born. P[F2P[F2\\F1] = 2/11 = 0.1818F1] = 2/11 = 0.1818 P[F1 and F2] = 0.25 x 0.1818 = 0.04545 = P[F1 and F2] = 0.25 x 0.1818 = 0.04545 =
1/221/22
ProbabilityProbability
ExampleExample 200 Students200 Students 77 Accounting77 Accounting 64 Law64 Law 92 Study neither92 Study neither Other 3 numbers to Other 3 numbers to
total 200 total 200 92 = 10892 = 108 Only Accounting =108 Only Accounting =108
64 = 4464 = 44 Only Law = 108 Only Law = 108 77 = 77 =
3131 Both = 77 Both = 77 44 or 64 44 or 64
31 = 3331 = 33 74
7744]\[
165.020033]&[
==
==
AAOnlyP
LAP
44 31
92
33
A L
ProbabilityProbability
Class ExampleClass Example A concrete beam will fail if the A concrete beam will fail if the
concrete is to weak or the load concrete is to weak or the load is to high.is to high.
P[Weak] = 0.2P[Weak] = 0.2 P[High] = 0.3P[High] = 0.3 P[Failure] = ?P[Failure] = ?
ProbabilityProbability
PP[[Week] = 0.2Week] = 0.2 P[High] = 0.3P[High] = 0.3 P[Failure] = ?P[Failure] = ?
44.006.05.0][)3.0)(2.0(3.02.0][
][][][][][][][][][
==
+=
+=
+=
FailurePFailureP
HighPWeakPHighPWeakPFailurePHighWeakPHighPWeakPFailureP
0.2 0,30.06
W H
Yes
-Loa
d to
hig
h
No
0.3
0.7
0.2 0.80.06
0.14 0.56
0.24
Yes Concrete weak No
P[Failure]
ProbabilityProbability P[Week] = 0.2P[Week] = 0.2 P[High] = 0.3P[High] = 0.3 P[Failure] = ?P[Failure] = ? 44.006.05.0][
)3.0)(2.0(3.02.0][][][][][][][][][][
==
+=
+=
+=
FailurePFailureP
HighPWeakPHighPWeakPFailurePHighWeakPHighPWeakPFailureP
Concrete
WeakP = 0.2
OKP = 0.8
Load Load
HighP = 0.3
OKP = 0.7
HighP = 0.3
OKP = 0.7
0.06 0.14 0.24 0.56
P[Failure] = 0.44
ProbabilityProbability
Class ExampleClass ExampleA lowA low--water bridge is designed to allow for water bridge is designed to allow for flood occurring once every 10 years. flood occurring once every 10 years. Damage occurs during each flood. The Damage occurs during each flood. The bridge is also located in an active seismic bridge is also located in an active seismic region and the probability of a destructive region and the probability of a destructive earthquake occurring in a year is 30%. earthquake occurring in a year is 30%. Determine the probability of damage during Determine the probability of damage during any given year assuming that floods and any given year assuming that floods and earthquakes are statistically independent.earthquakes are statistically independent.
ProbabilityProbability
Class example Class example SolutionSolution P[Flood] = 0.1P[Flood] = 0.1 P[No flood] = 0.9P[No flood] = 0.9 P[Quake] = 0.3P[Quake] = 0.3 P[No quake] = 0.7P[No quake] = 0.7
37.003.01.03.0][][][][][
03.01.03.0][
=+=
+=
==
FloodQuakePFloodQuakePFloodPQuakePFloodQuakeP
FloodQuakeP
ProbabilityProbability
Class exampleClass exampleVehicles are classified into light (10 ton). Vehicle counts show that 60% is light, 30% Vehicle counts show that 60% is light, 30% medium and 10% heavy vehicles. Vehicles medium and 10% heavy vehicles. Vehicles are weighed regularly. If it is known that the are weighed regularly. If it is known that the probability that a vehicle is light and probability that a vehicle is light and overloaded is 0.12. Calculate the overloaded is 0.12. Calculate the probability that the next vehicle is heavy or probability that the next vehicle is heavy or that the next vehicle is overloaded. that the next vehicle is overloaded. Assume statically independence.Assume statically independence.
ProbabilityProbability
Class example solutionClass example solution P[light] =0.6P[light] =0.6 P[light and overloaded] = 0.12P[light and overloaded] = 0.12 P[light and overloaded] = P[light and overloaded] =
P[light]P[overloaded]P[light]P[overloaded] 0.12 = 0.6 x P[overloaded]0.12 = 0.6 x P[overloaded] P[overloaded] = 0.12/0.6 = 0.2P[overloaded] = 0.12/0.6 = 0.2
ProbabilityProbability
Example continueExample continue
Overloaded Overloaded Overloaded
Vehicle
Light0.6
Medium0.3
Heavy0.1
Yes0.2
Yes0.2
Yes0.2
No0.8
No0.8
No0.8
0.12 0.48 0.06 0.24 0.02 0.08
28.002.01.02.0][][][][][
=+=
+=
HeavyOverloadedPHeavyOverloadedPheavyPoverloadedPHeavyOverloadedP
ProbabilityProbability
Exclusive and IndependentExclusive and Independent Exclusive:Exclusive:
Events are ones that never occur together.Events are ones that never occur together. IndependentIndependent
The proportion of times A occurs is the same The proportion of times A occurs is the same whether or not B occurswhether or not B occurs
Using a Venn diagramUsing a Venn diagram
w yx
A B
z
A & B mutually exclusiveX = 0
A & B independentw/x = z/yw & x refer to events in Ay & z refer to events outside ASee paragraph 66 NB!
ProbabilityProbability
Example Example Independent featuresIndependent featuresTotal of 200 students, 80 study Total of 200 students, 80 study biology, 90 study neither biology nor biology, 90 study neither biology nor geography, and the choice of geography, and the choice of whether a student does or does not whether a student does or does not study biology is independent of their study biology is independent of their choice of studying geography.choice of studying geography.Let us use the Venn diagram to Let us use the Venn diagram to determine all required numbers in determine all required numbers in the Venn diagram. the Venn diagram.
ProbabilityProbability
Example solutionExample solution
w yx
B G
z
z = 90 given
w+x = 80
Total outside B = 200-80Total outside B = 120
Number inside G but outside B = 120 90 = 30y=30
w/x = z/y90/30 = w/xw=3xw+x = 4x =80x = 20w = 60
ProbabilityProbability
Conditional ProbabilityConditional ProbabilityN amount of tests consist of:N amount of tests consist of:N = nN = n1111+ n+ n2222+ n+ n1212+ n+ n2121
nn1212 nn2121nn1111
A B
nn2222][
][]/[
][][]/[
][
][
][
2111
11
2111
11
1121
1112
11
APBAPABP
BPBAP
nnN
Nn
nnnBAP
NnnBP
NnnAP
NnBAP
=
=
+
=
+=
+=
+=
=
ProbabilityProbability
Class Example: van As & Class Example: van As & JoubertJoubert
Area 1
Area 2
400 vehicles/hour
600 vehicles/hour
1% busses
3% busses
Calculate the probability that the next vehicle will be a bus
ProbabilityProbability
Example Example SolutionSolution
978.0022.01][022.0018.0004.0][018.06.003.0][][]/[][004.04.001.0][][]/[][
][][]/[
][][]/[
03.0]/[01.0]/[
6.0][4.0][
2
222
1
111
2
2
2
1
1
1
2
1
2
1
==
=+=
==
=
==
=
=
=
=
=
=
=
MPBP
ABPAPABPABP
ABPAPABPABP
APABPABP
APABPABP
ABPABP
AfromVPAfromVP
Motor Vehicle Bus
2
Are
a
1
0.004
0.018
ProbabilityProbability
BayesBayes TheoryTheory
][][]/[]/[
][][]/[
][][]/[
][][]/[
BPAPABPBAP
APABPABP
BPABPBAP
BPBAPBAP
=
=
=
=
lll
ProbabilityProbability
Class ExampleClass Example
ReadyReady--mix concrete is delivered at a site. This mix concrete is delivered at a site. This concrete is used to cast columns.concrete is used to cast columns.BB00 Weak concreteWeak concreteBB11 Good concreteGood concreteKK00 Column failColumn failKK11 Column not failingColumn not failingP[BP[B00] = 0.1] = 0.1P[BP[B11] = 0.9] = 0.9P[KP[K00/B/B00] = 0.6] = 0.6P[KP[K11/B/B11] = 0.98] = 0.98P[KP[K00] = ?] = ?P[BP[B00/K/K00] = ?] = ?
ProbabilityProbability
Example SolutionExample Solution
77.0078.0
1.06.0][
][]/[]/[
:078.0060.0018.0][
882.09.098.0][]/[][][
][]/[
06.01.06.0][]/[][][
][]/[
0
00000
0
111111
1111
000000
0000
=
==
=+=
===
=
===
=
KPBPBKPKBP
BayeKP
BPBKPBKPBPBKPBKP
BPBKPBKPBP
BKPBKP
P[B1]0.9
P[B0]0.1
P[K1] P[K0]
0.882
0.06
0.01
8
0.04
ProbabilityProbability
Class Example Class Example -- van As & van As & JoubertJoubert
Gravel from a quarry. 70% of the material Gravel from a quarry. 70% of the material are classified as good material. 20% of the are classified as good material. 20% of the good material fail during testing and 10% of good material fail during testing and 10% of weak material does not fail during testing. weak material does not fail during testing. When testing the material calculate the When testing the material calculate the probability that the material will pass the probability that the material will pass the test. Then calculate the probability the test. Then calculate the probability the material will be good material given that the material will be good material given that the material pass the test.material pass the test.
ProbabilityProbability
Example Example SolutionSolution
Material
Good0.7
Weak0.3
Test Test
Fail0.2
Pass0.8
Fail0.9
Pass0.1
0.14 0.56 0.27 0.03
P[Pass] = 0.56 +0.03 = 0.59
949.0]/[03.056.07.08.0]/[
][][]/[]/[
=
+
=
=
PassGoodP
PassGoodP
PassPGoodPGoodPassPPassGoodP
ProbabilityProbability
Combinations and PermutationsCombinations and Permutations Lets start with an example: How many 6 letter Lets start with an example: How many 6 letter
words can me make with the word GROUND.words can me make with the word GROUND. 11stst letter letter -- 6 choices6 choices 22ndnd letter letter -- 5 choices5 choices 33rdrd letter letter -- 4 choices4 choices 44thth letter letter -- 3 choices3 choices 55thth letter letter -- 2 choices2 choices 66thth letter letter 1 choice1 choice Total = 6 x 5 x 4 x 3 x 2 x 1 = 6!Total = 6 x 5 x 4 x 3 x 2 x 1 = 6!
How many 3 letter words can we make?How many 3 letter words can we make? 11stst letter letter -- 6 choices6 choices 22ndnd letter letter -- 5 choices5 choices 33rdrd letter letter -- 4 choices4 choices Total = 6 x 5 x 4 = 6!/3!Total = 6 x 5 x 4 = 6!/3!
ProbabilityProbability
PermutationsPermutations nnPPxx is the amount permutations, is the amount permutations,
consisting of x elements from a consisting of x elements from a set of n elementsset of n elements
nnPPxx = n!/(n= n!/(n--x)!x)! For permutations the order is For permutations the order is
important. Therefore GROimportant. Therefore GROORGORG For combinations the order is not For combinations the order is not
important. Therefore GRO=ORGimportant. Therefore GRO=ORG
ProbabilityProbability
CombinationsCombinations nnCCxx amount of combinations that amount of combinations that
consist of x elements from a set of n consist of x elements from a set of n elementselements
If we can make If we can make nnCCxx combinations, for combinations, for each combination we can find x! each combination we can find x! permutations.permutations.
((nnCCxx) x! = ) x! = nnPPxx = n!/(n= n!/(n--x)!x)! Therefore (Therefore (nnCCxx) = n!/[(n) = n!/[(n--x)!x!]x)!x!]
!)!(! Cxn xxnnn
x =
=
ProbabilityProbability
ExampleExample National LotteryNational Lottery 49 balls choose 6 balls the order not 49 balls choose 6 balls the order not
important. (Number of combinations)important. (Number of combinations)
If you buy 1 ticket every week you will If you buy 1 ticket every week you will win the lotto once in 269 000 yearswin the lotto once in 269 000 years
649
61014
!6)!649(!49
!)!(!
=
=
=
xxnnn
x
ProbabilityProbability
Binominal DistributionBinominal Distribution To introduce it, suppose we have To introduce it, suppose we have
a sixa six--sided dice, with four sides sided dice, with four sides painted blue and two sides painted blue and two sides painted red. At any roll, there is a painted red. At any roll, there is a oneone--third chance of getting a red third chance of getting a red face and a twoface and a two--thirds chance of thirds chance of getting a blue face.getting a blue face.
Determine the probability for 3 red Determine the probability for 3 red faces consecutively.faces consecutively.
ProbabilityProbability Binominal DistributionBinominal Distribution
Enables us to Enables us to calculate the calculate the probability of probability of obtaining exactly x obtaining exactly x successes, when the successes, when the probability of success probability of success at any one trail is pat any one trail is p
( )( )( )( )( )( ) 037.0
31
321
311]3[
222.031
323
311]2[
444.031
323
311]1[
296.0321
311]0[
1][
30
3
333
3
21
2
233
2
12
1
133
1
3
0
033
0
=
=
==
=
=
==
=
=
==
=
=
==
==
pXP
pXP
pXP
pXP
ppxXP xxnn
x
Dice
R1/3
R1/3
R1/3
B2/3
B2/3
B2/3
RRR = 0.037
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0 1 2 3
Amount Red
Prob
abili
ty
ProbabilityProbability
Class ExampleClass ExampleA company manufacture water pipes that A company manufacture water pipes that can resist 500kPa water pressure. It is can resist 500kPa water pressure. It is known that the probability that a pipe will known that the probability that a pipe will burst at 500kPa is 0.05. If 4 water pipes burst at 500kPa is 0.05. If 4 water pipes are selected out of a group of pipes, what are selected out of a group of pipes, what is the probability that:is the probability that:
1. Exactly 1 pipe will burst1. Exactly 1 pipe will burst2. More than 2 pipes burst2. More than 2 pipes burst3. All 4 pipes will burst3. All 4 pipes will burst4. Any pipe burst4. Any pipe burst
ProbabilityProbability Example SolutionExample Solution
( )( )( )( ) ( ) ( )( )( ) ( ) ( )[ ]( )( ) ( ) ( )[ ]( )( ) ( ) ( )[ ]( )( ) ( ) ( )[ ] 00001.005.095.0105.005.01]4[
00048.005.095.0405.005.01]3[
0135.005.095.0605.005.01]2[
1715.005.095.0405.005.01]1[
8145.005.095.0105.005.01]0[
1][
404444
4
313344
3
222244
2
131144
1
040044
0
====
====
====
====
====
==
XP
XP
XP
XP
XP
ppxXP xxnn
x
1. P[X = 1] = 0.17152. P[X > 2] = 0.00048 + 0.00001 = 0.000493. P[X = 4] = 0.000014. Any pipe burst = 1 P[X = 0] = 1 0.8145 = 0.1855
Note the sum of P[X = x] = 1
ProbabilityProbability Poisson DistributionPoisson Distribution
( )
( )
( ) ( ) ( )
( ) ( )
!][
1
11
111
1
1
!111
)!(!!
1)!(!
!1)!(!
!
)(
1)!(!
!
xexXP
en
n
nxnnn
largenForn
nxn
xnnnppxnx
n
nnxnxnpp
xnxn
valuectedexpenpLetstantconnpwheresmallPandgearlnFor
ppxnx
nondistributiBinominal
x
n
x
x
x
n
x
xxxn
xnxxxn
xxn
==
+
+=
=
==
=
ProbabilityProbability
Class ExampleClass ExamplePretoria experience yearly an Pretoria experience yearly an average 2.8 large rain storms average 2.8 large rain storms (> 50mm/hour). Calculate the (> 50mm/hour). Calculate the probability that next year there probability that next year there will be:will be:
1.1. 112.2. 223.3. 334.4. More than 2, large rain storms. More than 2, large rain storms.
ProbabilityProbability
Class Example Class Example -- SolutionSolution
( )
531.0467.01]2[
238.017.0!0
8.21]2[
]2[]1[]0[1]2[
222.0!3
8.2]3[
238.0!2
8.2]2[
17.0!1
8.2]1[
!][
8.20
8.23
8.22
8.21
==>
++
=>
=+=+==>
===
===
===
2.8=
==
XP
eXP
xPxPxPXP
eXP
eXP
eXP
xexXP
x
ProbabilityProbability
Poisson Distribution vs. BinominalPoisson Distribution vs. Binominal
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
n
P[X=
x]Poisson Binominal
ProbabilityProbability
Class ExampleClass Example Vehicle counts at a shopping centre Vehicle counts at a shopping centre
shows there is an average of 120 shows there is an average of 120 vehicles arriving per hour.vehicles arriving per hour. Using first Using first the the Poisson distribution Poisson distribution
calculate the probability that in the next calculate the probability that in the next minute;minute;
TThenhen using theusing the Binominal DistributionBinominal Distributiontaking intervals of 2 secondstaking intervals of 2 seconds calculatecalculate the the probability of;probability of;
1.1. No vehicles arriveNo vehicles arrive2.2. 2 vehicles arrive2 vehicles arrive3.3. At least 2 vehicles arriveAt least 2 vehicles arrive
ProbabilityProbability
Class Example Class Example PoissonPoisson
594.0]2[])1[]0[(1]2[
271.0]2[135.0]0[!
][
min/2
==+==
==
==
==
=
60120
=
XPXPXPXP
XPXP
xexXP
vehiclesx
ProbabilityProbability
Class Example Class Example BinominalBinominal Take 2 sec intervals. Therefore n = 30Take 2 sec intervals. Therefore n = 30
034.0]5[090.0]4[1870.0]3[280.0]2[271.0]1[126.0]0[
)1()!(!
!][
//
302
0667.036001202
==
==
==
==
==
==
==
===
==
XPXPXPXPXPXP
ppxnx
nxXP
minerimentsexpminvehicles
np
p
xnx
ProbabilityProbability
Normal DistributionNormal Distribution
Y-Ax
is
X-Axis
ProbabilityProbability Normal DistributionNormal Distribution
=
2
21
21)(
x
exy
Y-A
xis
X-Axis
ProbabilityProbability Standard Normal DistributionStandard Normal Distribution
( )
=
2
21
21)(
z
ezy
Y-A
xis
Z-Axis
z )(zy
-1 1
)( =
xz
21
0
ProbabilityProbability
Properties of the normal distributionProperties of the normal distribution Continues functionContinues function y(x) is the probability densityy(x) is the probability density Describe completely through 2 parametersDescribe completely through 2 parameters Probability density 0 where x Probability density 0 where x dydy//dxdx = 0 Maximum at = 0 Maximum at y(y()=)=
Area under function = 1Area under function = 1 Standard Normal DistributionStandard Normal Distribution
= 0= 0 SD = 1SD = 1 Function symmetricFunction symmetric
ythenythen
121
ProbabilityProbability
Normalising/Transform the Normal Normalising/Transform the Normal DistributionDistribution Concrete Normal DistributionConcrete Normal Distribution
Y-A
xis
Z-Axis-1 1
016.02
1=
=
2550
1
xz
Y-A
xis
X-Axis
2550
=
=
0 50 100
ProbabilityProbability
Normalising/Transform the Normal Normalising/Transform the Normal DistributionDistribution Steel Normal DistributionSteel Normal Distribution
Y-Ax
is
X-Axis
10200
=
=
180 200 250
Y-Ax
is
X-Axis-1 1
040.02
1=
=
10200
1
xz
ProbabilityProbability
Class ExampleClass Example Make use of the standard normal distribution tableMake use of the standard normal distribution table Calculate the probability for z:Calculate the probability for z:1.1. Smaller than 0Smaller than 02.2. Smaller than 0.8Smaller than 0.83.3. Greater than 0.8Greater than 0.84.4. Greater than 1.2Greater than 1.25.5. Between 0.8 and 1.2Between 0.8 and 1.26.6. Between Between --0.3 and 1.20.3 and 1.2 For what value of z will the probability be to get a value For what value of z will the probability be to get a value
smaller as zsmaller as z1.1. 0.90.92.2. 0.20.2 What is the probability that a value betweenWhat is the probability that a value between1.1. --1 and 11 and 12.2. --2 and 22 and 23.3. --3 and 3 SD of the mean3 and 3 SD of the mean
ProbabilityProbability
Example SolutionExample Solution ProbabilitiesProbabilities1.1. y(0) = 0.5y(0) = 0.52.2. y(0.8) = 0.78814y(0.8) = 0.788143.3. 11--y(0.8) = 1 y(0.8) = 1 0.78814 = 0.211860.78814 = 0.211864.4. 11--y(1.2) = 1 y(1.2) = 1 0.88493 = 0.115070.88493 = 0.115075.5. y(1.2)y(1.2)--y(0.8) = 0.09679y(0.8) = 0.096796.6. y(1.2)y(1.2)--(1(1--y(0.3)) =0.50284y(0.3)) =0.50284 Z ValuesZ Values1.1. y(z) = 0.9 therefore z = 1.28y(z) = 0.9 therefore z = 1.282.2. y(z) = 0.8 ; z =0.85 therefore z = y(z) = 0.8 ; z =0.85 therefore z = --0.850.85 ProbabilitiesProbabilities1.1. y(0
ProbabilityProbability
Class exampleClass example A large amount of concrete cubA large amount of concrete cubees ws wereere tested. The tested. The
data set have a normal distribution with a data set have a normal distribution with a =50 =50 MPaMPa and and = 6 = 6 MPaMPa..
Calculate the probability that a cubCalculate the probability that a cubees strength:s strength:1.1. Larger than 65 Larger than 65 MPaMPa2.2. Less than 38 Less than 38 MPaMPa3.3. Between 45 and 60 Between 45 and 60 MPaMPa4.4. Between 59 and 60 Between 59 and 60 MPaMPa (without table)(without table)5.5. Between 1 standard deviation more than the Between 1 standard deviation more than the
average and the 3average and the 3rdrd quartilequartile6.6. Calculate the strength to ensure only 5% of all Calculate the strength to ensure only 5% of all
cubcubees will be less than this strength.s will be less than this strength.7.7. The borders where 80% of the strength will fall into. The borders where 80% of the strength will fall into.
ProbabilityProbability
Class Example Class Example -- SolutionSolution
Y-A
xis
X-Axis
MPaMPa
650
=
=
0 50 10044 56
6
Y-Ax
is
X-Axis-1 1
066.02
1=
=
650
1
xz
Normal Distribution
Standard Normal Distribution
ProbabilityProbability
Class ExampleClass Example
99379.0)
5.26
5065.1
=(=
=
z
z
Y-Ax
is
Z-Axis
=
65065
1z
Required Area
00621.0]65[)5.21]65[
99379.0)5.2.1
=>
(=>=(
xPxP
cont
ProbabilityProbability
Class Example Class Example -- SolutionSolution
74927.020327.095254.0]67.1833.0[]6045[20327.0833.01833.0
833.06
5045]45[
95254.0667.1
667.16
5060]60[.3
02275.097725.011
26
5038.2
====)(=)(=
=
=)(=
=
==(2)=(2)=
=
zxP
zxP
zxP
z
( )
( )
MPaxx
zxxz
xz
zthanlessatlookingButzz
P
zzparpageSee
xPxy
AvgMPax
exyx
1.40)6(65.150
65.1%5
65.195.0).609134.075.084134.0
84134.0)167.075.0).5
9620018984.0018984.0)5960(]6059[
018984.0)59&60(5.59
1.42
21
=
=
+=
=
=
=
==(==
=(==(
===
=
2=
ProbabilityProbability
Y-Ax
is
Z-Axis
10%
( )
68.5732.4268.750
28.1:68.57
50)6(28.1
28.1.0
=
+=
+=
=
=
90=
x
Bordersxx
zx
xz
zforz
ProbabilityProbability
Using the cumulative frequency plot Using the cumulative frequency plot to graphically judge whether data is to graphically judge whether data is normally distributednormally distributed
Arm Length (mm)
690 700 710 720 730 740 750 760 770 780 790 800 810 820 830 840 850Z co
rres
pond
ing
to p
ropo
rtion
of o
bser
vatio
n le
ss th
an x
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
Straight line indicate a normal distribution
ProbabilityProbability
Expectations of discrete random Expectations of discrete random variables.variables. Let the probability that a random Let the probability that a random
variable X takes the value x be P[X=x].variable X takes the value x be P[X=x]. The Expectation of X is:The Expectation of X is:
The summation taking place over all the The summation taking place over all the possible values of X. That is, the possible values of X. That is, the expectation is in effect the mean. expectation is in effect the mean.
( ) ( ) == ][ xXPxXE
ProbabilityProbability
Expectation Expectation -- ExampleExample
0.20.2
11
0.10.10.20.20.40.40.10.1P[X=x]P[X=x]
44332200xx
( ) ( ) ( ) ( ) ( )
2)(4.06.08.02.00)(
1.042.034.022.011.00)(
=
++++=
++++=
xExExE
ProbabilityProbability
Expectations of continuous random Expectations of continuous random variables. Thenvariables. Then
ExampleExample= dxxxfxE )()(
32)(
61
21)(
211)(
0
20211)(
2
0
32
2
0
=
=
=
=
=
=
xE
xxxE
dxxxxE
isitxotherfor
andbetweenxxf
x
x
ProbabilityProbability
Variances and standard deviations of Variances and standard deviations of discrete random variables.discrete random variables. The variance of a random variable X, is The variance of a random variable X, is
the expectation of [Xthe expectation of [X--E(X)]E(X)]22, that is., that is.
( )[ ] ( )[ ] ( )[ ] ( )[ ] ( )[ ]
1.1)(
2.1)(4.02.002.04.0)(
1.022.014.002.011.02)(
][)]([)(
22222
2
==
=
++++=
++++=
==
XV
XVXVXV
datapreviousfromExamplexXPXExXV
ProbabilityProbability
Variances and standard deviations of Variances and standard deviations of continuous random variables.continuous random variables.ThenThen
[ ]
47.0)(92)(
94
97
95
81)(
211
32)(
211)(
)()()(
2
0
234
2
0
2
2
==
=
++=
=
=
=
=
=
XV
XV
xxxxXV
dxxxXV
xxfwhereexampleourFor
dxxfXExXV
x
x
ProbabilityProbability Example Example IQIQ The following table provide information The following table provide information
presenting a group of students IQ levels presenting a group of students IQ levels related to the different IQ levels related to the different IQ levels probabilitiesprobabilities
0.010.011401400.020.021301300.080.081201200.220.221101100.330.331001000.220.2290900.090.0980800.020.0270700.010.016060
P[X=x]P[X=x]IQ LevelIQ Level
ProbabilityProbability
IQ Test IQ Test -- ExampleExample
IQ Level P[X=x] E(X=x)-P[X=x]x V(X) Probability Cumulative Probability z60 0.01 0.6 15.8404 0.0004 0.004 0.004 -2.6570 0.02 1.4 17.7608 0.0025 0.025 0.029 -1.980 0.09 7.2 35.2836 0.0100 0.100 0.129 -1.1390 0.22 19.8 21.1288 0.0228 0.228 0.357 -0.37
100 0.33 33 0.0132 0.0297 0.298 0.654 0.4110 0.22 24.2 22.8888 0.0223 0.223 0.877 1.16120 0.08 9.6 32.6432 0.0096 0.096 0.973 1.93130 0.02 2.6 18.2408 0.0024 0.024 0.997 2.65140 0.01 1.4 16.1604 0.0003 0.003 1.000
Total 99.8 179.96 0.0999
E(X)=mean 99.80V(X) 179.96Std Deviation 13.41
( )2
211
2=
x
exy
ProbabilityProbability
IQ
40 60 80 100 120 140 160
P[X
=x]
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
ProbabilityProbability
IQ
60 70 80 90 100 110 120 130 140
P[X=
x]
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
ProbabilityProbability
IQ
40 60 80 100 120 140 160
P[X