FPP 13-15 Probability 1. What statisticians hang their hat on Provides a formal framework from which...

Click here to load reader

download FPP 13-15 Probability 1. What statisticians hang their hat on Provides a formal framework from which uncertainty can be quantified Why study probability.

of 52

Transcript of FPP 13-15 Probability 1. What statisticians hang their hat on Provides a formal framework from which...

  • Slide 1
  • FPP 13-15 Probability 1
  • Slide 2
  • What statisticians hang their hat on Provides a formal framework from which uncertainty can be quantified Why study probability in an intro stat course? Lay foundations for statistical inference. Train your brain to think in a way that it is not hardwired to do Its quite enjoyable and relaxing 2
  • Slide 3
  • Types of probability What exactly is probability? There are any number of notions of probability, indicating that probability isnt a thing but a concept We can spend a semester philosophizing about probability if you are interested I can direct you to some books. An unexhausted list Laplacian probability Hypothetical limiting relative frequency probability Nomic probability Fiducial probability Epistemic probability In this class we will focus on two of these. 3
  • Slide 4
  • Terminology Sample Space: The set (collection) of all possible outcomes that can happen. Event: A single outcome or set of outcomes from a sample space Probability Model: A consistent assignment of a probability to each even in the sample space Disjoint Events: Two events that have no outcomes in commone and, thus, cannot both occur simultaneously Venn Diagrams help visualize the above 4
  • Slide 5
  • Heads up Mathematical notation will become a little more prevalent here. Youll need to put forth effort wrapping your brain around it. 5
  • Slide 6
  • Limiting relative frequency Most folks call this the frequentist approach 1. Operations: observation, measurement, or selection that can at least hypothetically be repeated an in nite number of times 2. Sample space: set of possible outcomes of an operation 3. Events: subsets of elements in the sample space Elements of the sample space (basic outcomes) are equally likely Calculation 1. Let S denote the sample space, E S denote an event, and |A| denote the size of any set A 2. P r(E) |E|/|S | Upshot Percentage of times an event occurs in repeated realizations of random processes 6
  • Slide 7
  • Epistemic probability Often times called subjective probability This term is a bit loaded as it can be argued that objective probability doesnt really exist Here probability is degree of belief in likelihood of event Belief is updated or modified in the light of observed information 7
  • Slide 8
  • Probability Why consider two probabilities Each allows different approaches to incorporating probability in an anslysis Each one leads to different types of inference statements. Is one preferable to the other? This really depends on who you ask. There have been (heated) discussions on the appropriateness of both 8
  • Slide 9
  • Notation we will regularly use 9
  • Slide 10
  • Frequency probability We focus first on how to use frequency probability in an analysis and will cover epistemic probability later Simple motivating example There are 3 red balls and 9 white balls in a hat Pick one ball at random out of the hat Once picked the ball is not replaced Then pick another ball at random out of the hat 10
  • Slide 11
  • Shorthand for probability Define R1 = pick a red ball on the 1 st try Define R2 = pick a red ball on the 2 nd try Define W1 = pick a white ball on the 1 st try Define W2 = pick a white ball on the 2 nd try Probability of picking a red ball on the 1 st try is Pr(R1) = Probability of picking two red balls in two picks without replacing 1 st ball is Pr(R1 and R2) = 11
  • Slide 12
  • Marginal and joint probability Probability of a single event is called marginal probability Example: Pr(R1) Probability of intersection of two events (both events happening) is called a joint probability Example: Pr(R1 and R2) 12
  • Slide 13
  • Conditional probability Say we pick a red ball on the 1 st try. The chance we pick a red ball on the 2 nd try equals 2/11. Probability that an event given another event occurs is called conditional probability Shorthand: Pr(R2|R1) = 2/11 Probability that R2 occurs given that R1 occurs. 13
  • Slide 14
  • Relating these probabilities Pr(R1 and R2) = Pr(R1)Pr(R2|R1) 6/132 = 3/12(2/11) Joint prob. = marginal prob. times conditional prob. This is always true 14
  • Slide 15
  • Independent events Replace 1 st ball before picking the 2 nd. Then Pr(R1) = 3/12 Pr(R2 | R1) = Pr(R2) = 3/12 R1 and R2 are called independent events: The occurrence of R1 does not affect the probability of R2. 15
  • Slide 16
  • Independent events When events are independent calculating joint probabilities is fairly easy Let events A, B, C, etc. be independent Pr(A and B and C and etc. ) = Pr(A)Pr(B)Pr(C)Pr(etc.) To get joint probabilities you can simply multiply the marginal probabilities Why does this work? 16
  • Slide 17
  • Dependent events Notice that when sampling with out replacement then Pr(R2|R1) = 2/11 3/12 = Pr(R2) When the conditional prob. is not equal to the marginal prob. then the events are said to be dependent. The occurrence of R1 affects the probability of R2 Here R1 and R2 are dependent events 17
  • Slide 18
  • Dependent events When events are dependent joint probabilities are harder to compute Let A, B, C, , etc. be dependent events Pr(A and B and C and etc.) = Pr(A|B,C,etc.)Pr(B|C, etc.)Pr(C|etc.)Pr(etc.) To get joint probabilities, you multiply all the conditional probabilities 18
  • Slide 19
  • Independence in sports Baseball announcers sometimes say, The batter has not gotten a base hit in the last four times hes batted. Hes due for a hit now What is this statement assuming? 19
  • Slide 20
  • or rule This is an inclusive or That is, A or B = A or B or both. Pr(A or B) = Pr(A) + Pr(B) Pr(A and B) 20
  • Slide 21
  • or rule Pr(R1 or R2 ) = Pr(R1) + Pr(R2) Pr(R1 and R2) = Pr(R1) + Pr(R2) Pr(R2|R1)Pr(R1) = 3/12 + Pr(R2) (2/11)(3/12) We will come back to Pr(R2) in a couple of slides 21
  • Slide 22
  • or rule If events are disjoint (i.e. they cannot happen simultaneously) then we can split or probabilities into sums of individual probabilities These are also referred to as mutually exclusive events Pr(W1 or R1) = Pr(W1) + Pr(R1) - Pr(W1 and R1) = Pr(W1) + Pr(R1) - 0 22
  • Slide 23
  • Law of Total of Probability Law of total probability. For any set B P(A) = P(A and B) + P(A and not B) P(Brown eyes)=P(Brown eyes and Male)+P(Brown eyes and Female) R2 occurs in two ways 1. Red picked 1 st and red picked 2 nd OR 2. While picked 1 st and red picked 2 nd Pr(R2) = Pr(R2 and R1) + Pr(R2 and W1) Notice the W1 implies not R1 (if a white is drawn on the first draw then you cant get a red on the first draw). 23
  • Slide 24
  • or Can we compute Pr(drawing at least one red ball) ? Pr(R1 and W2 or W1 and R2 or R1 and R2) = Pr(R1 and W2) + Pr(W1 and R2) + Pr(R1 and R2) = (3/12)(9/11) + (9/12)(3/11) + (3/12)(2/11) = 60/132 Sometimes it is easier to compute the probability of compliments of events Pr(drawing at least on red ball) = 1- Pr(no red balls) 1-Pr(no red balls) = 1- Pr(W1 and W2) = 1-(9/12)(8/11)=60/132 24
  • Slide 25
  • A common confusion Whats the difference between mutually exclusive and independent? When do I add and when do I multiply? Two events are mutually exclusive if the occurrence of one prevents the other from happening Under mutual exclusivity Pr(A or B) = Pr(A) + Pr(B) Two events are independent if the occurrence of one does not change the chances of the other Under independence Pr(A and B) = Pr(A)Pr(B) Pr(A|B) = Pr(A) 25
  • Slide 26
  • Coin toss A coin is tossed six times Two possible sequences are Sequence 1 H T T H T H Sequence 2 H H H H H H Which of the following is correct? Sequence 1 is more likely Sequence 2 is more likely Both the sequences are equally likely 26
  • Slide 27
  • Example Box A has 30 red and 20 blue marbles Box B has 3 red and 2 blue marbles Which box, if either, offers the better chance of winning in each of the the three scenarios below? 1. Pick one marble. You win if it is red 2. Pick two marbles (without replacement). You win if at least one is red 3. Pick three marbles (without replacement). You win if at least one is red 27
  • Slide 28
  • Example One ticket will be drawn at random from each of the two boxes shown below Find the chance that The number drawn from the left box is larger than the right The number drawn from the left equals the one on the right 4312321 28
  • Slide 29
  • Example Three cards are dealt from a standard 52 card deck. What is the chance that the first card is a King? What is the chance that the second card is a Queen? What is the chance that the third card is a Jack? What is the chance that the first card is a King and the second card is a Queen and the third a Jack? Five cards are dealt from a standard deck. What is the chance that the first cards are aces and the fifth card a king? 29
  • Slide 30
  • Example A 10-sided die is rolled three times. What is the chance of getting at least one roll with a number bigger than seven 30
  • Slide 31
  • Example True or False A fair 6-sided die is rolled three times. The chance of getting at least one ace equals 1/6 + 1/6 + 1/6 = 1/2 If a coin is tossed twice, the chance of getting at least one head is 50% 31
  • Slide 32
  • Lets make a deal Game show host, Monte Hall, presents 3 doors. Behind one door is a fabulous prize behind the other two doors are a sweet pig and goat Monte knows what is behind each door You pick a door Monte opens the door with the goat or pig (but not the door you picked) Monte then asks if you want to switch Should you switch?????? 32
  • Slide 33
  • 33
  • Slide 34
  • Lets make a deal revisited In general cant answer the question what is the probability of winning if I switch, given that I have been shown a goat behind door three. One must be very explicit about assumptions being made on what Monte Halls strategies are Will he ever reveal the door hiding a car? Two solutions When interested in the unconditional probability Here always want to switch When interested in the conditional probability Here can do no better than switching depending on strategy 34
  • Slide 35
  • Birthday problem What is the chance that at least two people in your stats 101 lab section share the same birthday? 35
  • Slide 36
  • Birthday problem Case study: There have been 44 U.S. Presidents Common birth dates Nov. 2: Harding and Polk Common death dates July 4: Adams, Jefferson, and Monroe March 8: Fillmore and Taft Dec. 26: Truman and Ford 36
  • Slide 37
  • Pistols at dawn Tom Cruise, Nicole Kidman, and Penelope Cruz have gotten into a disagreement about who has the best hair and decide to settle their dispute the only way hair disputes really can be settled: with a three-cornered pistol duel. Of the three, Tom is the worst shot, hitting his target only 30% of the time. Nicole is a markswoman; she never misses her target. Penelope spent a lot of time near the gun shop on Hillsborough Road, so she's had some practice and can hit targets 50% of the time. The rules of the duel are simple: they are to fire at targets of their choice in succession, and cyclically, in the order Tom, Nicole, Penelope, Tom, Nicole, Penelope, and so on until only one of them is left standing. On each turn, they get only one shot. If a combatant is hit, he or she no longer participates, either as a shooter or as a target. For example, one possible outcome of the duel is for Tom to shoot at Nicole and hit; then for Penelope to shoot at Tom and miss; then for Tom to shoot at Penelope and miss; then for Penelope to shoot at Tom and hit. Then, Penelope wins. Assume that each person is trying to maximize his chance of survival. For example, if Nicole has to choose between shooting at Tom and shooting at Penelope, Nicole will shoot at Penelope because Penelope is the more accurate shooter. Put yourself in Tom's shoes. You have three strategies to choose from: 1) shoot at Nicole with the intention of hitting her; 2) shoot at Penelope with the intention of hitting her; and 3) shoot at no one so that you have no chance of hitting anyone. Which of these three strategies maximizes Tom's probability of survival? 37
  • Slide 38
  • 38
  • Slide 39
  • Sports playoffs In playoffs in many sports, the first of two teams to win four games is the winner Wins do not have to be consecutive The series ends after one team wins four games How likely is it that a series will last four games? Five games? Six games? Seven games? 39
  • Slide 40
  • Sports playoffs For context say the two teams are the Celtics and the Lakers in the 2009 NBA finals Assume that the outcome of each game is independent of prior games.(is this reasonable?) For all games Pr(C) = 0.52 and Pr(L) = 0.48 40
  • Slide 41
  • Bayes Rule Not in book Recall that After some rearranging we get P(B and A)=P(B)P(A|B) = P(A)P(B|A)=P(A and B). 41
  • Slide 42
  • Bayes rule A common blood test for AIDS is the EIA. When AIDS antibodies present, EIA reports AIDS 99.85% of time. When AIDS antibodies not present, EIA reports no AIDS 99.4% of time. It is estimated that about 900,000 out of the 280,000,000 people in the U.S. have AIDS. A person takes the EIA test and it reports that he has AIDS. How likely is he to have AIDS? 42
  • Slide 43
  • 43
  • Slide 44
  • EIA example A convenient way to find Pr(A|P) Make a table of a hypothetical population, and fill in class of table using given information Assume 10,000 people in population Number with AIDS: (900,000/(280,000,000)*10,000 = 32.143 Number with out AIDS 10,000 32.14 = 9,967.857 44
  • Slide 45
  • Hypothetical table for EIA AIDSNot AIDS Positive Not Positive 45
  • Slide 46
  • EIA test: Final Table AIDSNot AIDS Positive Not Positive 32.1439,967.85710,000 46
  • Slide 47
  • EIA test: Final Table AIDSNot AIDS Positive32.095 Not Positive 0.048 32.1439,967.85710,000 47
  • Slide 48
  • EIA test: Final table AIDSNot AIDS Positive32.09559.80791.902 Not Positive 0.0489,908.0509,908.098 32.1439,967.85710,000 48
  • Slide 49
  • EIA test: Final answer Pr(A and P) = 32.095/10,000 Pr(P) = 91.902/10,000 Pr(A|P) = Pr(A and P)/Pr(P) = 32.095/91.902 = 0.3492 There is a 34.92% chance the person has AIDS give he tested positive on EIA 49
  • Slide 50
  • Sensitivity to initial marginal probability What if 1% of people have AIDS Pr(A|P) = 0.627 What if 10% of people have AIDS Pr(A|P) = 0.949 The probability is very sensitive to the incidence of rate of AIDS 50
  • Slide 51
  • Binomial distribution Let n be a sequence of binary outcome events (e.g. success/fail) Let p be the probability of success What is the probability of getting x successes out of n trials This is a binomial probability if p remains constant for all trials n (the number of trials) is fixed Pr(x) = choose(n,x)p x (1-p) (n-x) Choose(n,x) = n!/(x!(n-x)!) 51
  • Slide 52
  • Binomial distribution Roll a standard dice ten times and count the number of sixes. Consider the outcome of getting a six as a success a failure otherwise The number of trials is fixed at 10 The probability of success (1/6) is the same for all trials Let X denote the number of sixes Pr(X=1) = choose(10,1)(1/6) 1 (5/6) 10-1 = 0.32 Pr(X=5) = choose(10,5)(1/6) 5 (5/6) 5-5 = 0.013 The binomial distribution is useful to compute probabilities under the sampling with replacement scheme 52