Ps 1 Solutions

10
1 Solutions ECON 131: Econometrics and Data Analysis I Problem Set #1 1. (Airline Safety). We will solve this problem using probability tables. The at- tributes for the probability table are “cause of emergency landing” and “occurrence of fatalities”, and the table is as follows: Occurrence of fatalities Cause of emergency landing Equipment Malfunction Not Equipment Malfunction Fatalities 0.1 * 0.15 = 0.015 0.085 0.1 No fatalities 0.585 0.315 0.9 0.6 0.4 Please note that the third bullet-point statement in the problem is a conditional probability, which we use to calculate the joint probability 0.015 in the upper left- cell. (a) The quotient rule states that Prob(one or more fatalities |not equipment malfunction ) = Prob(one or more fatalities and not equipment malfunction ) Prob(not equipment malfunction ) = 0.085 0.4 =0.2125 (b) Here we must check whether the joint probability for “equipment malfunction” and “one or more fatalities” is equal to the product of the marginal probabilities. From the table, Prob(equipment malfunction and one or more fatalities ) = 0.015 and the product of the joint probabilities Prob(equipment malfunction ) * Prob(one or more fatalities )=0.6 * 0.1= 0.06. So, Prob(equipment malfunction and one or more fatalities ) 6= Prob(equipment malfunction )*Prob(one or more fatalities ), and hence the two events are not independent.

description

Ps 1 solutions

Transcript of Ps 1 Solutions

Page 1: Ps 1 Solutions

1

Solutions ECON 131: Econometrics and Data Analysis I Problem Set #1

Department of Economics Econ 131a

Yale University Econometrics & Data Analysis I

Fall 2010

Answers to Assignment #1

1. (Airline Safety). We will solve this problem using probability tables. The at-tributes for the probability table are “cause of emergency landing” and “occurrenceof fatalities”, and the table is as follows:

Occurrence offatalities

Cause of emergency landingEquipmentMalfunction

Not EquipmentMalfunction

Fatalities 0.1 * 0.15 = 0.015 0.085 0.1No fatalities 0.585 0.315 0.9

0.6 0.4

Please note that the third bullet-point statement in the problem is a conditionalprobability, which we use to calculate the joint probability 0.015 in the upper left-cell.

(a) The quotient rule states that

Prob(one or more fatalities |not equipment malfunction)

=Prob(one or more fatalities and not equipment malfunction)

Prob(not equipment malfunction)

=0.085

0.4= 0.2125

(b) Here we must check whether the joint probability for “equipment malfunction”

and “one or more fatalities” is equal to the product of the marginal probabilities.From the table, Prob(equipment malfunction and one or more fatalities) = 0.015 andthe product of the joint probabilities Prob(equipment malfunction) * Prob(one or

more fatalities)=0.6 * 0.1= 0.06. So, Prob(equipment malfunction and one or more

fatalities) 6= Prob(equipment malfunction)*Prob(one or more fatalities), and hencethe two events are not independent.

1

Page 2: Ps 1 Solutions

2

(c) From the above table: Prob(equipment malfunction and one or more fatalities)= 0.015 (it is the probability in the upper left cell)

(d) From the table: Prob(not equipment malfunction and no fatalities)= 0.315 (it isthe probability in the lower right cell).

2. (Business Week Poll) In this problem we are given a conditional probability and amarginal probability and we are asked to find a joint probability. Specifically, we aregiven that Prob(female executive has been harassed) = 0.27 and that Prob(femaleexecutive reports harassment female executive has been harassed) = 0.25. We areasked to find Prob(female executive reports harassment and female executive hasbeen harassed). By the product rule:

Prob(female executive reports harassment and female executive has been harassed)

= Prob(female executive has been harassed) ⇥Prob(female executive reports harassment female executive has been harassed)

= 0.27⇥ 0.25 = 0.0675.

Remark. This is called the reporting bias problem; if people who have been thevictims of a crime tend not to report it, the reported incidence (in this case about6.75%) may seriously understate the true incidence (in this case 27%).

3. (Student Background). The best way to approach this problem is using thefollowing probability tree:

Event ProbabilityEngineering

Biz School Student with Engineering Background 0.0666

0.37

GSB Student Social ScienceBiz School Student with Social Science Backgorund 0.0432

0.18 0.24

OtherBiz School Student with Other Background 0.0702

0.39

Other Grad StudentOther Graduate Student 0.82

0.82

There are two types of uncertainties for a randomly picked grad student: (a) whetheror not he or she is a business school student, and (b) for a business school student,

2

Page 3: Ps 1 Solutions

3

whether he or she is from an engineering background, a social science background, orother. So in this tree we represent the first form of uncertainty using the first chancenode, and the second using the second chance node, and the probabilities placed onthe tree are extracted from the statement of the problem.

(a) Here we are asked to determine the Prob(social science background and biz school

student). From the tree it follows that the event social science background andbiz school student corresponds to the middle upper branch of the tree. Therefore,Prob(social science background and biz school student) = 0.0432.

(b) Here we want to find the probability that a grad student is neither a biz school

student with a social science background nor a biz school student with an engineering

background. From the tree, it follows that this event corresponds to the branchesother graduate student (lowest branch) and other biz school student (second lowestbranch). So, Prob(neither biz school with social science background nor biz school

with engineering background) = 0.0702 + 0.82 = 0.8902.

Note. For the math jocks among you, here is a solution using the five probabilityrules directly: Let A1 denote engineering background, A2 denote social science back-ground, and let B be the event biz school student. We are given Prob(A1|B)=0.37,Prob(A2|B)=0.24, Prob(B)=0.18.

(a) Prob(A2 and B)= Prob(A2|B)*Prob(B) =0.24*0.18 = 0.0432.

(b) Prob(not (A1 and B) nor (A2 and B))= 1-P[(A1 and B) or (A2 and B)] = 1-[Prob(A1 and B) + Prob(A2 and B)] = 1-0.0666-0.0432=0.8902.

Note Prob(A2 and B) is calculated in the same manner that we used in (a) to calculateProb(A1 and B).

4. We will solve this problem using probability trees.

First note that there are two uncertainties here: (a) Whether the customer buysa 2-door or a 4-door; and (b) whether the customer buys on credit or pays cashup front. We are also given the marginal probability of buying a 2-door, and theconditional probability of buying on credit given that the consumer is buying a 2-door. Moreover we are given the conditional probability of paying up front given thatthe consumer is buying a sedan. So we create a tree with two set of nodes: the firstnode represents whether the consumer buys a 2-door or a 4-door, and the second setof nodes represents whether the consumer buys on credit or pays cash up front. Thecar selection node comes first because we know the marginal probabilities.

3

Page 4: Ps 1 Solutions

4

The tree looks like this:

Event ProbabilityCredit

2-door on credit 0.32-door 0.75

0.4Up Front

2-door, up front 0.10.25

Credit4-door, on credit 0.12

4-door 0.2

0.6Up Front

4-door, up front 0.480.8

(a) From the probability tree: Prob(buy on credit) = 0.3 + 0.12 = 0.42.

(b) Here we are asked to find Prob(buy 2-door| pay up front). By the quotient rule:

Prob(Buy 2-door| pay up front) =Prob(Buy 2-door and pay up front)

Prob(pay up front)

=0.1

0.58= 0.172.

Remark: You can also do this problem using probability tables. We used a proba-bility tree for continuity with problem 5.

4

Page 5: Ps 1 Solutions

5

5. In order to solve this problem we expand the probability tree from question 4 toinclude one more set of branches which represent whether the consumers who buy oncredit will default or not. The tree looks as follows:

Event Probabilitydefault

2-door, on credit, default 0.18

credit 0.6

0.75No default

2-door2-door, on credit, no default 0.12

0.40.4

up front2-door, up front 0.1

0.25

default4-door, on credit, default 0.036

credit 0.3

0.2No default

4-door4-door, on credit, no default 0.084

0.70.6

up front4-door, up front 0.48

0.8

(a) From the probability tree: Prob(buy on credit and default) = 0.18 + 0.036 =0.216.

(b) The tricky part here is to recognize that the question asks for the probabilityProb(default| buy on credit). Now using the quotient rule and the probability treewe get: Prob(default| buy on credit) = Prob(default and buy on credit) / Prob(buyon credit) = 0.216

0.42 = 0.514; the marginal probability 0.42 is from question 4(a).

6. (HIV Testing). The statement of the problem gives the following probabilities:

Prob(Infected Donor) = 0.0001,

Prob(Negative Test| Infected Donor) = 0.01,

Prob(Positive Test| Not-Infected Donor) = 0.001,

and it asks to determine the Prob(Positive Test) and the Prob(Not-Infected| PositiveTest). This can be done using a probability table with two attributes: infection status(infected or not-infected) and test outcomes (positive or negative). The Table is asfollows:

TestOutcomes

Infection StatusInfected Not-Infected

Positive 0.0001*0.99=0.000099 0.9999*0.001=0.0009999 0.0010989Negative 0.000001 0.9989001 0.9989011

0.0001 0.9999 1

5

Page 6: Ps 1 Solutions

6

(a) It follows from this Table that Prob(Positive Test) = 0.0010989.

(b) To determine the Prob(Not-Infected| Positive Test), we use the quotient rule andthe results in the above Table:

Prob(Not-Infected| Positive Test) =Prob(Not-Infected AND Positive Test)

Prob(Positive Test)

=0.0009999

0.0010989= 0.90991;

this answer is correct within 5 decimal points.

Note: This problem was motivated by a paper on HIV testing (Operations Research,Vol. 44, pp. 543-569). The numbers reflect the tests used in the US in the late 1990s,and they were provided by Centers for Disease Control. Newer HIV tests have higheraccuracy.

6

Page 7: Ps 1 Solutions

7

Problem 7:

Department of Economics Econ 131a

Yale University Econometrics & Data Analysis I

Fall 2010

Answers to Assignment #2

1. This problem is very similar to the problems solved in the Variance Associatescase and it involves a direct application of the mechanics of random variables.

(a) We can compute the expected value for the total number of orders as

E(Total Orders) = E(Orders from Client 1 +Orders from Client 2)

= E(Orders from Client 1) + E(Orders from Client 2).

Now,

E(Orders from Client 1) = 0.1(0) + 0.2(50) + 0.3(200) + 0.4(300) = 190,

andE(Orders from Client 2) = 0.2(0) + 0.2(100) + 0.6(150) = 110.

ThereforeE(Total Orders) = 190 + 110 = 300.

We cannot compute the standard deviation because we don’t know whether the twoorders are independent.

(b) Since we are told that the orders from the two clients are independent, we cannow compute the variance (and standard deviation) of the total number of ordersusing the formula:

Var(Total Orders) = Var(Orders from Client 1) + Var(Orders from Client 2)

Using the formula from class,

Var(Orders from Client 1) = 0.1(0� 190)2 + 0.2(50� 190)2 + 0.3(200� 190)2+0.4(300� 190)2 =

= 12400,

and

Var(Orders from Client 2) = 0.2(0�110)2+0.2(100�110)2+0.6(150�110)2 = 3400.

1

Department of Economics Econ 131a

Yale University Econometrics & Data Analysis I

Fall 2010

Answers to Assignment #2

1. This problem is very similar to the problems solved in the Variance Associatescase and it involves a direct application of the mechanics of random variables.

(a) We can compute the expected value for the total number of orders as

E(Total Orders) = E(Orders from Client 1 +Orders from Client 2)

= E(Orders from Client 1) + E(Orders from Client 2).

Now,

E(Orders from Client 1) = 0.1(0) + 0.2(50) + 0.3(200) + 0.4(300) = 190,

andE(Orders from Client 2) = 0.2(0) + 0.2(100) + 0.6(150) = 110.

ThereforeE(Total Orders) = 190 + 110 = 300.

We cannot compute the standard deviation because we don’t know whether the twoorders are independent.

(b) Since we are told that the orders from the two clients are independent, we cannow compute the variance (and standard deviation) of the total number of ordersusing the formula:

Var(Total Orders) = Var(Orders from Client 1) + Var(Orders from Client 2)

Using the formula from class,

Var(Orders from Client 1) = 0.1(0� 190)2 + 0.2(50� 190)2 + 0.3(200� 190)2+0.4(300� 190)2 =

= 12400,

and

Var(Orders from Client 2) = 0.2(0�110)2+0.2(100�110)2+0.6(150�110)2 = 3400.

1

Page 8: Ps 1 Solutions

8

Therefore,Var(Total Orders) = 12400 + 3400 = 15800,

andSD(Total orders) =

p15800 = 125.70.

The expected value is the same as before; we don’t need independence to argueE(X + Y ) = E(X) + E(Y ).

In this part, we must find the Prob(Total Orders 100). Because we assume that theorders from the two clients are independent, we can compute the joint probabilitiesby multiplying the marginal probabilities. This gives the following probability table:

Client 2

0 100 1500 0.02 0.02 0.06 0.10

Client 1 50 0.04 0.04 0.12 0.20200 0.06 0.06 0.18 0.30300 0.08 0.08 0.24 0.40

0.2 0.2 0.6

.

By the addition rule:

Prob(Total orders 100) = 0.02 + 0.02 + 0.04= 0.08

(c) Using the formula for the covariance we can write

Cov(Orders from Client 1,Orders from Client 2) =

0.05(0� 190)(0� 110) + 0.04(0� 190)(100� 110) + 0.01(0� 190)(150� 110)+

0.06(50� 190)(0� 110) + 0.06(50� 190)(100� 110) + 0.08(50� 190)(150� 110)+

0.06(200� 190)(0� 110) + 0.09(200� 190)(100� 110) + 0.15(200� 190)(150� 110)+

0.03(300� 190)(0� 110) + 0.01(300� 190)(100� 110) + 0.36(300� 190)(150� 110) =

2800.

The expected value of the Total Orders is 300 (as before), and

Var(Total Orders) == Var(Orders from Client 1 +Orders from Client 2)= Var(Orders from Client 1) + Var(Orders from Client 2)++2Cov(Orders from Client 1,Orders from Client 2)= 12400 + 3400 + 2(2800) = 21, 400,

2and the standard deviation isp21, 400=146.29.

Finally, Prob(Total Orders 200) = 0.05 + 0.04 + 0.01 +0.06 + 0.06 + 0.08 + 0.06=0.36.

2. This problem is a simple application of binomial distributions, expected valuesand variances.

Let X denote the total number of callers that either got a busy signal or hang up.Then X is a binomial random variable with n = 2000 and p = 0.032; because X isBinomial, it follows that E[X] = np and Var(X) = np(1� p).

(a) The expected number of callers who got a busy signal or hang up is: E(X) =np = 2000(0.032) = 64. Similarly, the standard deviation is

pVar(X) =

p2000 ⇤ 0.032 ⇤ (1� 0.032)

=p61.952

= 7.871

(b) Let W be the total goodwill loss. Then W=$25X, and the expected total loss is

E[W ] = E[25X] = 25E[X] = 25(64) = $1600.

Also,

SD(W ) = SD(25X) = 25SD(X)

= 25(7.871)

= $196.78

3. (Acceptance Sampling). We want to find the Prob(Acceptance) under both ruleswhen the defective rate is 20%. To do that we will use the binomial distribution.

Rule 1: Let X denote the number of defectives in a random sample of 10. Then Xis a binomial random variable with n = 10 and p = 0.20. Therefore,

Prob(Acceptance under rule 1| 20% defective rate) == Prob(X = 0|n = 10, p = 0.20)= 10!

0!10!(0.20)0(0.80)10�0 = 0.1074.

Rule 2: Let Y denote the number of defectives in a random sample of 20. Then Y

3

Page 9: Ps 1 Solutions

9

8. (Calculating Portfolio Variances.) In this problem, we are asked to find the expected value and variance of a portfolio of stocks. To do that we will repeatedly use the formula for the expected value of a sum of random variables, and the formula for the variance of the sum of random variables. (a) To compute the expected value, we express the value of the portfolio in terms of the value of the Apple shares, the value of the Google shares, and the value of the Facebook shares. Mathematically, let PAAPL be the price of one share of Apple one month from now, PGOOG be the price of one share of Google one month from now, and PFB be the prices of one share of Facebook one month from now. Then, the total value of Marge’s portfolio in one month is:

Total Value of Portfolio=200 × PAAPL + 100 × PGOOG + 50 × PFB. This now implies that:

E(Total Value of Portfolio) = E(200 × PAAPL + 100 × PGOOG + 50 × PFB) = (200 × $120 + 100 × $60 + 50 × $60) = $24, 000 + $6, 000 + $3, 000 = $33, 000.

(b) To calculate the standard deviation, we will first need to calculate the variance in the total value of the portfolio. In the process we will use the following fact (which generalizes Fact 2 in the Sums of Random Variables note). The variance of a sum of three random variables equals the sum of the variances of the three plus the 2 times the covariance of each pair of random variables. The trick in this problem was to derive this formula by applying the formula for a variance of a sum of two random variables twice (in succession). So, to get the variance of Marge’s portfolio, we have: Var(Value of Portfolio) = Var(200 × PAAPL + 100 × PGOOG + 50 × PFB) = Var(200×PAAPL)+Var(100×PGOOG)+Var(50×PFB)+ 2×Cov(200PAAPL,100PGOOG)+2×Cov(200PAAPL,50PFB)+ 2×Cov(100PGOOG,50PFB) = 40000Var(PAAPL)+10000Var(PGOOG)+2500Var(PFB)+40000Cov(PAAPL,PGOOG) +20000Cov(PAAPL,PFB)+10000Cov(PGOOG,PFB) ���= 40000 × 100 + 10000 × 64 + 2500 × 64 + 40000 × (−36) + 20000 × 24 + 10000×19 ���= 4, 030, 000 (dollars)2 and the standard deviation = sqrt(4, 030, 000) = 2007.49 dollars. Remark. You should go over this problem carefully and make sure that you understand how and why the constants pull out from the expectations, variances and covariances. Also, it is useful to remember that variances can never be negative, while covariances can.

Page 10: Ps 1 Solutions

10

(c) For Marge’s sister, Maggie, the calculations are pretty much the same. It should come as no surprise that the expected value of Maggie’s portfolio, one month from now, is exactly the same as that of Marge’s portfolio = $ 33, 000 dollars. The variance of Maggie’s portfolio, one month from now, is a little higher = $ 5, 230, 000 (dollars)2 and the corresponding standard deviation is $2286.92 dollars. (d) There isn’t very much difference between the performances of the two portfolios. However, Marge’s portfolio gives the same expected return, with a slightly lower standard deviation; that is, she exposes herself to less risk. In that sense, her portfolio performs better than Maggie’s. You should be able to see why this is so.