2009-11-14_BR_Assigment
-
Upload
cyril-moreau -
Category
Documents
-
view
216 -
download
0
Transcript of 2009-11-14_BR_Assigment
-
8/6/2019 2009-11-14_BR_Assigment
1/26
Academic year 2009 / 2010
0 1> 0 2< 0
Business researchProf. Herbert Hamers
Assignment
Submission deadline: 22.10.2009
Study Group BO9D:
Student : 09027165
-
8/6/2019 2009-11-14_BR_Assigment
2/26
Business research AssignmentStudent 09027165 2November 2009
Table of Content
1) Exercise 1: normal distribution ___________________________________________3 a) Descriptive statistics _______________________________________________________ 3 b) Probability that the farm will be profitable next summer ________________________ 4 c) Probability that the farm will not be profitable next summer _____________________ 4 d) Mean with fertilizer _______________________________________________________ 5
2) Exercise 2: descriptive statistics and data patterns____________________________ 7 a) Graphical way presentation_________________________________________________ 7
i) Weight _____________________________________________________________ 7ii) Education level _____________________________________________________ 8iii) Wage_____________________________________________________________ 9iv) Food expenses______________________________________________________ 9
b) Sample mean, standard deviation and median ________________________________ 10 c) 2s, 4s and 6s intervals _____________________________________________________ 10 d) 95% confidence interval___________________________________________________ 11 e) Scatter plots_____________________________________________________________ 11 f) Equation of the regression line _____________________________________________ 12
i) Food and housing income/family income _________________________________13ii) Clothing and recreation/family income _________________________________13
g) Interpretation ___________________________________________________________ 14 i) Questions a to c: distributions __________________________________________ 14ii) Question d: 95% confidence interval ___________________________________14iii) Questions e and f: linear relationship ___________________________________14
3) Exercise 3: Portfolio expectation, standard deviation and co-variance___________ 15 a) Individual shares_________________________________________________________ 15
i) H5N1 _____________________________________________________________ 15ii) Thinderbird _______________________________________________________15
iii) Correlation _______________________________________________________16b) All on H5N1_____________________________________________________________ 16 c) All on Thunderbirds ______________________________________________________ 17 d) Half-Half scenario________________________________________________________ 17 e) Risk lover portfolio_______________________________________________________ 17 f) Risk averse scenario.______________________________________________________ 18
4) Exercise 4: Markowitz portfolio theorem __________________________________19
5) Exercise 5: Confidence interval and tests __________________________________21 a) 95% confidence interval for ______________________________________________ 21 b) Opinion on the announced _______________________________________________ 21 c) Formal hypothesis test ____________________________________________________ 21 d) Minimal size of the sample_________________________________________________ 22
6) Exercise 6: Linear regression and confidence intervals_______________________23 a) Excesses of the company and the world stock exchange _________________________ 23 b) Scatter plot______________________________________________________________ 23 c) Regression ______________________________________________________________ 24 d) Reliability of the slope. ____________________________________________________ 25 e) Constant term ___________________________________________________________ 25 f) Performance ____________________________________________________________ 25 g) Percentage explained by the model __________________________________________ 26 h) Prediction interval _______________________________________________________ 26
-
8/6/2019 2009-11-14_BR_Assigment
3/26
Business research AssignmentStudent 09027165 3November 2009
1) Exercise 1: normal distribution
Basisi: file crop.xls
a) Descriptive statistics
The usage of the data analysis function of excel provides the following results:
Descriptive statistics HistogramCrop
Mean 1502,941176Standard Error 15,06687062Median 1500Mode 1500StandardDeviation
87,8541978
Sample Variance 7718,360071Kurtosis 0,16548812Skewness -0,379772489Range 350Minimum 1300Maximum 1650Sum 51100Count 34
Table 1 Figure 1
The crop in kilograms produced per hectare in the sample seems symmetrically distributedaround the mean (1502). The mode and the median that are exactly at 1500 seem to confirmthis hypothesis.The histogram is not clearly bell shapes so we can only apply Chebyshevs rule.
At least 75% of the measurement fall into the interval (1326;1678) Actually 94,11 At least 89% of the measurement fall into the interval (1238;1766) Actually 100%
The fact that the distribution is normally distributed can not be concluded only from thegraphical analysis, however this is the hypothesis taken for the next questions.
Crop yield distribution
2
45
10
6
43
0
2
4
6
8
10
12
1350 1400 1450 1500 1550 1600 1650
Bin
F r e q u e n c y
-
8/6/2019 2009-11-14_BR_Assigment
4/26
Business research AssignmentStudent 09027165 4November 2009
b) Probability that the farm will be profitable next summer
Hypothesis givenCrop yield distribution normal with =1500 and =200
We are looking for the probability P(X>1600)=1-P(X1600)=1-P(X
-
8/6/2019 2009-11-14_BR_Assigment
5/26
Business research AssignmentStudent 09027165 5November 2009
d) Mean with fertilizer
To solve the question determine the mean such that the probability that the farm will beprofitable next summer will be equal to 0,4 we need to understand the effect of the fertiliser.
The effect of the fertiliser is to increase the mean leaving the standard deviation unchanged. This effect is shown in the Figure 4 hereunder.
0 1> 0 2< 0
increases decreases
Figure 4
With the help of fertiliser we try to move to the left so that the area detailed in Figure 2(now approximately 0,3) becomes 0,4.
1 1600
We are lookingfor a 1 so that
this area is 0,4
Figure 5
-
8/6/2019 2009-11-14_BR_Assigment
6/26
Business research AssignmentStudent 09027165 6November 2009
First we will look for the point x1 so that P(X>x1)=0,4
x1
We search for ax1 so that thisarea is 0,4without fertiliser
Figure 6
To say that P(X>x1)=0,4 is the same as saying P(X1600)= 1-P(X fertiliser
-
8/6/2019 2009-11-14_BR_Assigment
7/26
Business research AssignmentStudent 09027165 7November 2009
2) Exercise 2: descriptive statistics and data patterns
My home address is 13 5F Luruper Haupstrasse, 22547 Hamburg. Come and visit mesometimes! I consequently chose the file family5.xls
a) Graphical way presentation
i) Weight
The value to present is an interval value, the presentation as histogram has been chosen.
The number of categories (bin) has been calculated using the formula: k=1+3,3log(n) where nis the number of measures.
k=1+3,3*log(300)= 9,1745
For the weight and the others we rounded this value to k=10 categories.
The width of the classes is then calculated using the formula: w= (largest observation-smallestobservation)/Number of classes.
w=(109,2-40,4)/10= 6,88
For the width we rounded the value to w=7 .
We consequently considered the following classes:Class lower
boundupperbound
n 300 1 40 47k 10 2 47 54w 7 3 54 61min 40,4 4 61 68max 109,2 5 68 75
6 75 827 82 89
8 89 969 96 103
10 103 110Table 2
-
8/6/2019 2009-11-14_BR_Assigment
8/26
Business research AssignmentStudent 09027165 8November 2009
Using excel histogram function we get the following result.
Weight (In kg)
48
30
46
63 64
49
26
82
0
10
20
30
40
50
60
70
47 54 61 68 75 82 89 96 103 110
Upper bound of the interval (width 7 kg)
F r e q u e n c y
Figure 7
ii) Education level
For this variable, the number of possible value being more limited (nominal variable) theprevious reasoning on the classes was ignored and all the categories possible were displayed.In this case a pie chart seems also a suitable way to present the distribution of the variable.
EDU
69 65
111
41
14
0
20
40
60
80
100
120
1 2 3 4 5
Level of eductaion
F r e q u e n c y
Figure 8 Figure 9
Level of education
5 ; 14; 5%
4 ; 41; 14%
3 ; 111; 36%
2 ; 65; 22%
1 ; 69; 23%
-
8/6/2019 2009-11-14_BR_Assigment
9/26
Business research AssignmentStudent 09027165 9November 2009
iii) Wage
Classes considered :see explanation in 2)a)i).
Class lowerbound
upperbound
n 300 1 0 2,7k 10 2 2,7 5,4w 2,7 3 5,4 8,1min 0 4 8,1 10,8max 26,39 5 10,8 13,5
6 13,5 16,27 16,2 18,98 18,9 21,69 21,6 24,3
10 24,3 27Table 3 Figure 10
iv) Food expenses
Classes considered :see explanation in 2)a)i).
Class lowerbound
upperbound
n 300 1 3 4,8k 10 2 4,8 6,6w 1,8 3 6,6 8,4min 3,05 4 8,4 10,2max 20,88 5 10,2 12
6 12 13,87 13,8 15,68 15,6 17,49 17,4 19,2
10 19,2 21Table 4
Wage (Hourly rate in )
6964
74
48
28
7 5 1 2 20
1020304050
607080
2 , 7 5 , 4 8 , 1 1 0 , 8
1 3 , 5
1 6 , 2
1 8 , 9
2 1 , 6
2 4 , 3 2 7
upper bound of intervals (width 2,7 )
F r e q u e n c y
Food expenses (In 1000 )
35
62
73
58
29
1613
5 5 4
0
10
20
30
40
50
60
70
80
4,8 6,6 8,4 10,2 12 13,8 15,6 17,4 19,2 21
Upper bound of the interval (width 1800 )
F r e q u e n c y
-
8/6/2019 2009-11-14_BR_Assigment
10/26
Business research AssignmentStudent 09027165 10November 2009
b) Sample mean, standard deviation and median
The data analysis/descriptive statistics function of excel is used repeatedly and summarizedin this table using only the values requested.
WEIGHT WAGE FOODEXPMean 74,59 6,24 8,49Standard deviation 11,96 4,83 3,48Median 74,85 5,95 7,85
Table 5
c) 2s, 4s and 6s intervals
The table is computed from the one presented in 2)b) with x being the sample mean and sthe standard deviation. The data actually in the interval are computed using a formula basedon functions COUNT and COUNTIF of excel.
WEIGHT WAGE FOODEXP x -s 62,62 1,41 5,01 x +s 86,55 11,08 11,97Empirical rule 68,00% 68,00% 68,00%Actual data in this interval 68,33% 65,00% 71,00%
x -2s 50,66 -3,43 1,52 x +2s 98,51 15,91 15,45Empirical rule 95,00% 95,00% 95,00%Chebyshev 75,00% 75,00% 75,00%Actual data in this interval 95,67% 96,33% 95,00%
x -3s 38,70 -8,26 -1,96 x +3s 110,48 20,75 18,93Empirical rule 99,70% 99,70% 99,70%Chebyshev 88,89% 88,89% 88,89%Actual data in this interval 100,00% 98,67% 98,67%
Table 6
Note: For the interval ( x -s; x +s) Chebyshev is not presented because it is applicable only from the ( x -2s; x +2s) interval.
-
8/6/2019 2009-11-14_BR_Assigment
11/26
Business research AssignmentStudent 09027165 11November 2009
d) 95% confidence interval
The data analysis/descriptive statistics function of excel is used repeatedly on the 3 variableconsidered. We then have the mean and the standard deviation of each sample that we need tocompute our confidence intervals.We use the student distribution and the standard deviation of theThe value of the student distribution t 299; 0,025 will be needed to compute the interval. It iscalculated using Excel and the formula TINV(0,05;299). Note the value 0,05 used as Excel isusing the two tailed calculation.The boundaries of the confidence interval are then computed using the formula
(n
st x *025,0;299 ,
n
st x *025,0;299+ )
Result:
FINC TOTEXP1 TOTEXP2Mean ( x ) 30,43 19,71 5,09Standard deviation (s) 13,94 8,63 2,35t299;0,025 1,97 1,97 1,97n 300 300 300lower bound 95% interval 28,84 18,72 4,82upper bound 95% interval 32,01 20,69 5,36
Table 7
e) Scatter plots
The scatter diagrams are realized using basic chart wizard of Excel and asking for the displayof the equation of the regression line and the R 2 coefficient. Note that R 2 is not the correlationcoefficient but its square. We consequently used the function CORREL of excel to computethe coefficient of correlation. However we can note that both scatter diagram tend to show a
clear positive linear relation, hence the coefficient of correlation will be positive so 2 R R = We compute in Table 8 the square of r xy to check our result.
-
8/6/2019 2009-11-14_BR_Assigment
12/26
Business research AssignmentStudent 09027165 12November 2009
Total expenses for food and housing on income
y = 0,6067x + 1,2469R2 = 0,9602
0
10
20
30
40
50
60
0 10 20 30 40 50 60 70 80 90
Familly ncome
T o
t a l e x p e n s e s v i
t a l
Figure 11
Total expenses for clothing and recreation on income
y = 0,1677x - 0,013R2 = 0,9905
0
2
4
6
8
10
12
14
16
0 10 20 30 40 50 60 70 80 90Familly ncome
T o
t a l e x p e n s e s n o n v i
t a l
Figure 12
Coefficient of correlation:rxy rxy
2 Correl Vital exp/income 0,979879344 0,960163529Correl non-Vital exp/income 0,995243075 0,990508779
Table 8
f) Equation of the regression line
Note Excel has been doing all the job and the equations are given on the Figure 11 and Figure 12 here above.
However we provide herewith an alternative calculation for checking. There are normally some assumption we shall be checking before applying a linear regression. From the
scatter diagram we can only believe that the homoscedasticity is respected, the other 3 assumptions arenot checked and shall be checked if we want to make real conclusion.
We use the functions data analysis/ regression which will give us b 0 and b 1 such as: xbb y 10 +=
-
8/6/2019 2009-11-14_BR_Assigment
13/26
Business research AssignmentStudent 09027165 13November 2009
i) Food and housing income/family income
SUMMARY OUTPUT
Regression Statistics
Multiple R 0,979879344R Square 0,960163529Adjusted R Square 0,96002985Standard Error 1,725643349Observations 300
ANOVAdf
Regression 1Residual 298Total 299
Coefficients Intercept 1,246850796FINC 0,606667955
Table 9
The equation in Figure 11 is confirmed. x y 50,6066679561,24685079 +=
ii) Clothing and recreation/family income
SUMMARY OUTPUT
Regression Statistics Multiple R 0,995243075R Square 0,990508779Adjusted R Square 0,99047693Standard Error 0,229240545Observations 300
ANOVAdf
Regression 1Residual 298Total 299
Coefficients Intercept -0,01295752FINC 0,167697813
Table 10
The equation in Figure 12 is confirmed. x y 30,167697812-0,0129575 +=
b1
b0
-
8/6/2019 2009-11-14_BR_Assigment
14/26
Business research AssignmentStudent 09027165 14November 2009
g) Interpretation
i) Questions a to c: distributions
When considering the graphs produced in question a we note that only the variable Weighthas a distribution with a bell shape and the others have mostly skewed distributions.Consequently this is no surprise that the variable respects the empirical rule in the tablepresented in question c. For the 3 other variable only the Chebyshevs rule is applicable asthey are not bell shaped.
ii) Question d: 95% confidence intervalThere is not much to conclude from this table except that we can guess that the mean for thepopulation for each of these variables is with 95% certainty between the lower bound and the
upper bound of the interval presented. For example I can tell with 95% certainty that theaverage net family income is between 28840 and 32010 Euros.
iii) Questions e and f: linear relationship
We can conclude of strong linear relationships between the variables FINC and TOTEXP1and between FINC and TOTEXP2. The strength of this relationship is shown by the scatterplots that show that graphs are very much aligned and concentrated around the trend line. Butthis is also confirmed by the coefficient of determination show that respectively 96% and 99%of the variables TOTEXP1 and TOTEXP2 can be explained by using FINC and the linearmodel.
Note : as explained in f) we have not fully demonstrated the validity of our model as we have not checked the 4assumptions necessary for concluding of the applicability of the model, the conclusion here above is then trueonly under the condition that the 4 assumptions are verified.
-
8/6/2019 2009-11-14_BR_Assigment
15/26
Business research AssignmentStudent 09027165 15November 2009
3) Exercise 3: Portfolio expectation, standard deviation andco-variance
a) Individual shares In this first time we are interested in the probability distribution of each share, regardless of the value of the other one.
To obtain this probability distributions we sum the different lines (for T) and column (for H)to obtain the probability associated with each possible values.
TP(H,T) 1,9 2 2,1 P(H)
0 0 0,1 0 0,1
H 1 0,15 0,2 0,15 0,51,1 0 0 0,1 0,11,3 0,2 0 0,1 0,3
P(T) 0,35 0,3 0,35Table 11
These are the values we use as basis here for our calculations
i) H5N1
The table prepared by excel hereunder aims at applying the two following formula: ==allx
x X xP X E )()( , ==allx
x X P x X V )()()( 2 and )( X V =
H P(H) H*P(H) E(H) H-E(H) (H-E(H)) 2 (H-E(H)) 2*P(H) V(H) H 0 0,1 0 1 -1 1 0,1 0,128 0,3577711 0,5 0,5 0 0 0
1,1 0,1 0,11 0,1 0,01 0,0011,3 0,3 0,39 0,3 0,09 0,027
Table 12
ii) Thinderbird
The same philosophy is applied a second time for T.T P(T) T*P(T) E(T) T-E(T) (T-E(T)) 2 (T-E(T)) 2*P(T) V(T) T
1,9 0,35 0,665 2 -0,1 0,01 0,0035 0,007 0,0836662 0,3 0,6 0 0 0
2,1 0,35 0,735 0,1 0,01 0,0035Table 13
-
8/6/2019 2009-11-14_BR_Assigment
16/26
Business research AssignmentStudent 09027165 16November 2009
iii) Correlation
We calculate first the covariance using the formula: ==== ),())((),cov( bY a X Pba y x y x xy
The calculation was done in Excel and is not presented as the detail would not be possible tofollow.
xy= -0,002
The coefficient of correlation is then
y x
xy
= = -0,06682
b) All on H5N1
Instead of building a separate spreadsheet for each of the exercise hereunder I decided to builda few formula in excel using one parameter that I call w which is the weight of H in theportfolio considered.
If the investor has 1000 Euros to invest the portfolio is then represented by:M=w*1000H+(1-w)*1000/2*T= w*1000H+(1-w)*500*T
Note the share price of T of 2 Euros used in the equation that gives the 500 in front of T.
We then use the basic formulas for the expectation and the variance of the portfolio.
)(*500*)1()(1000*)( T E w H E w M E +=
),(*500*1000*)1(**2)(*500*)1()(1000*)( 2222 T H COV wwT V w H V w M V ++=
And )( M V m =
All parameter were calculated in the previous exercise.The formula are entered in Excel and then we can solve the current question and the next twoby simply changing w.
-
8/6/2019 2009-11-14_BR_Assigment
17/26
Business research AssignmentStudent 09027165 17November 2009
Lets come back to the case we invest everything in H5N1. Means w=1.
Excel delivers us the following results:
H w 1T (1-w) 0
E(M) 1000V(M) 128000 M 357,7708764
Table 14
c) All on Thunderbirds
The same is applied as previous question using w=0.
H w 0
T (1-w) 1E(M) 1000V(M) 1750Sigma M 41,83300133
Table 15
d) Half-Half scenario
The same is applied as previous question using w=1/2.
H w 0,5T (1-w) 0,5E(M) 1000V(M) 31937,5Sigma M 178,71066
Table 16
e) Risk lover portfolio
Lets consider again the formula in b) and replace the E(H) and E(T) by their actual value.
10002*500*)1(1000*)(*500*)1()(1000*)( =+=+= wwT E w H E w M E
We discover that the mean of the mix is not dependant on w, consequently it does not matterwhich scenario is chosen, the expectation will be the same.
In reality the investor will certainly take into account the risk as the expectation has noimportance, this question is seen in the next paragraph.
-
8/6/2019 2009-11-14_BR_Assigment
18/26
Business research AssignmentStudent 09027165 18November 2009
h(w)
0
50000
100000
150000
200000
250000
- 0 , 3
- 0 , 2
- 0 , 1 0
0 , 1
0 , 2
0 , 3
0 , 4
0 , 5
0 , 6
0 , 7
0 , 8
0 , 9 1
1 , 1
1 , 2
1 , 3
f) Risk averse scenario.
We need to find the minimum to the equation:
),(*500*1000*)1(**2)(*500*)1()(1000*)( 2222 T H COV wwT V w H V ww f M ++==
We then replace the values we know form a)
)1(**2000)1(*1750*128000)( 22 wwwww f +=
)(*2000)21(*1750*128000)( 222 wwwwww f ++=
1750*)20003500(*)20001750128000()( 2 ++++= www f
1750*5500*131750)( 2 += www f
Lets consider what is under the square that we will callh(w).
1750*5500*131750)( 2 += wwwh
The function h is always positive on the rangeconsidered 0n [0;1] we checked it only graphically inFigure 13.
Notes: This demonstration was not really necessary as we are
dealing with a variance which by definition is positive, butI wanted to take no mathematical risk.
The graph is also showing a minimum between 0 and 0,1
Figure 13
Now that we have shown that h(w)>0 whatever w value we see that minimizing functions hand f is the same thing. As the square root function is such that if x1
-
8/6/2019 2009-11-14_BR_Assigment
19/26
Business research AssignmentStudent 09027165 19November 2009
4) Exercise 4: Markowitz portfolio theorem
This exercise uses the same formula as the previous one.
If we call p the weight of the asset X then the weight of the asset Y is (1-p)
Which gives us for M=pX+(1-p)Y
)(*)1()(*)( Y E p X E p M E +=
),(*)1(**2)(*)1()(*)( 22 Y X COV p pY V p X V p M V ++=
And )( M V m =
We search first the answer to the question b and then we will present all results togethergraphically
Here we assume that V(M) is always positive. Minimizing V and is the same thing.
We consider the function),(*)1(**2)(*)1()(*)()( 22 Y X COV p pY V p X V p M V p f ++==
)()),(*2)(2(*)),(*2)()((*)( 2 Y V Y X COV Y V pY X COV Y V X V p p f ++++= We look for the minimum point of this function which we find by derivation.
0)),(*2)(2()),(*2)()((**2)( =+++=
Y X COV Y V Y X COV Y V X V p
p p f
0,395712)),(*2)()((*2
)),(*2)(2( +
=Y X COV Y V X V
Y X COV Y V p
If we then put p back the first equations we find:E(M)= 0,221871 M =0,0849272This is the minimum variance point
Verification:We just checked a few values around this supposed minimum.
X=p M E(M) SIGMA(M)0,395709 0,604291 0,22187127 0,084927178076
0,39571 0,60429 0,2218713 0,0849271780740,395711 0,604289 0,22187133 0,0849271780720,395712 0,604288 0,22187136 0,0849271780710,395713 0,604287 0,22187139 0,0849271780720,395714 0,604286 0,22187142 0,0849271780730,395715 0,604285 0,22187145 0,084927178075
Table 17
-
8/6/2019 2009-11-14_BR_Assigment
20/26
Business research AssignmentStudent 09027165 20November 2009
The curve is obtained by using excel and the above mentioned formulas. We then get theTable 18 with several variation of the parameter p which is then depicted with a scatterdiagram in Figure 14.
X M E(M) M 0 1 0,21 0,14
0,05 0,95 0,2115 0,1291022
0,1 0,9 0,213 0,11887090,15 0,85 0,2145 0,1094931
0,2 0,8 0,216 0,10120630,25 0,75 0,2175 0,0942987
0,3 0,7 0,219 0,08909160,35 0,65 0,2205 0,0858949
0,4 0,6 0,222 0,08493570,45 0,55 0,2235 0,0862889
0,5 0,5 0,225 0,08984990,55 0,45 0,2265 0,0953717
0,6 0,4 0,228 0,10253820,65 0,35 0,2295 0,1110312
0,7 0,3 0,231 0,12057080,75 0,25 0,2325 0,1309284
0,8 0,2 0,234 0,14192510,85 0,15 0,2355 0,1534234
0,9 0,1 0,237 0,16531870,95 0,05 0,2385 0,1775313
1 0 0,24 0,19Table 18
Efficient curve of the Mix
0,195
0,2
0,205
0,21
0,215
0,22
0,225
0,23
0,235
0,24
0,245
0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2
(M)
E ( M )
Market liney=0,2793549x+0,2(approximation only)
Efficiency Curve
Tangent at minimum sigmax=0,0849272
Minimum variance pointx=0,0849272y=0,221871at the intersection of the tangent and thecurve it is where the efficiency c urve starts
Figure 14
-
8/6/2019 2009-11-14_BR_Assigment
21/26
Business research AssignmentStudent 09027165 21November 2009
5) Exercise 5: Confidence interval and tests
a) 95% confidence interval for
Hypothesis: in this full exercise we will consider that we trust the value of 6 given by thecompany for the standard deviation
We use the following formulas to draw the 95% interval around x .
n z xlb
025,0
=
n z xlb
025,0
+=
Excel helps us to get the results presented in the table hereunder.
x 90Z0,025 1,96 6n 9lb 86,08ub 93,92
Table 19
b) Opinion on the announced The mean =100 of the population given is outside the 95% confidence interval given. Basedon this interval we can conclude that this is not accurate and that it should be lower. This iswhat we will formally test in the next question.
c) Formal hypothesis test
We consider the following hypothesis:
H0: =100H1:
-
8/6/2019 2009-11-14_BR_Assigment
22/26
Business research AssignmentStudent 09027165 22November 2009
d) Minimal size of the sample
We are looking for n 0 so that
ub-lb=0,5
5,0)()(0
025,00
025,0=+
n z x
n z x
5,0*20
025,0=
n z
0025,0 5,0*2 n z =
2213)5,0
*2( 2025,00 = zn
The minimum sample size so that the width of the confidence interval is 0,5 is 2213.
Verification:With excel we computed back the confidence interval for the values n=2212 and n=2213
x_bar 90 90Z0,975 1,96 1,96Sigma 6 6n 2212 2213lb 89,74996 89,75002ub 90,25004 90,24998ub-lb 0,50008 0,49996
Table 20
-
8/6/2019 2009-11-14_BR_Assigment
23/26
Business research AssignmentStudent 09027165 23November 2009
6) Exercise 6: Linear regression and confidence intervals
a) Excesses of the company and the world stock exchange
The excess is calculated by subtracting from the return of the company (respectively theworld stock) the risk free interest rate (0,01).
As the table is quite long we give herewith only the first lines of the table as exemple.
company world stock Risk_free Excess_company Excess_world_stock0,02 -0,01 0,01 0,01 -0,020,01 -0,01 0,01 0,00 -0,020,12 0,08 0,01 0,11 0,070,08 0,06 0,01 0,07 0,050,04 0,02 0,01 0,03 0,01
-0,01 0,02 0,01 -0,02 0,010,00 -0,03 0,01 -0,01 -0,04
Table 21
b) Scatter plot
-0,10
-0,05
0,00
0,05
0,10
0,15
-0,08 -0,06 -0,04 -0,02 0,00 0,02 0,04 0,06 0,08 0,10 0,12
Excess world stock
E x c e s s c o m p a n y
1
Figure 15
-
8/6/2019 2009-11-14_BR_Assigment
24/26
Business research AssignmentStudent 09027165 24November 2009
c) Regression
We use the function data analysis/linear regression from excel to produce the followingresults
SUMMARYOUTPUT
Regression Statistics
Multiple R 0,769632194R Square 0,592333715Adjusted RSquare
0,587039347
Standard Error 0,025262736Observations 79
ANOVA
df SS MS F Significance F
Regression 1 0,071402455 0,071402455 111,8799805 1,15886E-16Residual 77 0,049141849 0,000638206Total 78 0,120544304
Coefficients Standard Error
t Stat P-value Lower 95% Upper 95% Lower 96,0% Upper 96,0%
Intercept -0,000960921 0,002870567 -0,334749365 0,738724329 -0,00667695 0,004755109 -0,006957893 0,005036052Excess_world_ stock
0,819299331 0,077458023 10,57733333 1,15886E-16 0,665060704 0,973537958 0,657479881 0,981118782
Table 22
According to this table the linear regression equation between the company excess and theworld stock excess is:
10,00096092-x10,81929933 = y
Note : as for 2)f) before trusting this equation we should verify the 4 assumptions necessary to apply a linear
model. This is not rigorously done here and the result given by Excel are trusted as they are.
Coefficients of the linearregression equation 95% confidence interval 96% confidence interval
-
8/6/2019 2009-11-14_BR_Assigment
25/26
Business research AssignmentStudent 09027165 25November 2009
Verification:We ask excel to draw the regression line on the scatter diagram and to display the equation
Figure 16
d) Reliability of the slope.
We base our reasoning on the Table 22.
It indicates that the 96% confidence interval for the slope is [0,657479881 ; 0,981118782]. If we rely on this interval the relation between the two variable is positive and is definitelyexisting because 0 is not in the interval.
e) Constant term
We base our reasoning on the Table 22.
It indicates that the 95% confidence interval (5% significance level) is [-0,00667695 ;0,004755109]. It can be positive or negative. We then can not conclude that the companyoffers a guaranteed advantage or disadvantage compared to the world stock.
f) Performance
If the market if performing good, the company is somehow performing worse that the worldstock exchange. Explanation if we look at the slope it is according to the 95 and 96% intervaldefinitely less than 1 and the value given by the equation is 0,82 approximately. Means if theexcess of the world stock is increasing by 1 the company excess is increasing only about 0,82.
But on the contrary if the market is falling it is falling less.
The constant term being around 0, we can conclude that:
The investment in the company will not bring the investor a real advantage than theaverage market but will lower his risk.
y = 0,8193x - 0,001
-0,10
-0,05
0,00
0,05
0,10
0,15
-0,08 -0,06 -0,04 -0,02 0,00 0,02 0,04 0,06 0,08 0,10 0,12
Excess world stock
E x c e s s c o m p a n y
1
-
8/6/2019 2009-11-14_BR_Assigment
26/26
Business research Assignment
g) Percentage explained by the model
In Table 22 , R 2 is given as 0,592333715.Mean:
59,23 % or the of the variability of the return of the company is explained by the linearregression model.
Note: this question could have also been explained by displaying on the excel scatter plot theR2 value associated to the trend line.
h) Prediction interval
The prediction interval could be manually computed with the following formula:
2
2
2,04,0 )1(
)(11
x
gn sn
x x
nst y
++ where
1
0
b
b y x g
= and 01 b xb y +=
But this manual calculation is quite laborious as we need to calculate the y for each valueof x, then the error and the standard deviation of it.Due to the high probability of errors I preferred to use the data analysis plus/predictioninterval in excel (In the tools provided on the CD with the book Managerial Statistics fromauthor Gerald Keller)
The result is displayed in Table 23, and then we corrected the values because the question wasreferring to the return and not to the excess. (We put back the 0,01 of the no riskinvestment)
Prediction Interval excess
Predicted value 0,015425066
Prediction Interval
Lower limit -0,037738959
Upper limit 0,068589091
Interval Estimate of Expected Value
Lower limit 0,009021793Upper limit 0,021828339
Prediction interval return of the copmpany
Predicted value 0,025425066Lower limit -0,027738959Upper limit 0,078589091
Table 23
The prediction interval is pretty wide as the return based on a 96% confidence intervalcan take values between -2,77% and 7,85%, this is not a narrow prediction.