TUTORIAL: BASIC OPERATIONS [1] AN EXERCISE ON HOW …miniahn/ecn726/gauss_basics.pdf · TUTORIAL:...

TUTORIAL: BASIC OPERATIONS

[1] AN EXERCISE ON HOW TO USE GAUSS • On your computer, create a file folder called “seminar”. And download the two files called exer.txt and ols.prg from Dr. Ahn’s website. Locate the two files in the folder “seminar”. Then, follow the steps described below. The pictures shown below would be slightly different from what you will actually see from your screens.

1. Double click on the GAUSS 6.0 icon. Then, you will see:

2. Type “chdir c:\seminar” (or a: if you use a floppy disk) on the command window and push the return button. By doing so, you can change your working directory to “seminar”

Basic-1

3. Click on file/open.

Then, you will see:

Click on the Open button. Then,

Basic-2

You can edit the program if you want. Once you finish editing, save the program using the save button on the screen. If you would like to create your own program, on command window, type, “edit johndoe.prg” (or other name) and push the return key. Then, you can get an empty window named johndoe.prg. Type your own codes on it! 4. To run the program, click on Action\Run Active File.

• To see the output file, click on file/Open. When the list of the files appears, click on “ols.out”, which is an output file created by running ols.prg. Then, you will see:

Basic-3

Basic-4

[2] SOME USEFUL COMMANDS (1) Reading data from a text file:

• The form of exer.txt:

• This data set contains 100 observations for 5 variables. To read this data set:

load data[100,5] = exer.txt; y = data[.,1]; @ all the entries on the first column @ x2 = data[.,2]; @ all the entries on the second column @ x3 = data[.,3]; x4 = data[.,4]; x5 = data[.,5]; The last 5 lines define variable names.

Basic-5

(2) Making Comments GAUSS does not execute any codes between @ and @ or between /* and */. Thus, you can make some comments in the program using @ ... @ or /* ... */.

(3) Matrix Operation • Let A = [aij] and B = [bij] be conformable matrices.

• A*B: Product of A and B. • A’: Transpose of A. • A’B: Product of A’ and B. • A[.,1]: The first column of A. • A[1,.]: The first row of A. • A[1,2]: The (1,2)th element of A. • A[1:5,.]: The 1st, 2nd, 3rd, 4th and 5th rows of A. • A[1 3,.]: The 1st and 3rd rows of A. • inv(A): Inverse of A. • invpd(A): Inverse of a positive definite matrix A. • diag(A): n×1 vector of diagonal elements of an n×n matrix A.

• sqrt(A): [ ija ].

• A^2: [aij2].

• A|B: Merge A and B vertically. • A~B: Merge A and B horizontally. • A.*.B: Kronecker Product of A and B. • sumc(A): n×1 vector of sums of individual columns for an m×n matrix A.

EX: . ⎥⎦

⎤⎢⎣

⎡=→

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡=

129

)(654321

AsumcA

• meanc(A): n×1 vector of means of individual columns for a m×n matrix A.

EX: . ⎥⎦

⎤⎢⎣

⎡=→

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡=

43

)(654321

AmeancA

• stdc(A): n×1 vector of standard errors of individual columns for an m�n matrix A. • Suppose A = [aij]m×n; B = [bij ]m×n; C = [cij ]m×1; and d = scalar.

• A./B: [aij/bij] (element by element operation).

Basic-6

Basic-7

• A./C: [aij/ci] (element by element operation). • d-A: [d-aij]. • d*A: [daij]. • A/d: [aij/d].

• Generating a matrix of standard normally distributed random variables:

• rndns(t,k,dd): random matrix of t rows and k columns generated using dd as a seed number seed number.

• For uniformly distributed (between zero and 1) random variables use rndus[t,k,dd]. (4) Do Loop • When you need to do the same job repeatedly, the following command would be useful: i = 1; do while i <= 100; : : : i = i + 1; endo; If you run the above commands, then the codes written on (: : :) are repeatedly executed. (5) Drawing Histogram • Run the following command lines: library pgraph; graphset; {a1,a2,a3} = hist(storb, 50);

If you execute the three lines, GAUSS draw a histogram for the variable named “storb” counting the frequencies for each of 50 categories. The variables named a1, a2 and a3 will contain some numeric information about the histogram (see the Gauss manual for detail).

Basic-8

• Or, you can run the following command lines: library pgraph; graphset; v = seqa(0,0.01,200); {a1,a2,a3} = hist(storb,v); The v is a 200×1 vector of an additive sequence. The first argument in sega is the starting

value. The second argument is increment, and the third argument is the number of elements in the sequence. Thus, the vector v is in the form of (0, 0.1, 0.2, .... , 1.99)′. If you execute the above four lines, GAUSSLIGHT draw a histogram for the variable named “storb” counting the frequencies for the entries of v as breakpoints to be used to compute the frequencies.

Basic-9

[3] OLS EXERCISE Program file: ols.prg /* ** OLS Program */ @ Data loading @ load data[100,5] = exer.txt; @ Define # of observations, # of regressors, X and Y @ tt = rows(data); @ # of observations @ kk = 5; @ # of regressors @ y = data[.,1]; x2 = data[.,2]; x3 = data[.,3]; x4 = data[.,4]; x5 = data[.,5]; vny = {"y"}; @ name of the dependent variable @ @ Dependent variable @ yy = y ; vny = {"y"}; @ Regressors @

xx = ones(tt,1)~x2~x3~x4~x5; vnx = {"cons", "x2", "x3", "x4", "x5" } ;

@ Do not change below @ @ OLS using yy and xx @ b = invpd(xx'xx)*(xx'yy); e = y - xx*b ; s2 = (e'e)/(tt-kk); v = s2*invpd(xx'xx); econ = b~sqrt(diag(v))~(b./sqrt(diag(v))); econ = vnx~econ;

Basic-10

sst = yy'yy - tt*meanc(yy)^2; sse = e'e; r2 = 1 - sse/sst; @ Printing out OLS results @ output file = ols.out reset; let mask[1,4] = 0 1 1 1; let fmt[4,3] = "-*.*s" 8 8 "*.*lf" 10 4 "*.*lf" 10 4 "*.*lf" 10 4; format /rd 10,4 ; "" ; "OLS Regression Result" ; "------------------------" ; " dependent variable: " $vny ; "" ; " R-Squares " r2 ; "" ; "variable coeff. std. err. t-st " ; yyprin = printfm(econ,mask,fmt); "" ; output off ; Output file: ols.out OLS Regression Result ------------------------ dependent variable: y R-Squares 0.8960 variable coeff. std. err. t-st cons -0.0009 0.2225 -0.0042 x2 0.9617 0.2040 4.7145 x3 1.7759 0.2659 6.6790 x4 3.0003 0.1903 15.7669 x5 3.8823 0.2035 19.0790 [4] OLS MONTE CARLO EXPERIMENTS

• Theory says that OLS estimators are normally distributed under ideal conditions. What does it mean?

• Consider the data exer.txt. The data is generated by the following equation: yt = 0 + xt2 + 2xt3 + 3xt4 + 4xt5 + εt, t = 1, ... 100, and var(εt) = 4.

Observe that β2 = 1. • We now generate a new data set (say, Data Set 1) as follows:

1. For each t = 1, ... , 100, keep the values of xt2, ... , xt5, but draw εt from N(0,4). Then, generate new yt following the above equation.

2. Estimate 2β . Name this estimate ]1[2β .

3. Generate other data set (say, Data Set 2) by the same way you generate Data Set 1. Then, get ]2[

2β .

4. Repeat 5000 (or more) times and get ]1[2β , ... , [5000]

2β .

• The econometric theory implies that the estimates will be normally distributed. • Let’s check whether it is true or not. Program file: olsmonte.prg /* ** Monte Carlo Program */ @ Load Data @ load data[100,5] = exer.txt ; @ Data generation under Classical Linear Regression Assumptions @ seed = 1; tt = 100; @ # of observations @ kk = 5; @ # of betas @ iter = 5000; @ # of sets of different data @ xx = ones(tt,1)~data[.,2:5]; @ Regressors are fixed @ tb = {0,1,2,3,4} ; @ y = x(2)*1 + x(3)*2 + x(4)*3 + x(5)*4 + e @ storb = zeros(iter,1); storse = zeros(iter,1); i = 1; do while i <= iter; @ Generating y @ Basic-11

Basic-12

yy = xx*tb + 2*rndns(tt,1,seed); @ OLS using yy and xx @ b = invpd(xx'xx)*(xx'yy); e = yy - xx*b ; s2 = (e'e)/(tt-kk); v = s2*invpd(xx'xx); se = sqrt(diag(v)); storb[i,1] = b[2,1]; storse[i,1] = se[2,1]; i = i + 1; endo; @ Reporting Monte Carlo results @ output file = olsmonte.out reset; format /rd 12,3; "Monte Carlo results"; "-----------"; "Mean of OLS b(2) =" meanc(storb); "s.e. of OLS b(2) =" stdc(storb); "mean of estimated s.e. of OLS b(2) =" meanc(storse) ; library pgraph; graphset; {a1,a2,a3}=hist(storb,50); output off ; Output file: olsmonte.out Monte Carlo results ----------- Mean of OLS b(2) = 0.998 s.e. of OLS b(2) = 0.184 mean of estimated s.e. of OLS b(2) = 0.185

Graphic Outcome:

Basic-13

Basic-14

[5] OLS Under Heteroskedasticity Program file: hetmonte.prg /* ** Monte Carlo Program for OLS, OLS with White Correction ** and GLS under Heteroskedasticity */ @ Data generating process @ seed = 1; @ seed number @ tt = 500; @ # of observations @ kk = 2; @ # of betas @ iter = 4000; @ # of sets of different data @ x = ones(tt,1)~rndns(tt,kk-1,seed)^2; @ Nonstochastic Regressor @ tb = {1,2}; @ y = x(1)*1 + x(2)*2 + e @ rho = 0.1 ; @ degree of hetyeroskedasticity @ storob = zeros(iter,1); storose = zeros(iter,1); storosec = zeros(iter,1); storot = zeros(iter,1); storotc = zeros(iter,1); storgb = zeros(iter,1); storgse = zeros(iter,1); storgt = zeros(iter,1); storfb = zeros(iter,1); storfse = zeros(iter,1); storft = zeros(iter,1); i = 1; do while i <= iter; @ Generating y with heteroskedasticity @ y = x*tb + sqrt(rho*x[.,2]+(1-rho)*1).*rndns(tt,1,seed); @ OLS using y and x @ b = invpd(x'x)*(x'y); e = y - x*b ; ee = e^2; s2 = (e'e)/(tt-kk); v = s2*invpd(x'x); vc = (x.*e)'(x.*e); vc = invpd(x'x)*vc*invpd(x'x);

Basic-15

se = sqrt(diag(v)) ; sec = sqrt(diag(vc)) ; storob[i,1] = b[2,1]; storose[i,1] = se[2,1]; storosec[i,1] = sec[2,1]; olst = (b[2,1]-tb[2,1])/se[2,1]; df = tt-2; pval = 2*cdftc(abs(olst),df); if pval >= 0.05; storot[i,1]= 0; else; storot[i,1] = 1; endif; olstc = (b[2,1]-tb[2,1])/sec[2,1]; df = tt-2; pval = 2*cdftc(abs(olstc),df); if pval >= 0.05; storotc[i,1]= 0; else; storotc[i,1] = 1; endif; @ infeasible GLS using y and x @ ys = y./sqrt(rho*x[.,2]+(1-rho)*1); xs = x./sqrt(rho*x[.,2]+(1-rho)*1); b = invpd(xs'xs)*(xs'ys); e = ys - xs*b ; s2 = (e'e)/(tt-kk); v = s2*invpd(xs'xs); se = sqrt(diag(v)) ; storgb[i,1] = b[2,1]; storgse[i,1] = se[2,1]; glst = (b[2,1]-tb[2,1])/se[2,1]; df = tt-2; pval = 2*cdftc(abs(glst),df); if pval >= 0.05; storgt[i,1]= 0; else; storgt[i,1] = 1; endif; @ feasible GLS using y and x @ fh = x*invpd(x'x)*(x'ee); ys = y./sqrt(abs(fh)); xs = x./sqrt(abs(fh)); b = invpd(xs'xs)*(xs'ys); e = ys - xs*b ;

Basic-16

s2 = (e'e)/(tt-kk); v = s2*invpd(xs'xs); se = sqrt(diag(v)) ; storfb[i,1] = b[2,1]; storfse[i,1] = se[2,1]; glst = (b[2,1]-tb[2,1])/se[2,1]; df = tt-2; pval = 2*cdftc(abs(glst),df); if pval >= 0.05; storft[i,1]= 0; else; storft[i,1] = 1; endif; i = i + 1; endo; @ Reporting Monte Carlo results @ output file = hetmonte.out reset; format /rd 12,3; "Monte Carlo results"; "" ; "-----------"; "Checking Unbiasedness"; "-----"; "Mean of OLS b(2) =" meanc(storob); "s.e. of OLS b(2) =" stdc(storob); "mean of estimated s.e. of OLS b(2) =" meanc(storose) ; "mean fo White-corrected s.e. of OLS b(2) =" meanc(storosec) ; "-----"; "Mean of infeasible GLS b(2) =" meanc(storgb); "s.e. of infeasible GLS b(2) =" stdc(storgb); "mean of estimated s.e. of infeasible GLS b(2) =" meanc(storgse) ; "-----"; "Mean of feasible GLS b(2) =" meanc(storfb); "s.e. of feasible GLS b(2) =" stdc(storfb); "mean of estimated s.e. of feasible GLS b(2) =" meanc(storfse) ; ""; "Checking sizes of t-tests at 5% sig. level"; "-----"; "Rejection Rate of the t-test with usual OLS =" meanc(storot); "Rejection Rate of the t-test with White-corrected OLS =" meanc(storotc); "Rejection Rate of the t-test with infeasible GLS =" meanc(storgt); "Rejection Rate of the t-test with feasible GLS =" meanc(storft);

Basic-17

output off ; Output file: hetmonte.out Monte Carlo results ----------- Checking Unbiasedness ----- Mean of OLS b(2) = 2.000 s.e. of OLS b(2) = 0.033 mean of estimated s.e. of OLS b(2) = 0.027 mean fo White-corrected s.e. of OLS b(2) = 0.032 ----- Mean of infeasible GLS b(2) = 2.000 s.e. of infeasible GLS b(2) = 0.032 mean of estimated s.e. of infeasible GLS b(2) = 0.032 ----- Mean of feasible GLS b(2) = 2.000 s.e. of feasible GLS b(2) = 0.032 mean of estimated s.e. of feasible GLS b(2) = 0.031 Checking sizes of t-tests at 5% sig. level ----- Rejection Rate of the t-test with usual OLS = 0.118 Rejection Rate of the t-test with White-corrected OLS = 0.071 Rejection Rate of the t-test with infeasible GLS = 0.051 Rejection Rate of the t-test with feasible GLS = 0.062

Basic-18

[5] OLS Under Autocorrelation Program file: automonte.prg /* ** Monte Carlo Program ** for OLS and GLS ** under Autocorrelation: AR(1) */ seed = 1; tt = 100; @ # of observations @ kk = 2; @ # of betas @ iter = 1000; @ # of sets of different data @ x = ones(tt,1)~rndns(tt,kk-1,seed)^2; @ Nonstochastic Regressor @ tb = {1,2} ; @ y = x(1)*1 + x(2)*2 + e @ rho = .9 ; @ e(t) = rho*e(t-1) + v(t) @ vvar = 4; bandw = 5; storb = zeros(iter,1); storse = zeros(iter,1); storsec = zeros(iter,1); stort = zeros(iter,1); stortc = zeros(iter,1); storpwgb = zeros(iter,1); storpwgse = zeros(iter,1); storpwgsec = zeros(iter,1); storpwgt = zeros(iter,1); storcogb = zeros(iter,1); storcogse = zeros(iter,1); storcogsec = zeros(iter,1); storcogt = zeros(iter,1); i = 1; do while i <= iter; @ Generating y with autocorrelation @ ep= sqrt(vvar/(1-rho^2))*rndns(tt,1,seed); j = 2; do while j <= tt; ep[j,1] = rho*ep[j-1,1]+ep[j,1] ; j = j + 1 ; endo ; y = x*tb + 2*ep ;

Basic-19

@ OLS using y and x @ b = invpd(x'x)*(x'y); e = y - x*b ; s2 = (e'e)/(tt-kk); v = s2*invpd(x'x); proc nw(uv,l) ; local j,omeo,n,w,ome ; n = rows(uv) ; omeo = uv'uv/n ; j = 1; do while j <= l ; w = 1 - j/(l+1) ; ome = uv[j+1:n,.]'uv[1:n-j,.]/n ; omeo = omeo + w*(ome+ome') ; j = j + 1 ; endo ; retp(omeo) ; endp ; vc = invpd(x'x)*(tt*nw(x.*e,bandw))*invpd(x'x); se = sqrt(diag(v)) ; sec = sqrt(diag(vc)) ; storb[i,1] = b[2,1]; storse[i,1] = se[2,1]; storsec[i,1] = sec[2,1]; olst = (b[2,1]-tb[2,1])/se[2,1]; df = tt-2; pval = 2*cdftc(abs(olst),df); if pval >= 0.05; stort[i,1]= 0; else; stort[i,1] = 1; endif; olst = (b[2,1]-tb[2,1])/sec[2,1]; df = tt-2; pval = 2*cdftc(abs(olst),df); if pval >= 0.05; stortc[i,1]= 0; else; stortc[i,1] = 1; endif; @ GLS using y and x @

Basic-20

gxx = y[1:rows(y)-1,1]~x[2:rows(x),.]~x[1:rows(x)-1,2]; gyy = y[2:rows(y),1]; durb = invpd(gxx'gxx)*(gxx'gyy); drho = durb[1,1]; gyy = (sqrt(1-drho^2)*y[1,.])|(y[2:rows(y),.]-drho*y[1:rows(y)-1,.]); gxx = (sqrt(1-drho^2)*x[1,.])|(x[2:rows(y),.]-drho*x[1:rows(y)-1,.]); pwglsb = invpd(gxx'gxx)*(gxx'gyy); pwe = gyy - gxx*pwglsb; pws2 = (pwe'pwe)/(tt-kk); pwv = pws2*invpd(gxx'gxx); pwse = sqrt(diag(pwv)) ; storpwgb[i,1] = pwglsb[2,1]; storpwgse[i,1] = pwse[2,1]; glst = (pwglsb[2,1]-tb[2,1])/pwse[2,1]; df = tt-2; pval = 2*cdftc(abs(glst),df); if pval >= 0.05; storpwgt[i,1]= 0; else; storpwgt[i,1] = 1; endif; gyy = gyy[2:rows(gyy),.]; gxx = gxx[2:rows(gxx),.]; coglsb = invpd(gxx'gxx)*(gxx'gyy); coe = gyy - gxx*coglsb; cos2 = (coe'coe)/(tt-kk-1); cov = cos2*invpd(gxx'gxx); cose = sqrt(diag(cov)) ; storcogb[i,1] = coglsb[2,1]; storcogse[i,1] = cose[2,1]; glst = (coglsb[2,1]-tb[2,1])/cose[2,1]; df = tt-2-1; pval = 2*cdftc(abs(glst),df); if pval >= 0.05; storcogt[i,1]= 0; else; storcogt[i,1] = 1; endif; i = i + 1; endo;

Basic-21

@ Reporting Monte Carlo results @ output file = automonte.out reset; format /rd 12,3; "Monte Carlo results"; "" ; "-----------"; "Checking Unibiasedness"; "-----"; "Mean of OLS b(2) =" meanc(storb); "s.e. of OLS b(2) =" stdc(storb); "mean of estimated s.e. of OLS b(2) =" meanc(storse) ; "mean fo NW-corrected s.e. of OLS b(2) =" meanc(storsec) ; "----------"; "Mean of P-W feasible GLS b(2) =" meanc(storpwgb); "s.e. of P-W feasible GLS b(2) =" stdc(storpwgb); "mean of P-W estimated s.e. of feasible GLS b(2) =" meanc(storpwgse) ; "----------"; "Mean of C-O feasible GLS b(2) =" meanc(storcogb); "s.e. of C-O feasible GLS b(2) =" stdc(storcogb); "mean of C-O estimated s.e. of feasible GLS b(2) =" meanc(storcogse) ; ""; "Checking sizes of t-tests at 5% of sig. level"; "----------"; "Rejection Rate of the t-test with usual OLS =" meanc(stort); "Rejection Rate of the t-test with NW-corrected OLS =" meanc(stortc); "Rejection Rate of the t-test with PW feasible GLS =" meanc(storpwgt); "Rejection Rate of the t-test with Co feasible GLS =" meanc(storcogt); output off ;

Basic-22

Output file: automonte.out Monte Carlo results ----------- Checking Unibiasedness ----- Mean of OLS b(2) = 1.997 s.e. of OLS b(2) = 0.971 mean of estimated s.e. of OLS b(2) = 1.346 mean fo NW-corrected s.e. of OLS b(2) = 1.117 ---------- Mean of P-W feasible GLS b(2) = 2.021 s.e. of P-W feasible GLS b(2) = 0.476 mean of P-W estimated s.e. of feasible GLS b(2) = 0.481 ---------- Mean of C-O feasible GLS b(2) = 2.021 s.e. of C-O feasible GLS b(2) = 0.476 mean of C-O estimated s.e. of feasible GLS b(2) = 0.483 Checking sizes of t-tests at 5% of sig. level ---------- Rejection Rate of the t-test with usual OLS = 0.011 Rejection Rate of the t-test with NW-corrected OLS = 0.040 Rejection Rate of the t-test with PW feasible GLS = 0.046 Rejection Rate of the t-test with Co feasible GLS = 0.045

TUTORIAL: BASIC OPERATIONS [1] AN EXERCISE ON HOW …miniahn/ecn726/gauss_basics.pdf · TUTORIAL:...

Documents

Transcript of TUTORIAL: BASIC OPERATIONS [1] AN EXERCISE ON HOW …miniahn/ecn726/gauss_basics.pdf · TUTORIAL:...