SAS NLN

Chapter 14The MODEL Procedure

Chapter Table of Contents

OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675

GETTING STARTED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 679Nonlinear Regression Analysis . . .. . . . . . . . . . . . . . . . . . . . . . 679Nonlinear Systems Regression . . .. . . . . . . . . . . . . . . . . . . . . . 683General Form Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684Solving Simultaneous Nonlinear Equation Systems . . . . . . . . . . . . . . 687Monte Carlo Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 690

SYNTAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693Functional Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694PROC MODEL Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 700BOUNDS Statement . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . 704BY Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705CONTROL Statement . .. . . . . . . . . . . . . . . . . . . . . . . . . . . 706ENDOGENOUS Statement. . . . . . . . . . . . . . . . . . . . . . . . . . . 706ESTIMATE Statement . .. . . . . . . . . . . . . . . . . . . . . . . . . . . 706EXOGENOUS Statement .. . . . . . . . . . . . . . . . . . . . . . . . . . . 708FIT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 708ID Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715INCLUDE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715INSTRUMENTS Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 715LABEL Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716OUTVARS Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717PARAMETERS Statement. . . . . . . . . . . . . . . . . . . . . . . . . . . 717RANGE Statement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717RESET Statement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718RESTRICT Statement . .. . . . . . . . . . . . . . . . . . . . . . . . . . . 718SOLVE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 719TEST Statement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724VAR Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725WEIGHT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725

ESTIMATION DETAILS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726Estimation Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726Minimization Methods . .. . . . . . . . . . . . . . . . . . . . . . . . . . . 738

671

Part 2. General Information

Convergence Criteria . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . 739Troubleshooting Convergence Problems. . . . . . . . . . . . . . . . . . . . 742Iteration History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754Computer Resource Requirements . . . . . . . . . . . . . . . . . . . . . . . 756Testing for Normality . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . 760Heteroscedasticity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762Transformation of Error Terms . . . . . . . . . . . . . . . . . . . . . . . . . 766Error Covariance Structure Specification . . . . . . . . . . . . . . . . . . . . 769Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . 773Restrictions and Bounds on Parameters. . . . . . . . . . . . . . . . . . . . 783Tests on Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 784Hausman Specification Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 786Chow Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 787Profile Likelihood Confidence Intervals. . . . . . . . . . . . . . . . . . . . 789Choice of Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791Autoregressive Moving Average Error Processes. . . . . . . . . . . . . . . 794Distributed Lag Models and the %PDL Macro .. . . . . . . . . . . . . . . . 808Input Data Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 810Output Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815ODS Table Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 819

SIMULATION DETAILS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 821Solution Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 821Goal Seeking: Solving for Right-Hand-Side Variables . . . . . . . . . . . . . 835Numerical Solution Methods. . . . . . . . . . . . . . . . . . . . . . . . . . 837Numerical Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845SOLVE Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 846

PROGRAMMING LANGUAGE OVERVIEW . . . . . . . . . . . . . . . . 849Variables in the Model Program . . . . . . . . . . . . . . . . . . . . . . . . 849Equation Translations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 853Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 855Mathematical Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 856Functions Across Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 857Language Differences . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . 861Storing Programs in Model Files . . . . . . . . . . . . . . . . . . . . . . . . 863Diagnostics and Debugging. . . . . . . . . . . . . . . . . . . . . . . . . . . 864Analyzing the Structure of Large Models . . . . . . . . . . . . . . . . . . . 869

EXAMPLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 873Example 14.1 OLS Single Nonlinear Equation . . . . . . . . . . . . . . . . . 873Example 14.2 A Consumer Demand Model . . . . . . . . . . . . . . . . . . 875Example 14.3 Vector AR(1) Estimation . . . . . . . . . . . . . . . . . . . . 879Example 14.4 MA(1) Estimation . . . . . . . . . . . . . . . . . . . . . . . . 883Example 14.5 Polynomial Distributed Lags Using %PDL. . . . . . . . . . . 887Example 14.6 General-Form Equations . . . . . . . . . . . . . . . . . . . . 891Example 14.7 Spring and Damper Continuous System .. . . . . . . . . . . 894Example 14.8 Nonlinear FIML Estimation . . . . . . . . . . . . . . . . . . . 900

SAS OnlineDoc: Version 8672

Chapter 14. The MODEL Procedure

Example 14.9 Circuit Estimation . . . . . . . . . . . . . . . . . . . . . . . . 902Example 14.10 Systems of Differential Equations . . . . . . . . . . . . . . . 903Example 14.11 Monte Carlo Simulation . . . . . . . . . . . . . . . . . . . . 906

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 908

673SAS OnlineDoc: Version 8

Chapter 14The MODEL Procedure

Overview

The MODEL procedure analyzes models in which the relationships among the vari-ables comprise a system of one or more nonlinear equations. Primary uses of theMODEL procedure are estimation, simulation, and forecasting of nonlinear simulta-neous equation models.

PROC MODEL features include

� SAS programming statements to define simultaneous systems of nonlinearequations

� tools to analyze the structure of the simultaneous equation system

� ARIMA, PDL, and other dynamic modeling capabilities

� tools to specify and estimate the error covariance structure

� tools to estimate and solve ordinary differential equations

� the following methods for parameter estimation:

– Ordinary Least Squares (OLS)

– Two-Stage Least Squares (2SLS)

– Seemingly Unrelated Regression (SUR) and iterative SUR (ITSUR)

– Three-Stage Least Squares (3SLS) and iterative 3SLS (IT3SLS)

– Generalized Method of Moments (GMM)

– Full Information Maximum Likelihood (FIML)

� simulation and forecasting capabilities

� Monte Carlo simulation

� goal seeking solutions

A system of equations can be nonlinear in the parameters, nonlinear in the observedvariables, or nonlinear in both the parameters and the variables.Nonlinear in theparameters means that the mathematical relationship between the variables and pa-rameters is not required to have a linear form. (A linear model is a special case of anonlinear model.) A general nonlinear system of equations can be written as

q1 (y1;t; y2;t; : : :; yg;t; x1;t; x2;t; : : :; xm;t; �1; �2; : : :; �p) = �1;tq2 (y1;t; y2;t; : : :; yg;t; x1;t; x2;t; : : :; xm;t; �1; �2; : : :; �p) = �2;t...qg (y1;t; y2;t; : : :; yg;t; x1;t; x2;t; : : :; xm;t; �1; �2; : : :; �p) = �g;t

675


whereyi;t is an endogenous variable,xi;t is an exogenous variable,�i is a parameter,and�i is the unknown error. The subscriptt represents time or some index to the data.In econometrics literature, the observed variables are eitherendogenous(dependent)variables orexogenous(independent) variables. This system can be written moresuccinctly in vector form as

q(yt;xt; �) = �t

This system of equations is ingeneral formbecause the error term is by itself onone side of the equality. Systems can also be written innormalized formby placingthe endogenous variable on one side of the equality, with each equation defininga predicted value for a unique endogenous variable. A normalized form equationsystem can be written in vector notation as

yt = f(yt;xt; �) + �t:

PROC MODEL handles equations written in both forms.

Econometric models often explain the current values of the endogenous variables asfunctions of past values of exogenous and endogenous variables. These past valuesare referred to aslaggedvalues, and the variablext�i is called lagi of the variablext. Using lagged variables, you can create adynamic, or time dependent, model. Inthe preceding model systems, the lagged exogenous and endogenous variables areincluded as part of the exogenous variables.

If the data are time series, so thatt indexes time (see Chapter 2, “Working with TimeSeries Data,” for more information on time series), it is possible that�t depends on�t�i or, more generally, the�t’s are not identically and independently distributed. Ifthe errors of a model system are autocorrelated, the standard error of the estimates ofthe parameters of the system will be inflated.

Sometimes the�i’s are not identically distributed because the variance of� is notconstant. This is known asheteroscedasticity. Heteroscedasticity in an estimatedmodel can also inflate the standard error of the estimates of the parameters. Using aweighted estimation can sometimes eliminate this problem. Alternately, a variancemodel such as GARCH or EGARCH can be estimated to correct for heteroscedas-ticity. If the proper weighting scheme and the form of the error model is difficultto determine, generalized methods of moments (GMM) estimation can be used todetermine parameter estimates that are asymptotically more efficient than the OLSparameter estimates.

Other problems may also arise when estimating systems of equations. Consider thesystem of equations:

y1;t = �1 + (�2 + �3�t4)�1 + �5y2;t + �1;t

y2;t = �6 + (�7 + �8�t9)�1 + �10y1;t + �2;t

which is nonlinear in its parameters and cannot be estimated with linear regression.This system of equations represents a rudimentary predator-prey process withy1 as


Chapter 14. Overview

the prey andy2 as the predator (the second term in both equations is a logistics curve).The two equations must be estimated simultaneously because of the cross dependencyof y’s. Nonlinear ordinary least-squares estimation of these equations will producebiased and inconsistent parameter estimates. This is calledsimultaneous equationbias.

One method to remove simultaneous equation bias, in the linear case, is to replace theendogenous variables on the right-hand side of the equations with predicted valuesthat are uncorrelated with the error terms. These predicted values can be obtainedthrough a preliminary, or "first stage,"instrumental variable regression. Instrumentalvariables, which are uncorrelated with the error term, are used as regressors to modelthe predicted values. The parameter estimates are obtained by a second regressionusing the predicted values of the regressors. This process is calledtwo-stage leastsquares.

In the nonlinear case, nonlinear ordinary least-squares estimation is performed itera-tively using a linearization of the model with respect to the parameters. The instru-mental solution to simultaneous equation bias in the nonlinear case is the same asthe linear case except the linearization of the model with respect to the parameters ispredicted by the instrumental regression. Nonlinear two-stage least squares is one ofseveral instrumental variables methods available in the MODEL procedure to handlesimultaneous equation bias.

When you have a system of several regression equations, the random errors of theequations can be correlated. In this case, the large-sample efficiency of the estimationcan be improved by using a joint generalized least-squares method that takes thecross-equation correlations into account. If the equations are not simultaneous (nodependent regressors), thenseemingly unrelated regression(SUR) can be used. TheSUR method requires an estimate of the cross-equation error covariance matrix,�.The usual approach is to first fit the equations using OLS, compute an estimate� fromthe OLS residuals, and then perform the SUR estimation based on�. The MODELprocedure estimates� by default, or you can supply your own estimate of�.

If the equation system is simultaneous, you can combine the 2SLS and SUR methodsto take into account both simultaneous equation bias and cross-equation correlationof the errors. This is calledthree-stage least squaresor 3SLS.

A different approach to the simultaneous equation bias problem is the full informationmaximum likelihood, or FIML, estimation method. FIML does not require instru-mental variables, but it assumes that the equation errors have a multivariate normaldistribution. 2SLS and 3SLS estimation do not assume a particular distribution forthe errors.

Once a nonlinear model has been estimated, it can be used to obtain forecasts. Ifthe model is linear in the variables you want to forecast, a simple linear solve cangenerate the forecasts. If the system is nonlinear, an iterative procedure must be used.The preceding example system is linear in its endogenous variables. The MODELprocedure’s SOLVE statement is used to forecast nonlinear models.

One of the main purposes of creating models is to obtain an understanding of therelationship among the variables. There are usually only a few variables in a model



you can control (for example, the amount of money spent on advertising). Often youwant to determine how to change the variables under your control to obtain sometarget goal. This process is calledgoal seeking. PROC MODEL allows you to solvefor any subset of the variables in a system of equations given values for the remainingvariables.

The nonlinearity of a model creates two problems with the forecasts: the forecasterrors are not normally distributed with zero mean, and no formula exits to calculatethe forecast confidence intervals. PROC MODEL provides Monte Carlo techniques,which, when used with the covariance of the parameters and error covariance matrix,can produce approximate error bounds on the forecasts.


Chapter 14. Getting Started

Getting Started

This section introduces the MODEL procedure and shows how to use PROC MODELfor several kinds of nonlinear regression analysis and nonlinear systems simulationproblems.

Nonlinear Regression Analysis

One of the most important uses of PROC MODEL is to estimate unknown parametersin a nonlinear model. A simple nonlinear model has the form:

y = f(x; �) + �

wherex is a vector of exogenous variables. To estimate unknown parameters usingPROC MODEL, do the following:

1. Use the DATA= option in a PROC MODEL statement to specify the input SASdata set containingy andx, the observed values of the variables.

2. Write the equation for the model using SAS programming statements, includ-ing all parameters and arithmetic operators but leaving off the unobserved errorcomponent,�.

3. Use a FIT statement to fit the model equation to the input data to determine theunknown parameters,�.

An ExampleThe SASHELP library contains the data set CITIMON, which contains the variableLHUR, the monthly unemployment figures, and the variable IP, the monthly industrialproduction index. You suspect that the unemployment rates are inversely proportionalto the industrial production index. Assume that these variables are related by thefollowing nonlinear equation:

lhur =1

a � ip + b+ c + �

In this equationa, b, andc are unknown coefficients and� is an unobserved randomerror.

The following statements illustrate how to use PROC MODEL to estimate values fora, b, andc from the data in SASHELP.CITIMON.

proc model data=sashelp.citimon;lhur = 1/(a * ip + b) + c;fit lhur;

run;

Notice that the model equation is written as a SAS assignment statement. The vari-able LHUR is assumed to be the dependent variable because it is named in the FITstatement and is on the left-hand side of the assignment.



PROC MODEL determines that LHUR and IP are observed variables because theyare in the input data set. A, B, and C are treated as unknown parameters to be esti-mated from the data because they are not in the input data set. If the data set containeda variable named A, B, or C, you would need to explicitly declare the parameters witha PARMS statement.

In response to the FIT statement, PROC MODEL estimates values for A, B, and Cusing nonlinear least squares and prints the results. The first part of the output is a"Model Summary" table, shown in Figure 14.1.

The MODEL Procedure

Model Summary

Model Variables 1Parameters 3Equations 1Number of Statements 1

Model Variables LHURParameters a b c

Equations LHUR

Figure 14.1. Model Summary Report

This table details the size of the model, including the number of programming state-ments defining the model, and lists the dependent variables (LHUR in this case), theunknown parameters (A, B, and C), and the model equations. In this case the equationis named for the dependent variable, LHUR.

PROC MODEL then prints a summary of the estimation problem, as shown in Figure14.2.

The MODEL Procedure

The Equation to Estimate is

LHUR = F(a, b, c(1))

Figure 14.2. Estimation Problem Report

The notation used in the summary of the estimation problem indicates that LHUR isa function of A, B, and C, which are to be estimated by fitting the function to thedata. If the partial derivative of the equation with respect to a parameter is a simplevariable or constant, the derivative is shown in parentheses after the parameter name.In this case, the derivative with respect to the intercept C is 1. The derivatives withrespect to A and B are complex expressions and so are not shown.

Next, PROC MODEL prints an estimation summary as shown in Figure 14.3.



The MODEL ProcedureOLS Estimation Summary

Data Set Options

DATA= SASHELP.CITIMON

Minimization Summary

Parameters Estimated 3Method GaussIterations 10

Final Convergence Criteria

R 0.000737PPC(b) 0.003943RPC(b) 0.00968Object 4.784E-6Trace(S) 0.533325Objective Value 0.522214

Observations Processed

Read 145Solved 145Used 144Missing 1

Figure 14.3. Estimation Summary Report

The estimation summary provides information on the iterative process used to com-pute the estimates. The heading "OLS Estimation Summary" indicates that the non-linear ordinary least-squares (OLS) estimation method is used. This table indicatesthat all 3 parameters were estimated successfully using 144 nonmissing observationsfrom the data set SASHELP.CITIMON. Calculating the estimates required 10 iter-ations of the GAUSS method. Various measures of how well the iterative processconverged are also shown. For example, the "RPC(B)" value 0.00968 means that onthe final iteration the largest relative change in any estimate was for parameter B,which changed by .968 percent. See the section "Convergence Criteria" later in thischapter for details.

PROC MODEL then prints the estimation results. The first part of this table is thesummary of residual errors, shown in Figure 14.4.

The MODEL Procedure

Nonlinear OLS Summary of Residual Errors

DF DF AdjEquation Model Error SSE MSE R-Square R-Sq

LHUR 3 141 75.1989 0.5333 0.7472 0.7436

Figure 14.4. Summary of Residual Errors Report



This table lists the sum of squared errors (SSE), the mean square error (MSE), theroot mean square error (Root MSE), and the R2 and adjusted R2 statistics. The R2

value of .7472 means that the estimated model explains approximately 75 percentmore of the variability in LHUR than a mean model explains.

Following the summary of residual errors is the parameter estimates table, shown inFigure 14.5.

The MODEL Procedure

Nonlinear OLS Parameter Estimates

Approx ApproxParameter Estimate Std Err t Value Pr > |t|

a 0.009046 0.00343 2.63 0.0094b -0.57059 0.2617 -2.18 0.0309c 3.337151 0.7297 4.57 <.0001

Figure 14.5. Parameter Estimates

Because the model is nonlinear, the standard error of the estimate, the t value, and itssignificance level are only approximate. These values are computed using asymptoticformulas that are correct for large sample sizes but only approximately correct forsmaller samples. Thus, you should use caution in interpreting these statistics fornonlinear models, especially for small sample sizes. For linear models, these resultsare exact and are the same as standard linear regression.

The last part of the output produced by the FIT statement is shown in Figure 14.6.

The MODEL Procedure

Number of Observations Statistics for System

Used 144 Objective 0.5222Missing 1 Objective*N 75.1989

Figure 14.6. System Summary Statistics

This table lists the objective value for the estimation of the nonlinear system, whichis a weighted system mean square error. This statistic can be used for testing cross-equation restrictions in multi-equation regression problems. See the section "Restric-tions and Bounds on Parameters" for details. Since there is only a single equation inthis case, the objective value is the same as the residual MSE for LHUR except thatthe objective value does not include a degrees of freedom correction. This can beseen in the fact that "Objective*N" equals the residual SSE, 75.1989. N is 144, thenumber of observations used.

Convergence and Starting ValuesComputing parameter estimates for nonlinear equations requires an iterative process.Starting with an initial guess for the parameter values, PROC MODEL tries differentparameter values until the objective function of the estimation method is minimized.(The objective function of the estimation method is sometimes called thefitting func-tion.) This process does not always succeed, and whether it does succeed depends



greatly on the starting values used. By default, PROC MODEL uses the starting value.0001 for all parameters.

Consequently, in order to use PROC MODEL to achieve convergence of parameterestimates, you need to know two things: how to recognize convergence failure byinterpreting diagnostic output, and how to specify reasonable starting values. TheMODEL procedure includes alternate iterative techniques and grid search capabilitiesto aid in finding estimates. See the section "Troubleshooting Convergence Problems"for more details.

Nonlinear Systems Regression

If a model has more than one endogenous variable, several facts need to be consid-ered in the choice of an estimation method. If the model has endogenous regressors,then an instrumental variables method such as 2SLS or 3SLS can be used to avoidsimultaneous equation bias. Instrumental variables must be provided to use thesemethods. A discussion of possible choices for instrumental variables is provided inthe "Choice of Instruments" section in this chapter.

The following is an example of the use of 2SLS and the INSTRUMENTS statement:

proc model data=test2 ;exogenous x1 x2;parms a1 a2 b2 2.5 c2 55 d1;

y1 = a1 * y2 + b2 * x1 * x1 + d1;y2 = a2 * y1 + b2 * x2 * x2 + c2 / x2 + d1;

fit y1 y2 / 2sls;instruments b2 c2 _exog_;

run;

The estimation method selected is added after the slash (/) on the FIT statement. TheINSTRUMENTS statement follows the FIT statement and in this case selects all theexogenous variables as instruments with the–EXOG– keyword. The parameters B2and C2 on the instruments list request that the derivatives with respect to B2 and C2be additional instruments.

Full information maximum likelihood (FIML) can also be used to avoid simultane-ous equation bias. FIML is computationally more expensive than an instrumentalvariables method and assumes that the errors are normally distributed. On the otherhand, FIML does not require the specification of instruments. FIML is selected withthe FIML option on the FIT statement.



The preceding example is estimated with FIML using the following statements:

proc model data=test2 ;exogenous x1 x2;parms a1 a2 b2 2.5 c2 55 d1;

y1 = a1 * y2 + b2 * x1 * x1 + d1;y2 = a2 * y1 + b2 * x2 * x2 + c2 / x2 + d1;

fit y1 y2 / fiml;run;

General Form Models

The single equation example shown in the preceding section was written in normal-ized form and specified as an assignment of the regression function to the dependentvariable LHUR. However, sometimes it is impossible or inconvenient to write a non-linear model in normalized form.

To write a general form equation, give the equation a name with the prefix "EQ." .This EQ.-prefixed variable represents the equation error. Write the equation as anassignment to this variable.

For example, suppose you have the following nonlinear model relating the variablesx andy:

� = a+ b ln(cy + dx)

Naming this equation ‘one’, you can fit this model with the following statements:

proc model data=xydata;eq.one = a + b * log( c * y + d * x );fit one;

run;

The use of the EQ. prefix tells PROC MODEL that the variable is an error term andthat it should not expect actual values for the variable ONE in the input data set.

Demand and Supply ModelsGeneral form specifications are often useful when you have several equations for thesame dependent variable. This is common in demand and supply models, where boththe demand equation and the supply equation are written as predictions for quantityas functions of price.

For example, consider the following demand and supply system:

(demand) quantity = �1 + �2 price + �3 income + �1

(supply) quantity = �1 + �2 price + �2



Assume thequantityof interest is the amount of energy consumed in the U.S.; theprice is the price of gasoline, and theincomevariable is the consumer debt. Whenthe market is at equilibrium, these equations determine the market price and the equi-librium quantity. These equations are written in general form as

�1 = quantity � (�1 + �2 price+ �3 income)

�2 = quantity � (�1 + �2 price)

Note that the endogenous variablesquantityandprice depend on two error terms sothat OLS should not be used. The following example uses three-stage least-squaresestimation.

Data for this model is obtained from the SASHELP.CITIMON data set.

title1 ’Supply-Demand Model using General-form Equations’;proc model data=sashelp.citimon;

endogenous eegp eec;exogenous exvus cciutc;parameters a1 a2 a3 b1 b2 ;label eegp = ’Gasoline Retail Price’

eec = ’Energy Consumption’cciutc = ’Consumer Debt’;

/* -------- Supply equation ------------- */eq.supply = eec - (a1 + a2 * eegp + a3 * cciutc);

/* -------- Demand equation ------------- */eq.demand = eec - (b1 + b2 * eegp );

/* -------- Instrumental variables -------*/lageegp = lag(eegp); lag2eegp=lag2(eegp);

/* -------- Estimate parameters --------- */fit supply demand / n3sls fsrsq;instruments _EXOG_ lageegp lag2eegp;

run;

The FIT statement specifies the two equations to estimate and the method of estima-tion, N3SLS. Note that ‘3SLS’ is an alias for N3SLS. The option FSRSQ is selectedto get a report of the first stage R2 to determine the acceptability of the selected in-struments.

Since three-stage least squares is an instrumental variables method, instruments arespecified with the INSTRUMENTS statement. The instruments selected are all theexogenous variables, selected with the–EXOG– option, and two lags of the variableEEGP, LAGEEGP and LAG2EEGP.

The data set CITIMON has four observations that generate missing values becausevalues for either EEGP, EEC, or CCIUTC are missing. This is revealed in the "Obser-vations Processed" output shown in Figure 14.7. Missing values are also generated



when the equations cannot be computed for a given observation. Missing observa-tions are not used in the estimation.

Supply-Demand Model using General-form Equations

The MODEL Procedure3SLS Estimation Summary


Read 145Solved 143First 3Last 145Used 139Missing 4Lagged 2

Figure 14.7. Supply-Demand Observations Processed

The lags used to create the instruments also reduce the number of observations used.In this case, the first 2 observations were used to fill the lags of EEGP.

The data set has a total of 145 observations, of which 4 generated missing valuesand 2 were used to fill lags, which left 139 observations for the estimation. In theestimation summary, in Figure 14.8, the total degrees of freedom for the model anderror is 139.


The MODEL Procedure

Nonlinear 3SLS Summary of Residual Errors

DF DF AdjEquation Model Error SSE MSE Root MSE R-Square R-Sq

supply 3 136 39.5791 0.2910 0.5395demand 2 137 43.2677 0.3158 0.5620

Nonlinear 3SLS Parameter Estimates

1stApprox Approx Stage

Parameter Estimate Std Err t Value Pr > |t| R-Square

a1 6.82196 0.3788 18.01 <.0001 1.0000a2 -0.00614 0.00303 -2.02 0.0450 0.9617a3 9E-7 3.165E-7 2.84 0.0051 1.0000b1 7.30952 0.3799 19.24 <.0001 1.0000b2 -0.00853 0.00328 -2.60 0.0103 0.9617

Figure 14.8. Supply-Demand Parameter Estimates

One disadvantage of specifying equations in general form is that there are no actualvalues associated with the equation, so the R2 statistic cannot be computed.



Solving Simultaneous Nonlinear Equation Systems

You can use a SOLVE statement to solve the nonlinear equation system for somevariables when the values of other variables are given.

Consider the demand and supply model shown in the preceding example. The fol-lowing statement computes equilibrium price (EEGP) and quantity (EEC) values forgiven observed cost (CCIUTC) values and stores them in the output data set EQUI-LIB.

title1 ’Supply-Demand Model using General-form Equations’;proc model data=sashelp.citimon;

endogenous eegp eec;exogenous exvus cciutc;parameters a1 a2 a3 b1 b2 ;label eegp = ’Gasoline Retail Price’

eec = ’Energy Consumption’cciutc = ’Consumer Debt’;

/* -------- Supply equation ------------- */eq.supply = eec - (a1 + a2 * eegp + a3 * cciutc);

/* -------- Demand equation ------------- */eq.demand = eec - (b1 + b2 * eegp );

/* -------- Instrumental variables -------*/lageegp = lag(eegp); lag2eegp=lag2(eegp);

/* -------- Estimate parameters --------- */instruments _EXOG_ lageegp lag2eegp;fit supply demand / n3sls ;solve eegp eec / out=equilib;

run;

As a second example, suppose you want to compute points of intersection between thesquare root function and hyperbolas of the forma+ b=x. That is, solve the system:

(square root) y =px

(hyperbola) y = a+b

x

The following statements read parameters for several hyperbolas in the input data setTEST and solve the nonlinear equations. The SOLVEPRINT option on the SOLVEstatement prints the solution values. The ID statement is used to include the valuesof A and B in the output of the SOLVEPRINT option.



data test;input a b @@;datalines;0 1 1 1 1 2

;

proc model data=test;eq.sqrt = sqrt(x) - y;eq.hyperbola = a + b / x - y;solve x y / solveprint;id a b;

run;

The printed output produced by this example consists of a model summary report,a listing of the solution values for each observation, and a solution summary report.The model summary for this example is shown in Figure 14.9.


The MODEL Procedure

Model Summary

Model Variables 2ID Variables 2Equations 2Number of Statements 2

Model Variables x yEquations sqrt hyperbola

Figure 14.9. Model Summary Report

The output produced by the SOLVEPRINT option is shown in Figure 14.10.



The MODEL ProcedureSimultaneous Simulation

Observation 1 a 0 b 1.0000 eq.hyperbola 0.000000Iterations 17 CC 0.000000

Solution Values

x y

1.000000 1.000000

Observation 2 a 1.0000 b 1.0000 eq.hyperbola 0.000000Iterations 5 CC 0.000000

Solution Values

x y

2.147899 1.465571

Observation 3 a 1.0000 b 2.0000 eq.hyperbola 0.000000Iterations 4 CC 0.000000

Solution Values

x y

2.875130 1.695621

Figure 14.10. Solution Values for Each Observation

For each observation, a heading line is printed that lists the values of the ID vari-ables for the observation and information on the iterative process used to computethe solution. Following the heading line for the observation, the solution values areprinted.

The heading line shows the solution method used (Newton’s method by default),the number of iterations required, and the convergence measure, labeled CC=. Thisconvergence measure indicates the maximum error by which solution values fail tosatisfy the equations. When this error is small enough (as determined by the CON-VERGE= option), the iterations terminate. The equation with the largest error isindicated in parentheses. For example, for observation 3 the HYPERBOLA equationhas an error of4:42�10�13 while the error of the SQRT equation is even smaller.

The last part of the SOLVE statement output is the solution summary report shownin Figure 14.11. This report summarizes the iteration history and the model solved.




Data Set Options

DATA= TEST

Solution Summary

Variables Solved 2Implicit Equations 2Solution Method NEWTONCONVERGE= 1E-8Maximum CC 9.176E-9Maximum Iterations 17Total Iterations 26Average Iterations 8.666667


Read 3Solved 3

Variables Solved For x yEquations Solved sqrt hyperbola

Figure 14.11. Solution Summary Report

Monte Carlo Simulation

The RANDOM= option is used to request Monte Carlo (or stochastic) simulation togenerate confidence intervals for a forecast. The confidence intervals are implied bythe model’s relationship to the the implicit random error term� and the parameters.

The Monte Carlo simulation generates a random set of additive error values, one foreach observation and each equation, and computes one set of perturbations of theparameters. These new parameters, along with the additive error terms, are then usedto compute a new forecast that satisfies this new simultaneous system. Then a newset of additive error values and parameter perturbations is computed, and the processis repeated the requested number of times.

Consider the following exchange rate model for the U.S. dollar with the German markand the Japanese yen:

rate–jp = a1 + b1im–jp+ c1di–jp;

rate–wg = a2 + b2im–wg + c1di–wg;

whererate–jp andrate–wgare the exchange rate of the Japanese yen and the Germanmark versus the U.S. dollar respectively;im–jp andim–wgare the imports from Japanand Germany in 1984 dollars respectively; anddi–jp anddi–wg are the differencesin inflation rate of Japan and the U.S., and Germany and the U.S. respectively. The



Monte Carlo capabilities of the MODEL procedure are used to generate error boundson a forecast using this model.

proc model data=exchange;endo im_jp im_wg;exo di_jp di_wg;parms a1 a2 b1 b2 c1 c2;label rate_jp = ’Exchange Rate of Yen/$’

rate_wg = ’Exchange Rate of Gm/$’im_jp = ’Imports to US from Japan in 1984 $’im_wg = ’Imports to US from WG in 1984 $’di_jp = ’Difference in Inflation Rates US-JP’di_wg = ’Difference in Inflation Rates US-WG’;

rate_jp = a1 + b1*im_jp + c1*di_jp;rate_wg = a2 + b2*im_wg + c2*di_wg;

/* Fit the EXCHANGE data */fit rate_jp rate_wg / sur outest=xch_est outcov outs=s;

/* Solve using the WHATIF data set */solve rate_jp rate_wg / data=whatif estdata=xch_est sdata=s

random=100 seed=123 out=monte forecast;id yr;range yr=1986;

run;

Data for the EXCHANGE data set was obtained from the Department of Commerceand the yearly "Economic Report of the President."

First, the parameters are estimated using SUR selected by the SUR option on the FITstatement. The OUTEST= option is used to create the XCH–EST data set whichcontains the estimates of the parameters. The OUTCOV option adds the covariancematrix of the parameters to the XCH–EST data set. The OUTS= option is used tosave the covariance of the equation error in the data set S.

Next, Monte Carlo simulation is requested using the RANDOM= option on theSOLVE statement. The data set WHATIF, shown below, is used to drive the fore-casts. The ESTDATA= option reads in the XCH–EST data set which contains theparameter estimates and covariance matrix. Because the parameter covariance ma-trix is included, perturbations of the parameters are performed. The SDATA= optioncauses the Monte Carlo simulation to use the equation error covariance in the S dataset to perturb the equation errors. The SEED= option selects the number 123 as seedvalue for the random number generator. The output of the Monte Carlo simulation iswritten to the data set MONTE selected by the OUT= option.

/* data for simulation */data whatif;

input yr rate_jp rate_wg imn_jp imn_wg emp_us emp_jpemp_wg prod_us prod_jp prod_wg cpi_us cpi_jp cpi_wg;

label cpi_us = ’US CPI 1982-1984 = 100’



cpi_jp = ’JP CPI 1982-1984 = 100’cpi_wg = ’WG CPI 1982-1984 = 100’;

im_jp = imn_jp/cpi_us;im_wg = imn_wg/cpi_us;ius = 100*(cpi_us-(lag(cpi_us)))/(lag(cpi_us));ijp = 100*(cpi_jp-(lag(cpi_jp)))/(lag(cpi_jp));iwg = 100*(cpi_wg-(lag(cpi_wg)))/(lag(cpi_wg));di_jp = ius - ijp;di_wg = ius - iwg;

datalines;1980 226.63 1.8175 30714 11693 103.3 101.3 100.4 101.7

125.4 109.8 .824 .909 .8681981 220.63 2.2631 35000 11000 102.8 102.2 97.9 104.6

126.3 112.8 .909 .954 .9221982 249.06 2.4280 40000 12000 95.8 101.4 95.0 107.1

146.8 113.3 .965 .980 .9701983 237.55 2.5539 45000 13100 94.4 103.4 91.1 111.6

152.8 116.8 .996 .999 1.0031984 237.45 2.8454 50000 14300 99.0 105.8 90.4 118.5

152.2 124.7 1.039 1.021 1.0271985 238.47 2.9419 55000 15600 98.1 107.6 91.3 124.2

161.1 128.5 1.076 1.042 1.0481986 . . 60000 17000 96.8 107.3 92.7 128.8

163.8 130.7 1.096 1.049 1.0471987 . . 65000 18500 97.1 106.1 92.8 132.0

176.5 129.9 1.136 1.050 1.0491988 . . 70000 20000 99.6 108.8 92.7 136.2

190.0 135.9 1.183 1.057 1.063;

To generate a confidence interval plot for the forecast, use PROC UNIVARIATE togenerate percentile bounds and use PROC GPLOT to plot the graph. The followingSAS statements produce the graph in Figure 14.12.

proc sort data=monte;by yr;

run;

proc univariate data=monte noprint;by yr;var rate_jp rate_wg;output out=bounds mean=mean p5=p5 p95=p95;

run;

title "Monte Carlo Generated ConfidenceIntervals on a Forecast";

proc gplot data=bounds;plot mean*yr p5*yr p95*yr /overlay;symbol1 i=join value=triangle;symbol2 i=join value=square l=4;symbol3 i=join value=square l=4;

run;


Chapter 14. Syntax

Figure 14.12. Monte Carlo Confidence Interval Plot

Syntax

The following statements can be used with the MODEL procedure:

PROC MODEL options;ABORT ;ARRAY arrayname variables : : : ;ATTRIB variable-list attribute-list [variable-list attribute-list];BOUNDS bound1, bound2 : : : ;BY variables;CALL name [( expression [, expression : : : ] ) ] ;CONTROL variable [ value ] : : : ;DELETE ;DO [variable = expression [ TO expression ] [ BY expression ]

[, expression TO expression ] [ BY expression ] : : : ][ WHILE expression ] [ UNTIL expression ] ;

END ;DROP variable : : : ;ENDOGENOUS variable [ initial values ] : : : ;ESTIMATE item [ , item : : : ] [ ,/ options ] ;EXOGENOUS variable [ initial values ] : : : ;FIT equations [ PARMS=(parameter values : : : ) ]

START=(parameter values : : : )[ DROP=(parameters)] [ / options ];

FORMAT variables [ format ] [ DEFAULT = default-format ];GOTO statement–label ;



ID variables;IF expression ;IF expression THEN programming–statement ;

ELSE programming–statement ;variable = expression ;variable + expression ;INCLUDE model files : : : ;INSTRUMENTS [ instruments ] [–EXOG– ]

[EXCLUDE=(parameters) ] [/ options ] ;KEEP variable : : : ;LABEL variable =’label’ : : : ;LENGTH variables [$ ] length : : : [DEFAULT=length ];LINK statement–label ;OUTVARS variable : : : ;PARAMETERS variable [ value ] variable [ value ] : : : ;PUT print–item : : : [ @ ] [ @@ ] ;RANGE variable [ = first ] [TO last ];RENAME old-name =new-name : : : [ old-name=new-name ];RESET options;RESTRICT restriction1 [ , restriction2 : : : ] ;RETAIN variables values [ variables values: : :] ;RETURN ;SOLVE variables [SATISFY=(equations) ] [/ options ] ;SUBSTR( variable, index, length ) = expression ;SELECT [ ( expression ) ] ;

OTHERWISE programming–statement ;STOP ;TEST [ "name" ] test1 [, test2 : : : ] [,/ options ] ;VAR variable [ initial values ] : : : ;WEIGHT variable;WHEN ( expression ) programming–statement ;

Functional Summary

The statements and options in the MODEL procedure are summarized in the follow-ing table.

Description Statement Option

Data Set Optionsspecify the input data set for the variables FIT, SOLVE DATA=specify the input data set for parameters FIT, SOLVE ESTDATA=specify the method for handling missingvalues

FIT MISSING=

specify the input data set for parameters MODEL PARMSDATA=


Chapter 14. Syntax


specify the output data set for residual, pre-dicted, or actual values

FIT OUT=

specify the output data set for solution moderesults

SOLVE OUT=

write the actual values to OUT= data set FIT OUTACTUALselect all output options FIT OUTALLwrite the covariance matrix of the estimates FIT OUTCOVwrite the parameter estimates to a data set FIT OUTEST=write the parameter estimates to a data set MODEL OUTPARMS=write the observations used to start the lags SOLVE OUTLAGSwrite the predicted values to the OUT= dataset

FIT OUTPREDICT

write the residual values to the OUT= data set FIT OUTRESIDwrite the covariance matrix of the equation er-rors to a data set

FIT OUTS=

write theS matrix used in the objective func-tion definition to a data set

FIT OUTSUSED=

write the estimate of the variance matrix of themoment generating function

FIT OUTV=

read the covariance matrix of the equationerrors

FIT, SOLVE SDATA=

read the covariance matrix for GMM andITGMM

FIT VDATA=

specify the name of the time variable FIT, SOLVE,MODEL

TIME=

select the estimation type to read FIT, SOLVE TYPE=

General ESTIMATE Statement Optionsspecify the name of the data set in which theestimate of the functions of the parameters areto be written

ESTIMATE OUTEST=

write the covariance matrix of the functions ofthe parameters to the OUTEST= data set

ESTIMATE OUTCOV

print the covariance matrix of the functions ofthe parameters

ESTIMATE COVB

print the correlation matrix of the functions ofthe parameters

ESTIMATE CORRB

Printing Options for FIT Tasksprint the modified Breusch-Pagan test forheteroscedasticity

FIT BREUSCH

print collinearity diagnostics FIT COLLINprint the correlation matrices FIT CORRprint the correlation matrix of the parameters FIT CORRB




print the correlation matrix of the residuals FIT CORRSprint the covariance matrices FIT COVprint the covariance matrix of the parameters FIT COVBprint the covariance matrix of the residuals FIT COVSprint Durbin-Watsond statistics FIT DWprint first-stage R2 statistics FIT FSRSQprint Godfrey’s tests for autocorrelated residu-als for each equation

FIT GODFREY

print tests of normality of the model residuals FIT NORMALspecify all the printing options FIT PRINTALLprint White’s test for heteroscedasticity FIT WHITE

Options to Control FIT Iteration Outputprint the inverse of the crossproducts Jacobianmatrix

FIT I

print a summary iteration listing FIT ITPRINTprint a detailed iteration listing FIT ITDETAILSprint the crossproduct Jacobian matrix FIT XPXspecify all the iteration printing-controloptions

FIT ITALL

Options to Control the MinimizationProcessspecify the convergence criteria FIT CONVERGE=select the Hessian approximation used forFIML

FIT HESSIAN=

specifies the local truncation error bound forthe integration

FIT, SOLVE,MODEL

LTEBOUND=

specify the maximum number of iterationsallowed

FIT MAXITER=

specify the maximum number of subiterationsallowed

FIT MAXSUBITER=

select the iterative minimization method to use FIT METHOD=specifies the smallest allowed time step to beused in the integration

FIT, SOLVE,MODEL

MINTIMESTEP=

modify the iterations for estimation methodsthat iterate theSmatrix or theV matrix

FIT NESTIT

specify the smallest pivot value MODEL, FIT,SOLVE

SINGULAR

specify the number of minimization iterationsto perform at each grid point

FIT STARTITER=

specify a weight variable WEIGHT


Chapter 14. Syntax


Options to Read and Write Model Filesread a model from one or more input modelfiles

INCLUDE MODEL=

suppress the default output of the model file MODEL, RESET NOSTOREspecify the name of an output model file MODEL, RESET OUTMODEL=delete the current model RESET PURGE

Options to List or Analyze the Struc-ture of the Modelprint a dependency structure of a model MODEL BLOCKprint a graph of the dependency structure of amodel

MODEL GRAPH

print the model program and variable lists MODEL LISTprint the derivative tables and compiled modelprogram code

MODEL LISTCODE

print a dependency list MODEL LISTDEPprint a table of derivatives MODEL LISTDERprint a cross-reference of the variables MODEL XREF

General Printing Control Optionsexpand parts of the printed output FIT, SOLVE DETAILSprint a message for each statement as it isexecuted

FIT, SOLVE FLOW

select the maximum number of execution er-rors that can be printed

FIT, SOLVE MAXERRORS=

select the number of decimal places shown inthe printed output

FIT, SOLVE NDEC=

suppress the normal printed output FIT, SOLVE NOPRINTspecify all the noniteration printing options FIT, SOLVE PRINTALLprint the result of each operation as it isexecuted

FIT, SOLVE TRACE

request a comprehensive memory usagesummary

FIT, SOLVE,MODEL, RESET

MEMORYUSE

turns off the NOPRINT option RESET PRINT

Statements that Declare Variablesassociate a name with a list of variables andconstants

ARRAY

declare a variable to have a fixed value CONTROLdeclare a variable to be a dependent or endoge-nous variable

ENDOGENOUS

declare a variable to be an independent or ex-ogenous variable

EXOGENOUS

specify identifying variables ID




assign a label to a variable LABELselect additional variables to be output OUTVARSdeclare a variable to be a parameter PARAMETERSforce a variable to hold its value from a previ-ous observation

RETAIN

declare a model variable VARdeclare an instrumental variable INSTRUMENTSomit the default intercept term in the instru-ments list

INSTRUMENTS NOINT

General FIT Statement Optionsomit parameters from the estimation FIT DROP=associate avariable with an initial value as aparameter or a constant

FIT INITIAL=

bypass OLS to get initial parameter estimatesfor GMM, ITGMM, or FIML

FIT NOOLS

bypass 2SLS to get initial parameter estimatesfor GMM, ITGMM, or FIML

FIT NO2SLS

specify the parameters to estimate FIT PARMS=request confidence intervals on estimatedparameters

FIT PRL=

select a grid search FIT START=

Options to Control the EstimationMethod Usedspecify nonlinear ordinary least squares FIT OLSspecify iterated nonlinear ordinary leastsquares

FIT ITOLS

specify seemingly unrelated regression FIT SURspecify iterated seemingly unrelatedregression

FIT ITSUR

specify two-stage least squares FIT 2SLSspecify iterated two-stage least squares FIT IT2SLSspecify three-stage least squares FIT 3SLSspecify iterated three-stage least squares FIT IT3SLSspecify full information maximum likelihood FIT FIMLselect the variance-covariance estimator usedfor FIML

FIT COVBEST=

specify generalized method of moments FIT GMMspecify the kernel for GMM and ITGMM FIT KERNEL=specify iterated generalized method ofmoments

FIT ITGMM

specify the denominator for computing vari-ances and covariances

FIT VARDEF=


Chapter 14. Syntax


Solution Mode Optionsselect a subset of the model equations SOLVE SATISFY=solve only for missing variables SOLVE FORECASTsolve for all solution variables SOLVE SIMULATE

Solution Mode Options: LagProcessinguse solved values in the lag functions SOLVE DYNAMICuse actual values in the lag functions SOLVE STATICproduce successive forecasts to a fixed forecasthorizon

SOLVE NAHEAD=

select the observation to start dynamicsolutions

SOLVE START=

Solution Mode Options: NumericalMethodsspecify the maximum number of iterationsallowed

SOLVE MAXITER=

specify the maximum number of subiterationsallowed

SOLVE MAXSUBITER=

specify the convergence criteria SOLVE CONVERGE=compute a simultaneous solution using aJacobi-like iteration

SOLVE JACOBI

compute a simultaneous solution using aGauss-Seidel-like iteration

SOLVE SEIDEL

compute a simultaneous solution using New-ton’s method

SOLVE NEWTON

compute a nonsimultaneous solution SOLVE SINGLE

Monte Carlo Simulation Optionsspecify psuedo or quasi-random numbergenerator

SOLVE QUASI=

repeat the solution multiple times SOLVE RANDOM=initialize the pseudo-random numbergenerator

SOLVE SEED=

Solution Mode Printing Optionsprint between data points integration valuesfor the DERT. variables and the auxiliaryvariables

FIT, SOLVE,MODEL

INTGPRINT

print the solution approximation and equationerrors

SOLVE ITPRINT




print the solution values and residuals at eachobservation

SOLVE SOLVEPRINT

print various summary statistics SOLVE STATSprint tables of Theil inequality coefficients SOLVE THEILspecify all printing control options SOLVE PRINTALL

General TEST Statement Optionsspecify that a Wald test be computed TEST WALDspecify that a Lagrange multiplier test becomputed

TEST LM

specify that a likelihood ratio test be computed TEST LRrequests all three types of tests TEST ALLspecify the name of an output SAS data set thatcontains the test results

TEST OUT=

Miscellaneous Statementsspecify the range of observations to be used RANGEsubset the data set withbyvariables BY

PROC MODEL Statement

PROC MODEL options;

The following options can be specified in the PROC MODEL statement. All of thenonassignment options (the options that do not accept a value after an equal sign) canhave NO prefixed to the option name in the RESET statement to turn the option off.The default case is not explicitly indicated in the discussion that follows. Thus, forexample, the option DETAILS is documented in the following, but NODETAILS isnot documented since it is the default. Also, the NOSTORE option is documentedbecause STORE is the default.

Data Set OptionsDATA= SAS-data-set

names the input data set. Variables in the model program are looked up in the DATA=data set and, if found, their attributes (type, length, label, format) are set to be thesame as those in the input data set (if not previously defined otherwise). The valuesfor the variables in the program are read from the input data set when the model isestimated or simulated by FIT and SOLVE statements.


Chapter 14. Syntax

OUTPARMS= SAS-data-setwrites the parameter estimates to a SAS data set. See "Output Data Sets" for details.

PARMSDATA= SAS-data-setnames the SAS data set that contains the parameter estimates. See "Input Data Sets"for details.

Options to Read and Write Model FilesMODEL= model-nameMODEL= (model-list)

reads the model from one or more input model files created by previous PROCMODEL executions. Model files are written by the OUTMODEL= option.

NOSTOREsuppresses the default output of the model file. This option is only applicable whenFIT or SOLVE statements are not used, the MODEL= option is not used, and when amodel is specified.

OUTMODEL= model-namespecifies the name of an output model file to which the model is to be written. Modelfiles are stored as members of a SAS catalog, with the type MODEL.

V5MODEL= model-namereads model files written by Version 5 of SAS/ETS software.

Options to List or Analyze the Structure of the ModelThese options produce reports on the structure of the model or list the programmingstatements defining the models. These options are automatically reset (turned off)after the reports are printed. To turn these options back on after a RUN statement hasbeen entered, use the RESET statement or specify the options on a FIT or SOLVEstatement.

BLOCKprints an analysis of the structure of the model given by the assignments to modelvariables appearing in the model program. This analysis includes a classification ofmodel variables into endogenous (dependent) and exogenous (independent) groupsbased on the presence of the variable on the left-hand side of an assignment state-ment. The endogenous variables are grouped into simultaneously determined blocks.The dependency structure of the simultaneous blocks and exogenous variables is alsoprinted. The BLOCK option cannot analyze dependencies implied by general formequations.

GRAPHprints the graph of the dependency structure of the model. The GRAPH option alsoinvokes the BLOCK option and produces a graphical display of the information listedby the BLOCK option.

LISTprints the model program and variable lists, including the statements added by PROCMODEL and macros.



LISTALLselects the LIST, LISTDEP, LISTDER, and LISTCODE options.

LISTCODEprints the derivative tables and compiled model program code. LISTCODE is a de-bugging feature and is not normally needed.

LISTDEPprints a report that lists for each variable in the model program the variables thatdepend on it and that it depends on. These lists are given separately for current-periodvalues and for lagged values of the variables.

The information displayed is the same as that used to construct the BLOCK report butdiffers in that the information is listed for all variables (including parameters, controlvariables, and program variables), not just the model variables. Classification intoendogenous and exogenous groups and analysis of simultaneous structure is not doneby the LISTDEP report.

LISTDERprints a table of derivatives for FIT and SOLVE tasks. (The LISTDER option isonly applicable for the default NEWTON method for SOLVE tasks.) The derivativestable shows each nonzero derivative computed for the problem. The derivative listedcan be a constant, a variable in the model program, or a special derivative variablecreated to hold the result of the derivative expression. This option is turned on by theLISTCODE and PRINTALL options.

XREFprints a cross-reference of the variables in the model program showing where eachvariable was referenced or given a value. The XREF option is normally used inconjunction with the LIST option. A more detailed description is given in the "Diag-nostics and Debugging" section.

General Printing Control OptionsDETAILS

specifies the detailed printout. Parts of the printed output are expanded when theDETAILS option is specified.

FLOWprints a message for each statement in the model program as it is executed. Thisdebugging option is needed very rarely and produces voluminous output.

MAXERRORS= nspecifies the maximum number of execution errors that can be printed. The default isMAXERRORS=50.

NDEC= nspecifies the precision of the format that PROC MODEL uses when printing variousnumbers. The default is NDEC=3, which means that PROC MODEL attempts toprint values using the D format but ensures that at least three significant digits areshown. If the NDEC= value is greater than nine, the BEST. format is used. Thesmallest value allowed is NDEC=2.


Chapter 14. Syntax

The NDEC= option affects the format of most, but not all, of the floating point num-bers that PROC MODEL can print. For some values (such as parameter estimates), aprecision limit one or two digits greater than the NDEC= value is used. This optiondoes not apply to the precision of the variables in the output data set.

NOPRINTsuppresses the normal printed output but does not suppress error listings. Using anyother print option turns the NOPRINT option off. The PRINT option can be usedwith the RESET statement to turn off NOPRINT.

PRINTALLturns on all the printing-control options. The options set by PRINTALL are DE-TAILS; the model information options LIST, LISTDEP, LISTDER, XREF, BLOCK,and GRAPH; the FIT task printing options FSRSQ, COVB, CORRB, COVS,CORRS, DW, and COLLIN; and the SOLVE task printing options STATS, THEIL,SOLVEPRINT, and ITPRINT.

TRACEprints the result of each operation in each statement in the model program as it isexecuted, in addition to the information printed by the FLOW option. This debuggingoption is needed very rarely and produces voluminous output.

MEMORYUSEprints a report of the memory required for the various parts of the analysis.

FIT Task OptionsThe following options are used in the FIT statement (parameter estimation) andcan also be used in the PROC MODEL statement: COLLIN, CONVERGE=,CORR, CORRB, CORRS, COVB, COVBEST=, COVS, DW, FIML, FSRSQ,GMM, HESSIAN=, I, INTGPRINT, ITALL, ITDETAILS, ITGMM, ITPRINT,ITOLS, ITSUR, IT2SLS, IT3SLS, KERNEL=, LTEBOUND=, MAXITER=, MAX-SUBITER=, METHOD=, MINTIMESTEP=, NESTIT, N2SLS, N3SLS, OLS, OUT-PREDICT, OUTRESID, OUTACTUAL, OUTLAGS, OUTERRORS, OUTALL,OUTCOV, SINGULAR=, STARTITER=, SUR, TIME=, VARDEF, and XPX. See"FIT Statement Syntax" later in this chapter for a description of these options.

When used in the PROC MODEL or RESET statement, these are default options forsubsequent FIT statements. For example, the statement

proc model n2sls ... ;

makes two-stage least squares the default parameter estimation method for FIT state-ments that do not specify an estimation method.

SOLVE Task OptionsThe following options used in the SOLVE statement can also be used in thePROC MODEL statement: CONVERGE=, DYNAMIC, FORECAST, INTG-PRINT, ITPRINT, JACOBI, LTEBOUND=, MAXITER=, MAXSUBITER=,MINTIMESTEP=, NAHEAD=, NEWTON, OUTPREDICT, OUTRESID, OUTAC-TUAL, OUTLAGS, OUTERRORS, OUTALL, SEED=, SEIDEL, SIMULATE, SIN-GLE, SINGULAR=, SOLVEPRINT, START=, STATIC, STATS, THEIL, TIME=,



and TYPE=. See "SOLVE Statement Syntax" later in this chapter for a descriptionof these options.

When used in the PROC MODEL or RESET statement, these options provide defaultvalues for subsequent SOLVE statements.

BOUNDS Statement

BOUNDS bound1 [, bound2 ... ] ;

The BOUNDS statement imposes simple boundary constraints on the parameter es-timates. BOUNDS statement constraints refer to the parameters estimated by theassociated FIT statement (that is, to either the preceding FIT statement or, in the ab-sence of a preceding FIT statement, to the following FIT statement). You can specifyany number of BOUNDS statements.

Eachboundis composed of parameters and constants and inequality operators:

item operator item [ operator item [ operator item : : : ] ]

Eachitem is a constant, the name of an estimated parameter, or a list of parameternames. Eachoperatoris ’<’, ’>’, ’<=’, or ’>=’.

You can use both the BOUNDS statement and the RESTRICT statement to imposeboundary constraints; however, the BOUNDS statement provides a simpler syntaxfor specifying these kinds of constraints. See "The RESTRICT Statement" for infor-mation on the computational details of estimation with inequality restrictions.

Lagrange multipliers are reported for all the active boundary constraints. In theprinted output and in the OUTEST= data set, the Lagrange multiplier estimates areidentified with the names BOUND1, BOUND2, and so forth. The probability of theLagrange multipliers are computed using a beta distribution (LaMotte 1994). To givethe constraints more descriptive names, use the RESTRICT statement instead of theBOUNDS statement.

The following BOUNDS statement constrains the estimates of the parameters A andB and the ten parameters P1 through P10 to be between zero and one. This exampleillustrates the use of parameter lists to specify boundary constraints.

bounds 0 < a b p1-p10 < 1;

The following is an example of the use of the BOUNDS statement:

title ’Holzman Function (1969), Himmelblau No. 21, N=3’;data zero;

do i = 1 to 99;output;

end;run;


Chapter 14. Syntax

proc model data=zero ;parms x1= 100 x2= 12.5 x3= 3;bounds .1 <= x1 <= 100,

0 <= x2 <= 25.6,0 <= x3 <= 5;

t = 2 / 3;u = 25 + (-50 * log(0.01 * i )) ** t;v = (u - x2) ** x3;w = exp(-v / x1);eq.foo = -.01 * i + w;

fit foo / method=marquardt;run;

Holzman Function (1969), Himmelblau No. 21, N=3

The MODEL Procedure



x1 49.99999 0 . .x2 25 0 . .x3 1.5 0 . .


Used 99 Objective 5.455E-18Missing 0 Objective*N 5.4E-16

Figure 14.13. Output from Bounded Estimation

BY Statement

BY variables;

A BY statement is used with the FIT statement to obtain separate estimates for ob-servations in groups defined by the BY variables. Note that if an output model file iswritten, using the OUTMODEL= option, the parameter values stored are those fromthe last BY group processed. To save parameter estimates for each BY group, use theOUTEST= option in the FIT statement.

A BY statement is used with the SOLVE statement to obtain solutions for observa-tions in groups defined by the BY variables. The BY statement only applies to theDATA= data set.

BY group processing is done separately for the FIT and the SOLVE tasks. It is notpossible to use the BY statement to estimate and solve a model for each instance of



a BY variable. If BY group processing is done for the FIT and the SOLVE tasks, theparameters obtained from the last BY group processed by the FIT statement are usedby the SOLVE statement for all of the BY groups.

CONTROL Statement

CONTROL variable [ value ] ... ;

The CONTROL statement declares control variables and specifies their values. Acontrol variable is like a parameter except that it has a fixed value and is not estimatedfrom the data. You can use control variables for constants in model equations thatyou may want to change in different solution cases. You can use control variables tovary the program logic. Unlike the retained variables, these values are fixed acrossiterations.

ENDOGENOUS Statement

ENDOGENOUS variable [ initial-values ] ... ;

The ENDOGENOUS statement declares model variables and identifies them as en-dogenous. You can declare model variables with an ENDOGENOUS statement in-stead of with a VAR statement to help document the model or to indicate the defaultsolution variables. The variables declared endogenous are solved when a SOLVEstatement does not indicate which variables to solve. Valid abbreviations for the EN-DOGENOUS statement are ENDOG and ENDO.

The ENDOGENOUS statement optionally provides initial values for lagged depen-dent variables. See "Lag Logic" in the "Functions Across Time" section for moreinformation.

ESTIMATE Statement

ESTIMATE item [ , item ... ] [ ,/ options ] ;

The ESTIMATE statement computes estimates of functions of the parameters.

The ESTIMATE statement refers to the parameters estimated by the associated FITstatement (that is, to either the preceding FIT statement or, in the absence of a pre-ceding FIT statement, to the following FIT statement). You can use any number ofESTIMATE statements.

If you specify options on the ESTIMATE statement, a comma is required before the"/" character separating the test expressions from the options, since the "/" charactercan also be used within test expressions to indicate division. Eachitem is written asan optional name followed by an expression,


Chapter 14. Syntax

[ "name" ] expression

where"name"is a string used to identify the estimate in the printed output and in theOUTEST= data set.

Expressions can be composed of parameter names, arithmetic operators, functions,and constants. Comparison operators (such as "=" or "<") and logical operators (suchas “&”) cannot be used in ESTIMATE statement expressions. Parameters named inESTIMATE expressions must be among the parameters estimated by the associatedFIT statement.

You can use the following options in the ESTIMATE statement:

OUTEST=specifies the name of the data set in which the estimate of the functions of the param-eters are to be written. The format for this data set is identical to the OUTEST= dataset for the FIT statement.

If you specify anamein the ESTIMATE statement, that name is used as the parametername for the estimate in the OUTEST= data set. If nonameis provided and theexpression is just a symbol, the symbol name is used; otherwise, the string "–Estimate#" is used, where "#" is the variable number in the OUTEST= data set.

OUTCOVwrites the covariance matrix of the functions of the parameters to the OUTEST= dataset in addition to the parameter estimates.

COVBprints the covariance matrix of the functions of the parameters.

CORRBprints the correlation matrix of the functions of the parameters.

The following is an example of the use of the ESTIMATE statement in a segmentedmodel:

data a;input y x @@;datalines;.46 1 .47 2 .57 3 .61 4 .62 5 .68 6 .69 7.78 8 .70 9 .74 10 .77 11 .78 12 .74 13 .80 13.80 15 .78 16;

title ’Segmented Model -- Quadratic with Plateau’;proc model data=a;

x0 = -.5 * b / c;

if x < x0 then y = a + b*x + c*x*x;else y = a + b*x0 + c*x0*x0;

fit y start=( a .45 b .5 c -.0025 );



estimate ’Join point’ x0 ,’plateau’ a + b*x0 + c*x0**2 ;

run;

Segmented Model -- Quadratic with Plateau

The MODEL Procedure

Nonlinear OLS Estimates

Approx ApproxTerm Estimate Std Err t Value Pr > |t| Label

Join point 12.7504 1.2785 9.97 <.0001 x0plateau 0.777516 0.0123 63.10 <.0001 a + b*x0 + c*x0**2

Figure 14.14. ESTIMATE Statement Output

EXOGENOUS Statement

EXOGENOUS variable [initial-values] ... ;

The EXOGENOUS statement declares model variables and identifies them as exoge-nous. You can declare model variables with an EXOGENOUS statement instead ofwith a VAR statement to help document the model or to indicate the default instru-mental variables. The variables declared exogenous are used as instruments whenan instrumental variables estimation method is requested (such as N2SLS or N3SLS)and an INSTRUMENTS statement is not used. Valid abbreviations for the EXOGE-NOUS statement are EXOG and EXO.

The EXOGENOUS statement optionally provides initial values for lagged exogenousvariables. See "Lag Logic" in the "Functions Across Time" section for more infor-mation.

FIT Statement

FIT [ equations ] [ PARMS=( parameter [values] ... ) ][ START=( parameter values ... ) ][ DROP=( parameter ... ) ][ INITIAL= ( variable = [ parameter | constant ] ... )[ / options ] ;

The FIT statement estimates model parameters by fitting the model equations to inputdata and optionally selects the equations to be fit. If the list of equations is omitted,all model equations containing parameters are fit.

The following options can be used in the FIT statement.


Chapter 14. Syntax

DROP= ( parameters ... )specifies that the named parameters not be estimated. All the parameters in the equa-tions fit are estimated except those listed in the DROP= option. The dropped param-eters retain their previous values and are not changed by the estimation.

INITIAL= ( variable = [parameter | constant ] ... )associates avariablewith an initial value as aparameteror aconstant.

NOOLSNO2SLS

specifies bipassing OLS or 2SLS to get initial parameter estimates for GMM, IT-GMM, or FIML. This is important for certian models that are poorly defined in OLSor 2SLS or if good initial parameter values are already provided. Note that for GMM,the V matrix is created using the initial values specified and this may not be consis-tently estimated.

PARMS= ( parameters [values] ... )selects a subset of the parameters for estimation. When the PARMS= option isused, only the named parameters are estimated. Any parameters not specified inthe PARMS= list retain their previous values and are not changed by the estimation.

PRL= WALD | LR | BOTHrequests confidence intervals on estimated parameters. By default the PRL optionproduces 95% likelihood ratio confidence limits. The coverage of the confidenceinterval is controlled by the ALPHA= option in the FIT statement.

START= ( parameter values ... )supplies starting values for the parameter estimates. If the START= option specifiesmore than one starting value for one or more parameters, a grid search is performedover all combinations of the values, and the best combination is used to start theiterations. For more information, see the STARTITER= option.

Options to Control the Estimation Method UsedCOVBEST=GLS | CROSS | FDA

specifies the variance-covariance estimator used for FIML. COVBEST=GLS selectsthe generalized least-squares estimator. COVBEST=CROSS selects the crossprod-ucts estimator. COVBEST=FDA selects the inverse of the finite difference approxi-mation to the Hessian. The default is COVBEST=CROSS.

FIMLspecifies full information maximum likelihood estimation.

GMMspecifies generalized method of moments estimation.

ITGMMspecifies iterated generalized method of moments estimation.

ITOLSspecifies iterated ordinary least-squares estimation. This is the same as OLS unlessthere are cross-equation parameter restrictions.



ITSURspecifies iterated seemingly unrelated regression estimation

IT2SLSspecifies iterated two-stage least-squares estimation. This is the same as 2SLS unlessthere are cross-equation parameter restrictions.

IT3SLSspecifies iterated three-stage least-squares estimation.

KERNEL=(PARZEN | BART | QS, [c], [e] )KERNEL=PARZEN | BART | QS

specifies the kernel to be used for GMM and ITGMM. PARZEN selects the Parzenkernel, BART selects the Bartlett kernel, and QS selects the Quadratic Spectral ker-nel. e � 0 andc � 0 are used to compute the bandwidth parameter. The default isKERNEL=(PARZEN, 1, 0.2). See the "Estimation Methods" section for more details.

N2SLS | 2SLSspecifies nonlinear two-stage least-squares estimation. This is the default when anINSTRUMENTS statement is used.

N3SLS | 3SLSspecifies nonlinear three-stage least-squares estimation.

OLSspecifies ordinary least-squares estimation. This is the default when no INSTRU-MENTS statement is used.

SURspecifies seemingly unrelated regression estimation.

VARDEF=N | WGT | DF | WDFspecifies the denominator to be used in computing variances and covariances.VARDEF=N specifies that the number of nonmissing observations be used.VARDEF=WGT specifies that the sum of the weights be used. VARDEF=DFspecifies that the number of nonmissing observations minus the model degreesof freedom (number of parameters) be used. VARDEF=WDF specifies that thesum of the weights minus the model degrees of freedom be used. The default isVARDEF=DF. VARDEF=N is used for FIML estimation.


specifies the input data set. Values for the variables in the program are read fromthis data set. If the DATA= option is not specified on the FIT statement, the data setspecified by the DATA= option on the PROC MODEL statement is used.

ESTDATA= SAS-data-setspecifies a data set whose first observation provides initial values for some or all ofthe parameters.


Chapter 14. Syntax

MISSING= PAIRWISE | DELETEThe option MISSING=PAIRWISE specifies that missing values are tracked on anequation-by-equation basis. The MISSING=DELETE option specifies that the entireobservation is omitted from the analysis when any equation has a missing predictedor actual value for the equation. The default is MISSING=DELETE.

OUT= SAS-data-setnames the SAS data set to contain the residuals, predicted values, or actual valuesfrom each estimation. Only the residuals are output by default.

OUTACTUALwrites the actual values of the endogenous variables of the estimation to the OUT=data set. This option is applicable only if the OUT= option is specified.

OUTALLselects the OUTACTUAL, OUTERRORS, OUTLAGS, OUTPREDICT, and OUT-RESID options.

OUTCOVCOVOUT

writes the covariance matrix of the estimates to the OUTEST= data set in addition tothe parameter estimates. The OUTCOV option is applicable only if the OUTEST=option is also specified.

OUTEST= SAS-data-setnames the SAS data set to contain the parameter estimates and optionally the covari-ance of the estimates.

OUTLAGSwrites the observations used to start the lags to the OUT= data set. This option isapplicable only if the OUT= option is specified.

OUTPREDICTwrites the predicted values to the OUT= data set. This option is applicable only ifOUT= is specified.

OUTRESIDwrites the residual values computed from the parameter estimates to the OUT= dataset. The OUTRESID option is the default if neither OUTPREDICT nor OUTAC-TUAL is specified. This option is applicable only if the OUT= option is specified.

OUTS= SAS-data-setnames the SAS data set to contain the estimated covariance matrix of the equationerrors. This is the covariance of the residuals computed from the parameter estimates.

OUTSUSED= SAS-data-setnames the SAS data set to contain the S matrix used in the objective function defini-tion. The OUTSUSED= data set is the same as the OUTS= data set for the methodsthat iterate the S matrix.

OUTV= SAS-data-setnames the SAS data set to contain the estimate of the variance matrix for GMM andITGMM.



SDATA= SAS-data-setspecifies a data set that provides the covariance matrix of the equation errors. Thematrix read from the SDATA= data set is used for the equation covariance matrix (Smatrix) in the estimation. (The SDATA= S matrix is used to provide only the initialestimate ofS for the methods that iterate the S matrix.)

TIME= namespecifies the name of the time variable. This variable must be in the data set.

TYPE= namespecifies the estimation type to read from the SDATA= and ESTDATA= data sets.The name specified in the TYPE= option is compared to the–TYPE– variable in theESTDATA= and SDATA= data sets to select observations to use in constructing thecovariance matrices. When the TYPE= option is omitted, the last estimation type inthe data set is used. Valid values are the estimation methods used in PROC MODEL.

VDATA= SAS-data-setspecifies a data set containing a variance matrix for GMM and ITGMM estimation.

Printing Options for FIT TasksBREUSCH= ( variable-list )

specifies the modified Breusch-Pagan test, wherevariable-list is a list of variablesused to model the error variance.

COLLINprints collinearity diagnostics for the Jacobian crossproducts matrix (XPX) after theparameters have converged. Collinearity diagnostics are also automatically printed ifthe estimation fails to converge.

CORRprints the correlation matrices of the residuals and parameters. Using CORR is thesame as using both CORRB and CORRS.

CORRBprints the correlation matrix of the parameter estimates.

CORRSprints the correlation matrix of the residuals.

COVprints the covariance matrices of the residuals and parameters. Specifying COV isthe same as specifying both COVB and COVS.

COVBprints the covariance matrix of the parameter estimates.

COVSprints the covariance matrix of the residuals.

DWprints Durbin-Watsond statistics, which measure autocorrelation of the residuals.When the residual series is interrupted by missing observations, the Durbin-Watsonstatistic calculated isdprimesym as suggested by Savin and White (1978). This is the


Chapter 14. Syntax

usual Durbin-Watson computed by ignoring the gaps. Savin and White show that ithas the same null distribution as the DW with no gaps in the series and can be used totest for autocorrelation using the standard tables. The Durbin-Watson statistic is notvalid for models containing lagged endogenous variables.

FSRSQprints the first-stage R2 statistics for instrumental estimation methods. These R2smeasure the proportion of the variance retained when the Jacobian columns associ-ated with the parameters are projected through the instruments space.

GODFREYGODFREY= n

performs Godfrey’s tests for autocorrelated residuals for each equation, wheren isthe maximum autoregressive order, and specifies that Godfrey’s tests be computedfor lags 1 throughn. The default number of lags is one.

NORMALperforms tests of normality of the model residuals.

PRINTALLspecifies the printing options COLLIN, CORRB, CORRS, COVB, COVS, DETAILS,DW, and FSRSQ.

WHITEspecifies White’s test.

Options to control iteration outputDetails of the output produced are discussed in the section "Iteration History".

Iprints the inverse of the crossproducts Jacobian matrix at each iteration.

ITALLspecifies all iteration printing-control options (I, ITDETAILS, ITPRINT, and XPX).ITALL also prints the crossproducts matrix (labeled CROSS), the parameter changevector, and the estimate of the cross-equation covariance of residuals matrix at eachiteration.

ITDETAILSprints a detailed iteration listing. This includes the ITPRINT information and addi-tional statistics.

ITPRINTprints the parameter estimates, objective function value, and convergence criteria ateach iteration.

XPXprints the crossproducts Jacobian matrix at each iteration.



Options to Control the Minimization ProcessThe following options may be helpful when you experience a convergence problem:

CONVERGE= value1CONVERGE= (value1, value2)

specifies the convergence criteria. The convergence measure must be less thanvalue1before convergence is assumed.value2is the convergence criterion for theS andVmatrices forS andV iterated methods.value2defaults tovalue1. See "The Conver-gence Criteria" for details. The default value is CONVERGE=.001.

HESSIAN= CROSS | GLS | FDAspecifies the Hessian approximation used for FIML. HESSIAN=CROSS selects thecrossproducts approximation to the Hessian, HESSIAN=GLS selects the generalizedleast-squares approximation to the Hessian, and HESSIAN=FDA selects the finitedifference approximation to the Hessian. HESSIAN=GLS is the default.

LTEBOUND= nspecifies the local truncation error bound for the integration. This option is ignored ifno ODE’s are specified.

MAXITER= nspecifies the maximum number of iterations allowed. The default is MAXITER=100.

MAXSUBITER= nspecifies the maximum number of subiterations allowed for an iteration. For theGAUSS method, the MAXSUBITER= option limits the number of step halvings.For the MARQUARDT method, the MAXSUBITER= option limits the number oftimes� can be increased. The default is MAXSUBITER=30. See "MinimizationMethods" for details.

METHOD= GAUSS | MARQUARDTspecifies the iterative minimization method to use. METHOD=GAUSS specifiesthe Gauss-Newton method, and METHOD=MARQUARDT specifies the Marquardt-Levenberg method. The default is METHOD=GAUSS. See "Minimization Methods"for details.

MINTIMESTEP= nspecifies the smallest allowed time step to be used in the integration. This option isignored if no ODE’s are specified.

NESTITchanges the way the iterations are performed for estimation methods that iterate theestimate of the equation covariance (S matrix). The NESTIT option is relevant onlyfor the methods that iterate the estimate of the covariance matrix (ITGMM, ITOLS,ITSUR, IT2SLS, IT3SLS). See "Details on the Covariance of Equation Errors" for anexplanation of NESTIT.

SINGULAR= valuespecifies the smallest pivot value allowed. The default 1.0E-12.


Chapter 14. Syntax

STARTITER= nspecifies the number of minimization iterations to perform at each grid point. Thedefault is STARTITER=0, which implies that no minimization is performed at thegrid points. See "Using the STARTITER option" for more details.

Other OptionsOther options that can be used on the FIT statement include the following that list andanalyze the model: BLOCK, GRAPH, LIST, LISTCODE, LISTDEP, LISTDER, andXREF. The following printing control options are also available: DETAILS, FLOW,INTGPRINT, MAXERRORS=, NOPRINT, PRINTALL, and TRACE. For completedescriptions of these options, see the discussion of the PROC MODEL statementoptions earlier in this chapter.

ID Statement

ID variables;

The ID statement specifies variables to identify observations in error messages orother listings and in the OUT= data set. The ID variables are normally SAS date ordatetime variables. If more than one ID variable is used, the first variable is used toidentify the observations; the remaining variables are added to the OUT= data set.

INCLUDE Statement

INCLUDE model-names ... ;

The INCLUDE statement reads model files and inserts their contents into the currentmodel. However, instead of replacing the current model as the RESET MODEL=option does, the contents of included model files are inserted into the model programat the position that the INCLUDE statement appears.

INSTRUMENTS Statement

The INSTRUMENTS statement specifies the instrumental variables to be used in theN2SLS, N3SLS, IT2SLS, IT3SLS, GMM, and ITGMM estimation methods. Thereare two forms of the INSTRUMENTS statement:

INSTRUMENTS variables [ –EXOG– ] ;INSTRUMENTS [instruments] [ –EXOG– ]

[ EXCLUDE=( parameters ) ] [ / options ] ;



The first form of the INSTRUMENTS statement is used only before a FIT statementand defines the default instruments list. The items specified as instruments can bevariables or the special keyword–EXOG–. –EXOG– indicates that all the modelvariables declared EXOGENOUS are to be added to the instruments list.

The second form of the INSTRUMENTS statement is used only after the FIT state-ment and before the next RUN statement. The items specified as instruments forthe second form can be variables, names of parameters to be estimated, or the specialkeyword–EXOG–. If you specify the name of a parameter in the instruments list, thepartial derivatives of the equations with respect to the parameter (that is, the columnsof the Jacobian matrix associated with the parameter) are used as instruments. Theparameter itself is not used as an instrument. These partial derivatives should notdepend on any of the parameters to be estimated. Only the names of parameters to beestimated can be specified.

EXCLUDE= (parameters)specifies that the derivatives of the equations with respect to all of the parameters to beestimated, except the parameters listed in the EXCLUDE list, be used as instruments,in addition to the other instruments specified. If you use the EXCLUDE= option,you should be sure that the derivatives with respect to the non-excluded parametersin the estimation are independent of the endogenous variables and not functions ofthe parameters estimated.

The following option is specified on the INSTRUMENTS statement following a slash(/):

NOINTERCEPTNOINT

excludes the constant of 1.0 (intercept) from the instruments list. An intercept isalways included as an instrument unless NOINTERCEPT is specified.

When a FIT statement specifies an instrumental variables estimation method and noINSTRUMENTS statement accompanies the FIT statement, the default instrumentsare used. If no default instruments list has been specified, all the model variablesdeclared EXOGENOUS are used as instruments.

See "Choice of Instruments" for more details.

LABEL Statement

LABEL variable=’label’ ... ;

The LABEL statement specifies a label of up to 255 characters for parameters andother variables used in the model program. Labels are used to identify parts of theprintout of FIT and SOLVE tasks. The labels will be displayed in the output if theLINESIZE= option is large enough.


Chapter 14. Syntax

OUTVARS Statement

OUTVARS variables;

The OUTVARS statement specifies additional variables defined in the model programto be output to the OUT= data sets. The OUTVARS statement is not needed unlessthe variables to be added to the output data set are not referred to by the model, orunless you wish to include parameters or other special variables in the OUT= data set.The OUTVARS statement includes additional variables, whereas the KEEP statementexcludes variables.

PARAMETERS Statement

PARAMETERS variable [value] [variable [value]] ... ;

The PARAMETERS statement declares the parameters of a model and optionally setstheir values. Valid abbreviations are PARMS and PARM.

Each parameter has a single value associated with it, which is the same for all ob-servations. Lagging is not relevant for parameters. If a value is not specified in thePARMS statement (or by the PARMS= option of a FIT statement), the value defaultsto 0.0001 for FIT tasks and to a missing value for SOLVE tasks.

RANGE Statement

RANGE variable [= first] [TO last ];

The RANGE statement specifies the range of observations to be read from the DATA=data set. For FIT tasks, the RANGE statement controls the period of fit for the esti-mation. For SOLVE tasks, the RANGE statement controls the simulation period orforecast horizon.

The RANGE variable must be a numeric variable in the DATA= data set that identifiesthe observations, and the data set must be sorted by the RANGE variable. The firstobservation in the range is identified byfirst, and the last observation is identified bylast.

PROC MODEL uses the firstl observations prior tofirst to initialize the lags, wherel is the maximum number of lags needed to evaluate any of the equations to be fit orsolved, or the maximum number of lags needed to compute any of the instrumentswhen an instrumental variables estimation method is used. There should be at leastl observations in the data set beforefirst. If last is not specified, all the nonmissingobservations starting withfirst are used.

If first is omitted, the firstl observations are used to initialize the lags, and the rest ofthe data, untillast, is used. If a RANGE statement is used but bothfirst andlast are



omitted, the RANGE statement variable is used to report the range of observationsprocessed.

The RANGE variable should be nonmissing for all observations. Observations con-taining missing RANGE values are deleted.

The following are examples of RANGE statements:

range year = 1971 to 1988; /* yearly data */range date = ’1feb73’d to ’1nov82’d; /* monthly data */range time = 60.5; /* time in years */range year to 1977; /* use all years through 1977 */range date; /* use values of date to report period-of-fit */

RESET Statement

RESET options;

All of the options of the PROC MODEL statement can be reset by the RESET state-ment. In addition, the RESET statement supports one additional option:

PURGEdeletes the current model so that a new model can be defined.

When the MODEL= option is used in the RESET statement, the current model isdeleted before the new model is read.

RESTRICT Statement

RESTRICT restriction1 [, restriction2 ... ] ;

The RESTRICT statement is used to impose linear and nonlinear restrictions on theparameter estimates.

RESTRICT statements refer to the parameters estimated by the associated FIT state-ment (that is, to either the preceding FIT statement or, in the absence of a precedingFIT statement, to the following FIT statement). You can specify any number of RE-STRICT statements.

Eachrestriction is written as an optional name, followed by an expression, followedby an equality operator (=) or an inequality operator (<, >, <=, >=), followed by asecond expression:

["name"] expression operator expression

The optional"name" is a string used to identify the restriction in the printed outputand in the OUTEST= data set. Theoperatorcan be =, <, >, <= , or >=. The operatorand second expression are optional, as in the TEST statement (=0).


Chapter 14. Syntax

Restriction expressions can be composed of parameter names, arithmetic operators,functions, and constants. Comparison operators (such as "=" or "<") and logical oper-ators (such as “&”) cannot be used in RESTRICT statement expressions. Parametersnamed in restriction expressions must be among the parameters estimated by the as-sociated FIT statement. Expressions can refer to variables defined in the program.

The restriction expressions can be linear or nonlinear functions of the parameters.

The following is an example of the use of the RESTRICT statement:

proc model data=one;endogenous y1 y2;exogenous x1 x2;parms a b c;restrict b*(b+c) <= a;

eq.one = -y1/c + a/x2 + b * x1**2 + c * x2**2;eq.two = -y2 * y1 + b * x2**2 - c/(2 * x1);

fit one two / fiml;run;

SOLVE Statement

SOLVE [variables] [SATISFY= equations] [INITIAL= (variable=[parameter]][/options];

The SOLVE statement specifies that the model be simulated or forecast for input datavalues and, optionally, selects the variables to be solved. If the list of variables isomitted, all of the model variables declared ENDOGENOUS are solved. If no modelvariables are declared ENDOGENOUS, then all model variables are solved.

The following specification can be used in the SOLVE statement:

SATISFY= equationSATISFY= ( equations )

specifies a subset of the model equations that the solution values are to satisfy. Ifthe SATISFY= option is not used, the solution is computed to satisfy all the modelequations. Note that the number of equations must equal the number of variablessolved.


names the input data set. The model is solved for each observation read from theDATA= data set. If the DATA= option is not specified on the SOLVE statement, thedata set specified by the DATA= option on the PROC MODEL statement is used.

ESTDATA= SAS-data-setnames a data set whose first observation provides values for some or all of the pa-rameters and whose additional observations (if any) give the covariance matrix of the



parameter estimates. The covariance matrix read from the ESTDATA= data set isused to generate multivariate normal pseudo-random shocks to the model parameterswhen the RANDOM= option requests Monte Carlo simulation.

OUT= SAS-data-setoutputs the predicted (solution) values, residual values, actual values, or equationerrors from the solution to a data set. Only the solution values are output by default.

OUTACTUALoutputs the actual values of the solved variables read from the input data set to theOUT= data set. This option is applicable only if the OUT= option is specified.

OUTALLspecifies the OUTACTUAL, OUTERRORS, OUTLAGS, OUTPREDICT, and OUT-RESID options

OUTERRORSwrites the equation errors to the OUT= data set. These values are normally very closeto zero when a simultaneous solution is computed; they can be used to double-checkthe accuracy of the solution process. It is applicable only if the OUT= option isspecified.

OUTLAGSwrites the observations used to start the lags to the OUT= data set. This option isapplicable only if the OUT= option is specified.

OUTPREDICTwrites the solution values to the OUT= data set. This option is relevant only if theOUT= option is specified. The OUTPREDICT option is the default unless one of theother output options is used.

OUTRESIDwrites the residual values computed as the difference of the solution values and thevalues for the solution variables read from the input data set to the OUT= data set.This option is applicable only if the OUT= option is specified.

PARMSDATA= SAS-data-setspecifies a data set that contains the parameter estimates. See the "Input Data Sets"section for more details.

SDATA= SAS-data-setspecifies a data set that provides the covariance matrix of the equation errors. Thecovariance matrix read from the SDATA= data set is used to generate multivariatenormal pseudo-random shocks to the equations when the RANDOM= option requestsMonte Carlo simulation.

TYPE= namespecifies the estimation type. The name specified in the TYPE= option is compared tothe–TYPE– variable in the ESTDATA= and SDATA= data sets to select observationsto use in constructing the covariance matrices. When TYPE= is omitted, the lastestimation type in the data set is used.


Chapter 14. Syntax

Solution Mode Options: Lag ProcessingDYNAMIC

specifies a dynamic solution. In the dynamic solution mode, solved values are usedby the lagging functions. DYNAMIC is the default.

NAHEAD= nspecifies a simulation ofn-period-ahead dynamic forecasting. The NAHEAD= optionis used to simulate the process of using the model to produce successive forecasts toa fixed forecast horizon, with each forecast using the historical data available at thetime the forecast is made.

Note that NAHEAD=1 produces a static (one-step-ahead) solution. NAHEAD=2 pro-duces a solution using one-step-ahead solutions for the first lag (LAG1 functions re-turn static predicted values) and actual values for longer lags. NAHEAD=3 producesa solution using NAHEAD=2 solutions for the first lags, NAHEAD=1 solutions forthe second lags, and actual values for longer lags. In general, NAHEAD=n solutionsuse NAHEAD=n-1 solutions for LAG1, NAHEAD=n-2 solutions for LAG2, and soforth.

START= sspecifies static solutions until thesth observation and then changes to dynamic solu-tions. If the START=soption is specified, the first observation in the range in whichLAGn delivers solved predicted values iss+n, while LAGn returns actual values forearlier observations.

STATICspecifies a static solution. In static solution mode, actual values of the solved vari-ables from the input data set are used by the lagging functions.

Solution Mode Options: Use of Available DataFORECAST

specifies that the actual value of a solved variable is used as the solution value (in-stead of the predicted value from the model equations) whenever nonmissing data areavailable in the input data set. That is, in FORECAST mode, PROC MODEL solvesonly for those variables that are missing in the input data set.

SIMULATEspecifies that PROC MODEL always solves for all solution variables as a function ofthe input values of the other variables, even when actual data for some of the solutionvariables are available in the input data set. SIMULATE is the default.

Solution Mode Options: Numerical Solution MethodJACOBI

computes a simultaneous solution using a Jacobi iteration.

NEWTONcomputes a simultaneous solution using Newton’s method. When the NEWTONoption is selected, the analytic derivatives of the equation errors with respect to thesolution variables are computed and memory-efficient sparse matrix techniques areused for factoring the Jacobian matrix.



The NEWTON option can be used to solve both normalized-form and general-formequations and can compute goal-seeking solutions. NEWTON is the default.

SEIDELcomputes a simultaneous solution using a Gauss-Seidel method.

SINGLEONEPASS

specifies a single-equation (nonsimultaneous) solution. The model is executed onceto compute predicted values for the variables from the actual values of the otherendogenous variables. The SINGLE option can only be used for normalized-formequations and cannot be used for goal-seeking solutions.

For more information on these options, see the "Solution Modes" section later in thischapter.

Monte Carlo Simulation OptionsQUASI= NONE|SOBOL|FAURE

specifies a psuedo or quasi-random number generator. Two Quasi-randomnumber generators supported by the MODEL procedure, the Sobol sequence(QUASI=SOBOL) and the Faure sequence (QUASI=FAURE). The default isQUASI=NONE which is the psuedo random number generator.

RANDOM= nrepeats the solutionn times for each BY group, with different random perturbationsof the equation errors if the SDATA= option is used; with different random perturba-tions of the parameters if the ESTDATA= option is used and the ESTDATA= data setcontains a parameter covariance matrix; and with different values returned from therandom-number generator functions, if any are used in the model program. If RAN-DOM=0, the random-number generator functions always return zero. See "MonteCarlo Simulation" for details. The default is RANDOM=0.

SEED= nspecifies an integer to use as the seed in generating pseudo-random numbers to shockthe parameters and equations when the ESTDATA= or the SDATA= options are spec-ified. If n is negative or zero, the time of day from the computer’s clock is used asthe seed. The SEED= option is only relevant if the RANDOM= option is used. Thedefault is SEED=0.

Options for Controlling the Numerical Solution ProcessThe following options are useful when you have difficulty converging to the simulta-neous solution.

CONVERGE= valuespecifies the convergence criterion for the simultaneous solution. Convergence of thesolution is judged by comparing the CONVERGE= value to the maximum over theequations of

j�ijjyij+ 1E � 6


Chapter 14. Syntax

if it is computable, otherwise

j�ij

where�i represents the equation error andyi represents the solution variable cor-responding to theith equation for normalized-form equations. The default isCONVERGE=1E-8.

MAXITER= nspecifies the maximum number of iterations allowed for computing the simultaneoussolution for any observation. The default is MAXITER=50.

INITIAL= (variable= [parameter])specifies starting values for the parameters

MAXSUBITER= nspecifies the maximum number of damping subiterations that are performed in solv-ing a nonlinear system when using the NEWTON solution method. Damping is dis-abled by setting MAXSUBITER=0. The default is MAXSUBITER=10.

Printing OptionsINTGPRINT

prints between data points integration values for the DERT. variables and the auxiliaryvariables. If you specify the DETAILS option, the integrated derivative variables areprinted as well.

ITPRINTprints the solution approximation and equation errors at each iteration for each obser-vation. This option can produce voluminous output.

PRINTALLspecifies the printing control options DETAILS, ITPRINT, SOLVEPRINT, STATS,and THEIL.

SOLVEPRINTprints the solution values and residuals at each observation

STATSprints various summary statistics for the solution values

THEILprints tables of Theil inequality coefficients and Theil relative change forecast errormeasures for the solution values. See "Summary Statistics" in the "Details" sectionfor more information.

Other OptionsOther options that can be used on the SOLVE statement include the following that listand analyze the model: BLOCK, GRAPH, LIST, LISTCODE, LISTDEP, LISTDER,and XREF. The LTEBOUND= and MINTIMESTEP= options can be used to controlthe integration process. The following printing-control options are also available:DETAILS, FLOW, MAXERRORS=, NOPRINT, and TRACE. For complete descrip-tions of these options, see the PROC MODEL and FIT statement options describedearlier in this chapter.



TEST Statement

TEST ["name"] test1 [, test2 ... ] [,/ options ] ;

The TEST statement performs tests of nonlinear hypotheses on the model parameters.

The TEST statement applies to the parameters estimated by the associated FIT state-ment (that is, either the preceding FIT statement or, in the absence of a precedingFIT statement, the following FIT statement). You can specify any number of TESTstatements.

If you specify options on the TEST statement, a comma is required before the "/"character separating the test expressions from the options, because the "/" charactercan also be used within test expressions to indicate division.

Each test is written as an expression optionally followed by an equal sign (=) and asecond expression:

[expression] [= expression ]

Test expressions can be composed of parameter names, arithmetic operators, func-tions, and constants. Comparison operators (such as "=") and logical operators (suchas “&”) cannot be used in TEST statement expressions. Parameters named in test ex-pressions must be among the parameters estimated by the associated FIT statement.

If you specify only one expression in a test, that expression is tested against zero. Forexample, the following two TEST statements are equivalent:

test a + b;

test a + b = 0;

When you specify multiple tests on the same TEST statement, a joint test is per-formed. For example, the following TEST statement tests the joint hypothesis thatboth A and B are equal to zero.

test a, b;

To perform separate tests rather than a joint test, use separate TEST statements. Forexample, the following TEST statements test the two separate hypotheses that A isequal to zero and that B is equal to zero.

test a;test b;

You can use the following options in the TEST statement.


Chapter 14. Syntax

WALDspecifies that a Wald test be computed. WALD is the default.

LMRAOLAGRANGE

specifies that a Lagrange multiplier test be computed.

LRLIKE

specifies that a likelihood ratio test be computed.

ALLrequests all three types of tests.

OUT=specifies the name of an output SAS data set that contains the test results. The for-mat of the OUT= data set produced by the TEST statement is similar to that of theOUTEST= data set produced by the FIT statement.

VAR Statement

VAR variables [initial–values] ... ;

The VAR statement declares model variables and optionally provides initial valuesfor the variables’ lags. See the "Lag Logic" section for more information.

WEIGHT Statement

WEIGHT variable;

The WEIGHT statement specifies a variable to supply weighting values to use foreach observation in estimating parameters.

If the weight of an observation is nonpositive, that observation is not used for theestimation.variablemust be a numeric variable in the input data set.

An alternative weighting method is to use an assignment statement to give values tothe special variable–WEIGHT–. The–WEIGHT– variable must not depend on theparameters being estimated. If both weighting specifications are given, the weightsare multiplied together.



Estimation Details

Estimation Methods

Consider the general nonlinear model:

�t = q(yt ;xt ; �)zt = Z(xt)

whereq2Rg is a real vector valued function, ofyt2Rg, xt2Rl, �2Rp, g is the numberof equations,l is the number of exogenous variables (lagged endogenous variables areconsidered exogenous here),p is the number of parameters andt ranges from 1 ton.zt2Rk is a vector of instruments.�t is an unobservable disturbance vector with thefollowing properties:

E(�t) = 0

E(�t�0

t) = �

All of the methods implemented in PROC MODEL aim to minimize anobjectivefunction. The following table summarizes the objective functions defining the esti-mators and the corresponding estimator of the covariance of the parameter estimatesfor each method.

Table 14.1. Summary of PROC MODEL Estimation Methods

Method Instruments Objective Function Covariance of�OLS no r0r=n (X0(diag(S)�1I)X)�1

ITOLS no r0(diag(S)�1I)r=n (X0(diag(S)�1I)X)�1

SUR no r0(S�1OLSI)r=n (X0(S�1I)X)�1

ITSUR no r0(S�1I)r=n (X0(S�1I)X)�1

N2SLS yes r0(IW)r=n (X0(diag(S)�1W)X)�1

IT2SLS yes r0(diag(S)�1W)r=n (X0(diag(S)�1W)X)�1

N3SLS yes r0(S�1N2SLSW)r=n (X0(S�1W)X)�1

IT3SLS yes r0(S�1W)r=n (X0(S�1W)X)�1

GMM yes [nmn(�)]0V�1

N2SLS[nmn(�)]=n [(YX)0V�1(YX)]�1

ITGMM yes [nmn(�)]0V�1[nmn(�)]=n [(YX)0V�1(YX)]�1

FIML no constant+ n2 ln(det(S)) [Z0(S�1I)Z]�1

�Pn1 lnj(Jt)j

The column labeled "Instruments" identifies the estimation methods that require in-struments. The variables used in this table and the remainder of this chapter aredefined as follows:

n = is the number of nonmissing observations.

g = is the number of equations.

k = is the number of instrumental variables.


Chapter 14. Estimation Details

r =

2664r1r2...rg

3775 is theng � 1 vector of residuals for theg equations stacked together.

ri =

2664qi(y1 ;x1 ; �)qi(y2 ;x2 ; �)

...qi(yn ;xn ; �)

3775 is then� 1 column vector of residuals for theith equation.

S is a g � g matrix that estimates�, the covariances of the errorsacross equations (referred to as theS matrix).

X is anng � p matrix of partial derivatives of the residual with re-spect to the parameters.

W is ann� n matrix,Z(Z0Z)�1Z0.

Z is ann� k matrix of instruments.

Y is agk � ng matrix of instruments.Y = IgZ0.Z Z = (Z1; Z2; : : :; Zp) is anng�p matrix. Zi is a ng�1 column

vector obtained from stacking the columns of

U1

n

nXt=1

�@q(yt ;xt ; �)

0

@yt

��1 @2q(yt ;xt ; �)0@yt@�i

�Qi

U is ann�g matrix of residual errors.U = �1; �2; : : :; �n0

Q is then�g matrixq(y1 ;x1 ; �);q(y2 ;x2 ; �); : : :;q(yn ; n; �.

Qi is ann�g matrix @Q@�i

.

I is ann� n identity matrix.

Jt is @q(yt ;xt ;�)

@y0

t

which is ag � g Jacobian matrix.

mn is first moment of the crossproductq(yt;xt; �)zt.mn = 1

n

Pnt=1 q(yt ;xt ; �)zt

zt is ak column vector of instruments for observationt. z0t is also thetth row of Z.

V is the gk � gk matrix representing the variance of the momentfunctions.

k is the number of instrumental variables used.

constant is the constantng2 (1 + ln(2�)).

is the notation for a Kronecker product.

All vectors are column vectors unless otherwise noted. Other estimates of the covari-ance matrix for FIML are also available.

Dependent Regressors and Two-Stage Least SquaresOrdinary regression analysis is based on several assumptions. A key assumption isthat the independent variables are in fact statistically independent of the unobserved



error component of the model. If this assumption is not true–if the regressor variessystematically with the error–then ordinary regression produces inconsistent results.The parameter estimates arebiased.

Regressors might fail to be independent variables because they are dependent vari-ables in a larger simultaneous system. For this reason, the problem of dependentregressors is often calledsimultaneous equation bias. For example, consider the fol-lowing two-equation system.

y1 = a1 + b1y2 + c1x1 + �1

y2 = a2 + b2y1 + c2x2 + �2

In the first equation, y2 is a dependent, orendogenous, variable. As shown by thesecond equation, y2 is a function of y1, which by the first equation is a function of�1, and therefore y2 depends on�1. Likewise, y1 depends on�2 and is a dependentregressor in the second equation. This is an example of asimultaneous equationsystem; y1 and y2 are a function of all the variables in the system.

Using the ordinary least squares (OLS) estimation method to estimate these equationsproduces biased estimates. One solution to this problem is to replace y1 and y2 on theright-hand side of the equations with predicted values, thus changing the regressionproblem to the following:

y1 = a1 + b1y2 + c1x1 + �1

y2 = a2 + b2y1 + c2x2 + �2

This method requires estimating the predicted valuesy1 andy2 through a preliminary,or "first stage,"instrumental regression. An instrumental regression is a regressionof the dependent regressors on a set ofinstrumental variables, which can be anyindependent variables useful for predicting the dependent regressors. In this example,the equations are linear and the exogenous variables for the whole system are known.Thus, the best choice for instruments (of the variables in the model) are the variablesx1 and x2.

This method is known astwo-stage least squaresor 2SLS, or more generally as theinstrumental variables method. The 2SLS method for linear models is discussed inPindyck (1981, p. 191-192). For nonlinear models this situation is more complex,but the idea is the same. In nonlinear 2SLS, the derivatives of the model with respectto the parameters are replaced with predicted values. See the section "Choice ofInstruments" for further discussion of the use of instrumental variables in nonlinearregression.

To perform nonlinear 2SLS estimation with PROC MODEL, specify the instrumen-tal variables with an INSTRUMENTS statement and specify the 2SLS or N2SLSoption on the FIT statement. The following statements show how to estimate the firstequation in the preceding example with PROC MODEL.



proc model data=in;y1 = a1 + b1 * y2 + c1 * x1;fit y1 / 2sls;instruments x1 x2;

run;

The 2SLS or instrumental variables estimator can be computed using a first-stageregression on the instrumental variables as described previously. However, PROCMODEL actually uses the equivalent but computationally more appropriate techniqueof projecting the regression problem into the linear space defined by the instruments.Thus PROC MODEL does not produce any "first stage" results when you use 2SLS.If you specify the FSRSQ option on the FIT statement, PROC MODEL prints "first-stage R2" statistic for each parameter estimate.

Formally, the� that minimizes

Sn =1

n

nX

t=1

(q(yt ;xt ; �)zt)!0 nX

t=1

Iztz0t!�1 nX

t=1

(q(yt ;xt ; �)zt)!

is the N2SLS estimator of the parameters. The estimate of� at the final iterationis used in the covariance of the parameters given in Table 14.1. Refer to Amemiya(1985, p. 250) for details on the properties of nonlinear two-stage least squares.

Seemingly Unrelated RegressionIf the regression equations are not simultaneous, so there are no dependent regressors,seemingly unrelated regression(SUR) can be used to estimate systems of equationswith correlated random errors. The large-sample efficiency of an estimation can beimproved if these cross-equation correlations are taken into account. SUR is alsoknown asjoint generalized least squaresor Zellner regression. Formally, the� thatminimizes

Sn =1

n

nXt=1

q(yt ;xt ; �)0��1q(yt ;xt ; �)

is the SUR estimator of the parameters.

The SUR method requires an estimate of the cross-equation covariance matrix,�.PROC MODEL first performs an OLS estimation, computes an estimate,�, from theOLS residuals, and then performs the SUR estimation based on�. The OLS resultsare not printed unless you specify the OLS option in addition to the SUR option.

You can specify the� to use for SUR by storing the matrix in a SAS data set andnaming that data set in the SDATA= option. You can also feed the� computed fromthe SUR residuals back into the SUR estimation process by specifying the ITSURoption. You can print the estimated covariance matrix� using the COVS option onthe FIT statement.

The SUR method requires estimation of the� matrix, and this increases the samplingvariability of the estimator for small sample sizes. The efficiency gain SUR has overOLS is a large sample property, and you must have a reasonable amount of data to



realize this gain. For a more detailed discussion of SUR, refer to Pindyck (1981, p.331-333).

Three-Stage Least-Squares EstimationIf the equation system is simultaneous, you can combine the 2SLS and SUR methodsto take into account both dependent regressors and cross-equation correlation of theerrors. This is calledthree-stage least squares(3SLS).

Formally, the� that minimizes

Sn =1

n

nX

t=1

(q(yt ;xt ; �)zt)!0

nXt=1

(�ztz0t)!�1 nX

t=1

(q(yt ;xt ; �)zt)!

is the 3SLS estimator of the parameters. For more details on 3SLS, refer to Gallant(1987, p. 435).

Residuals from the 2SLS method are used to estimate the�matrix required for 3SLS.The results of the preliminary 2SLS step are not printed unless the 2SLS option is alsospecified.

To use the three-stage least-squares method, specify an INSTRUMENTS statementand use the 3SLS or N3SLS option on either the PROC MODEL statement or a FITstatement.

Generalized Method of Moments - GMMFor systems of equations with heteroscedastic errors, generalized method of moments(GMM) can be used to obtain efficient estimates of the parameters. See the "Het-eroscedasticity" section for alternatives to GMM.

Consider the nonlinear model

�t = q(yt ;xt ; �)zt = Z(xt)

wherezt is a vector of instruments and�t is an unobservable disturbance vector thatcan be serially correlated and nonstationary.

In general, the following orthogonality condition is desired:

E(�tzt) = 0

which states that the expected crossproducts of the unobservable disturbances,�t , andfunctions of the observable variables are set to 0. The first moment of the crossprod-ucts is

mn =1

n

nXt=1

m(yt ;xt ; �)

m(yt ;xt ; �) = q(yt ;xt ; �)zt

wherem(yt ;xt ; �)2Rgk.



The case wheregk > p is considered here, wherep is the number of parameters.

Estimate the true parameter vector�0 by the value of� that minimizes

S(�; V ) = [nmn(�)]0V �1[nmn(�)]=n

where

V = Cov�[nmn(�

0)]; [nmn(�0)]0�

The parameter vector that minimizes this objective function is the GMM estimator.GMM estimation is requested on the FIT statement with the GMM option.

The variance of the moment functions,V , can be expressed as

V = E

nX

t=1

�tzt!

nXs=1

�szs!0

=

nXt=1

nXs=1

E�(�tzt)(�szs)0

�= nS0n

whereS0n is estimated as

Sn =1

n

nXt=1

nXs=1

(q(yt ;xt ; �)zt)(q(ys ;xs ; �)zs)0

Note thatSn is agk�gk matrix. Because Var(Sn) will not decrease with increasingn we consider estimators ofS0n of the form:

Sn(l(n)) =

n�1X�=�n+1

w

��

l(n)

�DSn;�D

Sn;� =

8<:

nPt=1+�

[q(yt;xt; �#)zt][q(yt�� ;xt�� ; �#)zt�� ]0 � � 0

(Sn;�� )0 � < 0

where l(n) is a scalar function that computes the bandwidth parameter,w(�) is ascalar valued kernel, and the diagonal matrixD is used for a small sample degreesof freedom correction (Gallant 1987). The initial�# used for the estimation ofSn isobtained from a 2SLS estimation of the system. The degrees of freedom correctionis handled by the VARDEF= option as for theSmatrix estimation.

The following kernels are supported by PROC MODEL. They are listed with theirdefault bandwidth functions.



Bartlett: KERNEL=BART

w(x) =

�1� jxj jxj <= 10 otherwise

l(n) =1

2n1=3

Parzen: KERNEL=PARZEN

w(x) =

8<:1� 6jxj2 + 6jxj3 0 <= jxj <= 1

22(1 � jxj)3 1

2 <= jxj <= 10 otherwise

l(n) = n1=5

Quadratic Spectral: KERNEL=QS

w(x) =25

12�2x2

�sin(6�x=5)

6�x=5� cos(6�x=5)

�

l(n) =1

2n1=5

-l(m)

Quadratic Spectral

1

-l(m) l(m)

Parzen

1

l(m)-l(m)

Bartlett

1

l(m)

Figure 14.15. Kernels for Smoothing

Details of the properties of these and other kernels are given in Andrews (1991).Kernels are selected with the KERNEL= option; KERNEL=PARZEN is the default.The general form of the KERNEL= option is

KERNEL=( PARZEN | QS | BART, c, e )



where thee � 0 andc � 0 are used to compute the bandwidth parameter as

l(n) = cne

The bias of the standard error estimates increases for large bandwidth parameters.A warning message is produced for bandwidth parameters greater thann

1

3 . For adiscussion of the computation of the optimall(n), refer to Andrews (1991).

The "Newey-West" kernel (Newey (1987)) corresponds to the Bartlett kernel withbandwith parameterl(n) = L+ 1. That is, if the "lag length" for the Newey-Westkernel isL then the corresponding Model procedure syntax is KERNEL=( bart, L+1,0).

Andrews (1992) has shown that using prewhitening in combination with GMM canimprove confidence interval coverage and reduce over rejection oft-statistics at thecost of inflating the variance and MSE of the estimator. Prewhitening can be per-formed using the %AR macros.

For the special case that the errors are not serially correlated, that is

E(etzt)(eszs) = 0 t6=s

the estimate forS0n reduces to

Sn =1

n

nXt=1

[q(yt ;xt ; �)zt][q(yt ;xt ; �)zt]0

The option KERNEL=(kernel,0,) is used to select this type of estimation when usingGMM.

Testing Over-Identifying RestrictionsLet r be the number of unique instruments times the number of equations. The valuer represents the number of orthogonality conditions imposed by the GMM method.Under the assumptions of the GMM method,r � p linearly independent combina-tions of the orthogonality should be close to zero. The GMM estimates are computedby setting these combinations to zero. Whenr exceeds the number of parameters tobe estimated, the OBJECTIVE*N, reported at the end of the estimation, is an asymp-toticly valid statistic to test the null hypothesis that the over-identifying restrictionsof the model are valid. The OBJECTIVE*N is distributed as a chi-square withr � pdegrees of freedom (Hansen 1982, p. 1049).

Iterated Generalized Method of Moments - ITGMMIterated generalized method of moments is similar to the iterated versions of 2SLS,SUR, and 3SLS. The variance matrix for GMM estimation is re-estimatedg at eachiteration with the parameters determined by the GMM estimation. The iteration ter-minates when the variance matrix for the equation errors change less than the CON-VERGE= value. Iterated generalized method of moments is selected by the ITGMMoption on the FIT statement. For some indication of the small sample properties ofITGMM, refer to Ferson (1993).



Full Information Maximum Likelihood Estimation - FIMLA different approach to the simultaneous equation bias problem is the full informationmaximum likelihood (FIML) estimation method (Amemiya 1977).

Compared to the instrumental variables methods (2SLS and 3SLS), the FIML methodhas these advantages and disadvantages:

� FIML does not require instrumental variables.

� FIML requires that the model include the full equation system, with as manyequations as there are endogenous variables. With 2SLS or 3SLS you canestimate some of the equations without specifying the complete system.

� FIML assumes that the equations errors have a multivariate normal distribution.If the errors are not normally distributed, the FIML method may produce poorresults. 2SLS and 3SLS do not assume a specific distribution for the errors.

� The FIML method is computationally expensive.

The full information maximum likelihood estimators of� and� are the� and� thatminimize the negative log likelihood function:

ln(�; �) =ng2 ln(2�) �

nXt=1

ln

��@q(yt ;xt ; �)@y0

t

��+n

2ln (j�(�)j)

+1

2tr

�(�)�1

nXt=1

q(yt ;xt ; �)q0(yt ;xt ; �)

!

The option FIML requests full information maximum likelihood estimation. If theerrors are distributed normally, FIML produces efficient estimators of the parameters.If instrumental variables are not provided the starting values for the estimation areobtained from a SUR estimation. If instrumental variables are provided, then thestarting values are obtained from a 3SLS estimation. The negative log likelihoodvalue and the l2 norm of the gradient of the negative log likelihood function are shownin the estimation summary.

FIML DetailsTo compute the minimum ofln(�; �), this function isconcentratedusing the relation

�(�) =1

n

nXt=1

q(yt ;xt ; �)q0(yt ;xt ; �)

This results in the concentrated negative log likelihood function:

ln(�) =ng

2(1 + ln(2�)) �

nXt=1

ln

�� @@y0tq(yt ;xt ; �)��+ n

2lnj�(�j)

The gradient of the negative log likelihood function is

@

@�iln(�) =

nXt=1

ri(t)



ri(t) = �tr �

@q(yt ;xt ; �)

@y0

t

��1 @2q(yt ;xt ; �)@y

0

t@�i

!

+1

2tr

��(�)�1

@�(�)

@�i�I ��(�)�1q(yt ;xt ; �)q(yt ;xt ; �)

0��

+ q(yt ;xt ; �0)�(�)�1

@q(yt ;xt ; �)

@�i

where

@�(�)

@�i=

2

n

nXt=1

q(yt ;xt ; �)@q(yt ;xt ; �)

0

@�i

The estimator of the variance-covariance of� (COVB) for FIML can be selected withthe COVBEST= option with the following arguments:

CROSS selects the crossproducts estimator of the covariance matrix (de-fault) (Gallant 1987, p. 473):

C =

1

n

nXt=1

r(t)r0(t)

!�1

wherer(t) = [r1(t);r2(t); : : :;rp(t)]0

GLS selects the generalized least-squares estimator of the covariancematrix. This is computed as (Dagenais 1978)

C = [Z 0(�(�)�1I)Z]�1

whereZ = (Z1; Z2; : : :; Zp) is ng � p and eachZi column vectoris obtained from stacking the columns of

U1

n

nXt=1

�@q(yt ;xt ; �)

0

@y

��1 @2q(yt ;xt ; �)0@y

0

n@�i�Qi

U is ann� g matrix of residuals andqi is ann� g matrix @Q@�i

.

FDA selects the inverse of concentrated likelihood Hessian as an estima-tor of the covariance matrix. The Hessian is computed numerically,so for a large problem this is computationally expensive.

The HESSIAN= option controls which approximation to the Hessian is used in theminimization procedure. Alternate approximations are used to improve convergenceand execution time. The choices are as follows.



CROSS The crossproducts approximation is used.

GLS The generalized least-squares approximation is used (default).

FDA The Hessian is computed numerically by finite differences.

HESSIAN=GLS has better convergence properties in general, but COVBEST=CROSSproduces the most pessimistic standard error bounds. When the HESSIAN= optionis used, the default estimator of the variance-covariance of� is the inverse of theHessian selected.

Properties of the EstimatesAll of the methods are consistent. Small sample properties may not be good fornonlinear models. The tests and standard errors reported are based on the convergenceof the distribution of the estimates to a normal distribution in large samples.

These nonlinear estimation methods reduce to the corresponding linear systems re-gression methods if the model is linear. If this is the case, PROC MODEL producesthe same estimates as PROC SYSLIN.

Except for GMM, the estimation methods assume that the equation errors for eachobservation are identically and independently distributed with a 0 mean vector andpositive definite covariance matrix� consistently estimated byS. For FIML, the er-rors need to be normally distributed. There are no other assumptions concerning thedistribution of the errors for the other estimation methods.

The consistency of the parameter estimates relies on the assumption that theSmatrixis a consistent estimate of�. These standard error estimates are asymptotically valid,but for nonlinear models they may not be reliable for small samples.

TheS matrix used for the calculation of the covariance of the parameter estimates isthe best estimate available for the estimation method selected. ForS-iterated methodsthis is the most recent estimation of�. For OLS and 2SLS, an estimate of theSmatrix is computed from OLS or 2SLS residuals and used for the calculation of thecovariance matrix. For a complete list of theS matrix used for the calculation of thecovariance of the parameter estimates, see Table 14.1.

Missing ValuesAn observation is excluded from the estimation if any variable used for FIT tasks ismissing, if the weight for the observation is not greater than 0 when weights are used,or if a DELETE statement is executed by the model program. Variables used for FITtasks include the equation errors for each equation, the instruments, if any, and thederivatives of the equation errors with respect to the parameters estimated. Note thatvariables can become missing as a result of computational errors or calculations withmissing values.

The number of usable observations can change when different parameter values areused; some parameter values can be invalid and cause execution errors for some ob-servations. PROC MODEL keeps track of the number of usable and missing observa-tions at each pass through the data, and if the number of missing observations countedduring a pass exceeds the number that was obtained using the previous parametervector, the pass is terminated and the new parameter vector is considered infeasible.



PROC MODEL never takes a step that produces more missing observations than thecurrent estimate does.

The values used to compute the Durbin-Watson, R2, and other statistics of fit arefrom the observations used in calculating the objective function and do not includeany observation for which any needed variable was missing (residuals, derivatives,and instruments).

Details on the Covariance of Equation ErrorsThere are severalS matrices that can be involved in the various estimation methodsand in forming the estimate of the covariance of parameter estimates. TheseS ma-trices are estimates of�, the true covariance of the equation errors. Apart from thechoice of instrumental or noninstrumental methods, many of the methods providedby PROC MODEL differ in the way the variousSmatrices are formed and used.

All of the estimation methods result in a final estimate of�, which is included in theoutput if the COVS option is specified. The finalS matrix of each method providesthe initial Smatrix for any subsequent estimation.

This estimate of the covariance of equation errors is defined as

S = D(R0R)D

whereR = (r1; : : : ; rg) is composed of the equation residuals computed from thecurrent parameter estimates in ann� g matrix andD is a diagonal matrix that de-pends on the VARDEF= option.

For VARDEF=N, the diagonal elements ofD are1=pn, wheren is the number of

nonmissing observations. For VARDEF=WGT,n is replaced with the sum of theweights. For VARDEF=WDF,n is replaced with the sum of the weights minus themodel degrees of freedom. For the default VARDEF=DF, theith diagonal elementof D is 1=

pn� dfi, wheredfi is the degrees of freedom (number of parameters) for

the ith equation. Binkley and Nelson (1984) show the importance of using a degrees-of-freedom correction in estimating�. Their results indicate that the DF methodproduces more accurate confidence intervals for N3SLS parameter estimates in thelinear case than the alternative approach they tested. VARDEF=N is always used forthe computation of the FIML estimates.

For the fixedS methods, the OUTSUSED= option writes theS matrix used in theestimation to a data set. ThisS matrix is either the estimate of the covariance ofequation errors matrix from the preceding estimation, or a prior� estimate read infrom a data set when the SDATA= option is specified. For the diagonalS methods,all of the off-diagonal elements of theS matrix are set to 0 for the estimation of theparameters and for the OUTSUSED= data set, but the output data set produced bythe OUTS= option will contain the off-diagonal elements. For the OLS and N2SLSmethods, there is no previous estimate of the covariance of equation errors matrix,and the option OUTSUSED= will save an identity matrix unless a prior� estimateis supplied by the SDATA= option. For FIML the OUTSUSED= data set containstheSmatrix computed with VARDEF=N. The OUTS= data set contains theSmatrixcomputed with the selected VARDEF= option.



If the COVS option is used, the method is notS-iterated, andS is not an identity, theOUTSUSED= matrix is included in the printed output.

For the methods that iterate the covariance of equation errors matrix, theS matrixis iteratively re-estimated from the residuals produced by the current parameter es-timates. ThisS matrix estimate iteratively replaces the previous estimate until boththe parameter estimates and the estimate of the covariance of equation errors matrixconverge. The final OUTS= matrix and OUTSUSED= matrix are thus identical fortheS-iterated methods.

Nested IterationsBy default, forS-iterated methods, theS matrix is held constant until the parametersconverge once. Then theS matrix is re-estimated. One iteration of the parameterestimation algorithm is performed, and theSmatrix is again re-estimated. This latterprocess is repeated until convergence of both the parameters and theS matrix. Sincethe objective of the minimization depends on theS matrix, this has the effect ofchasing a moving target.

When the NESTIT option is specified, iterations are performed to convergence forthe structural parameters with a fixedSmatrix. TheSmatrix is then re-estimated, theparameter iterations are repeated to convergence, and so on until both the parametersand theSmatrix converge. This has the effect of fixing the objective function for theinner parameter iterations. It is more reliable, but usually more expensive, to nest theiterations.

R2

For unrestricted linear models with an intercept successfully estimated by OLS, R2 isalways between 0 and 1. However, nonlinear models do not necessarily encompassthe dependent mean as a special case and can produce negative R2 statistics. NegativeR2’s can also be produced even for linear models when an estimation method otherthan OLS is used and no intercept term is in the model.

R2 is defined for normalized equations as

R2 = 1� SSE

SSA� �y2 � n

where SSA is the sum of the squares of the actualy’s and�y are the actual means. R2

cannot be computed for models in general form because of the need for an actual Y.

Minimization Methods

PROC MODEL currently supports two methods for minimizing the objective func-tion. These methods are described in the following sections.

GAUSSThe Gauss-Newton parameter-change vector for a system withg equations,n non-missing observations, andp unknown parameters is

� = (X0X)�1X0r



where� is the change vector,X is the stackedng � p Jacobian matrix of partialderivatives of the residuals with respect to the parameters, andr is anng � 1vectorof the stacked residuals. The components ofX andr are weighted by theS�1 matrix.When instrumental methods are used,X and r are the projections of the Jacobianmatrix and residuals vector in the instruments space and not the Jacobian and resid-uals themselves. In the preceding formula,S and W are suppressed. If instrumentalvariables are used, then the change vector becomes:

� = (X0(S�1W)X)�1X0(S�1W)r

This vector is computed at the end of each iteration. The objective function is thencomputed at the changed parameter values at the start of the next iteration. If theobjective function is not improved by the change, the� vector is reduced by one-halfand the objective function is re-evaluated. The change vector will be halved up toMAXSUBITER= times until the objective function is improved.

For FIML theX0Xmatrix is substituted with one of three choices for approximationsto the Hessian. See the "FIML Estimation" section in this chapter.

MARQUARDTThe Marquardt-Levenberg parameter change vector is

� = (X0X+ �diag(X0X))�1X0r

where� is the change vector, andX and r are the same as for the Gauss-Newtonmethod, described in the preceding section. Before the iterations start,� is set toa small value (1E-6). At each iteration, the objective function is evaluated at theparameters changed by�. If the objective function is not improved,� is increased to10� and the step is tried again.� can be increased up to MAXSUBITER= times to amaximum of 1E15 (whichever comes first) until the objective function is improved.For the start of the next iteration,� is reduced to max(�/10,1E-10).

Convergence Criteria

There are a number of measures that could be used as convergence or stopping cri-teria. PROC MODEL computes five convergence measures labeled R, S, PPC, RPC,and OBJECT.

When an estimation technique that iterates estimates of� is used (that is, IT3SLS),two convergence criteria are used. The termination values can be specified with theCONVERGE=(p,s) option on the FIT statement. If the second value,s, is not spec-ified, it defaults top. The criterion labeled S (given in the following) controls theconvergence of theS matrix. When S is less thans, theS matrix has converged. Thecriterion labeled R is compared to thep value to test convergence of the parameters.

The R convergence measure cannot be computed accurately in the special case ofsingular residuals (when all the residuals are close to 0) or in the case of a 0 objectivevalue. When either the trace of theS matrix computed from the current residuals(trace(S)) or the objective value is less than the value of the SINGULAR= option,convergence is assumed.



The various convergence measures are explained in the following:

R is the primary convergence measure for the parameters. It mea-sures the degree to which the residuals are orthogonal to the Jaco-bian columns, and it approaches 0 as the gradient of the objectivefunction becomes small. R is defined as the square root of

(r0(S�1W )X(X 0(S�1W )X)�1X 0(S�1W )r)

(r0(S�1W )r)

whereX is the Jacobian matrix andr is the residuals vector. Ris similar to the relative offset orthogonality convergence criterionproposed by Bates and Watts (1981).

In the univariate case, the R measure has several equivalent inter-pretations:

� the cosine of the angle between the residuals vector and thecolumn space of the Jacobian matrix. When this cosine is0, the residuals are orthogonal to the partial derivatives ofthe predicted values with respect to the parameters, and thegradient of the objective function is 0.

� the square root of the R2 for the current linear pseudo-modelin the residuals.

� a norm of the gradient of the objective function, where thenorming matrix is proportional to the current estimate of thecovariance of the parameter estimates. Thus, using R, con-vergence is judged when the gradient becomes small in thisnorm.

� the prospective relative change in the objective function valueexpected from the next GAUSS step, assuming that the cur-rent linearization of the model is a good local approximation.

In the multivariate case, R is somewhat more complicated but is de-signed to go to 0 as the gradient of the objective becomes small andcan still be given the previous interpretations for the aggregation ofthe equations weighted byS�1.

PPC is the prospective parameter change measure. PPC measuresthe maximum relative change in the parameters implied by theparameter-change vector computed for the next iteration. At thekth iteration, PPC is the maximum over the parameters

j�k+1i � �ki jj�jki + 1:0e�6

where�ki is the current value of theith parameter and�k+1i is theprospective value of this parameter after adding the change vectorcomputed for the next iteration. The parameter with the maximumprospective relative change is printed with the value of PPC, unlessthe PPC is nearly 0.



RPC is the retrospective parameter change measure. RPC measures themaximum relative change in the parameters from the previous iter-ation. At thekth iteration, RPC is the maximum overi of

j�ki � �k�1i jj�k�1i + 1:0e�6j

where�ki is the current value of theith parameter and�k�1i is theprevious value of this parameter. The name of the parameter withthe maximum retrospective relative change is printed with the valueof RPC, unless the RPC is nearly 0.

OBJECT measures the relative change in the objective function value be-tween iterations:

j(Ok �Ok�1jjOk�1 + 1:0e�6j

whereOk�1 is the value of the objective function (Ok) from theprevious iteration.

S measures the relative change in theS matrix. S is computed as themaximum overi, j of

jSkij � Sk�1

ij jjSk�1

ij + 1:0e�6j

whereSk�1 is the previousS matrix. The S measure is relevantonly for estimation methods that iterate theS matrix.

An example of the convergence criteria output is as follows:

The MODEL ProcedureIT3SLS Estimation Summary


Parameters Estimated 5Method GaussIterations 35


R 0.000883PPC(d1) 0.000644RPC(d1) 0.000815Object 0.00004Trace(S) 3599.982Objective Value 0.435683S 0.000052

Figure 14.16. Convergence Criteria Output



This output indicates the total number of iterations required by the Gauss minimiza-tion for all theSmatrices was 35. The "Trace(S)" is the trace (the sum of the diagonalelements) of theS matrix computed from the current residuals. This row is labeledMSE if there is only one equation.

Troubleshooting Convergence Problems

As with any nonlinear estimation routine, there is no guarantee that the estimationwill be successful for a given model and data. If the equations are linear with respectto the parameters, the parameter estimates always converge in one iteration. Themethods that iterate theS matrix must iterate further for theS matrix to converge.Nonlinear models may not necessarily converge.

Convergence can be expected only with fully identified parameters, adequate data,and starting values sufficiently close to solution estimates.

Convergence and the rate of convergence may depend primarily on the choice ofstarting values for the estimates. This does not mean that a great deal of effort shouldbe invested in choosing starting values. First, try the default values. If the estimationfails with these starting values, examine the model and data and re-run the estimationusing reasonable starting values. It is usually not necessary that the starting values bevery good, just that they not be very bad; choose values that seem plausible for themodel and data.

An Example of Requiring Starting ValuesSuppose you want to regress a variable Y on a variable X assuming that the variablesare related by the following nonlinear equation:

y = a+ bxc + �

In this equation, Y is linearly related to a power transformation of X. The unknownparameters area, b, andc. � is an unobserved random error. Some simulated data wasgenerated using the following SAS statements. In this simulation,a = 10, b = 2, andthe use of the SQRT function corresponds toc = :5.

data test;do i = 1 to 20;

x = 5 * ranuni(1234);y = 10 + 2 * sqrt(x) + .5 * rannor(2345);output;end;

run;

The following statements specify the model and give descriptive labels to the modelparameters. Then the FIT statement attempts to estimatea, b, andc using the defaultstarting value .0001.

proc model data=test;y = a + b * x ** c;label a = "Intercept"



b = "Coefficient of Transformed X"c = "Power Transformation Parameter";

fit y;run;

PROC MODEL prints model summary and estimation problem summary reports andthen prints the output shown in Figure 14.17.

The MODEL ProcedureOLS Estimation

NOTE: The iteration limit is exceeded for OLS.

ERROR: The parameter estimates failed to converge for OLS after100 iterations using CONVERGE=0.001 as the convergence criteria.


NIteration N Obs R Objective Subit a b c

OLS 100 20 0.9627 3.9678 2 137.3844 -126.536 -0.00213

Gauss Method Parameter Change Vector

a b c

-69367.57 69366.51 -1.16NOTE: The parameter estimation is abandoned. Check your model and data. If the

model is correct and the input data are appropriate, try rerunning theparameter estimation using different starting values for the parameterestimates.

PROC MODEL continues as if the parameter estimates had converged.

Figure 14.17. Diagnostics for Convergence Failure

By using the default starting values, PROC MODEL was unable to take even the firststep in iterating to the solution. The change in the parameters that the Gauss-Newtonmethod computes is very extreme and makes the objective values worse instead ofbetter. Even when this step is shortened by a factor of a million, the objective functionis still worse, and PROC MODEL is unable to estimate the model parameters.

The problem is caused by the starting value of C. Using the default starting valueC=.0001, the first iteration attempts to compute better values of A and B by what is,in effect, a linear regression of Y on the 10,000th root of X, which is almost the sameas the constant 1. Thus the matrix that is inverted to compute the changes is nearlysingular and affects the accuracy of the computed parameter changes.

This is also illustrated by the next part of the output, which displays collinearitydiagnostics for the crossproducts matrix of the partial derivatives with respect to theparameters, shown in Figure 14.18.




Collinearity Diagnostics

Condition -----Proportion of Variation----Number Eigenvalue Numbe r a b c

1 2.376793 1.0000 0.0000 0.0000 0.00002 0.623207 1.9529 0.0000 0.0000 0.00003 1.684616E-12 1187805 1.0000 1.0000 1.0000

Figure 14.18. Collinearity Diagnostics

This output shows that the matrix is singular and that the partials of A, B, and Cwith respect to the residual are collinear at the point(0:0001; 0:0001; 0:0001) in theparameter space. See the section "Linear Dependencies" for a full explanation of thecollinearity diagnostics.

The MODEL procedure next prints the note shown in Figure 14.19, which suggeststhat you try different starting values.


NOTE: The parameter estimation is abandoned. Check your model and data. If themodel is correct and the input data are appropriate, try rerunning theparameter estimation using different starting values for the parameterestimates.

PROC MODEL continues as if the parameter estimates had converged.

Figure 14.19. Estimation Failure Note

PROC MODEL then produces the usual printout of results for the nonconverged pa-rameter values. The estimation summary is shown in Figure 14.20. The headingincludes the reminder "(Not Converged)."





Condition -----Proportion of Variation----Number Eigenvalue Numbe r a b c

1 2.376793 1.0000 0.0000 0.0000 0.00002 0.623207 1.9529 0.0000 0.0000 0.00003 1.684616E-12 1187805 1.0000 1.0000 1.0000

The MODEL ProcedureOLS Estimation Summary (Not Converged)


Parameters Estimated 3Method GaussIterations 100Subiterations 239Average Subiterations 2.39


R 0.962666PPC(b) 548.1977RPC(b) 540.4224Object 2.633E-6Trace(S) 4.667947Objective Value 3.967755


Read 20Solved 20

Figure 14.20. Nonconverged Estimation Summary

The nonconverged estimation results are shown in Figure 14.21.



The MODEL Procedure

Nonlinear OLS Summary of Residual Errors(Not Converged)


y 3 17 79.3551 4.6679 2.1605 -1.6812 -1.9966

Nonlinear OLS Parameter Estimates (Not Converged)

Approx ApproxParameter Estimate Std Err t Value Pr > |t| Label

a 137.3844 263342 0.00 0.9996 Interceptb -126.536 263342 -0.00 0.9996 Coefficient of

Transformed Xc -0.00213 4.4371 -0.00 0.9996 Power Transformation

Parameter

Figure 14.21. Nonconverged Results

Note that theR2 statistic is negative. AnR2 < 0 results when the residual mean squareerror for the model is larger than the variance of the dependent variable. NegativeR2 statistics may be produced when either the parameter estimates fail to convergecorrectly, as in this case, or when the correctly estimated model fits the data verypoorly.

Controlling Starting ValuesTo fit the preceding model you must specify a better starting value for C. Avoid start-ing values of C that are either very large or close to 0. For starting values of A andB, you can either specify values, use the default, or have PROC MODEL fit startingvalues for them conditional on the starting value for C.

Starting values are specified with the START= option of the FIT statement or on aPARMS statement. For example, the following statements estimate the model param-eters using the starting values A=.0001, B=.0001, and C=5.

proc model data=test;y = a + b * x ** c;label a = "Intercept"


fit y start=(c=5);run;

Using these starting values, the estimates converge in 16 iterations. The results areshown in Figure 14.22. Note that since the START= option explicitly declares pa-rameters, the parameter C is placed first in the table.



The MODEL Procedure



y 3 17 5.7359 0.3374 0.5809 0.8062 0.7834



c 0.327079 0.2892 1.13 0.2738 Power TransformationParameter

a 8.384311 3.3775 2.48 0.0238 Interceptb 3.505391 3.4858 1.01 0.3287 Coefficient of

Transformed X

Figure 14.22. Converged Results

Using the STARTITER OptionPROC MODEL can compute starting values for some parameters conditional on start-ing values you specify for the other parameters. You supply starting values for someparameters and specify the STARTITER option on the FIT statement.

For example, the following statements set C to 1 and compute starting values for Aand B by estimating these parameters conditional on the fixed value of C. With C=1this is equivalent to computing A and B by linear regression on X. A PARMS state-ment is used to declare the parameters in alphabetical order. The ITPRINT option isused to print the parameter values at each iteration.

proc model data=test;parms a b c;y = a + b * x ** c;label a = "Intercept"


fit y start=(c=1) / startiter itprint;run;

With better starting values, the estimates converge in only 5 iterations. Counting the 2iterations required to compute the starting values for A and B, this is 5 fewer than the12 iterations required without the STARTITER option. The iteration history listing isshown in Figure 14.23.





GRID 0 20 0.9970 161.9 0 0.00010 0.00010 5.00000GRID 1 20 0.0000 0.9675 0 12.29508 0.00108 5.00000


OLS 0 20 0.6551 0.9675 0 12.29508 0.00108 5.00000OLS 1 20 0.6882 0.9558 4 12.26426 0.00201 4.44013OLS 2 20 0.6960 0.9490 4 12.25554 0.00251 4.28262OLS 3 20 0.7058 0.9428 2 12.24487 0.00323 4.09977OLS 4 20 0.7177 0.9380 2 12.23186 0.00430 3.89040OLS 5 20 0.7317 0.9354 2 12.21610 0.00592 3.65450OLS 6 20 0.7376 0.9289 3 12.20663 0.00715 3.52417OLS 7 20 0.7445 0.9223 2 12.19502 0.00887 3.37407OLS 8 20 0.7524 0.9162 2 12.18085 0.01130 3.20393OLS 9 20 0.7613 0.9106 2 12.16366 0.01477 3.01460OLS 10 20 0.7705 0.9058 2 12.14298 0.01975 2.80839OLS 11 20 0.7797 0.9015 2 12.11827 0.02690 2.58933OLS 12 20 0.7880 0.8971 2 12.08900 0.03712 2.36306OLS 13 20 0.7947 0.8916 2 12.05460 0.05152 2.13650OLS 14 20 0.7993 0.8835 2 12.01449 0.07139 1.91695OLS 15 20 0.8015 0.8717 2 11.96803 0.09808 1.71101OLS 16 20 0.8013 0.8551 2 11.91459 0.13284 1.52361OLS 17 20 0.7987 0.8335 2 11.85359 0.17666 1.35745OLS 18 20 0.8026 0.8311 1 11.71551 0.28373 1.06872OLS 19 20 0.7945 0.7935 2 11.57666 0.40366 0.89662OLS 20 20 0.7872 0.7607 1 11.29346 0.65999 0.67059OLS 21 20 0.7632 0.6885 1 10.81372 1.11483 0.48842OLS 22 20 0.6976 0.5587 0 9.54889 2.34556 0.30461OLS 23 20 0.0108 0.2868 0 8.44333 3.44826 0.33232OLS 24 20 0.0008 0.2868 0 8.39438 3.49500 0.32790

NOTE: At OLS Iteration 24 CONVERGE=0.001 Criteria Met.

Figure 14.23. ITPRINT Listing

The results produced in this case are almost the same as the results shown in Figure14.22, except that the PARMS statement causes the Parameter Estimates table to beordered A, B, C instead of C, A, B. They are not exactly the same because the differentstarting values caused the iterations to converge at a slightly different place. Thiseffect is controlled by changing the convergence criterion with the CONVERGE=option.

By default, the STARTITER option performs one iteration to find starting values forthe parameters not given values. In this case the model is linear in A and B, so onlyone iteration is needed. If A or B were nonlinear, you could specify more than one"starting values" iteration by specifying a number for the STARTITER= option.

Finding Starting Values by Grid SearchPROC MODEL can try various combinations of parameter values and use the com-bination producing the smallest objective function value as starting values. (For OLSthe objective function is the residual mean square.) This is known as a preliminarygrid search. You can combine the STARTITER option with a grid search.



For example, the following statements try 5 different starting values for C: 10, 5, 2.5,-2.5, -5. For each value of C, values for A and B are estimated. The combination ofA, B, and C values producing the smallest residual mean square is then used to startthe iterative process.

proc model data=test;parms a b c;y = a + b * x ** c;label a = "Intercept"


fit y start=(c=10 5 2.5 -2.5 -5) / startiter itprint;run;

The iteration history listing is shown in Figure 14.24. Using the best starting valuesfound by the grid search, the OLS estimation only requires 2 iterations. However,since the grid search required 10 iterations, the total iterations in this case is 12.



GRID 0 20 1.0000 26815.5 0 0.00010 0.00010 10.00000GRID 1 20 0.0000 1.2193 0 12.51792 0.00000 10.00000GRID 0 20 0.6012 1.5151 0 12.51792 0.00000 5.00000GRID 1 20 0.0000 0.9675 0 12.29508 0.00108 5.00000GRID 0 20 0.7804 1.6091 0 12.29508 0.00108 2.50000GRID 1 20 0.0000 0.6290 0 11.87327 0.06372 2.50000GRID 0 20 0.8779 4.1604 0 11.87327 0.06372 -2.50000GRID 1 20 0.0000 0.9542 0 12.92455 -0.04700 -2.50000GRID 0 20 0.9998 2776.1 0 12.92455 -0.04700 -5.00000GRID 1 20 0.0000 1.0450 0 12.86129 -0.00060 -5.00000


OLS 0 20 0.6685 0.6290 0 11.87327 0.06372 2.50000OLS 1 20 0.6649 0.5871 3 11.79268 0.10083 2.11710OLS 2 20 0.6713 0.5740 2 11.71445 0.14901 1.81658OLS 3 20 0.6726 0.5621 2 11.63772 0.20595 1.58705OLS 4 20 0.6678 0.5471 2 11.56098 0.26987 1.40903OLS 5 20 0.6587 0.5295 2 11.48317 0.33953 1.26760OLS 6 20 0.6605 0.5235 1 11.32436 0.48846 1.03784OLS 7 20 0.6434 0.4997 2 11.18704 0.62475 0.90793OLS 8 20 0.6294 0.4805 1 10.93520 0.87965 0.73319OLS 9 20 0.6031 0.4530 1 10.55670 1.26879 0.57385OLS 10 20 0.6052 0.4526 0 9.62442 2.23114 0.36146OLS 11 20 0.1652 0.2948 0 8.56683 3.31774 0.32417OLS 12 20 0.0008 0.2868 0 8.38015 3.50974 0.32664

NOTE: At OLS Iteration 12 CONVERGE=0.001 Criteria Met.

Figure 14.24. ITPRINT Listing

Because no initial values for A or B were provided in the PARAMETERS statementor were read in with a PARMSDATA= or ESTDATA= option, A and B were given thedefault value of 0.0001 for the first iteration. At the second grid point, C=5, the valuesof A and B obtained from the previous iterations are used for the initial iteration. If



initial values are provided for parameters, the parameters start at those initial valuesat each grid point.

Guessing Starting Values from the Logic of the ModelExample 14.1 of the logistic growth curve model of the U.S. population illustratesthe need for reasonable starting values. This model can be written

pop =a

1 + exp(b� c(t� 1790))

wheret is time in years. The model is estimated using decennial census data of theU.S. population in millions. If this simple but highly nonlinear model is estimatedusing the default starting values, the estimation fails to converge.

To find reasonable starting values, first consider the meaning ofa andc. Taking thelimit as time increases,a is the limiting or maximum possible population. So, as astarting value fora, several times the most recent population known can be used, forexample, one billion (1000 million).

Dividing the time derivative by the function to find the growth rate and taking thelimit as t moves into the past, you can determine thatc is the initial growth rate. Youcan examine the data and compute an estimate of the growth rate for the first fewdecades, or you can pick a number that sounds like a plausible population growthrate figure, such as 2%.

To find a starting value forb, let t equal the base year used, 1790, which causesc to drop out of the formula for that year, and then solve for the value ofb thatis consistent with the known population in 1790 and with the starting value ofa.This yieldsb = ln(a=3:9 � 1) or about 5.5, wherea is 1000 and 3.9 is roughly thepopulation for 1790 given in the data. The estimates converge using these startingvalues.

Convergence ProblemsWhen estimating nonlinear models, you may encounter some of the following con-vergence problems.

Unable to ImproveThe optimization algorithm may be unable to find a step that improves the objectivefunction. If this happens in the Gauss-Newton method, the step size is halved tofind a change vector for which the objective improves. In the Marquardt method,�will be increased to find a change vector for which the objective improves. If, afterMAXSUBITER= step-size halvings or increases in�, the change vector still does notproduce a better objective value, the iterations are stopped and an error message isprinted.

Failure of the algorithm to improve the objective value can be caused by a CON-VERGE= value that is too small. Look at the convergence measures reported at thepoint of failure. If the estimates appear to be approximately converged, you can ac-cept the NOT CONVERGED results reported, or you can try re-running the FIT taskwith a larger CONVERGE= value.



If the procedure fails to converge because it is unable to find a change vector thatimproves the objective value, check your model and data to ensure that all param-eters are identified and data values are reasonably scaled. Then, re-run the modelwith different starting values. Also, consider using the Marquardt method if Gauss-Newton fails; the Gauss-Newton method can get into trouble if the Jacobian matrix isnearly singular or ill-conditioned. Keep in mind that a nonlinear model may be well-identified and well-conditioned for parameter values close to the solution values butunidentified or numerically ill-conditioned for other parameter values. The choice ofstarting values can make a big difference.

NonconvergenceThe estimates may diverge into areas where the program overflows or the estimatesmay go into areas where function values are illegal or too badly scaled for accuratecalculation. The estimation may also take steps that are too small or that make onlymarginal improvement in the objective function and, thus, fail to converge within theiteration limit.

When the estimates fail to converge, collinearity diagnostics for the Jacobiancrossproducts matrix are printed if there are 20 or fewer parameters estimated. See"Linear Dependencies" later in this section for an explanation of these diagnostics.

Inadequate Convergence CriterionIf convergence is obtained, the resulting estimates will only approximate a minimumpoint of the objective function. The statistical validity of the results is based on theexact minimization of the objective function, and for nonlinear models the quality ofthe results depends on the accuracy of the approximation of the minimum. This iscontrolled by the convergence criterion used.

There are many nonlinear functions for which the objective function is quite flat in alarge region around the minimum point so that many quite different parameter vectorsmay satisfy a weak convergence criterion. By using different starting values, differ-ent convergence criteria, or different minimization methods, you can produce verydifferent estimates for such models.

You can guard against this by running the estimation with different starting values anddifferent convergence criteria and checking that the estimates produced are essentiallythe same. If they are not, use a smaller CONVERGE= value.

Local MinimumYou may have converged to a local minimum rather than a global one. This problemis difficult to detect because the procedure will appear to have succeeded. You canguard against this by running the estimation with different starting values or with adifferent minimization technique. The START= option can be used to automaticallyperform a grid search to aid in the search for a global minimum.

DiscontinuitiesThe computational methods assume that the model is a continuous and smooth func-tion of the parameters. If this is not the case, the methods may not work.

If the model equations or their derivatives contain discontinuities, the estimation willusually succeed, provided that the final parameter estimates lie in a continuous inter-



val and that the iterations do not produce parameter values at points of discontinuityor parameter values that try to cross asymptotes.

One common case of discontinuities causing estimation failure is that of an asymp-totic discontinuity between the final estimates and the initial values. For example,consider the following model, which is basically linear but is written with one pa-rameter in reciprocal form:

y = a + b * x1 + x2 / c;

By placing the parameter C in the denominator, a singularity is introduced into theparameter space at C=0. This is not necessarily a problem, but if the correct estimateof C is negative while the starting value is positive (or vice versa), the asymptoticdiscontinuity at 0 will lie between the estimate and the starting value. This meansthat the iterations have to pass through the singularity to get to the correct estimates.The situation is shown in Figure 14.25.

1/C

Starting

C

estimate

Correct

value

Figure 14.25. Asymptotic Discontinuity

Because of the incorrect sign of the starting value, the C estimate goes off towardspositive infinity in a vain effort to get past the asymptote and onto the correct arm ofthe hyperbola. As the computer is required to work with ever closer approximationsto infinity, the numerical calculations break down and an "objective function wasnot improved" convergence failure message is printed. At this point, the iterationsterminate with an extremely large positive value for C. When the sign of the startingvalue for C is changed, the estimates converge quickly to the correct values.

Linear DependenciesIn some cases, the Jacobian matrix may not be of full rank; parameters may not befully identified for the current parameter values with the current data. When linear de-pendencies occur among the derivatives of the model, some parameters appear witha standard error of 0 and with the word BIASED printed in place of thet statistic.When this happens, collinearity diagnostics for the Jacobian crossproducts matrixare printed if the DETAILS option is specified and there are twenty or fewer param-eters estimated. Collinearity diagnostics are also printed out automatically when aminimization method fails, or when the COLLIN option is specified.



For each parameter, the proportion of the variance of the estimate accounted for byeachprincipal componentis printed. The principal components are constructed fromthe eigenvalues and eigenvectors of the correlation matrix (scaled covariance matrix).When collinearity exists, a principal component is associated with proportion of thevariance of more than one parameter. The numbers reported are proportions so theywill remain between 0 and 1. If two or more parameters have large proportion valuesassociated with the same principle component, then two problems can occur: thecomputation of the parameter estimates are slow or nonconvergent; and the parameterestimates have inflated variances (Belsley 1980, p. 105-117).

For example, the following cubic model is fit to a quadratic data set:

proc model data=test3;exogenous x1 ;parms b1 a1 c1 ;y1 = a1 * x1 + b1 * x1 * x1 + c1 * x1 * x1 *x1;fit y1/ collin ;

run;

The collinearity diagnostics are shown in Figure 14.26.

The MODEL Procedure


Condition -----Proportion of Variation----Number Eigenvalue Number b1 a1 c1

1 2.942920 1.0000 0.0001 0.0004 0.00022 0.056638 7.2084 0.0001 0.0357 0.01483 0.000442 81.5801 0.9999 0.9639 0.9850

Figure 14.26. Collinearity Diagnostics

Notice that the proportions associated with the smallest eigenvalue are almost 1. Forthis model, removing any of the parameters will decrease the variances of the remain-ing parameters.

In many models the collinearity might not be clear cut. Collinearity is not neces-sarily something you remove. A model may need to be reformulated to remove theredundant parameterization or the limitations on the estimatability of the model canbe accepted.

Collinearity diagnostics are also useful when an estimation does not converge. Thediagnostics provide insight into the numerical problems and can suggest which pa-rameters need better starting values. These diagnostics are based on the approach ofBelsley, Kuh, and Welsch (1980).



Iteration History

The options ITPRINT, ITDETAILS, XPX, I, and ITALL specify a detailed listing ofeach iteration of the minimization process.

ITPRINT OptionThe ITPRINT information is selected whenever any iteration information is re-quested.

The following information is displayed for each iteration:

N the number of usable observations

Objective the corrected objective function value

Trace(S) the trace of theS matrix

subit the number of subiterations required to find a� or a damping factorthat reduces the objective function

R the R convergence measure

The estimates for the parameters at each iteration are also printed.

ITDETAILS OptionThe additional values printed for the ITDETAILS option are:

Theta is the angle in degrees between�, the parameter change vector,and the negative gradient of the objective function.

Phi is the directional derivative of the objective function in the� di-rection scaled by the objective function.

Stepsize is the value of the damping factor used to reduce� if the Gauss-Newton method is used.

Lambda is the value of� if the Marquardt method is used.

Rank(XPX) If the projected Jacobian crossproducts matrix is singular, the rankof theX0X matrix is output.

The definitions of PPC and R are explained in the section "Convergence Criteria."When the values of PPC are large, the parameter associated with the criteria is dis-played in parentheses after the value.

XPX and I OptionsThe XPX and the I options select the printing of the augmentedX0X matrix andthe augmentedX0X matrix after asweepoperation (Goodnight 1979) has been per-formed on it. An example of the output from the following statements is shown inFigure 14.27.

proc model data=test2 ;y1 = a1 * x2 * x2 - exp( d1*x1);y2 = a2 * x1 * x1 + b2 * exp( d2*x2);fit y1 y2 / XPX I ;

run;




Cross Products for System At OLS Iteration 0

a1 d1 a2 b2 d2 Residual

a1 1839468 -33818.35 0.0 0.00 0.000000 3879959d1 -33818 1276.45 0.0 0.00 0.000000 -76928a2 0 0.00 42925.0 1275.15 0.154739 470686b2 0 0.00 1275.2 50.01 0.003867 16055d2 0 0.00 0.2 0.00 0.000064 2Residual 3879959 -76928.14 470686.3 16055.07 2.329718 24576144

XPX Inverse for System At OLS Iteration 0

a1 d1 a2 b2 d2 Residual

a1 0.000001 0.000028 0.000000 0.0000 0.00 2d1 0.000028 0.001527 0.000000 0.0000 0.00 -9a2 0.000000 0.000000 0.000097 -0.0025 -0.08 6b2 0.000000 0.000000 -0.002455 0.0825 0.95 172d2 0.000000 0.000000 -0.084915 0.9476 15746.71 11931Residual 1.952150 -8.546875 5.823969 171.6234 11930.89 10819902

Figure 14.27. XPX and I Options Output

The first matrix, labeled "Cross Products," for OLS estimation is

�X0X X0r

r0X r0r

�

The column labeled "Residual" in the output is the vectorX0r, which is the gradientof the objective function. The diagonal scalar valuer0r is the objective functionuncorrected for degrees of freedom. The second matrix, labeled "XPX Inverse," iscreated through a sweep operation on the augmentedX0X matrix to get:

�(X0X)�1 (X0X)�1X0r

(X0r)0(X0X)�1 r0r� (X0r)0(X0X)�1X0r

�

Note that the residual column is the change vector used to update the parameter esti-mates at each iteration. The corner scalar element is used to compute the R conver-gence criteria.

ITALL OptionThe ITALL option, in addition to causing the output of all of the preceding options,outputs theS matrix, the inverse of theS matrix, the CROSS matrix, and the sweptCROSS matrix. An example of a portion of the CROSS matrix for the precedingexample is shown in Figure 14.28.




Crossproducts Matrix At OLS Iteration 0

1 @PRED.y1/@a1 @PRED.y1/@d1 @PRED.y2/@a2

1 50.00 6409 -239.16 [email protected]/@a1 6409.08 1839468 -33818.35 [email protected]/@d1 -239.16 -33818 1276.45 [email protected]/@a2 1275.00 187766 -7253.00 [email protected]/@b2 50.00 6410 -239.19 [email protected]/@d2 0.00 1 -0.03 0.2RESID.y1 14699.97 3879959 -76928.14 420582.9RESID.y2 16052.76 4065028 -85083.68 470686.3

Crossproducts Matrix At OLS Iteration 0

@PRED.y2/@b2 @PRED.y2/@d2 RESID.y1 RESID.y2

1 50.00 0.003803 14700 [email protected]/@a1 6409.88 0.813934 3879959 [email protected]/@d1 -239.19 -0.026177 -76928 [email protected]/@a2 1275.15 0.154739 420583 [email protected]/@b2 50.01 0.003867 14702 [email protected]/@d2 0.00 0.000064 2 2RESID.y1 14701.77 1.820356 11827102 12234106RESID.y2 16055.07 2.329718 12234106 12749042

Figure 14.28. ITALL Option Cross-Products Matrix Output

Computer Resource Requirements

If you are estimating large systems, you need to be aware of how PROC MODELuses computer resources such as memory and the CPU so they can be used mostefficiently.

Saving Time with Large Data SetsIf your input data set has many observations, the FIT statement does a large numberof model program executions. A pass through the data is made at least once for eachiteration and the model program is executed once for each observation in each pass.If you refine the starting estimates by using a smaller data set, the final estimationwith the full data set may require fewer iterations.

For example, you could use

proc model;/* Model goes here */fit / data=a(obs=25);fit / data=a;

where OBS=25 selects the first 25 observations in A. The second FIT statement pro-duces the final estimates using the full data set and starting values from the first run.

Fitting the Model in Sections to Save Space and TimeIf you have a very large model (with several hundred parameters, for example), theprocedure uses considerable space and time. You may be able to save resources by



breaking the estimation process into several steps and estimating the parameters insubsets.

You can use the FIT statement to select for estimation only the parameters for selectedequations. Do not break the estimation into too many small steps; the total computertime required is minimized by compromising between the number of FIT statementsthat are executed and the size of the crossproducts matrices that must be processed.

When the parameters are estimated for selected equations, the entire model programmust be executed even though only a part of the model program may be needed tocompute the residuals for the equations selected for estimation. If the model itselfcan be broken into sections for estimation (and later combined for simulation andforecasting), then more resources can be saved.

For example, to estimate the following four equation model in two steps, you coulduse

proc model data=a outmodel=part1;parms a0-a2 b0-b2 c0-c3 d0-d3;y1 = a0 + a1*y2 + a2*x1;y2 = b0 + b1*y1 + b2*x2;y3 = c0 + c1*y1 + c2*y4 + c3*x3;y4 = d0 + d1*y1 + d2*y3 + d3*x4;fit y1 y2;fit y3 y4;fit y1 y2 y3 y4;

run;

You should try estimating the model in pieces to save time only if there are more than14 parameters; the preceding example takes more time, not less, and the difference inmemory required is trivial.

Memory Requirements for Parameter EstimationPROC MODEL is a large program, and it requires much memory. Memory is alsorequired for the SAS System, various data areas, the model program and associatedtables and data vectors, and a few crossproducts matrices. For most models, thememory required for PROC MODEL itself is much larger than that required for themodel program, and the memory required for the model program is larger than thatrequired for the crossproducts matrices.

The number of bytes needed for two crossproducts matrices, fourS matrices, andthree parameter covariance matrices is

8� (2 + k +m+ g)2 + 16� g2 + 12� (p+ 1)2

plus lower-order terms.m is the number of unique nonzero derivatives of each resid-ual with respect to each parameter,g is the number of equations,k is the number ofinstruments, andp is the number of parameters. This formula is for the memory re-quired for 3SLS. If you are using OLS, a reasonable estimate of the memory requiredfor large problems (greater than 100 parameters) is to divide the value obtained fromthe formula in half.



Consider the following model program.

proc model data=test2 details;exogenous x1 x2;parms b1 100 a1 a2 b2 2.5 c2 55;y1 = a1 * y2 + b1 * x1 * x1;y2 = a2 * y1 + b2 * x2 * x2 + c2 / x2;fit y1 y2 / n3sls;inst b1 b2 c2 x1 ;

run;

The DETAILS option prints the storage requirements information shown in Figure14.29.

The MODEL Procedure

Storage Requirements for this Problem

Order of XPX Matrix 6Order of S Matrix 2Order of Cross Matrix 13Total Nonzero Derivatives 5Distinct Variable Derivatives 5Size of Cross matrix 728

Figure 14.29. Storage Requirements Information

The matrixX0X augmented by the residual vector is called the XPX matrix in theoutput, and it has the sizem+ 1. The order of theS matrix, 2 for this example, isthe value ofg. The CROSS matrix is made up of thek unique instruments, a constantcolumn representing the intercept terms, followed by them unique Jacobian vari-ables plus a constant column representing the parameters with constant derivatives,followed by theg residuals.

The size of two CROSS matrices in bytes is

8� (2 + k +m+ g)2 + 2 + k +m+ g

Note that the CROSS matrix is symmetric, so only the diagonal and the upper trian-gular part of the matrix is stored. For examples of the CROSS and XPX matrices see"Iteration History" in this section.

The MEMORYUSE OptionThe MEMORYUSE option on the FIT, SOLVE, MODEL, or RESET statement maybe used to request a comprehensive memory usage summary.

Figure 14.30 shows an example of the output produced by the MEMORYUSE option.



The MODEL Procedure

Memory Usage Summary (in bytes)

Symbols 5368Strings 1057Lists 1472Arrays 84Statements 704Opcodes 800Parsing 640Executable 220Block option 0Cross reference 0Flow analysis 1024Derivatives 9406Data vector 240Cross matrix 728X’X matrix 392S matrix 96GMM memory 0Jacobian 0Work vectors 692Overhead 1906----------------------- --------------Total 24829

Figure 14.30. MEMORYUSE Option Output for SOLVE Task

Definitions of the memory components follows:

symbols memory used to store information about variables in the modelstrings memory used to store the variable names and labelslists space used to hold lists of variablesarrays memory used by ARRAY statementsstatements memory used for the list of programming statements in the modelopcodes memory used to store the code compiled to evaluate the

expression in the model programparsing memory used in parsing the SAS statementsexecutable the compiled model program size (not correct yet)block option memory used by the BLOCK optioncross ref. memory used by the XREF optionflow analysis memory used to compute the interdependencies of the variablesderivatives memory used to compute and store the analytical derivativesdata vector memory used for the program data vectorcross matrix memory used for one or more copies of the Cross matrixX0X matrix memory used for one or more copies of theX0X matrixS matrix memory used for the covariance matrixGMM memory additional memory used for the GMM and ITGMM methodsJacobian memory used for the Jacobian matrix for SOLVE and FIMLwork vectors memory used for miscellaneous work vectorsoverhead other miscellaneous memory



Testing for Normality

The NORMAL option on the FIT statement performs multivariate and univariate testsof normality.

The three multivariate tests provided are Mardia’s skewness test and kurtosis test(Mardia 1980) and the Henze-ZirklerTn;� test (Henze and Zirkler 1990). The twounivariate tests provided are the Shapiro-Wilk W test and the Kolmogorov-Smirnovtest. (For details on the univariate tests, refer to "Tests for Normality" in "The UNI-VARIATE Procedure" chapter in theSAS Procedures Guide.) The null hypothesis forall these tests is that the residuals are normally distributed.

For a random sampleX1; : : :;Xn, Xi2Rd, whered is the dimension ofXi andn isthe number of observations, a measure of multivariate skewness is

b1;d =1

n2

nXi=1

nXj=1

[(Xi � �)0S�1(Xj � �)]3

whereS is the sample covariance matrix ofX. For weighted regression, bothS and(Xi � �) are computed using the weights supplied by the WEIGHT statement or the

–WEIGHT– variable.

Mardia showed that under the null hypothesisn6 b1;d is asymptotically distributed as

�2(d(d + 1)(d+ 2)=6) .

A measure of multivariate kurtosis is given by

b2;d =1

n

nXi=1

[(Xi � �)0

S�1(Xi � �)]2

Mardia showed that under the null hypothesisb2;dis asymptotically normally dis-tributed with meand(d+ 2) and variance8d(d+ 2)=n.

The Henze-Zirkler test is based on a nonnegative functionalD(:; :)that measures thedistance between two distribution functions and has the property that

D(Nd(0; Id);Q) = 0

if and only if

Q = Nd(0; Id)

whereNd(�;�d) is ad-dimensional normal distribution.

The distance measureD(:; :) can be written as

D�(P;Q) =

ZRd

jP (t)� Q(t)j2'�(t)dt



whereP (t) andQ(t) are the Fourier transforms of P and Q, and'�(t) is a weight ora kernel function. The density of the normal distributionNd(0; �

2Id) is used as'�(t)

'�(t) = (2��2)�d2 exp(

�jtj22�2

); t 2 Rd

wherejtj = (t0

t)0:5.

The parameter� depends onn as

�d(n) =1p2(2d+ 1

4)1=(d+4)n1=(d+4)

The test statistic computed is calledT�(d) and is approximately distributed as a lognormal. The log normal distribution is used to compute the null hypothesis probabil-ity.

T�(d) =1n2

nXj=1

nXk=1

exp(��2

2jYj � Ykj2)

� 2(1 + �2)�d=21

n

nXj=1

exp(� �2

2(1 + �2)jYjj2) + (1 + 2�2)�d=2

where

jYj � Ykj2 = (Xj �Xk)0S�1(Xj �Xk)

jYj j2 = (Xj � �X)0S�1(Xj � �X)

Monte Carlo simulations suggest thatT�(d) has good power against distributionswith heavy tails.

The Shapiro-Wilk W test is computed only when the number of observations (n) isless than2000.

The following is an example of the output produced by the NORMAL option.

The MODEL Procedure

Normality TestEquation Test Statistic Value Prob

y1 Shapiro-Wilk W 0.37 <.0001y2 Shapiro-Wilk W 0.84 <.0001System Mardia Skewness 286.4 <.0001

Mardia Kurtosis 31.28 <.0001Henze-Zirkler T 7.09 <.0001

Figure 14.31. Normality Test Output



Heteroscedasticity

One of the key assumptions of regression is that the variance of the errors is con-stant across observations. If the errors have constant variance, the errors are calledhomoscedastic. Typically, residuals are plotted to assess this assumption. Standardestimation methods are inefficient when the errors areheteroscedasticor have non-constant variance.

Heteroscedasticity TestsThe MODEL procedure now provides two tests for heteroscedasticity of the errors:White’s test and the modified Breusch-Pagan test.

Both White’s test and the Breusch-Pagan are based on the residuals of the fittedmodel. For systems of equations, these tests are computed separately for the residualsof each equation.

The residuals of an estimation are used to investigate the heteroscedasticity of thetrue disturbances.

The WHITE option tests the null hypothesis

H0 : �2i = �2 for all i

White’s test is general because it makes no assumptions about the form of the het-eroscedasticity (White 1980). Because of its generality, White’s test may identifyspecification errors other than heteroscedasticity (Thursby 1982). Thus White’s testmay be significant when the errors are homoscedastic but the model is misspecifiedin other ways.

White’s test is equivalent to obtaining the error sum of squares for the regression ofthe squared residuals on a constant and all the unique variables inJJ, where thematrix J is composed of the partial derivatives of the equation residual with respectto the estimated parameters.

Note that White’s test in the MODEL procedure is different than White’s test in theREG procedure requested by the SPEC option. The SPEC option produces the testfrom Theorem 2 on page 823 of White (1980). The WHITE option, on the otherhand, produces the statistic from Corollary 1 on page 825 of White (1980).

The modified Breusch-Pagan test assumes that the error variance varies with a set ofregressors, which are listed in the BREUSCH= option.

Define the matrixZ to be composed of the values of the variables listed in theBREUSCH= option, such thatzi;j is the value of thejth variable in the BREUSCH=option for theith observation. The null hypothesis of the Breusch-Pagan test is

H0 : �2i = �2(�0 + �

0

zi)

where�2i is the error variance for theith observation, and�0 and� are regressioncoefficients.



The test statistic for the Breusch-Pagan test is

bp =1

v(u� �ui)

0

Z(Z0

Z)�1Z0

(u� �ui)

whereu = (e21; e22; : : :; e

2n), i is an� 1 vector of ones, and

v =1

n

nXi=1

(e2i �e0

e

n)2

This is a modified version of the Breusch-Pagan test, which is less sensitive to theassumption of normality than the original test (Greene 1993, p. 395).

The statements in the following example produce the output in Figure 14.32:

proc model data=schools;parms const inc inc2;

exp = const + inc * income + inc2 * income * income;incsq = income * income;

fit exp / white breusch=(1 income incsq);run;

The MODEL Procedure

Heteroscedasticity TestEquation Test Statistic DF Pr > ChiSq Variables

exp White’s Test 21.16 4 0.0003 Cross of all varsBreusch-Pagan 15.83 2 0.0004 1, income, incsq

Figure 14.32. Output for Heteroscedasticity Tests

Correcting for HeteroscedasticityThere are two methods for improving the efficiency of the parameter estimation inthe presence of heteroscedastic errors. If the error variance relationships are known,weighted regression can be used or an error model can be estimated. For details onerror model estimation see section “Error Covariance Stucture Specification”. If theerror variance relationship is unknown, GMM estimation can be used.

Weighted RegressionThe WEIGHT statement can be used to correct for the heteroscedasticity. Considerthe following model, which has a heteroscedastic error term:

yt = 250(e�0:2t � e�0:8t) +p(9=t)�t

The data for this model is generated with the following SAS statements.



data test;do t=1 to 25;

y = 250 * (exp( -0.2 * t ) - exp( -0.8 * t )) +sqrt( 9 / t ) * rannor(1);

output;end;

run;

If this model is estimated with OLS,

proc model data=test;parms b1 0.1 b2 0.9;y = 250 * ( exp( -b1 * t ) - exp( -b2 * t ) );fit y;

run;

the estimates shown in Figure 14.33 are obtained for the parameters.

The MODEL Procedure



b1 0.200977 0.00101 198.60 <.0001b2 0.826236 0.00853 96.82 <.0001

Figure 14.33. Unweighted OLS Estimates

If both sides of the model equation are multiplied bypt, the model will have a

homoscedastic error term. This multiplication or weighting is done through theWEIGHT statement. The WEIGHT statement variable operates on the squared resid-uals as

�0

t�t = weight � q0tqt

so that the WEIGHT statement variable represents the square of the model multi-plier. The following PROC MODEL statements corrects the heteroscedasticity witha WEIGHT statement

proc model data=test;parms b1 0.1 b2 0.9;y = 250 * ( exp( -b1 * t ) - exp( -b2 * t ) );fit y;weight t;

run;

Note that the WEIGHT statement follows the FIT statement. The weighted estimatesare shown in Figure 14.34.



The MODEL Procedure



b1 0.200503 0.000844 237.53 <.0001b2 0.816701 0.0139 58.71 <.0001

Figure 14.34. Weighted OLS Estimates

The weighted OLS estimates are identical to the output produced by the followingPROC MODEL example:

proc model data=test;parms b1 0.1 b2 0.9;y = 250 * ( exp( -b1 * t ) - exp( -b2 * t ) );_weight_ = t;fit y;

run;

If the WEIGHT statement is used in conjunction with the–WEIGHT– variable, thetwo values are multiplied together to obtain the weight used.

The WEIGHT statement and the–WEIGHT– variable operate on all the residualsin a system of equations. If a subset of the equations needs to be weighted, theresiduals for each equation can be modified through the RESID. variable for eachequation. The following example demonstrates the use of the RESID. variable tomake a homoscedastic error term:

proc model data=test;parms b1 0.1 b2 0.9;y = 250 * ( exp( -b1 * t ) - exp( -b2 * t ) );resid.y = resid.y * sqrt(t);fit y;

run;

These statements produce estimates of the parameters and standard errors that areidentical to the weighted OLS estimates. The reassignment of the RESID.Y variablemust be done after Y is assigned, otherwise it would have no effect. Also, note thatthe residual (RESID.Y) is multiplied by

pt. Here the multiplier is acting on the

residual before it is squared.

GMM EstimationIf the form of the heteroscedasticity is unknown, generalized method of momentsestimation (GMM) can be used. The following PROC MODEL statements use GMMto estimate the example model used in the preceding section:

proc model data=test;parms b1 0.1 b2 0.9;y = 250 * ( exp( -b1 * t ) - exp( -b2 * t ) );



fit y / gmm;instruments b1 b2;

run;

GMM is an instrumental method, so instrument variables must be provided.

GMM estimation generates estimates for the parameters shown in Figure 14.35.

The MODEL Procedure

Nonlinear GMM Parameter Estimates


b1 0.200487 0.000807 248.38 <.0001b2 0.822148 0.0142 57.95 <.0001

Figure 14.35. GMM Estimation for Heteroscedasticity

Transformation of Error Terms

In PROC MODEL you can control the form of the error term. By default the errorterm is assumed to be additive. This section demonstrates how to specify nonadditiveerror terms and discusses the effects of these transformations.

Models with Nonadditive ErrorsThe estimation methods used by PROC MODEL assume that the error terms of theequations are independently and identically distributed with zero means and finitevariances. Furthermore, the methods assume that the RESID.nameequation variablefor normalized form equations or the EQ.nameequation variable for general formequations contains an estimate of the error term of the true stochastic model whoseparameters are being estimated. Details on RESID.name and EQ.nameequationvariables are in the section "Model Translations."

To illustrate these points, consider the common loglinear model

y = �x� (1)

lny = a+ bln(x) (2)

wherea=log(�) and b=�. Equation (2) is called thelog form of the equation incontrast to equation (1), which is called thelevel formof the equation. Using theSYSLIN procedure, you can estimate equation (2) by specifying

proc syslin data=in;model logy=logx;

run;

where LOGY and LOGX are the logs of Y and X computed in a preceding DATAstep. The resulting values for INTERCEPT and LOGX correspond toa and b inequation (2).



Using the MODEL procedure, you can try to estimate the parameters in the levelform (and avoid the DATA step) by specifying

proc model data=in;parms alpha beta;y = alpha * x ** beta;fit y;

run;

where ALPHA and BETA are the parameters in equation (1).

Unfortunately, at least one of the preceding is wrong; an ambiguity results becauseequations (1) and (2) contain no explicit error term. The SYSLIN and MODEL pro-cedures both deal with additive errors; the residual used (the estimate of the errorterm in the equation) is the difference between the predicted and actual values (ofLOGY for PROC SYSLIN and of Y for PROC MODEL in this example). If youperform the regressions discussed previously, PROC SYSLIN estimates equation (3)while PROC MODEL estimates equation (4).

lny = a+ bln(x) + � (3)

y = �x� + � (4)

These are different statistical models. Equation (3) is the log form of equation (5)

y = �x�� (5)

where�=e�. Equation (4), on the other hand, cannot be linearized because the errorterm� (different from�) is additive in the level form.

You must decide whether your model is equation (4) or (5). If the model is equation(4), you should use PROC MODEL. If you linearize equation (1) without consideringthe error term and apply SYSLIN to MODEL LOGY=LOGX, the results will bewrong. On the other hand, if your model is equation (5) (in practice it usually is),and you want to use PROC MODEL to estimate the parameters in thelevelform, youmust do something to account for the multiplicative error.

PROC MODEL estimates parameters by minimizing an objective function. The ob-jective function is computed using either the RESID.-prefixed equation variable orthe EQ.-prefixed equation variable. You must make sure that these prefixed equationvariables are assigned an appropriate error term. If the model has additive errors thatsatisfy the assumptions, nothing needs to be done. In the case of equation (5), theerror is nonadditive and the equation is in normalized form, so you must alter thevalue of RESID.Y.

The following assigns a valid estimate of� to RESID.Y:

y = alpha * x ** beta;resid.y = actual.y / pred.y;

However,�=e� and, therefore,� cannot have a mean of zero and you cannot con-sistently estimate� and� by minimizing the sum of squares of an estimate of�.Instead, you use� = ln�.



proc model data=in;parms alpha beta;y = alpha * x ** beta;resid.y = log( actual.y / pred.y );fit y;

run;

If the model was expressed in general form, this transformation becomes

proc model data=in;parms alpha beta;EQ.trans = log( y / (alpha * x ** beta));fit trans;

run;

Both examples produce estimates of� and � of the level form that match theestimates ofa and b of the log form. That is, ALPHA=exp(INTERCEPT) andBETA=LOGX, where INTERCEPT and LOGX are the PROC SYSLIN parameterestimates from the MODEL LOGY=LOGX. The standard error reported for ALPHAis different from that for the INTERCEPT in the log form.

The preceding example is not intended to suggest that loglinear models should beestimated in level form but, rather, to make the following points:

� Nonlinear transformations of equations involve the error term of the equation,and this should be taken into account when transforming models.

� The RESID.-prefixed and the EQ.-prefixed equation variables for models es-timated by the MODEL procedure must represent additive errors with zeromeans.

� You can use assignments to RESID.-prefixed and EQ.-prefixed equation vari-ables to transform error terms.

� Some models do not have additive errors or zero means, and many such mod-els can be estimated using the MODEL procedure. The preceding approachapplies not only to multiplicative models but to any model that can be manipu-lated to isolate the error term.

Predicted Values of Transformed ModelsNonadditive or transformed errors affect the distribution of the predicted values, aswell as the estimates. For the preceding loglinear example, the MODEL procedureproduces consistent parameter estimates. However, the predicted values for Y com-puted by PROC MODEL are not unbiased estimates of the expected values of Y,although they do estimate the conditional median Y values.

In general, the predicted values produced for a model with nonadditive errors are notunbiased estimates of the conditional means of the endogenous value. If the modelcan be transformed to a model with additive errors by using amonotonictransforma-tion, the predicted values estimate the conditional medians of the endogenous vari-able.



For transformed models in which the biasing factor is known, you can use program-ming statements to correct for the bias in the predicted values as estimates of theendogenous means. In the preceding loglinear case, the predicted values will be bi-ased by the factor exp(�2/2). You can produce approximately unbiased predictedvalues in this case by writing the model as

proc model data=in;parms alpha beta;y=alpha * x ** beta;resid.y = log( actual.y / pred.y );

fit y;run;

Refer to Miller (1984) for a discussion of bias factors for predicted values of trans-formed models.

Note that models with transformed errors are not appropriate for Monte Carlo simu-lation using the SDATA= option. PROC MODEL computes the OUTS= matrix fromthe transformed RESID.-prefixed equation variables, while it uses the SDATA= ma-trix to generate multivariate normal errors, which are added to the predicted values.This method of computing errors is inconsistent when the equation variables havebeen transformed.

Error Covariance Structure Specification

One of the key assumptions of regression is that the variance of the errors is constantacross observations. Correcting for heteroscedasticity improves the efficiency of theestimates.

Consider the following general form for models:

q(yt;xt; �) = "t

"t = Ht � �t

Ht =

26664pht;1 0 : : : 0

0pht;2 : : : 0

. . .0 0 : : :

pht;g

37775

ht = g(yt;xt; �)

where�t � N(0;�).

For models that are homoscedastic,

ht = 1

If you had a model which was heteroscedastic with known form you can improve theefficiency of the estimates by performing a weighted regression. The weight variable,using this notation, would be1=

pht.



If the errors for a model are heteroscedastic and the functional form of the varianceis known, the model for the variance can now be estimated along with the regressionfunction.

To specify a functional form for the variance, assign the function to an H.var variablewherevar is the equation variable. For example, if you wanted to estimate the scaleparameter for the variance of a simple regression model

y = a � x+ b

you can specify

proc model data=s;y = a * x + b;h.y = sigma**2;

fit y;

Consider the same model with the following functional form for the variance:

ht = �2 � x2��

This would be written as

proc model data=s;y = a * x + b;h.y = sigma**2 * x**(2*alpha);

fit y;

There are three ways to model the variance in the MODEL procedure; Feasable gen-eralized least squares; Generalized method of moments; and Full information maxi-mum likelihood.

Feasable GLSA simple approach to estimating a variance function is to estimate the mean param-eters� using some auxilary method, such as OLS, and then use the residuals of thatestimation to estimate the parameters� of the variance function. This scheme iscalledfeasable GLS. It is posible to use the residuals from an auxilary method for thepurpose of estimating� because in many cases the residuals consistently estimate theerror terms.

This scheme can be done by hand by performing OLS estimation ofq(yt;xt; �),the mean function, then by regressing the residuals squared onht, and finally byre-estimating the the mean function using a weight of1=

pht.

For all estimation methods except GMM and FIML, using the H.var syntax specifiesthat feasable GLS will be used in the estimation. For feasable GLS the mean functionis estimated by the usual method. The variance function is then estimated usingpseudolikelihood (PL) function of the generated residuals. The objective function forthe PL estimation is



pn(�; �) =nX

i=1

(yi � f(xi; �))

2

�2h(zi; �)+ log[�2h(zi; �)]

!

Once the variance function has been estimated the mean function is re-estimated us-ing the variance function as weights. If an S-iterated method is selected, this processis repeated until convergence (iterated feasable GLS).

Note, feasable GLS will not yield consistent estimates when one of the following istrue:

� The variance is unbounded.

� There is too much serial dependence in the errors (the dependence does notfade with time).

� A combination of serial dependence and lag dependent variables.

The first two cases are unusual but the third is much more common. Whether iter-ated feasable GLS avoids consistency problems with the last case is an unansweredresearch question. For more information see (Davidson and MacKinnon 1993) pages298-301 or (Gallant 1987) pages 124-125 and (Amemiya 1985) pages 202-203.

One limitation is that parameters can not be shared between the mean equation andthe variance equation. This implies that certian GARCH models, cross equation re-strictions of parameters, or testing of combinations of parameters in the mean andvariance component are not allowed.

Generalized Method of MomentsIn GMM, normally the first moment of the mean function is used in the objectivefunction.

q(yt;xt; �) = �t

E(�t) = 0

To add the second moment conditions to the estimation, add the equation

E("t � "t � ht) = 0

to the model. For example if you wanted to estimate� for linear example above, youcan write

proc model data=s;y = a * x + b;eq.two = resid.y**2 - sigma**2;

fit y two/ gmm;instruments x;run;



This is a popular way to estimate a continuous-time interest rate processes (see (Chan,et al 1992)). The H.var syntax will automatically generate this system of equations.

To further take advantage of the information obtained about the variance, the momentequations can be modified to

E("t=pht) = 0

E("t � "t � ht) = 0

For the above example, this can be written as

proc model data=s;y = a * x + b;eq.two = resid.y**2 - sigma**2;resid.y = resid.y / sigma;

fit y two/ gmm;instruments x;run;

Note that, if the error model is misspecified in this form of the GMM model, theparameter estimates may be inconsistent.

Full Information Maximum LikelihoodFor FIML estimation of variance functions, the concentrated likelihood below is usedas the objective function. That is, the mean function will be coupled with the variancefunction and the system will be solved simultaneously.

ln(�) =ng

2(1 + ln(2�)) �

nXt=1

ln

��@q(yt;xt; �)@yt

��

+1

2

nXt=1

gXi=1

�ln(ht;i) + qi(yt;xt; �)

2=ht;i�

whereg is the number of equations in the system.

The HESSIAN=GLS option is not available for FIML estimation involving variancefunctions. The matrix used when HESSIAN=CROSS is specified is a cross productsmatrix which has been enhanced by the dual quasi-newton approximation.

ExamplesYou can specify a GARCH(1,1) model as follows:

proc model data=modloc.usd_jpy;

/* Mean model --------*/jpyret = intercept ;

/* Variance model ----------------*/h.jpyret = arch0 + arch1 * zlag( resid.jpyret * resid.jpyret )

+ garch1 * zlag(h.jpyret) ;



bounds arch0 arch1 garch1 >= 0;

fit jpyret/method=marquardt fiml;run;

Note that the BOUNDS statement was used to ensure that the parameters were posi-tive, a requirement for GARCH models.

EGARCH models are used because there is no restrictions on the parameters. Youcan specify a EGARCH(1,1) model as follows:

proc model data=sasuser.usd_dem ;

/* Mean model ----------*/demret = intercept ;

/* Variance model ----------------*/if ( _OBS_ =1 ) then

h.demret = exp( earch0/ (1. - egarch1) );else

h.demret = exp( earch0 + earch1 * zlag( g)+ egarch1 * log(zlag(h.demret)));

g = theta * nresid.demret + abs( nresid.demret ) - sqrt(2/3.1415);

/* Fit and save the model */fit demret/method=marquardt fiml maxiter=100run;

Ordinary Differential Equations

Ordinary differential equations (ODEs) are also calledinitial value problemsbecausea time zero value for each first-order differential equation is needed. The followingis an example of a first-order system of ODEs:

y0 = �0:1y + 2:5z2

z0 = �zy0 = 0z0 = 1

Note that you must provide an initial value for each ODE.

As a reminder, anyn-order differential equation can be modeled as a system of first-order differential equations. For example, consider the differential equation

y00

= by0

+ cyy0 = 0y0

0 = 1

which can be written as the system of differential equations

y0

= z



z0

= by0

+ cyy0 = 0z0 = 1

This differential system can be simulated as follows:

data t;time=0; output;time=1; output;time=2; output;

run;

proc model data=t ;dependent y 0 z 1;parm b -2 c -4;

/* Solve y’’=b y’ + c y --------------*/

dert.y = z;dert.z = b * dert.y + c * y;

solve y z / dynamic solveprint;run;

The preceding statements produce the following output. These statements produceadditional output, which is not shown.


Observation 1 Missing 2 CC -1.000000Iterations 0

Solution Values

y z

0.000000 1.000000

Observation 2 Iterations 0 CC 0.000000 ERROR.y 0.000000

Solution Values

y z

0.2096398 -.2687053

Observation 3 Iterations 0 CC 9.464802 ERROR.y -0.234405

Solution Values

y z

-.0247649 -.1035929



The differential variables are distinguished by the derivative with respect to time(DERT.) prefix. Once you define the DERT. variable, you can use it on the right-handside of another equation. The differential equations must be expressed in normalform; implicit differential equations are not allowed, and other terms on the left-handside are not allowed.

The TIME variable is theimplied with respect tovariable for all DERT. variables.The TIME variable is also the only variable that must be in the input data set.

You can provide initial values for the differential equations in the data set, in thedeclaration statement (as in the previous example), or in statements in the code. Usingthe previous example, you can specify the initial values as

proc model data=t ;dependent y z ;parm b -2 c -4;

/* Solve y’’=b y’ + c y --------------*/if ( time=0 ) then

do;y=0;z=1;

end;else

do;dert.y = z;dert.z = b * dert.y + c * y;

end;end;solve y z / dynamic solveprint;

run;

If you do not provide an initial value, 0 is used.

DYNAMIC and STATIC SimulationNote that, in the previous example, the DYNAMIC option was specified in theSOLVE statement. The DYNAMIC and STATIC options work the same for differen-tial equations as they do for dynamic systems. In the differential equation case, theDYNAMIC option makes the initial value needed at each observation the computedvalue from the previous iteration. For a static simulation, the data set must containvalues for the integrated variables. For example, if DERT.Y and DERT.Z are the dif-ferential variables, you must include Y and Z in the input data set in order to do astatic simulation of the model.

If the simulation is dynamic, the initial values for the differential equations are ob-tained from the data set, if they are available. If the variable is not in the data set, youcan specify the initial value in a declaration statement. If you do not specify an initialvalue, the value of 0.0 is used.



A dynamic solution is obtained by solving one initial value problem for all the data.A graph of a simple dynamic simulation is shown in Figure 14.36. If the time variablefor the current observation is less than the time variable for the previous observation,the integration is restarted from this point. This allows for multiple samples in onedata file.

Figure 14.36. Dynamic Solution

ACTUAL

TIME

Y

INTEGRATION

In a static solution, n-1 initial value problems are solved using the first n-1 data valuesas initial values. The equations are integrated using theith data value as an initialvalue to the i+1 data value. Figure 14.37 displays a static simulation of noisy datafrom a simple differential equation. The static solution does not propagate errors ininitial values as the dynamic solution does.

Figure 14.37. Static Solution

TIME

Y

ACTUAL

INTEGRATION



For estimation, the DYNAMIC and STATIC options in the FIT statement performthe same functions as they do in the SOLVE statement. Components of differentialsystems that have missing values or are not in the data set are simulated dynamically.For example, often in multiple compartment kinetic models, only one compartmentis monitored. The differential equations describing the unmonitored compartmentsare simulated dynamically.

For estimation, it is important to have accurate initial values for ODEs that are not inthe data set. If an accurate initial value is not known, the initial value can be madean unknown parameter and estimated. This allows for errors in the initial values butincreases the number of parameters to estimate by the number of equations.

Estimation of Differential EquationsConsider the kinetic model for the accumulation of mercury (Hg) in mosquito fish(Matis, Miller, and Allen 1991, p. 177). The model for this process is the one-compartment constant infusion model shown in Figure 14.38.

KuFish

Ke

Figure 14.38. One-Compartment Constant Infusion Model

The differential equation that models this process is

dconc

dt= ku � keconc

conc0 = 0

The analytical solution to the model is

conc = (ku=ke)(1 � exp(�ket))

The data for the model are

data fish;input day conc;datalines;

0.0 0.01.0 0.152.0 0.23.0 0.264.0 0.326.0 0.33;run;



To fit this model in differential form, use the following statements:

proc model data=fish;parm ku ke;

dert.conc = ku - ke * conc;

fit conc / time=day;run;

The results from this estimation are shown in Figure 14.39.

The MODEL Procedure



ku 0.180159 0.0312 5.78 0.0044ke 0.524661 0.1181 4.44 0.0113

Figure 14.39. Static Estimation Results for Fish Model

To perform a dynamic estimation of the differential equation, add the DYNAMICoption to the FIT statement.

proc model data=fish;parm ku .3 ke .3;


fit conc / time = day dynamic;run;

The equation DERT.CONC is integrated fromconc(0) = 0. The results from thisestimation are shown in Figure 14.40.

The MODEL Procedure



ku 0.167109 0.0170 9.84 0.0006ke 0.469033 0.0731 6.42 0.0030

Figure 14.40. Dynamic Estimation Results for Fish Model

To perform a dynamic estimation of the differential equation and estimate the initialvalue, use the following statements:

proc model data=fish;parm ku .3 ke .3 conc0 0;




fit conc initial=(conc = conc0) / time = day dynamic;run;

The INITIAL= option in the FIT statement is used to associate the initial value of adifferential equation with a parameter. The results from this estimation are shown inFigure 14.41.

The MODEL Procedure



ku 0.164408 0.0230 7.14 0.0057ke 0.45949 0.0943 4.87 0.0165conc0 0.003798 0.0174 0.22 0.8414

Figure 14.41. Dynamic Estimation with Initial Value for Fish Model

Finally, to estimate the fish model using the analytical solution, use the followingstatements:

proc model data=fish;parm ku .3 ke .3;

conc = (ku/ ke)*( 1 -exp(-ke * day));

fit conc;run;

The results from this estimation are shown in Figure 14.42.

The MODEL Procedure



ku 0.167109 0.0170 9.84 0.0006ke 0.469033 0.0731 6.42 0.0030

Figure 14.42. Analytical Estimation Results for Fish Model

A comparison of the results among the four estimations reveals that the two dynamicestimations and the analytical estimation give nearly identical results (identical to thedefault precision). The two dynamic estimations are identical because the estimatedinitial value (0.00013071) is very close to the initial value used in the first dynamicestimation (0). Note also that the static model did not require an initial guess for theparameter values. Static estimation, in general, is more forgiving of bad initial values.

The form of the estimation that is preferred depends mostly on the model and data.



If a very accurate initial value is known, then a dynamic estimation makes sense.If, additionally, the model can be written analytically, then the analytical estimationis computationally simpler. If only an approximate initial value is known and notmodeled as an unknown parameter, the static estimation is less sensitive to errors inthe initial value.

The form of the error in the model is also an important factor in choosing the formof the estimation. If the error term is additive and independent of previous error, thenthe dynamic mode is appropriate. If, on the other hand, the errors are cumulative, astatic estimation is more appropriate. See the section "Monte Carlo Simulation" foran example.

Auxiliary EquationsAuxiliary equations can be used with differential equations. These are equations thatneed to be satisfied with the differential equations at each point between each datavalue. They are automatically added to the system, so you do not need to specifythem in the SOLVE or FIT statement.

Consider the following example.

The Michaelis-Menten Equations describe the kinetics of an enzyme-catalyzed reac-tion. The enzyme is E, and S is called thesubstrate. The enzyme first reacts withthe substrate to form the enzyme-substrate complex ES, which then breaks down in asecond step to form enzyme and products P.

The reaction rates are described by the following system of differential equations:

d[ES ]

dt= k1([E ]� [ES ])[S ]� k2[ES ]� k3[ES ]

d[S ]

dt= �k1([E ]� [ES ])[S ] + k2[ES ]

[E ] = [E ]tot � [ES ]

The first equation describes the rate of formation of ES from E + S. The rate offormation of ES from E + P is very small and can be ignored. The enzyme is ineither the complexed or the uncomplexed form. So if the total ([E ]tot) concentrationof enzyme and the amount bound to the substrate is known,[E ] can be obtained byconservation.

In this example, the conservation equation is an auxiliary equation and is coupledwith the differential equations for integration.

Time VariableYou must provide a time variable in the data set. The name of the time variabledefaults to TIME. You can use other variables as the time variable by specifying theTIME= option in the FIT or SOLVE statement. The time intervals need not be evenlyspaced. If the time variable for the current observation is less than the time variablefor the previous observation, the integration is restarted.



Differential Equations and Goal SeekingConsider the following differential equation

y0

= a�x

and the data set

data t2;y=0; time=0; output;y=2; time=1; output;y=3; time=2; output;

run;

The problem is to find values for X that satisfy the differential equation and the data inthe data set. Problems of this kind are sometimes referred to asgoal seeking problemsbecause they require you to search for values of X that will satisfy the goal of Y.

This problem is solved with the following statements:

proc model data=t2 ;dependent x 0;independent y;parm a 5;dert.y = a * x;solve x / out=foo;

run;

proc print data=foo; run;

The output from the PROC PRINT statement is shown in Figure 14.43.

Obs _TYPE_ _MODE_ _ERRORS_ x y time

1 PREDICT SIMULATE 0 0.00000 0.00000 02 PREDICT SIMULATE 0 0.80000 2.00000 13 PREDICT SIMULATE 0 -0.40000 3.00000 2

Figure 14.43. Dynamic Solution

Note that an initial value of 0 is provided for the X variable because it is undeterminedat TIME = 0.

In the preceding goal seeking example, X is treated as a linear function between eachset of data points (see Figure 14.44).



To T

X

Xf

f

o

Figure 14.44. Form of X Used for Integration in Goal Seeking

If you integratey0

= ax manually, you have

x(t) =tf � t

tf � t0x0 +

�T0tf � t0

xf

y =

Z tf

to

ax(t)dt

= a1

tf � t0(t(tfx0 � t0xf ) +

1

2t2(xf � x0))jtft0

For observation 2, this reduces to

y =1

2a�xf

2 = 2:5�xf

Sox = 0:8 for this observation.

Goal seeking for the TIME variable is not allowed.



Restrictions and Bounds on Parameters

Using the BOUNDS and RESTRICT statements, PROC MODEL can compute opti-mal estimates subject to equality or inequality constraints on the parameter estimates.

Equality restrictions can be written as a vector function

h(�) = 0

Inequality restrictions are either active or inactive. When an inequality restriction isactive, it is treated as an equality restriction. All inactive inequality restrictions canbe written as a vector function

F (�) � 0

Strict inequalities, such as(f(�) > 0), are transformed into inequalities asf(�) �(1� �)� � � 0, where the tolerance� is controlled by the EPSILON= option on theFIT statement and defaults to10�8. The ith inequality restriction becomes active ifFi < 0 and remains active until its lagrange multiplier becomes negative. Lagrangemultipliers are computed for all the nonredundant equality restrictions and all theactive inequality restrictions.

For the following, assume the vectorh(�) contains all the current active restrictions.The constraint matrix A is

A(�) =@h(�)

@�

The covariance matrix for the restricted parameter estimates is computed as

Z(Z 0HZ)�1Z 0

where H is Hessian or approximation to the Hessian of the objective function((X 0(diag(S)�1I)X) for OLS), and Z is the last(np� nc) columns of Q. Q is froman LQ factorization of the constraint matrix,nc is the number of active constraints,andnp is the number of parameters. Refer to Gill, Murray, and Wright (1981) formore details on LQ factorization. The covariance column in Table 14.1 summarizesthe Hessian approximation used for each estimation method.

The covariance matrix for the Lagrange multipliers is computed as

(AH�1A0)�1

The p-value reported for a restriction is computed from a beta distribution ratherthan at-distribution because the numerator and the denominator of thet-ratio for anestimated Lagrange multiplier are not independent.



The Lagrange multipliers for the active restrictions are printed with the parameterestimates. The Lagrange multiplier estimates are computed using the relationship

A0

� = g

where the dimension of the constraint matrixA is the number of constraints by thenumber of parameters,� is the vector of Lagrange multipliers, andg is the gradientof the objective function at the final estimates.

The final gradient includes the effects of the estimated S matrix. For example, forOLS the final gradient would be:

g = X 0(diag(S)�1I)r

wherer is the residual vector. Note that when nonlinear restrictions are imposed, theconvergence measure R may have values greater than one for some iterations.

Tests on Parameters

In general, the hypothesis tested can be written as

H0 : h(�) = 0

whereh(�) is a vector valued function of the parameters� given by ther expressionsspecified on the TEST statement.

Let V be the estimate of the covariance matrix of�. Let � be the unconstrainedestimate of� and~� be the constrained estimate of� such thath(~�) = 0. Let

A(�) = @h(�)=@� j�

Let r be the dimension ofh(�) and n be the number of observations. Using thisnotation, the test statistics for the three kinds of tests are computed as follows.

The Wald test statistic is defined as

W = h0

(�)8:A(�)V A0

(�)9;�1

h(�)

The Wald test is not invariant to reparameterization of the model (Gregory 1985,Gallant 1987, p. 219). For more information on the theoretical properties of the Waldtest see Phillips and Park 1988.

The Lagrange multiplier test statistic is

R = �0

A(~�) ~V �1A0

(~�)�



where� is the vector of Lagrange multipliers from the computation of the restrictedestimate~�.

The Lagrange multiplier test statistic is equivalent to Rao’s efficient score test statis-tic:

R = (@L(~�)=@�)0 ~V �1(@L(~�)=@�)

whereL is the log likelihood function for the estimation method used. For OLS andSUR the Lagrange multiplier test statistic is computed as:

R = [(@S(~�)=@�)0 ~V �1(@S(~�)=@�)]=S(~�)

whereS(~�) is the corresponding objective function value at the constrained estimate.

The likelihood ratio test statistic is

T = 2�L(~�)� L(�)

�

where~� represents the constrained estimate of� andL is the concentrated log likeli-hood value.

For OLS and SUR, the likelihood ratio test statistic is computed as:

T = 0:5 � df � (n� nparms)� (S(~�)� S(�))=S(�)

wheredf is the difference in degrees of freedom for the full and restricted models,andnparms is the number of parameters in the full system.

The Likelihood ratio test is not appropriate for models with nonstationary seriallycorrelated errors (Gallant 1987, p. 139). The likelihood ratio test should not be usedfor dynamic systems, for systems with lagged dependent variables, or with the FIMLestimation method unless certain conditions are met (see Gallant 1987, p. 479).

For each kind of test, under the null hypothesis the test statistic is asymptotically dis-tributed as a�2 random variable withr degrees of freedom, wherer is the number ofexpressions on the TEST statement. Thep-values reported for the tests are computedfrom the�2(r) distribution and are only asymptotically valid.

Monte Carlo simulations suggest that the asymptotic distribution of the Wald testis a poorer approximation to its small sample distribution than the other two tests.However, the Wald test has the least computational cost, since it does not requirecomputation of the constrained estimate~�.

The following is an example of using the TEST statement to perform a likelihoodratio test for a compound hypothesis.

test a*exp(-k) = 1-k, d = 0 ,/ lr;



It is important to keep in mind that although individualt tests for each parameter areprinted by default into the parameter estimates table, they are only asymptoticallyvalid for nonlinear models. You should be cautious in drawing any inferences fromtheset tests for small samples.

Hausman Specification Test

Hausman’s specification test, orm-statistic, can be used to test hypotheses in termsof bias or inconsistency of an estimator. This test was also proposed by Wu (1973).Hausman’sm-statistic is as follows.

Given two estimators,�0 and�1, where under the null hypothesis both estimators areconsistent but only�0 is asymptotically efficient and under the alternative hypothesisonly �1 is consistent, them-statistic is

m = q0(V1 � V0)�q

whereV1 andV0 represent consistent estimates of the asymptotic covariance matri-ces of�1 and�0, and

q = �1 � �0

Them-statistic is then distributed�2 with k degrees of freedom, wherek is the rank ofthe matrix(V1 � V0). A generalized inverse is used, as recommended by Hausman(1982).

In the MODEL procedure, Hausman’sm-statistic can be used to determine if it isnecessary to use an instrumental variables method rather than a more efficient OLSestimation. Hausman’sm-statistic can also be used to compare 2SLS with 3SLS for aclass of estimators for which 3SLS is asymptotically efficient (similarly for OLS andSUR).

Hausman’sm-statistic can also be used, in principle, to test the null hypothesis ofnormality when comparing 3SLS to FIML. Because of the poor performance of thisform of the test, it is not offered in the MODEL procedure. Refer to R.C. Fair (1984,pp. 246-247) for a discussion of why Hausman’s test fails for common econometricmodels.

To perform a Hausman’s specification test, specify the HAUSMAN option in theFIT statement. The selected estimation methods are compared using Hausman’sm-statistic.

In the following example, OLS, SUR, 2SLS, 3SLS, and FIML are used to estimate amodel, and Hausman’s test is requested.

proc model data=one out=fiml2;endogenous y1 y2;

y1 = py2 * y2 + px1 * x1 + interc;



y2 = py1* y1 + pz1 * z1 + d2;

fit y1 y2 / ols sur 2sls 3sls fiml hausman;instruments x1 z1;

run;

The output specified by the HAUSMAN option produces the following results.

The MODEL Procedure

Hausman’s Specification Test ResultsComparing To DF Statistic Pr > ChiSq

OLS SUR 6 32.47 <.0001OLS 2SLS 6 13.86 0.0313OLS 3SLS 6 -0.07 .2SLS 3SLS 6 0.00 1.0000

Figure 14.45. Hausman’s Specification Test Results

Figure 14.45 indicates that 2SLS, a system estimation method, is preferred over OLS.The model needs an IV estimator but not a full error covariance matrix. Note that theFIML estimation results are not compared.

Chow Tests

The Chow test is used to test for break points or structural changes in a model. Theproblem is posed as a partitioning of the data into two parts of sizen1 andn2. Thenull hypothesis to be tested is

Ho : �1 = �2 = �

where�1 is estimated using the first part of the data and�2 is estimated using thesecond part.

The test is performed as follows (refer to Davidson and MacKinnon 1993, p. 380).

1. Thep parameters of the model are estimated.

2. A second linear regression is performed on the residuals,u, from the nonlinearestimation in step one.

u = X b+ residuals

whereX is Jacobian columns that are evaluated at the parameter estimates.If the estimation is an instrumental variables estimation with matrix of instru-ments W, then the following regression is performed:

u = PW �X b+ residuals

wherePW � is the projection matrix.



3. The restricted SSE (RSSE) from this regression is obtained. An SSE for eachsubsample is then obtained using the same linear regression.

4. TheF statistic is then

f =(RSSE � SSE1 � SSE 2)=p

(SSE 1 + SSE 2)=(n� 2p)

This test hasp andn� 2p degrees of freedom.

Chow’s test is not applicable ifmin(n1; n2) < p, since one of the two subsamplesdoes not contain enough data to estimate�. In this instance, thepredictive Chow testcan be used. The predictive Chow test is defined as

f =(RSSE � SSE 1)�(n1 � p)

SSE1�n2

wheren1 > p. This test can be derived from the Chow test by noting that theSSE 2 = 0 whenn2 <= p and by adjusting the degrees of freedom appropriately.

You can select the Chow test and the predictive Chow test by specifying theCHOW=arg and the PCHOW=arg options in the FIT statement, wherearg is eitherthe number of observations in the first sample or a parenthesized list of first samplesizes. If the sizes for the second or the first group are less than the number of parame-ters, then a PCHOW test is automatically used. These tests statistics are not producedfor GMM and FIML estimations.

The following is an example of the use of the Chow test.

data exp;x=0;do time=1 to 100;

if time=50 then x=1;y = 35 * exp( 0.01 * time ) + rannor( 123 ) + x * 5;output;

end;run;

proc model data=exp;parm zo 35 b;

dert.z = b * z;y=z;

fit y init=(z=zo) / chow =(40 50 60) pchow=90;run;

The data set introduced an artificial structural change into the model (the structuralchange effects the intercept parameter). The output from the requested Chow testsare shown in Figure 14.46.



The MODEL Procedure

Structural Change Test

BreakTest Point Num DF Den DF F Value Pr > F

Chow 40 2 96 12.95 <.0001Chow 50 2 96 101.37 <.0001Chow 60 2 96 26.43 <.0001Predictive Chow 90 11 87 1.86 0.0566

Figure 14.46. Chow’s Test Results

Profile Likelihood Confidence Intervals

Wald-based and likelihood ratio-based confidence intervals are available in theMODEL procedure for computing a confidence interval on an estimated parameter.A confidence interval on a parameter� can be constructed by inverting a Wald-basedor a likelihood ratio-based test.

The approximate100(1 � �) % Wald confidence interval for a parameter� is

��z1��=2�

wherezp is the100pth percentile of the standard normal distribution,� is the maxi-mum likelihood estimate of�, and� is the standard error estimate of�.

A likelihood ratio-based confidence interval is derived from the�2 distribution of thegeneralized likelihood ratio test. The approximate1� � confidence interval for aparameter� is

� : 2[l(�)� l(�)]�q1;1�� = 2l�

whereq1;1�� is the(1� �) quantile of the�2 with one degree of freedom, andl(�)is the log likelihood as a function of one parameter. The endpoints of a confidenceinterval are the zeros of the functionl(�)� l�. Computing a likelihood ratio-basedconfidence interval is an iterative process. This process must be performed twice foreach parameter, so the computational cost is considerable. Using a modified form ofthe algorithm recommended by Venzon and Moolgavkar (1988), you can determinethat the cost of each endpoint computation is approximately the cost of estimating theoriginal system.

To request confidence intervals on estimated parameters, specify the following optionin the FIT statement:

PRL= WALD | LR | BOTHBy default the PRL option produces 95% likelihood ratio confidence limits. Thecoverage of the confidence interval is controlled by the ALPHA= option in the FITstatement.

The following is an example of the use of the confidence interval options.



data exp;do time = 1 to 20;

y = 35 * exp( 0.01 * time ) + 5*rannor( 123 );output;end;

run;

proc model data=exp;parm zo 35 b;

dert.z = b * z;y=z;

fit y init=(z=zo) / prl=both;test zo = 40.475437 ,/lr;

run;

The output from the requested confidence intervals and the TEST statement areshown in Figure 14.47

The MODEL Procedure



zo 36.58933 1.9471 18.79 <.0001b 0.006497 0.00464 1.40 0.1780

Test Results

Test Type Statistic Pr > ChiSq Label

Test0 L.R. 3.81 0.0509 zo = 40.475437

Parameter Wald95% Confidence Intervals

Parameter Value Lower Upper

zo 36.5893 32.7730 40.4056b 0.00650 -0.00259 0.0156

Parameter Likelihood Ratio95% Confidence Intervals

Parameter Value Lower Upper

zo 36.5893 32.8381 40.4921b 0.00650 -0.00264 0.0157

Figure 14.47. Confidence Interval Estimation

Note that the likelihood ratio test reported the probability thatzo = 40:47543 is5% but zo = 40:47543 is the upper bound of a 95% confidence interval. To un-derstand this conundrum, note that the TEST statement is using the likelihood ra-tio statistic to test the null hypothesisH0 : zo = 40:47543 with the alternate thatHa : zo 6=40:47543. The upper confidence interval can be viewed as a test with thenull hypothesisH0 : zo <= 40:47543.



Choice of Instruments

Several of the estimation methods supported by PROC MODEL are instrumentalvariables methods. There is no standard method for choosing instruments for nonlin-ear regression. Few econometric textbooks discuss the selection of instruments fornonlinear models. Refer to Bowden, R.J. and Turkington, D.A. (1984, p. 180-182)for more information.

The purpose of the instrumental projection is to purge the regressors of their corre-lation with the residual. For nonlinear systems, the regressors are the partials of theresiduals with respect to the parameters.

Possible instrumental variables include

� any variable in the model that is independent of the errors

� lags of variables in the system

� derivatives with respect to the parameters, if the derivatives are independent ofthe errors

� low degree polynomials in the exogenous variables

� variables from the data set or functions of variables from the data set.

Selected instruments must not

� depend on any variable endogenous with respect to the equations estimated

� depend on any of the parameters estimated

� be lags of endogenous variables if there is serial correlation of the errors.

If the preceding rules are satisfied and there are enough observations to support thenumber of instruments used, the results should be consistent and the efficiency lossheld to a minimum.

You need at least as many instruments as the maximum number of parameters inany equation, or some of the parameters cannot be estimated. Note thatnumber ofinstrumentsmeans linearly independent instruments. If you add an instrument that isa linear combination of other instruments, it has no effect and does not increase theeffective number of instruments.

You can, however, use too many instruments. In order to get the benefit of instru-mental variables, you must have more observations than instruments. Thus, there isa trade-off; the instrumental variables technique completely eliminates the simulta-neous equation bias only in large samples. In finite samples, the larger the excessof observations over instruments, the more the bias is reduced. Adding more instru-ments may improve the efficiency, but after some point efficiency declines as theexcess of observations over instruments becomes smaller and the bias grows.

The instruments used in an estimation are printed out at the beginning of the estima-tion. For example, the following statements produce the instruments list shown inFigure 14.48.



proc model data=test2;exogenous x1 x2;parms b1 a1 a2 b2 2.5 c2 55;y1 = a1 * y2 + b1 * exp(x1);y2 = a2 * y1 + b2 * x2 * x2 + c2 / x2;fit y1 y2 / n2sls;inst b1 b2 c2 x1 ;

run;

The MODEL Procedure

The 2 Equations to Estimate

y1 = F(b1, a1(y2))y2 = F(a2(y1), b2, c2)

Instruments 1 x1 @y1/@b1 @y2/@b2 @y2/@c2

Figure 14.48. Instruments Used Message

This states that an intercept term, the exogenous variable X1, and the partial deriva-tives of the equations with respect to B1, B2, and C2, were used as instruments forthe estimation.

ExamplesSuppose that Y1 and Y2 are endogenous variables, that X1 and X2 are exogenousvariables, and that A, B, C, D, E, F, and G are parameters. Consider the followingmodel:

y1 = a + b * x1 + c * y2 + d * lag(y1);y2 = e + f * x2 + g * y1;fit y1 y2;instruments exclude=(c g);

The INSTRUMENTS statement produces X1, X2, LAG(Y1), and an intercept asinstruments.

In order to estimate the Y1 equation by itself, it is necessary to include X2 explicitlyin the instruments since F, in this case, is not included in the estimation

y1 = a + b * x1 + c * y2 + d * lag(y1);y2 = e + f * x2 + g * y1;fit y1;instruments x2 exclude=(c);

This produces the same instruments as before. You can list the parameter associatedwith the lagged variable as an instrument instead of using the EXCLUDE= option.Thus, the following is equivalent to the previous example:

y1 = a + b * x1 + c * y2 + d * lag(y1);y2 = e + f * x2 + g * y1;fit y1;instruments x1 x2 d;



For an example of declaring instruments when estimating a model involving identi-ties, consider Klein’s Model I

proc model data=klien;endogenous c p w i x wsum k y;exogenous wp g t year;parms c0-c3 i0-i3 w0-w3;a: c = c0 + c1 * p + c2 * lag(p) + c3 * wsum;b: i = i0 + i1 * p + i2 * lag(p) + i3 * lag(k);c: w = w0 + w1 * x + w2 * lag(x) + w3 * year;x = c + i + g;y = c + i + g-t;p = x-w-t;k = lag(k) + i;wsum = w + wp;

The three equations to estimate are identified by the labels A, B, and C. The param-eters associated with the predetermined terms are C2, I2, I3, W2, and W3 (and theintercepts, which are automatically added to the instruments). In addition, the systemincludes five identities that contain the predetermined variables G, T, LAG(K), andWP. Thus, the INSTRUMENTS statement can be written as

lagk = lag(k);instruments c2 i2 i3 w2 w3 g t wp lagk;

where LAGK is a program variable used to hold LAG(K). However, this is more com-plicated than it needs to be. Except for LAG(K), all the predetermined terms in theidentities are exogenous variables, and LAG(K) is already included as the coefficientof I3. There are also more parameters for predetermined terms than for endogenousterms, so you might prefer to use the EXCLUDE= option. Thus, you can specify thesame instruments list with the simpler statement

instruments _exog_ exclude=(c1 c3 i1 w1);

To illustrate the use of polynomial terms as instrumental variables, consider the fol-lowing model:

y1 = a + b * exp( c * x1 ) + d * log( x2 ) + e * exp( f * y2 );

The parameters are A, B, C, D, E, and F, and the right-hand-side variables are X1, X2,and Y2. Assume that X1 and X2 are exogenous (independent of the error), while Y2is endogenous. The equation for Y2 is not specified, but assume that it includes thevariables X1, X3, and Y1, with X3 exogenous, so the exogenous variables of the fullsystem are X1, X2, and X3. Using as instruments quadratic terms in the exogenousvariables, the model is specified to PROC MODEL as follows.



proc model;parms a b c d e f;y1 = a + b * exp( c * x1 ) + d * log( x2 ) + e * exp( f * y2 );instruments inst1-inst9;inst1 = x1; inst2 = x2; inst3 = x3;inst4 = x1 * x1; inst5 = x1 * x2; inst6 = x1 * x3;inst7 = x2 * x2; inst8 = x2 * x3; inst9 = x3 * x3;fit y1 / 2sls;

run;

It is not clear what degree polynomial should be used. There is no way to know howgood the approximation is for any degree chosen, although the first-stageR2s mayhelp the assessment.

First-Stage R2sWhen the FSRSQ option is used on the FIT statement, the MODEL procedure prints acolumn of first-stageR2 (FSRSQ) statistics along with the parameter estimates. TheFSRSQ measures the fraction of the variation of the derivative column associatedwith the parameter that remains after projection through the instruments.

Ideally, the FSRSQ should be very close to 1.00 for exogenous derivatives. If theFSRSQ is small for an endogenous derivative, it is unclear whether this reflects a poorchoice of instruments or a large influence of the errors in the endogenous right-hand-side variables. When the FSRSQ for one or more parameters is small, the standarderrors of the parameter estimates are likely to be large.

Note that you can make all the FSRSQs larger (or 1.00) by including more instru-ments, because of the disadvantage discussed previously. The FSRSQ statistics re-ported are unadjustedR2s and do not include a degrees-of-freedom correction.

Autoregressive Moving Average Error Processes

Autoregressive moving average error processes (ARMA errors) and other modelsinvolving lags of error terms can be estimated using FIT statements and simulated orforecast using SOLVE statements. ARMA models for the error process are often usedfor models with autocorrelated residuals. The %AR macro can be used to specifymodels with autoregressive error processes. The %MA macro can be used to specifymodels with moving average error processes.

Autoregressive ErrorsA model with first-order autoregressive errors, AR(1), has the form

yt = f(xt; �) + �t

�t = ��t�1 + �t

while an AR(2) error process has the form

�t = �1�t�1 + �2�t�2 + �t



and so forth for higher-order processes. Note that the�t’s are independent and iden-tically distributed and have an expected value of 0.

An example of a model with an AR(2) component is

y = �+ �x1 + �t

�t = �1�t�1 + �2�t�2 + �t

You would write this model as follows:

proc model data=in;parms a b p1 p2;y = a + b * x1 + p1 * zlag1(y - (a + b * x1)) +

p2 * zlag2(y - (a + b * x1));fit y;

run;

or equivalently using the %AR macro as

proc model data=in;parms a b;y = a + b * x1;%ar( y, 2 );fit y;

run;

Moving Average ModelsA model with first-order moving average errors, MA(1), has the form

yt = f(xt) + �t

�t = �t � �1�t�1

where�t is identically and independently distributed with mean zero. An MA(2) errorprocess has the form

�t = �t � �1�t�1 � �2�t�2

and so forth for higher-order processes.

For example, you can write a simple linear regression model with MA(2) movingaverage errors as

proc model data=inma2;parms a b ma1 ma2;y = a + b * x + ma1 * zlag1( resid.y ) +

ma2 * zlag2( resid.y );fit;

run;

where MA1 and MA2 are the moving average parameters.



Note that RESID.Y is automatically defined by PROC MODEL as

pred.y = a + b * x + ma1 * zlag1( resid.y ) +ma2 * zlag2( resid.y );

resid.y = actual.y - pred.y;

Note that RESID.Y is�t.

The ZLAG function must be used for MA models to truncate the recursion of the lags.This ensures that the lagged errors start at zero in the lag-priming phase and do notpropagate missing values when lag-priming period variables are missing, and ensuresthat the future errors are zero rather than missing during simulation or forecasting.For details on the lag functions, see the section "Lag Logic."

This model written using the %MA macro is

proc model data=inma2;parms a b;y = a + b * x;%ma(y, 2);fit;

run;

General Form for ARMA ModelsThe general ARMA(p,q) process has the following form

�t = �1�t�1 + : : :+ �p�t�p + �t � �1�t�1 � : : :� �q�t�q

An ARMA(p,q) model can be specified as follows

yhat = ... compute structural predicted value here ... ;yarma = ar1 * zlag1( y - yhat ) + ... /* ar part */

+ ar(p) * zlag(p)( y - yhat )+ ma1 * zlag1( resid.y ) + ... /* ma part */

+ ma(q) * zlag(q)( resid.y );y = yhat + yarma;

where ARi and MAj represent the autoregressive and moving average parameters forthe various lags. You can use any names you want for these variables, and there aremany equivalent ways that the specification could be written.

Vector ARMA processes can also be estimated with PROC MODEL. For example,a two-variable AR(1) process for the errors of the two endogenous variables Y1 andY2 can be specified as follows

y1hat = ... compute structural predicted value here ... ;

y1 = y1hat + ar1_1 * zlag1( y1 - y1hat ) /* ar part y1,y1 */+ ar1_2 * zlag1( y2 - y2hat ); /* ar part y1,y2 */

y21hat = ... compute structural predicted value here ... ;



y2 = y2hat + ar2_2 * zlag1( y2 - y2hat ) /* ar part y2,y2 */+ ar2_1 * zlag1( y1 - y1hat ); /* ar part y2,y1 */

Convergence Problems with ARMA ModelsARMA models can be difficult to estimate. If the parameter estimates are not withinthe appropriate range, a moving average model’s residual terms will grow exponen-tially. The calculated residuals for later observations can be very large or can over-flow. This can happen either because improper starting values were used or becausethe iterations moved away from reasonable values.

Care should be used in choosing starting values for ARMA parameters. Starting val-ues of .001 for ARMA parameters usually work if the model fits the data well and theproblem is well-conditioned. Note that an MA model can often be approximated bya high order AR model, and vice versa. This may result in high collinearity in mixedARMA models, which in turn can cause serious ill-conditioning in the calculationsand instability of the parameter estimates.

If you have convergence problems while estimating a model with ARMA error pro-cesses, try to estimate in steps. First, use a FIT statement to estimate only the struc-tural parameters with the ARMA parameters held to zero (or to reasonable prior esti-mates if available). Next, use another FIT statement to estimate the ARMA parame-ters only, using the structural parameter values from the first run. Since the values ofthe structural parameters are likely to be close to their final estimates, the ARMA pa-rameter estimates may now converge. Finally, use another FIT statement to producesimultaneous estimates of all the parameters. Since the initial values of the parame-ters are now likely to be quite close to their final joint estimates, the estimates shouldconverge quickly if the model is appropriate for the data.

AR Initial ConditionsThe initial lags of the error terms of AR(p) models can be modeled in different ways.The autoregressive error startup methods supported by SAS/ETS procedures are thefollowing:

CLS conditional least squares (ARIMA and MODEL procedures)

ULS unconditional least squares (AUTOREG, ARIMA, and MODELprocedures)

ML maximum likelihood (AUTOREG, ARIMA, and MODEL proce-dures)

YW Yule-Walker (AUTOREG procedure only)

HL Hildreth-Lu, which deletes the firstp observations (MODEL pro-cedure only)

See Chapter 8, for an explanation and discussion of the merits of various AR(p)startup methods.

The CLS, ULS, ML, and HL initializations can be performed by PROC MODEL.For AR(1) errors, these initializations can be produced as shown in Table 14.2. Thesemethods are equivalent in large samples.



Table 14.2. Initializations Performed by PROC MODEL: AR(1) ERRORS

Method Formulaconditional least Y=YHAT+AR1*ZLAG1(Y-YHAT);squaresunconditional least Y=YHAT+AR1*ZLAG1(Y-YHAT);squares IF –OBS–=1 THEN

RESID.Y=SQRT(1-AR1**2)*RESID.Y;maximum likelihood Y=YHAT+AR1*ZLAG1(Y-YHAT);

W=(1-AR1**2)**(-1/(2* –NUSED–));IF –OBS–=1 THEN W=W*SQRT(1-AR1**2);RESID.Y=W*RESID.Y;

Hildreth-Lu Y=YHAT+AR1*LAG1(Y-YHAT);

MA Initial ConditionsThe initial lags of the error terms of MA(q) models can also be modeled in differentways. The following moving average error startup paradigms are supported by theARIMA and MODEL procedures:

ULS unconditional least squares

CLS conditional least squares

ML maximum likelihood

The conditional least-squares method of estimating moving average error terms isnot optimal because it ignores the startup problem. This reduces the efficiency ofthe estimates, although they remain unbiased. The initial lagged residuals, extend-ing before the start of the data, are assumed to be 0, their unconditional expectedvalue. This introduces a difference between these residuals and the generalized least-squares residuals for the moving average covariance, which, unlike the autoregressivemodel, persists through the data set. Usually this difference converges quickly to 0,but for nearly noninvertible moving average processes the convergence is quite slow.To minimize this problem, you should have plenty of data, and the moving averageparameter estimates should be well within the invertible range.

This problem can be corrected at the expense of writing a more complex program.Unconditional least-squares estimates for the MA(1) process can be produced byspecifying the model as follows:

yhat = ... compute structural predicted value here ... ;if _obs_ = 1 then do;

h = sqrt( 1 + ma1 ** 2 );y = yhat;resid.y = ( y - yhat ) / h;end;

else do;g = ma1 / zlag1( h );h = sqrt( 1 + ma1 ** 2 - g ** 2 );y = yhat + g * zlag1( resid.y );resid.y = ( ( y - yhat) - g * zlag1( resid.y ) ) / h;end;



Moving-average errors can be difficult to estimate. You should consider using anAR(p) approximation to the moving average process. A moving average process canusually be well-approximated by an autoregressive process if the data have not beensmoothed or differenced.

The %AR MacroThe SAS macro %AR generates programming statements for PROC MODEL forautoregressive models. The %AR macro is part of SAS/ETS software and no specialoptions need to be set to use the macro. The autoregressive process can be applied tothe structural equation errors or to the endogenous series themselves.

The %AR macro can be used for

� univariate autoregression

� unrestricted vector autoregression

� restricted vector autoregression.

Univariate AutoregressionTo model the error term of an equation as an autoregressive process, use the followingstatement after the equation:

%ar( varname, nlags )

For example, suppose that Y is a linear function of X1 and X2, and an AR(2) error.You would write this model as follows:

proc model data=in;parms a b c;y = a + b * x1 + c * x2;%ar( y, 2 )fit y / list;

run;

The calls to %AR must comeafter all of the equations that the process applies to.

The proceding macro invocation, %AR(y,2), produces the statements shown in theLIST output in Figure 14.49.

The MODEL Procedure

Listing of Compiled Program CodeStmt Line:Col Statement as Parsed

1 5738:50 PRED.y = a + b * x1 + c * x2;1 5738:50 RESID.y = PRED.y - ACTUAL.y;1 5738:50 ERROR.y = PRED.y - y;2 7987:23 _PRED__y = PRED.y;3 8003:15 #OLD_PRED.y = PRED.y + y_l1

* ZLAG1( y - _PRED__y ) + y_l2* ZLAG2( y - _PRED__y );

3 8003:15 PRED.y = #OLD_PRED.y;3 8003:15 RESID.y = PRED.y - ACTUAL.y;3 8003:15 ERROR.y = PRED.y - y;

Figure 14.49. LIST Option Output for an AR(2) Model



The–PRED–– prefixed variables are temporary program variables used so that thelags of the residuals are the correct residuals and not the ones redefined by this equa-tion. Note that this is equivalent to the statements explicitly written in the "GeneralForm for ARMA Models" earlier in this section.

You can also restrict the autoregressive parameters to zero at selected lags. For ex-ample, if you wanted autoregressive parameters at lags 1, 12, and 13, you can use thefollowing statements:

proc model data=in;parms a b c;y = a + b * x1 + c * x2;%ar( y, 13, , 1 12 13 )fit y / list;

run;

These statements generate the output shown in Figure 14.50.

The MODEL Procedure


1 8182:50 PRED.y = a + b * x1 + c * x2;1 8182:50 RESID.y = PRED.y - ACTUAL.y;1 8182:50 ERROR.y = PRED.y - y;2 8631:23 _PRED__y = PRED.y;3 8647:15 #OLD_PRED.y = PRED.y + y_l1 * ZLAG1( y -

_PRED__y ) + y_l12 * ZLAG12( y -_PRED__y ) + y_l13 * ZLAG13(y - _PRED__y );


Figure 14.50. LIST Option Output for an AR Model with Lags at 1, 12, and 13

There are variations on the conditional least-squares method, depending on whetherobservations at the start of the series are used to "warm up" the AR process. Bydefault, the %AR conditional least-squares method uses all the observations and as-sumes zeros for the initial lags of autoregressive terms. By using the M= option,you can request that %AR use the unconditional least-squares (ULS) or maximum-likelihood (ML) method instead. For example,

proc model data=in;y = a + b * x1 + c * x2;%ar( y, 2, m=uls )fit y;

run;

Discussions of these methods is provided in the "AR Initial Conditions" earlier in thissection.



By using the M=CLSn option, you can request that the firstn observations be used tocompute estimates of the initial autoregressive lags. In this case, the analysis startswith observationn+1. For example:

proc model data=in;y = a + b * x1 + c * x2;%ar( y, 2, m=cls2 )fit y;

run;

You can use the %AR macro to apply an autoregressive model to the endogenousvariable, instead of to the error term, by using the TYPE=V option. For example, ifyou want to add the five past lags of Y to the equation in the previous example, youcould use %AR to generate the parameters and lags using the following statements:

proc model data=in;parms a b c;y = a + b * x1 + c * x2;%ar( y, 5, type=v )fit y / list;

run;

The preceding statements generate the output shown in Figure 14.51.

The MODEL Procedure


1 8892:50 PRED.y = a + b * x1 + c * x2;1 8892:50 RESID.y = PRED.y - ACTUAL.y;1 8892:50 ERROR.y = PRED.y - y;2 9301:15 #OLD_PRED.y = PRED.y + y_l1 * ZLAG1( y )

+ y_l2 * ZLAG2( y ) + y_l3 * ZLAG3( y )+ y_l4 * ZLAG4( y ) + y_l5 * ZLAG5( y );


Figure 14.51. LIST Option Output for an AR model of Y

This model predicts Y as a linear combination of X1, X2, an intercept, and the valuesof Y in the most recent five periods.

Unrestricted Vector AutoregressionTo model the error terms of a set of equations as a vector autoregressive process, usethe following form of the %AR macro after the equations:

%ar( process_name, nlags, variable_list )

The process–namevalue is any name that you supply for %AR to use in makingnames for the autoregressive parameters. You can use the %AR macro to model



several different AR processes for different sets of equations by using different pro-cess names for each set. The process name ensures that the variable names used areunique. Use a shortprocess–namevalue for the process if parameter estimates are tobe written to an output data set. The %AR macro tries to construct parameter namesless than or equal to eight characters, but this is limited by the length ofname, whichis used as a prefix for the AR parameter names.

Thevariable–list value is the list of endogenous variables for the equations.

For example, suppose that errors for equations Y1, Y2, and Y3 are generated by asecond-order vector autoregressive process. You can use the following statements:

proc model data=in;y1 = ... equation for y1 ...;y2 = ... equation for y2 ...;y3 = ... equation for y3 ...;%ar( name, 2, y1 y2 y3 )fit y1 y2 y3;

run;

which generates the following for Y1 and similar code for Y2 and Y3:

y1 = pred.y1 + name1_1_1*zlag1(y1-name_y1) +name1_1_2*zlag1(y2-name_y2) +name1_1_3*zlag1(y3-name_y3) +name2_1_1*zlag2(y1-name_y1) +name2_1_2*zlag2(y2-name_y2) +name2_1_3*zlag2(y3-name_y3) ;

Only the conditional least-squares (M=CLS or M=CLSn) method can be used forvector processes.

You can also use the same form with restrictions that the coefficient matrix be 0 atselected lags. For example, the statements

proc model data=in;y1 = ... equation for y1 ...;y2 = ... equation for y2 ...;y3 = ... equation for y3 ...;%ar( name, 3, y1 y2 y3, 1 3 )fit y1 y2 y3;

apply a third-order vector process to the equation errors with all the coefficients atlag 2 restricted to 0 and with the coefficients at lags 1 and 3 unrestricted.

You can model the three series Y1-Y3 as a vector autoregressive process in the vari-ables instead of in the errors by using the TYPE=V option. If you want to model Y1-Y3 as a function of past values of Y1-Y3 and some exogenous variables or constants,you can use %AR to generate the statements for the lag terms. Write an equation foreach variable for the nonautoregressive part of the model, and then call %AR withthe TYPE=V option. For example,



proc model data=in;parms a1-a3 b1-b3;y1 = a1 + b1 * x;y2 = a2 + b2 * x;y3 = a3 + b3 * x;%ar( name, 2, y1 y2 y3, type=v )fit y1 y2 y3;

run;

The nonautoregressive part of the model can be a function of exogenous variables, orit may be intercept parameters. If there are no exogenous components to the vectorautoregression model, including no intercepts, then assign zero to each of the vari-ables. There must be an assignment to each of the variables before %AR is called.

proc model data=in;y1=0;y2=0;y3=0;%ar( name, 2, y1 y2 y3, type=v )fit y1 y2 y3;

This example models the vector Y=(Y1 Y2 Y3)0 as a linear function only of its valuein the previous two periods and a white noise error vector. The model has 18=(3� 3+ 3� 3) parameters.

Syntax of the %AR MacroThere are two cases of the syntax of the %AR macro. The first has the general form

%AR (name, nlag [,endolist[,laglist]] [,M=method] [,TYPE=V])where

name specifies a prefix for %AR to use in constructing names of variablesneeded to define the AR process. If theendolistis not specified, theendogenous list defaults toname, which must be the name of theequation to which the AR error process is to be applied. Thenamevalue cannot exceed eight characters.

nlag is the order of the AR process.

endolist specifies the list of equations to which the AR process is to beapplied. If more than one name is given, an unrestricted vectorprocess is created with the structural residuals of all the equationsincluded as regressors in each of the equations. If not specified,endolistdefaults toname.

laglist specifies the list of lags at which the AR terms are to be added. Thecoefficients of the terms at lags not listed are set to 0. All of thelisted lags must be less than or equal tonlag, and there must be noduplicates. If not specified, thelaglist defaults to all lags 1 throughnlag.



M=method specifies the estimation method to implement. Valid values of M=are CLS (conditional least-squares estimates), ULS (unconditionalleast-squares estimates), and ML (maximum-likelihood estimates).M=CLS is the default. Only M=CLS is allowed when more thanone equation is specified. The ULS and ML methods are not sup-ported for vector AR models by %AR.

TYPE=V specifies that the AR process is to be applied to the endogenousvariables themselves instead of to the structural residuals of theequations.

Restricted Vector AutoregressionYou can control which parameters are included in the process, restricting those pa-rameters that you do not include to 0. First, use %AR with the DEFER option todeclare the variable list and define the dimension of the process. Then, use additional%AR calls to generate terms for selected equations with selected variables at selectedlags. For example,

proc model data=d;y1 = ... equation for y1 ...;y2 = ... equation for y2 ...;y3 = ... equation for y3 ...;%ar( name, 2, y1 y2 y3, defer )%ar( name, y1, y1 y2 )%ar( name, y2 y3, , 1 )fit y1 y2 y3;

run;

The error equations produced are

y1 = pred.y1 + name1_1_1*zlag1(y1-name_y1) +name1_1_2*zlag1(y2-name_y2) + name2_1_1*zlag2(y1-name_y1) +name2_1_2*zlag2(y2-name_y2) ;

y2 = pred.y2 + name1_2_1*zlag1(y1-name_y1) +name1_2_2*zlag1(y2-name_y2) + name1_2_3*zlag1(y3-name_y3) ;

y3 = pred.y3 + name1_3_1*zlag1(y1-name_y1) +name1_3_2*zlag1(y2-name_y2) + name1_3_3*zlag1(y3-name_y3) ;

This model states that the errors for Y1 depend on the errors of both Y1 and Y2 (butnot Y3) at both lags 1 and 2, and that the errors for Y2 and Y3 depend on the previouserrors for all three variables, but only at lag 1.

%AR Macro Syntax for Restricted Vector ARAn alternative use of %AR is allowed to impose restrictions on a vector AR processby calling %AR several times to specify different AR terms and lags for differentequations.

The first call has the general form

%AR( name, nlag, endolist, DEFER )



where

name specifies a prefix for %AR to use in constructing names of variablesneeded to define the vector AR process.

nlag specifies the order of the AR process.

endolist specifies the list of equations to which the AR process is to beapplied.

DEFER specifies that %AR is not to generate the AR process but is to waitfor further information specified in later %AR calls for the samenamevalue.

The subsequent calls have the general form

%AR( name, eqlist, varlist, laglist,TYPE= )

where

name is the same as in the first call.

eqlist specifies the list of equations to which the specifications in this%AR call are to be applied. Only names specified in theendolistvalue of the first call for thenamevalue can appear in the list ofequations ineqlist.

varlist specifies the list of equations whose lagged structural residuals areto be included as regressors in the equations ineqlist. Only namesin the endolistof the first call for thenamevalue can appear invarlist. If not specified,varlist defaults toendolist.

laglist specifies the list of lags at which the AR terms are to be added. Thecoefficients of the terms at lags not listed are set to 0. All of thelisted lags must be less than or equal to the value ofnlag, and theremust be no duplicates. If not specified,laglist defaults to all lags 1throughnlag.

The %MA MacroThe SAS macro %MA generates programming statements for PROC MODEL formoving average models. The %MA macro is part of SAS/ETS software and no spe-cial options are needed to use the macro. The moving average error process can beapplied to the structural equation errors. The syntax of the %MA macro is the sameas the %AR macro except there is no TYPE= argument.

When you are using the %MA and %AR macros combined, the %MA macro mustfollow the %AR macro. The following SAS/IML statements produce an ARMA(1,(1 3)) error process and save it in the data set MADAT2.

/* use IML module to simulate a MA process */proc iml;

phi={1 .2};



theta={ 1 .3 0 .5};y=armasim(phi, theta, 0,.1, 200,32565);create madat2 from y[colname=’y’];append;

quit;

The following PROC MODEL statements are used to estimate the parameters of thismodel using maximum likelihood error structure:

title1 ’Maximum Likelihood ARMA(1, (1 3))’;proc model data=madat2;

y=0;%ar(y,1,, M=ml)%ma(y,3,,1 3, M=ml) /* %MA always after %AR */fit y;

run;

The estimates of the parameters produced by this run are shown in Figure 14.52.

Maximum Likelihood ARMA(1, (1 3))

The MODEL Procedure



y 3 197 2.6383 0.0134 0.1157 -0.0067 -0.0169RESID.y 197 1.9957 0.0101 0.1007



y_l1 -0.10067 0.1187 -0.85 0.3973 AR(y) y lag1parameter

y_m1 -0.1934 0.0939 -2.06 0.0408 MA(y) y lag1parameter

y_m3 -0.59384 0.0601 -9.88 <.0001 MA(y) y lag3parameter

Figure 14.52. Estimates from an ARMA(1, (1 3)) Process

Syntax of the %MA MacroThere are two cases of the syntax for the %MA macro. The first has the general form

%MA ( name, nlag [,endolist[,laglist]] [,M=method] )

where

name specifies a prefix for %MA to use in constructing names of vari-ables needed to define the MA process and is the defaultendolist.

nlag is the order of the MA process.



endolist specifies the equations to which the MA process is to be applied. Ifmore than one name is given, CLS estimation is used for the vectorprocess.

laglist specifies the lags at which the MA terms are to be added. All of thelisted lags must be less than or equal tonlag, and there must be noduplicates. If not specified, thelaglist defaults to all lags 1 throughnlag.

M=method specifies the estimation method to implement. Valid values of M=are CLS (conditional least-squares estimates), ULS (unconditionalleast-squares estimates), and ML (maximum-likelihood estimates).M=CLS is the default. Only M=CLS is allowed when more thanone equation is specified on theendolist.

%MA Macro Syntax for Restricted Vector Moving AverageAn alternative use of %MA is allowed to impose restrictions on a vector MA processby calling %MA several times to specify different MA terms and lags for differentequations.

The first call has the general form

%MA( name, nlag, endolist, DEFER )

where

name specifies a prefix for %MA to use in constructing names of vari-ables needed to define the vector MA process.

nlag specifies the order of the MA process.

endolist specifies the list of equations to which the MA process is to beapplied.

DEFER specifies that %MA is not to generate the MA process but is to waitfor further information specified in later %MA calls for the samenamevalue.

The subsequent calls have the general form

%MA( name, eqlist, varlist, laglist )

where

name is the same as in the first call.

eqlist specifies the list of equations to which the specifications in this%MA call are to be applied.

varlist specifies the list of equations whose lagged structural residuals areto be included as regressors in the equations ineqlist.

laglist specifies the list of lags at which the MA terms are to be added.



Distributed Lag Models and the %PDL Macro

In the following example, the variabley is modeled as a linear function ofx, the firstlag ofx, the second lag ofx, and so forth:

yt = a+ b0xt + b1xt�1 + b2xt�2 + b3xt�3 + : : :+ bnxt�l

Models of this sort can introduce a great many parameters for the lags, and there maynot be enough data to compute accurate independent estimates for them all. Often, thenumber of parameters is reduced by assuming that the lag coefficients follow somepattern. One common assumption is that the lag coefficients follow a polynomial inthe lag length

bi =dX

j=0

�j(i)j

whered is the degree of the polynomial used. Models of this kind are calledAlmonlag models, polynomial distributed lag models, or PDLsfor short. For example, Fig-ure 14.53 shows the lag distribution that can be modeled with a low order polynomial.Endpoint restrictions can be imposed on a PDL to require that the lag coefficients be0 at the 0th lag, or at the final lag, or at both.

Weight

1 2 3 4 5 6 7

Lag

Lag

Figure 14.53. Polynomial Distributed Lags

For linear single-equation models, SAS/ETS software includes the PDLREG proce-dure for estimating PDL models. See Chapter 15, “The PDLREG Procedure,” for amore detailed discussion of polynomial distributed lags and an explanation of end-point restrictions.



Polynomial and other distributed lag models can be estimated and simulated or fore-cast with PROC MODEL. For polynomial distributed lags, the %PDL macro cangenerate the needed programming statements automatically.

The %PDL MacroThe SAS macro %PDL generates the programming statements to compute the lagcoefficients of polynomial distributed lag models and to apply them to the lags ofvariables or expressions.

To use the %PDL macro in a model program, you first call it to declare the lag distri-bution; later, you call it again to apply the PDL to a variable or expression. The firstcall generates a PARMS statement for the polynomial parameters and assignmentstatements to compute the lag coefficients. The second call generates an expressionthat applies the lag coefficients to the lags of the specified variable or expression. APDL can be declared only once, but it can be used any number of times (that is, thesecond call can be repeated).

The initial declaratory call has the general form

%PDL( pdlname, nlags, degree, R=code, OUTEST=dataset)

wherepdlnameis a name (up to eight characters) that you give to identify the PDL,nlagsis the lag length, anddegreeis the degree of the polynomial for the distribution.The R=codeis optional for endpoint restrictions. The value ofcodecan be FIRST(for upper), LAST (for lower), or BOTH (for both upper and lower endpoints). Seechapter pdlreg, "The PDLREG Procedure," for a discussion of endpoint restrictions.The option OUTEST=datasetcreates a data set containing the estimates of the pa-rameters and their covariance matrix.

The later calls to apply the PDL have the general form

%PDL( pdlname, expression )

wherepdlnameis the name of the PDL andexpressionis the variable or expressionto which the PDL is to be applied. Thepdlnamegiven must be the same as the nameused to declare the PDL.

The following statements produce the output in Figure 14.54:

proc model data=in list;parms int pz;%pdl(xpdl,5,2);y = int + pz * z + %pdl(xpdl,x);%ar(y,2,M=ULS);id i;

fit y / out=model1 outresid converge=1e-6;run;



The MODEL Procedure

Nonlinear OLS Estimates

Approx ApproxTerm Estimate Std Err t Value Pr > |t| Label

XPDL_L0 1.568788 0.0992 15.81 <.0001 PDL(XPDL,5,2)coefficient for lag0

XPDL_L1 0.564917 0.0348 16.24 <.0001 PDL(XPDL,5,2)coefficient for lag1

XPDL_L2 -0.05063 0.0629 -0.80 0.4442 PDL(XPDL,5,2)coefficient for lag2



XPDL_L5 0.43267 0.1445 2.99 0.0172 PDL(XPDL,5,2)coefficient for lag5

Figure 14.54. %PDL Macro ESTIMATE Statement Output

This second example models two variables, Y1 and Y2, and uses two PDLs:

proc model data=in;parms int1 int2;%pdl( logxpdl, 5, 3 )%pdl( zpdl, 6, 4 )y1 = int1 + %pdl( logxpdl, log(x) ) + %pdl( zpdl, z );y2 = int2 + %pdl( zpdl, z );fit y1 y2;

run;

A (5,3) PDL of the log of X is used in the equation for Y1. A (6,4) PDL of Zis used in the equations for both Y1 and Y2. Since the same ZPDL is used in bothequations, the lag coefficients for Z are the same for the Y1 and Y2 equations, and thepolynomial parameters for ZPDL are shared by the two equations. See Example 14.5for a complete example and comparison with PDLREG.

Input Data Sets

DATA= Input Data SetFor FIT tasks, the DATA= option specifies which input data set to use in estimatingparameters. Variables in the model program are looked up in the DATA= data setand, if found, their attributes (type, length, label, and format) are set to be the sameas those in the DATA= data set (if not defined otherwise within PROC MODEL), andvalues for the variables in the program are read from the data set.

ESTDATA= Input Data SetThe ESTDATA= option specifies an input data set that contains an observation giv-ing values for some or all of the model parameters. The data set can also containobservations giving the rows of a covariance matrix for the parameters.

Parameter values read from the ESTDATA= data set provide initial starting values forparameters estimated. Observations providing covariance values, if any are presentin the ESTDATA= data set, are ignored.



The ESTDATA= data set is usually created by the OUTEST= option in a previousFIT statement. You can also create an ESTDATA= data set with a SAS DATA stepprogram. The data set must contain a numeric variable for each parameter to be givena value or covariance column. The name of the variable in the ESTDATA= data setmust match the name of the parameter in the model. Parameters with names longerthan eight characters cannot be set from an ESTDATA= data set. The data set mustalso contain a character variable–NAME– of length 8.–NAME– has a blank valuefor the observation that gives values to the parameters.–NAME– contains the nameof a parameter for observations defining rows of the covariance matrix.

More than one set of parameter estimates and covariances can be stored in the EST-DATA= data set if the observations for the different estimates are identified by thevariable–TYPE–. –TYPE– must be a character variable of length 8. The TYPE=option is used to select for input the part of the ESTDATA= data set for which the

–TYPE– value matches the value of the TYPE= option.

The following SAS statements generate the ESTDATA= data set shown in Figure14.55. The second FIT statement uses the TYPE= option to select the estimates fromthe GMM estimation as starting values for the FIML estimation.

/* Generate test data */data gmm2;

do t=1 to 50;x1 = sqrt(t) ;x2 = rannor(10) * 10;y1 = -.002 * x2 * x2 - .05 / x2 - 0.001 * x1 * x1;y2 = 0.002* y1 + 2 * x2 * x2 + 50 / x2 + 5 * rannor(1);y1 = y1 + 5 * rannor(1);z1 = 1; z2 = x1 * x1; z3 = x2 * x2; z4 = 1.0/x2;output;

end;run;

proc model data=gmm2 ;exogenous x1 x2;parms a1 a2 b1 2.5 b2 c2 55 d1;inst b1 b2 c2 x1 x2;y1 = a1 * y2 + b1 * x1 * x1 + d1;y2 = a2 * y1 + b2 * x2 * x2 + c2 / x2 + d1;

fit y1 y2 / 3sls gmm kernel=(qs,1,0.2) outest=gmmest;

fit y1 y2 / fiml type=gmm estdata=gmmest;run;

proc print data=gmmest;run;



_S _

_ _ T NN T A UA Y T S

O M P U Eb E E S D a a b b c ds _ _ _ _ 1 2 1 2 2 1

1 3SLS 0 Converged 50 -.002229607 -1.25002 0.025827 1.99609 49.8119 -0.445332 GMM 0 Converged 50 -.002013073 -1.53882 0.014908 1.99419 49.8035 -0.64933

Figure 14.55. ESTDATA= Data Set

MISSING= PAIRWISE | DELETEWhen missing values are encountered for any one of the equations in a system ofequations, the default action is to drop that observation for all of the equations. Thenew MISSING=PAIRWISE option on the FIT statement provides a different methodof handling missing values that avoids losing data for nonmissing equations for theobservation. This is especially useful for SUR estimation on equations with unequalnumbers of observations.

The option MISSING=PAIRWISE specifies that missing values are tracked on anequation-by-equation basis. The MISSING=DELETE option specifies that the entireobservation is omitted from the analysis when any equation has a missing predictedor actual value for the equation. The default is MISSING=DELETE.

When you specify the MISSING=PAIRWISE option, the S matrix is computed as

S = D(R0R)D

where D is a diagonal matrix that depends on the VARDEF= option, the matrixR is(r1; : : : ; rg), andri is the vector of residuals for theith equation withrij replacedwith zero whenrij is missing.

For MISSING=PAIRWISE, the calculation of the diagonal elementdi;i of D is basedonni, the number of nonmissing observations for theith equation, instead of onn or,for VARDEF=WGT or WDF, on the sum of the weights for the nonmissing observa-tions for theith equation instead of on the sum of the weights for all observations.Refer to the description of the VARDEF= option for the definition ofD.

The degrees of freedom correction for a shared parameter is computed using theaverage number of observations used in its estimation.

The MISSING=PAIRWISE option is not valid for the GMM and FIML estimationmethods.

For the instrumental variables estimation methods (2SLS, 3SLS), when an instrumentis missing for an observation, that observation is dropped for all equations, regardlessof the MISSING= option.



PARMSDATA= Input Data SetThe option PARMSDATA= reads values for all parameters whose names match thenames of variables in the PARMSDATA= data set. Values for any or all of the param-eters in the model can be reset using the PARMSDATA= option. The PARMSDATA=option goes on the PROC MODEL statement, and the data set is read before any FITor SOLVE statements are executed.

Together, the OUTPARMS= and PARMSDATA= options allow you to change partof a model and recompile the new model program without the need to reestimateequations that were not changed.

Suppose you have a large model with parameters estimated and you now want toreplace one equation, Y, with a new specification. Although the model program mustbe recompiled with the new equation, you don’t need to reestimate all the equations,just the one that changed.

Using the OUTPARMS= and PARMSDATA= options, you could do the following:

proc model model=oldmod outparms=temp; run;proc model outmodel=newmod parmsdata=temp data=in;

... include new model definition with changed y eq. here ...fit y;

run;

The model file NEWMOD will then contain the new model and its estimated param-eters plus the old models with their original parameter values.

SDATA= Input Data SetThe SDATA= option allows a cross-equation covariance matrix to be input from adata set. TheS matrix read from the SDATA= data set, specified in the FIT state-ment, is used to define the objective function for the OLS, N2SLS, SUR, and N3SLSestimation methods and is used as the initialS for the methods that iterate theS ma-trix.

Most often, the SDATA= data set has been created by the OUTS= or OUTSUSED=option on a previous FIT statement. The OUTS= and OUTSUSED= data sets froma FIT statement can be read back in by a FIT statement in the same PROC MODELstep.

You can create an input SDATA= data set using the DATA step. PROC MODELexpects to find a character variable–NAME– in the SDATA= data set as well asvariables for the equations in the estimation or solution. For each observation witha –NAME– value matching the name of an equation, PROC MODEL fills the cor-responding row of theS matrix with the values of the names of equations found inthe data set. If a row or column is omitted from the data set, a 1 is placed on thediagonal for the row or column. Missing values are ignored, and since theS matrixis symmetric, you can include only a triangular part of theS matrix in the SDATA=data set with the omitted part indicated by missing values. If the SDATA= data setcontains multiple observations with the same–NAME–, the last values supplied forthe–NAME– are used. The structure of the expected data set is further described inthe "OUTS=Data Set" section.



Use the TYPE= option on the PROC MODEL or FIT statement to specify the type ofestimation method used to produce theSmatrix you want to input.

The following SAS statements are used to generate anS matrix from a GMM and a3SLS estimation and to store that estimate in the data set GMMS:

proc model data=gmm2 ;exogenous x1 x2;parms a1 a2 b1 2.5 b2 c2 55 d1;inst b1 b2 c2 x1 x2;y1 = a1 * y2 + b1 * x1 * x1 + d1;y2 = a2 * y1 + b2 * x2 * x2 + c2 / x2 + d1;

fit y1 y2 / 3sls gmm kernel=(qs,1,0.2) outest=gmmest outs=gmms;run;

The data set GMMS is shown in Figure 14.56.

Obs _NAME_ _TYPE_ _NUSED_ y1 y2

1 y1 3SLS 50 27.1032 38.15992 y2 3SLS 50 38.1599 74.62533 y1 GMM 50 27.4205 46.40284 y2 GMM 50 46.4028 99.4656

Figure 14.56. SDATA= Data Set

VDATA= Input data setThe VDATA= option allows a variance matrix for GMM estimation to be input from adata set. When the VDATA= option is used on the PROC MODEL or FIT statement,the matrix that is input is used to define the objective function and is used as the initialV for the methods that iterate the V matrix.

Normally the VDATA= matrix is created from the OUTV= option on a previous FITstatement. Alternately an input VDATA= data set can be created using the DATAstep. Each row and column of the V matrix is associated with an equation and aninstrument. The position of each element in the V matrix can then be indicated byan equation name and an instrument name for the row of the element and an equa-tion name and an instrument name for the column. Each observation in the VDATA=data set is an element in the V matrix. The row and column of the element are in-dicated by four variables EQ–ROW, INST–ROW, EQ–COL, and INST–COL whichcontain the equation name or instrument name. The variable name for an element isVALUE. Missing values are set to 0. Because the variance matrix is symmetric, onlya triangular part of the matrix needs to be input.

The following SAS statements are used to generate aV matrix estimation from GMMand to store that estimate in the data set GMMV:

proc model data=gmm2 ;exogenous x1 x2;parms a1 a2 b2 b1 2.5 c2 55 d1;inst b1 b2 c2 x1 x2;y1 = a1 * y2 + b1 * x1 * x1 + d1;y2 = a2 * y1 + b2 * x2 * x2 + c2 / x2 + d1;



fit y1 y2 / gmm outv=gmmv;run;

The data set GMM2 was generated by the example in the preceding ESTDATA=section. TheV matrix stored in GMMV is selected for use in an additional GMMestimation by the following FIT statement:

fit y1 y2 / gmm vdata=gmmv;run;

proc print data=gmmv(obs=15);run;

A partial listing of the GMMV data set is shown in Figure 14.57. There are a total of78 observations in this data set. TheV matrix is 12 by 12 for this example.

Obs _TYPE_ EQ_ROW EQ_COL INST_ROW INST_COL VALUE

1 GMM Y1 Y1 1 1 1509.592 GMM Y1 Y1 X1 1 8257.413 GMM Y1 Y1 X1 X1 47956.084 GMM Y1 Y1 X2 1 7136.275 GMM Y1 Y1 X2 X1 44494.706 GMM Y1 Y1 X2 X2 153135.597 GMM Y1 Y1 @PRED.Y1/@B1 1 47957.108 GMM Y1 Y1 @PRED.Y1/@B1 X1 289178.689 GMM Y1 Y1 @PRED.Y1/@B1 X2 275074.36

10 GMM Y1 Y1 @PRED.Y1/@B1 @PRED.Y1/@B1 1789176.5611 GMM Y1 Y1 @PRED.Y2/@B2 1 152885.9112 GMM Y1 Y1 @PRED.Y2/@B2 X1 816886.4913 GMM Y1 Y1 @PRED.Y2/@B2 X2 1121114.9614 GMM Y1 Y1 @PRED.Y2/@B2 @PRED.Y1/@B1 4576643.5715 GMM Y1 Y1 @PRED.Y2/@B2 @PRED.Y2/@B2 28818318.24

Figure 14.57. The First 15 Observations in the VDATA= Data Set

Output Data Sets

OUT= Data SetFor normalized form equations, the OUT= data set specified on the FIT statementcontains residuals, actuals, and predicted values of the dependent variables computedfrom the parameter estimates. For general form equations, actual values of the en-dogenous variables are copied for the residual and predicted values.

The variables in the data set are as follows:

� BY variables

� RANGE variable

� ID variables

� –ESTYPE–, a character variable of length 8 identifying the estimation method:OLS, SUR, N2SLS, N3SLS, ITOLS, ITSUR, IT2SLS, IT3SLS, GMM, IT-GMM, or FIML



� –TYPE–, a character variable of length 8 identifying the type of observation:RESIDUAL, PREDICT, or ACTUAL

� –WEIGHT–, the weight of the observation in the estimation. The–WEIGHT–value is 0 if the observation was not used. It is equal to the product of the

–WEIGHT– model program variable and the variable named in the WEIGHTstatement, if any, or 1 if weights were not used.

� the WEIGHT statement variable if used

� the model variables. The dependent variables for the normalized-form equa-tions in the estimation contain residuals, actuals, or predicted values, dependingon the–TYPE– variable, whereas the model variables that are not associatedwith estimated equations always contain actual values from the input data set.

� any other variables named in the OUTVARS statement. These can be programvariables computed by the model program, CONTROL variables, parameters,or special variables in the model program.

The following SAS statements are used to generate and print an OUT= data set:

proc model data=gmm2;exogenous x1 x2;parms a1 a2 b2 b1 2.5 c2 55 d1;inst b1 b2 c2 x1 x2;y1 = a1 * y2 + b1 * x1 * x1 + d1;y2 = a2 * y1 + b2 * x2 * x2 + c2 / x2 + d1;

fit y1 y2 / 3sls gmm out=resid outall ;run;

proc print data=resid(obs=20);run;

The data set GMM2 was generated by the example in the preceding ESTDATA=section above. A partial listing of the RESID data set is shown in Figure 14.58.



Obs _ESTYPE_ _TYPE_ _WEIGHT_ x1 x2 y1 y2

1 3SLS ACTUAL 1 1.00000 -1.7339 -3.05812 -23.0712 3SLS PREDICT 1 1.00000 -1.7339 -0.36806 -19.3513 3SLS RESIDUAL 1 1.00000 -1.7339 -2.69006 -3.7204 3SLS ACTUAL 1 1.41421 -5.3046 0.59405 43.8665 3SLS PREDICT 1 1.41421 -5.3046 -0.49148 45.5886 3SLS RESIDUAL 1 1.41421 -5.3046 1.08553 -1.7227 3SLS ACTUAL 1 1.73205 -5.2826 3.17651 51.5638 3SLS PREDICT 1 1.73205 -5.2826 -0.48281 41.8579 3SLS RESIDUAL 1 1.73205 -5.2826 3.65933 9.707

10 3SLS ACTUAL 1 2.00000 -0.6878 3.66208 -70.01111 3SLS PREDICT 1 2.00000 -0.6878 -0.18592 -76.50212 3SLS RESIDUAL 1 2.00000 -0.6878 3.84800 6.49113 3SLS ACTUAL 1 2.23607 -7.0797 0.29210 99.17714 3SLS PREDICT 1 2.23607 -7.0797 -0.53732 92.20115 3SLS RESIDUAL 1 2.23607 -7.0797 0.82942 6.97616 3SLS ACTUAL 1 2.44949 14.5284 1.86898 423.63417 3SLS PREDICT 1 2.44949 14.5284 -1.23490 421.96918 3SLS RESIDUAL 1 2.44949 14.5284 3.10388 1.66519 3SLS ACTUAL 1 2.64575 -0.6968 -1.03003 -72.21420 3SLS PREDICT 1 2.64575 -0.6968 -0.10353 -69.680

Figure 14.58. The OUT= Data Set

OUTEST= Data SetThe OUTEST= data set contains parameter estimates and, if requested, estimates ofthe covariance of the parameter estimates.

The variables in the data set are as follows:

� BY variables

� –NAME–, a character variable of length 8, blank for observations containingparameter estimates or a parameter name for observations containing covari-ances

� –TYPE–, a character variable of length 8 identifying the estimation method:OLS, SUR, N2SLS, N3SLS, ITOLS, ITSUR, IT2SLS, IT3SLS, GMM, IT-GMM, or FIML

� the parameters estimated.

If the COVOUT option is specified, an additional observation is written for each rowof the estimate of the covariance matrix of parameter estimates, with the–NAME–values containing the parameter names for the rows. Parameter names longer thaneight characters are truncated.

OUTPARMS= Data SetThe option OUTPARMS= writes all the parameter estimates to an output data set.This output data set contains one observation and is similar to the OUTEST= data set,but it contains all the parameters, is not associated with any FIT task, and contains nocovariances. The OUTPARMS= option is used on the PROC MODEL statement, andthe data set is written at the end, after any FIT or SOLVE steps have been performed.



OUTS= Data SetThe OUTS= SAS data set contains the estimate of the covariance matrix of the resid-uals across equations. This matrix is formed from the residuals that are computedusing the parameter estimates.

The variables in the OUTS= data set are as follows:

� BY variables

� –NAME–, a character variable containing the name of the equation

� –TYPE–, a character variable of length 8 identifying the estimation method:OLS, SUR, N2SLS, N3SLS, ITOLS, ITSUR, IT2SLS, IT3SLS, GMM, IT-GMM, or FIML

� variables with the names of the equations in the estimation.

Each observation contains a row of the covariance matrix. The data set is suitable foruse with the SDATA= option on a subsequent FIT or SOLVE statement. (See "Testson Parameters" in this chapter for an example of the SDATA= option.)

OUTSUSED= Data SetThe OUTSUSED= SAS data set contains the covariance matrix of the residuals acrossequations that is used to define the objective function. The form of the OUTSUSED=data set is the same as that for the OUTS= data set.

Note that OUTSUSED= is the same as OUTS= for the estimation methods that it-erate theS matrix (ITOLS, IT2SLS, ITSUR, and IT3SLS). If the SDATA= option isspecified in the FIT statement, OUTSUSED= is the same as the SDATA= matrix readin for the methods that do not iterate theSmatrix (OLS, SUR, N2SLS, and N3SLS).

OUTV= Data SetThe OUTV= data set contains the estimate of the variance matrix, V. This matrixis formed from the instruments and the residuals that are computed using the pa-rameter estimates obtained from the initial 2SLS estimation when GMM estimationis selected. If an estimation method other than GMM or ITGMM is requested andOUTV= is specified, a V matrix is created using computed estimates. In the casethat a VDATA= data set is used, this becomes the OUTV= data set. For ITGMM, theOUTV= data set is the matrix formed from the instruments and the residuals com-puted using the final parameter estimates.



ODS Table Names

PROC MODEL assigns a name to each table it creates. You can use these names toreference the table when using the Output Delivery System (ODS) to select tablesand create output data sets. These names are listed in the following table. For moreinformation on ODS, see Chapter 6, “Using the Output Delivery System.”

Table 14.3. ODS Tables Produced in PROC MODEL

ODS Table Name Description Option

ODS Tables Created by the FIT Statement

AugGMMCovariance Cross products matrix GMMChowTest Structural change test CHOW=CollinDiagnostics Collinearity DiagnosticsConfInterval Profile likelihood Confidence Intervals PRL=ConvCrit Convergence criteria for estimation defaultConvergenceStatus Convergence status defaultCorrB Correlations of parameters COVB/CORRBCorrResiduals Correlations of residuals CORRS/COVSCovB Covariance of parameters COVB/CORRBCovResiduals Covariance of residuals CORRS/COVSCrossproducts Cross products matrix ITALL/ITPRINTDatasetOptions Data sets used defaultDetResidCov Determinant of the Residuals DETAILSDWTest Durbin Watson Test DW=Equations Listing of equations to estimate defaultEstSummaryMiss Model Summary Statistics for PAIRWISE MISSING=EstSummaryStats Objective, Objective * N defaultGMMCovariance Cross products matrix GMMGodfrey Godfrey’s Serial Correlation Test GF=HausmanTest Hausman’s test table HAUSMANHeteroTest Heteroscedasticity test tables BREUSCH/PAGENInvXPXMat X’X inverse for System IIterInfo Iteration printing ITALL/ITPRINTLagLength Model lag length defaultMinSummary Number of parameters, estimation kind defaultMissingValues Missing values generated by the program defaultModSummary Listing of all categorized variables defaultModVars Listing of Model variables and parameters defaultNormalityTest Normality test table NORMALObsSummary Identifies observations with errors defaultObsUsed Observations read, used, and missing. defaultParameterEstimates Parameter Estimates defaultParmChange Parameter Change VectorResidSummary Summary of the SSE, MSE for the equations defaultSizeInfo Storage Requirement for estimation DETAILS



Table 14.3. (continued)

ODS Table Name Description OptionTermEstimates Nonlinear OLS and ITOLS Estimates OLS/ITOLSTestResults Test statement tableWgtVar The name of the weight variableXPXMat X’X for System XPX

ODS Tables Created by the SOLVE Statement

DatasetOptions Data sets used defaultDescriptiveStatistics Descriptive Statistics STATSFitStatistics Fit statistics for simulation STATSLagLength Model lag length defaultModSummary Listing of all categorized variables defaultObsSummary Simulation trace output SOLVEPRINTObsUsed Observations read, used, and missing. defaultSimulationSummary Number of variables solved for defaultSolutionVarList Solution Variable Lists defaultTheilRelStats Theil Relative Change Error Statistics THEILTheilStats Theil Forecast Error Statistics THEIL

ODS Tables Created by the FIT and SOLVE Statements

AdjacencyMatrix Adjacency Graph GRAPHBlockAnalysis Block analysis BLOCKBlockStructure Block structure BLOCKCodeDependency Variable cross reference LISTDEPCodeList Listing of programs statements LISTCODECrossReference Cross Reference Listing For ProgramDepStructure Dependency Structure of the System BLOCKDerList Derivative variables LISTDERFirstDerivatives First derivative table LISTDERInterIntg Integration Iteration Output INTGPRINTMemUsage Memory usage statistics MEMORYUSEParmReadIn Parameter estimates read in ESTDATA=ProgList Listing of Compiled Program CodeRangeInfo RANGE statement specificationSortAdjacencyMatrix Sorted adjacency Graph GRAPHTransitiveClosure Transitive closure Graph GRAPH


Chapter 14. Simulation Details

Simulation Details

Thesolutiongiven the vectork, of the following nonlinear system of equations is thevectoru which satisfies this equation:

q(u;k; �) = 0

A simulationis a set of solutionsut for a specific sequence of vectorskt.

Model simulation can be performed to

� check how well the model predicts the actual values over the historical period

� investigate the sensitivity of the solution to changes in the input values or pa-rameters

� examine the dynamic characteristics of the model

� check the stability of the simultaneous solution

� estimate the statistical distribution of the predicted values of the nonlinearmodel using Monte Carlo methods

By combining the various solution modes with different input data sets, model sim-ulation can answer many different questions about the model. This section presentsdetails of model simulation and solution.

Solution Modes

The following solution modes are commonly used:

� Dynamic simultaneous forecastmode is used for forecasting with the model.Collect the historical data on the model variables, the future assumptions ofthe exogenous variables, and any prior information on the future endogenousvalues, and combine them in a SAS data set. Use the FORECAST option onthe SOLVE statement.

� Dynamic simultaneous simulationmode is often calledex-post simulation, his-torical simulation, or ex-post forecasting. Use the DYNAMIC option. Thismode is the default.

� Static simultaneous simulationmode can be used to examine the within-periodperformance of the model without the complications of previous period errors.Use the STATIC option.

� NAHEAD=n dynamic simultaneous simulationmode can be used to see howwell n-period-ahead forecasting would have performed over the historical pe-riod. Use the NAHEAD=n option.

The different solution modes are explained in detail in the following sections.

Dynamic and Static SimulationsIn model simulation, either solved values or actual values from the data set can beused to supply lagged values of an endogenous variable. Adynamicsolution refers



to a solution obtained by using only solved values for the lagged values. Dynamicmode is used both for forecasting and for simulating the dynamic properties of themodel.

A staticsolution refers to a solution obtained by using the actual values when avail-able for the lagged endogenous values. Static mode is used to simulate the behaviorof the model without the complication of previous period errors. Dynamic simulationis the default.

If you wish to use static values for lags only for the firstn observations, and dynamicvalues thereafter, specify the START=n option. For example, if you want a dynamicsimulation to start after observation twenty-four, specify START=24 on the SOLVEstatement. If the model being simulated had a value lagged for four time periods, thenthis value would start using dynamic values when the simulation reached observationnumber 28.

n-Period-Ahead ForecastingSuppose you want to regularly forecast 12 months ahead and produce a new forecasteach month as more data becomes available.n-period-ahead forecasting allows youto test how well you would have done over time had you been using your model toforecast 1 year ahead.

To see how well a model predictsn time periods in the future, perform ann-period-ahead forecast on real data and compare the forecast values with the actual values.

n-period-ahead forecasting refers to using dynamic values for the lagged endogenousvariables only for lags1 throughn-1. For example, 1-period-ahead forecasting, spec-ified by the NAHEAD=1 option on the SOLVE statement, is the same as if a staticsolution had been requested. Specifying NAHEAD=2 produces a solution that usesdynamic values for lag one and static, actual, values for longer lags.

The following example is a 2-year-ahead dynamic simulation. The output is shownin Figure 14.59.

data yearly;input year x1 x2 x3 y1 y2 y3;datalines;

84 4 9 0 7 4 585 5 6 1 1 27 486 3 8 2 5 8 287 2 10 3 0 10 1088 4 7 6 20 60 4089 5 4 8 40 40 4090 3 2 10 50 60 6091 2 5 11 40 50 60;run;

proc model data=yearly outmodel=foo;endogenous y1 y2 y3;exogenous x1 x2 x3;

y1 = 2 + 3*x1 - 2*x2 + 4*x3;



y2 = 4 + lag2( y3 ) + 2*y1 + x1;y3 = lag3( y1 ) + y2 - x2;

solve y1 y2 y3 / nahead=2 out=c;run;

proc print data=c;run;

The MODEL ProcedureDynamic Simultaneous 2-Periods-Ahead Forecasting Simulation

Data Set Options

DATA= YEARLYOUT= C

Solution Summary

Variables Solved 3Simulation Lag Length 3Solution Method NEWTONCONVERGE= 1E-8Maximum CC 0Maximum Iterations 1Total Iterations 8Average Iterations 1


Read 20Lagged 12Solved 8First 5Last 8

Variables Solved For y1 y2 y3

Figure 14.59. NAHEAD Summary Report

Obs _TYPE_ _MODE_ _LAG_ _ERRORS_ y1 y2 y3 x1 x2 x3

1 PREDICT SIMULATE 0 0 0 10 7 2 10 32 PREDICT SIMULATE 1 0 24 58 52 4 7 63 PREDICT SIMULATE 1 0 41 101 102 5 4 84 PREDICT SIMULATE 1 0 47 141 139 3 2 105 PREDICT SIMULATE 1 0 42 130 145 2 5 11

Figure 14.60. C Data Set

The proceding 2-year-ahead simulation can be emulated without using the NA-HEAD= option by the following PROC MODEL statements:

proc model data=test model=foo;range year = 87 to 88;solve y1 y2 y3 / dynamic solveprint;

run;



range year = 88 to 89;solve y1 y2 y3 / dynamic solveprint;

run;


run;


The totals shown under "Observations Processed" in Figure 14.59 are equal to thesum of the four individual runs.

Simulation and ForecastingYou can perform a simulation of your model or use the model to produce forecasts.Simulationrefers to the determination of the endogenous or dependent variables asa function of the input values of the other variables, even when actual data for someof the solution variables are available in the input data set. The simulation mode isuseful for verifying the fit of the model parameters. Simulation is selected by theSIMULATE option on the SOLVE statement. Simulation mode is the default.

In forecast mode, PROC MODEL solves only for those endogenous variables thatare missing in the data set. The actual value of an endogenous variable is used asthe solution value whenever nonmissing data for it are available in the input data set.Forecasting is selected by the FORECAST option on the SOLVE statement.

For example, an econometric forecasting model can contain an equation to predictfuture tax rates, but tax rates are usually set in advance by law. Thus, for the first yearor so of the forecast, the predicted tax rate should really be exogenous. Or, you maywant to use a prior forecast of a certain variable from a short-run forecasting modelto provide the predicted values for the earlier periods of a longer-range forecast of along-run model. A common situation in forecasting is when historical data neededto fill the initial lags of a dynamic model are available for some of the variables buthave not yet been obtained for others. In this case, the forecast must start in the pastto supply the missing initial lags. Clearly, you should use the actual data that areavailable for the lags. In all the preceding cases, the forecast should be produced byrunning the model in the FORECAST mode; simulating the model over the futureperiods would not be appropriate.

Monte Carlo SimulationThe accuracy of the forecasts produced by PROC MODEL depends on four sourcesof error (Pindyck 1981, 405-406):

� The system of equations contains an implicit random error term�

g(y;x; �) = �

wherey, x, g, �, and� are vector valued.

� The estimated values of the parameters,�, are themselves random variables.



� The exogenous variables may have been forecast themselves and therefore maycontain errors.

� The system of equations may be incorrectly specified; the model only approx-imates the process modeled.

The RANDOM= option is used to request Monte Carlo (or stochastic) simulations togenerate confidence intervals for errors arising from the first two sources. The MonteCarlo simulations can be performed with�, �, or both vectors represented as randomvariables. The SEED= option is used to control the random number generator for thesimulations. SEED=0 forces the random number generator to use the system clockas its seed value.

In Monte Carlo simulations, repeated simulations are performed on the model forrandom perturbations of the parameters and the additive error term. The randomperturbations follow a multivariate normal distribution with expected value of 0 andcovariance described by a covariance matrix of the parameter estimates in the case of�, or a covariance matrix of the equation residuals for the case of�. PROC MODELcan generate both covariance matrices or you can provide them.

The ESTDATA= option specifies a data set containing an estimate of the covariancematrix of the parameter estimates to use for computing perturbations of the parame-ters. The ESTDATA= data set is usually created by the FIT statement with the OUT-EST= and OUTCOV options. When the ESTDATA= option is specified, the matrixread from the ESTDATA= data set is used to compute vectors of random shocks orperturbations for the parameters. These random perturbations are computed at thestart of each repetition of the solution and added to the parameter values. The per-turbed parameters are fixed throughout the solution range. If the covariance matrixof the parameter estimates is not provided, the parameters are not perturbed.

The SDATA= option specifies a data set containing the covariance matrix of the resid-uals to use for computing perturbations of the equations. The SDATA= data set isusually created by the FIT statement with the OUTS= option. When SDATA= is spec-ified, the matrix read from the SDATA= data set is used to compute vectors of randomshocks or perturbations for the equations. These random perturbations are computedat each observation. The simultaneous solution satisfies the model equations plus therandom shocks. That is, the solution is not a perturbation of a simultaneous solutionof the structural equations; rather, it is a simultaneous solution of the stochastic equa-tions using the simulated errors. If the SDATA= option is not specified, the randomshocks are not used.

The different random solutions are identified by the–REP– variable in the OUT=data set. An unperturbed solution with–REP–=0 is also computed when the RAN-DOM= option is used. RANDOM=n producesn+1 solution observations for eachinput observation in the solution range. If the RANDOM= option is not specified,the SDATA= and ESTDATA= options are ignored, and no Monte Carlo simulation isperformed.

PROC MODEL does not have an automatic way of modeling the exogenous variablesas random variables for Monte Carlo simulation. If the exogenous variables have beenforecast, the error bounds for these variables should be included in the error bounds



generated for the endogenous variables. If the models for the exogenous variablesare included in PROC MODEL, then the error bounds created from a Monte Carlosimulation will contain the uncertainty due to the exogenous variables.

Alternatively, if the distribution of the exogenous variables is known, the built-in ran-dom number generator functions can be used to perturb these variables appropriatelyfor the Monte Carlo simulation. For example, if you knew the forecast of an exoge-nous variable, X, had a standard error of 5.2 and the error was normally distributed,then the following statements could be used to generate random values for X:

x_new = x + 5.2 * rannor(456);

During a Monte Carlo simulation the random number generator functions produceone value at each observation. It is important to use a different seed value for all therandom number generator functions in the model program; otherwise, the perturba-tions will be correlated. For the unperturbed solution,–REP–=0, the random numbergenerator functions return 0.

PROC UNIVARIATE can be used to create confidence intervals for the simulation(see the Monte Carlo simulation example in the "Getting Started" section).

Quasi-Random Number GeneratorsTraditionally high discrepancy psuedo-random number generators are used to gener-ate innovations in Monte Carlo simulations. Loosely translated, a high discrepancypsuedo-random number generator is one in which there is very little correlation be-tween the current number generated and the past numbers generated. This property isideal if indeed independance of the innovations is required. If, on the other hand, theefficient spanning of a multi-dimensional space is desired, a low discrepancy, quasi-random number generator can be used. A quasi-random number generator producesnumbers which have no random component.

A simple one-dimensional quasi-random sequence is the van der Corput sequence.Given a prime number r (r � 2 ) any integer has a unique representation in termsof base r. A number in the interval [0,1) can be created by inverting the representionbase power by base power. For example, consider r=3 and n=1. 1 in base 3 is

110 = 1 � 30 = 13

When the powers of 3 are inverted,

�(1) =1

3

Also 11 in base 3 is

1110 = 1 � 32 + 2 � 30 = 1023

When the powers of 3 are inverted,

�(11) =1

9+ 2 � 1

3=

7

9



The first 10 numbers in this squence�(1) : : : �(10) are provided below

0;1

3;2

3;1

9;4

9;7

9;2

9;5

9;8

9;1

27

As the sequence proceeds it fills in the gaps in a uniform fashion.

Several authors have expanded this idea to many dimensions. Two versions supportedby the MODEL procedure are the Sobol sequence (QUASI=SOBOL) and the Fauresequence (QUASI=FAURE). The Sobol sequence is based on binary numbers an isgenerally computationaly faster than the Faure sequence. The Faure sequence usesthe dimensionality of the problem to determine the number base to use to generatethe sequence. The Faure sequence has better distributional properties than the Sobolsequence for dimensions greater than 8.

As an example of the difference between a pseudo random number and a quasi ran-dom number consider simulating a bivariate normal with 100 draws.

Figure 14.61. A Bivariate Normal using 100 pseudo random draws



Figure 14.62. A Bivariate Normal using 100 Faure random draws

Solution Mode OutputThe following SAS statements dynamically forecast the solution to a nonlinear equa-tion:

proc model data=sashelp.citimon;parameters a 0.010708 b -0.478849 c 0.929304;lhur = 1/(a * ip) + b + c * lag(lhur);solve lhur / out=sim forecast dynamic;

run;

The first page of output produced by the SOLVE step is shown in Figure 14.63. Thisis the summary description of the model. The error message states that the simulationwas aborted at observation 144 because of missing input values.



The MODEL Procedure

Model Summary

Model Variables 1Parameters 3Equations 1Number of Statements 1Program Lag Length 1

Model Variables LHURParameters a(0.010708) b(-0.478849) c(0.929304)

Equations LHUR

The MODEL ProcedureDynamic Single-Equation Forecast

ERROR: Solution values are missing because of missing input values forobservation 144 at NEWTON iteration 0.

NOTE: Additional information on the values of the variables at thisobservation, which may be helpful in determining the cause of the failureof the solution process, is printed below.

Iteration Errors - Missing.NOTE: Simulation aborted.

Figure 14.63. Solve Step Summary Output

The second page of output, shown in Figure 14.64, gives more information on thefailed observation.


ERROR: Solution values are missing because of missing input values forobservation 144 at NEWTON iteration 0.


Observation 144 Iteration 0 CC -1.000000Missing 1

Iteration Errors - Missing.

--- Listing of Program Data Vector ---_N_: 144 ACTUAL.LHUR: . ERROR.LHUR: .IP: . LHUR: 7.10000 PRED.LHUR: .RESID.LHUR: . a: 0.01071 b: -0.47885c: 0.92930

NOTE: Simulation aborted.

Figure 14.64. Solve Step Error Message

From the program data vector you can see the variable IP is missing for observation144. LHUR could not be computed so the simulation aborted.

The solution summary table is shown in Figure 14.65.




Data Set Options

DATA= SASHELP.CITIMONOUT= SIM

Solution Summary

Variables Solved 1Forecast Lag Length 1Solution Method NEWTONCONVERGE= 1E-8Maximum CC 0Maximum Iterations 1Total Iterations 143Average Iterations 1


Read 145Lagged 1Solved 143First 2Last 145Failed 1

Variables Solved For LHUR

Figure 14.65. Solution Summary Report

This solution summary table includes the names of the input data set and the outputdata set followed by a description of the model. The table also indicates the solutionmethod defaulted to Newton’s method. The remaining output is defined as follows.



Maximum CC is the maximum convergence value accepted by the Newtonprocedure. This number is always less than the valuefor "CONVERGE=."

Maximum Iterations is the maximum number of Newton iterations performedat each observation and each replication of MonteCarlo simulations.

Total Iterations is the sum of the number of iterations required for eachobservation and each Monte Carlo simulation.

Average Iterations is the average number of Newton iterations required tosolve the system at each step.

Solved is the number of observations used times the number ofrandom replications selected plus one, for Monte Carlosimulations. The one additional simulation is the originalunperturbed solution. For simulations not involving MonteCarlo, this number is the number of observations used.

Summary StatisticsThe STATS and THEIL options are used to select goodness of fit statistics. Ac-tual values must be provided in the input data set for these statistics to be printed.When the RANDOM= option is specified, the statistics do not include the unper-turbed (–REP–=0) solution.

STATS Option OutputIf the STATS and THEIL options are added to the model in the previous section

proc model data=sashelp.citimon;parameters a 0.010708 b -0.478849 c 0.929304;lhur= 1/(a * ip) + b + c * lag(lhur) ;solve lhur / out=sim dynamic stats theil;range date to ’01nov91’d;

run;

the STATS output in Figure 14.66 and the THEIL output in Figure 14.67 are gener-ated.



The MODEL ProcedureDynamic Single-Equation Simulation

Solution Range DATE = FEB1980 To NOV1991

Descriptive Statistics

Actual PredictedVariable N Obs N Mean Std Dev Mean Std Dev

LHUR 142 142 7.0887 1.4509 7.2473 1.1465

Statistics of fit

Mean Mean % Mean Abs Mean Abs RMS RMS %Variable N Error Error Error % Error Error Error

LHUR 142 0.1585 3.5289 0.6937 10.0001 0.7854 11.2452

Statistics of fit

Variable R-Square Label

LHUR 0.7049 UNEMPLOYMENT RATE:ALL WORKERS,16 YEARS

Figure 14.66. STATS Output

The number of observations (Nobs), the number of observations with both predictedand actual values nonmissing (N), and the mean and standard deviation of the actualand predicted values of the determined variables are printed first. The next set ofcolumns in the output are defined as follows.



Mean Error 1N

PNj=1 (yj � yj)

Mean % Error 100N

PNj=1 (yj � yj)=yj

Mean Abs Error 1N

PNj=1 jyj � yjj

Mean Abs % Error 100N

PNj=1 j(yj � yj)=yj j

RMS Errorq

1N

PNj=1 (yj � yj)2

RMS % Error 100q

1N

PNj=1 ((yj � yj)=yj)2

R-square 1� SSE=CSSA

SSEPN

j=1 (yj � yj)2

SSAPN

j=1 (yj)2

CSSA SSA��PN

j=1 yj

�2y predicted value

y actual value

When the RANDOM= option is specified, the statistics do not include the unper-turbed (–REP–=0) solution.

THEIL Option OutputThe THEIL option specifies that Theil forecast error statistics be computed for theactual and predicted values and for the relative changes from lagged values. Mathe-matically, the quantities are

yc = (y � lag(y))=lag(y)

yc = (y � lag(y))=lag(y)

whereyc is the relative change for the predicted value andyc is the relative changefor the actual value.



The MODEL ProcedureDynamic Single-Equation Simulation

Solution Range DATE = FEB1980 To NOV1991

Theil Forecast Error Statistics

MSE Decomposition ProportionsCorr Bias Reg Dist Var Covar

Variable N MSE (R) (UM) (UR) (UD) (US) (UC)

LHUR 142.0 0.6168 0.85 0.04 0.01 0.95 0.15 0.81

Theil Forecast Error Statistics

Inequality CoefVariable U1 U Label

LHUR 0.1086 0.0539 UNEMPLOYMENT RATE:ALL WORKERS,16 YEARS

Theil Relative Change Forecast Error Statistics

Relative Change MSE Decomposition ProportionsCorr Bias Reg Dist Var Covar

Variable N MSE (R) (UM) (UR) (UD) (US) (UC)

LHUR 142.0 0.0126 -0.08 0.09 0.85 0.06 0.43 0.47

Theil Relative Change Forecast Error Statistics

Inequality CoefVariable U1 U Label

LHUR 4.1226 0.8348 UNEMPLOYMENT RATE:ALL WORKERS,16 YEARS

Figure 14.67. THEIL Output

The columns have the following meaning:

Corr (R) is the correlation coefficient,�, between the actual and predictedvalues.

� =cov(y; y)

�a�p

where�p and�a are the standard deviations of the predicted andactual values.

Bias (UM) is an indication of systematic error and measures the extent towhich the average values of the actual and predicted deviate fromeach other.

(E(y)� E(y))2

1N

PNt=1 (yt � yt)2



Reg (UR) is defined as(�p � � � �a)2=MSE . Consider the regression

y = �+ �y

If � = 1, UR will equal zero.

Dist (UD) is defined as(1� �2)�a�a=MSE and represents the variance ofthe residuals obtained by regressingyc on yc.

Var (US) is the variance proportion. US indicates the ability of the model toreplicate the degree of variability in the endogenous variable.

US =(�p � �a)

2

MSE

Covar (UC) represents the remaining error after deviations from average valuesand average variabilities have been accounted for.

UC =2(1 � �)�p�a

MSE

U1 is a statistic measuring the accuracy of a forecast.

U1 =MSEq

1N

PNt=1 (yt)

2

U is the Theil’s inequality coefficient defined as follows:

U =MSEq

1N

PNt=1 (yt)

2 +q

1N

PNt=1 (yt)

2

MSE is the mean square error

MSE =1

N

NXt=1

(yc� yc)2

More information on these statistics can be found in the references Maddala (1977,344–347) and Pindyck and Rubinfeld (1981, 364–365).

Goal Seeking: Solving for Right-Hand-Side Variables

The process of computing input values needed to produce target results is often calledgoal seeking. To compute a goal-seeking solution, use a SOLVE statement that liststhe variables you want to solve for and provide a data set containing values for theremaining variables.

Consider the following demand model for packaged rice

quantity demanded = �1 + �2price2=3 + �3income



whereprice is the price of the package andincomeis disposable personal income.The only variable the company has control over is the price it charges for rice. Thismodel is estimated using the following simulated data and PROC MODEL state-ments:

data demand;do t=1 to 40;

price = (rannor(10) +5) * 10;income = 8000 * t ** (1/8);demand = 7200 - 1054 * price ** (2/3) +

7 * income + 100 * rannor(1);output;

end;run;

data goal;demand = 85000;income = 12686;

run;

The goal is to find the price the company would have to charge to meet a sales targetof 85,000 units. To do this, a data set is created with a DEMAND variable set to85000 and with an INCOME variable set to 12686, the last income value.

proc model data=demand ;demand = a1 - a2 * price ** (2/3) + a3 * income;fit demand / outest=demest;

run;

The desired price is then determined using the following PROC MODEL statement:

solve price / estdata=demest data=goal solveprint;run;

The SOLVEPRINT option prints the solution values, number of iterations, and finalresiduals at each observation. The SOLVEPRINT output from this solve is shown inFigure 14.68.

The MODEL ProcedureSingle-Equation Simulation

Observation 1 Iterations 6 CC 0.000000 ERROR.demand 0.000000

Solution Values

price

33.59016

Figure 14.68. Goal Seeking, SOLVEPRINT Output

The output indicates that it took 6 Newton iterations to determine the PRICE of33.5902, which makes the DEMAND value within 16E-11 of the goal of 85,000units.



Consider a more ambitious goal of 100,000 units. The output shown in Figure 14.69indicates that the sales target of 100,000 units is not attainable according to thismodel.


NOTE: 3 parameter estimates were read from the ESTDATA=DEMEST data set.


ERROR: Could not reduce norm of residuals in 10 subiterations.

ERROR: The solution failed because 1 equations are missing or have extremevalues for observation 1 at NEWTON iteration 1.





ERROR: 2 execution errors for this observationNOTE: Check for missing input data or uninitialized lags.

(Note that the LAG and DIF functions return missing values for theinitial lag starting observations. This is a change from the 1982 and earlierversions of SAS/ETS which returned zero for uninitialized lags.)NOTE: Simulation aborted.

Figure 14.69. Goal Seeking, Convergence Failure

The program data vector indicates that even with PRICE nearly 0 (4.462312E-22) thedemand is still 4,164 less than the goal. You may need to reformulate your model orcollect more data to more accurately reflect the market response.

Numerical Solution Methods

If the SINGLE option is not used, PROC MODEL computes values that simultane-ously satisfy the model equations for the variables named in the SOLVE statement.PROC MODEL provides three iterative methods, Newton, Jacobi, and Seidel, forcomputing a simultaneous solution of the system of nonlinear equations.

Single-Equation SolutionFor normalized-form equation systems, the solution can either simultaneously satisfyall the equations or can be computed for each equation separately, using the actualvalues of the solution variables in the current period to compute each predicted value.By default, PROC MODEL computes a simultaneous solution. The SINGLE optionon the SOLVE statement selects single-equation solutions.

Single-equation simulations are often made to produce residuals (which estimate therandom terms of the stochastic equations) rather than the predicted values themselves.



If the input data and range are the same as that used for parameter estimation, a staticsingle-equation simulation will reproduce the residuals of the estimation.

Newton’s MethodThe NEWTON option on the SOLVE statement requests Newton’s method to simul-taneously solve the equations for each observation. Newton’s method is the defaultsolution method. Newton’s method is an iterative scheme that uses the derivatives ofthe equations with respect to the solution variables,J, to compute a change vector as

�yi = J�1q(yi;x; �)

PROC MODEL builds and solvesJ using efficient sparse matrix techniques. Thesolution variablesyi at theith iteration are then updated as

yi+1 = yi + d��yi

d is a damping factor between 0 and 1 chosen iteratively so that

kq(yi+1;x; �)k < kq(yi;x; �)k

The number of subiterations allowed for finding a suitabled is controlled by theMAXSUBITER= option. The number of iterations of Newton’s method allowed foreach observation is controlled by MAXITER= option. Refer to Ortega and Rheinbolt(1970) for more details.

Jacobi MethodThe JACOBI option on the SOLVE statement selects a matrix-free alternative to New-ton’s method. This method is the traditional nonlinear Jacobi method found in the lit-erature. The Jacobi method as implemented in PROC MODEL substitutes predictedvalues for the endogenous variables and iterates until a fixed point is reached. Thennecessary derivatives are computed only for the diagonal elements of the jacobian,J.

If the normalized-form equation is

y = f(y;x; �)

the Jacobi iteration has the form

yi+1 = f(yi;x; �)

Seidel MethodThe Seidel method is an order-dependent alternative to the Jacobi method. The Seidelmethod is selected by the SEIDEL option on the SOLVE statement and is applicableonly to normalized-form equations. The Seidel method is like the Jacobi method ex-cept that in the Seidel method the model is further edited to substitute the predictedvalues into the solution variables immediately after they are computed. Seidel thusdiffers from the other methods in that the values of the solution variables are notfixed within an iteration. With the other methods, the order of the equations in the



model program makes no difference, but the Seidel method may work much differ-ently when the equations are specified in a different sequence. Note that this fixedpoint method is the traditional nonlinear Seidel method found in the literature.

The iteration has the form

yi+1j = f(yi;x; �)

whereyi+1j is thejth equation variable at theith iteration and

yi = (yi+11 ; yi+12 ; yi+13 ; : : :; yi+1j�1; yij ; y

ij+1; : : :; y

ig)0

If the model is recursive, and if the equations are in recursive order, the Seidel methodwill converge at once. If the model is block-recursive, the Seidel method may con-verge faster if the equations are grouped by block and the blocks are placed in block-recursive order. The BLOCK option can be used to determine the block-recursiveform.

Comparison of MethodsNewton’s method is the default and should work better than the others for most small-to medium-sized models. The Seidel method is always faster than the Jacobi forrecursive models with equations in recursive order. For very large models and somehighly nonlinear smaller models, the Jacobi or Seidel methods can sometimes befaster. Newton’s method uses more memory than the Jacobi or Seidel methods.

Both the Newton’s method and the Jacobi method are order-invariant in the sense thatthe order in which equations are specified in the model program has no effect on theoperation of the iterative solution process. In order-invariant methods, the values ofthe solution variables are fixed for the entire execution of the model program. Assign-ments to model variables are automatically changed to assignments to correspondingequation variables. Only after the model program has completed execution are theresults used to compute the new solution values for the next iteration.

Troubleshooting ProblemsIn solving a simultaneous nonlinear dynamic model you may encounter some of thefollowing problems.

Missing ValuesFor SOLVE tasks, there can be no missing parameter values. If there are missingright-hand-side variables, this will result in a missing left-hand-side variable for thatobservation.

Unstable SolutionsA solution may exist but be unstable. An unstable system can cause the Jacobi andSeidel methods to diverge.

Explosive Dynamic SystemsA model may have well-behaved solutions at each observation but be dynamicallyunstable. The solution may oscillate wildly or grow rapidly with time.



Propagation of ErrorsDuring the solution process, solution variables can take on values that cause computa-tional errors. For example, a solution variable that appears in a LOG function may bepositive at the solution but may be given a negative value during one of the iterations.When computational errors occur, missing values are generated and propagated, andthe solution process may collapse.

Convergence ProblemsThe following items can cause convergence problems:

� illegal function values ( that isp�1 )

� local minima in the model equation

� no solution exists

� multiple solutions exist

� initial values too far from the solution

� the CONVERGE= value too small.

When PROC MODEL fails to find a solution to the system, the current iteration infor-mation and the program data vector are printed. The simulation halts if actual valuesare not available for the simulation to proceed. Consider the following program:

data test1;do t=1 to 50;

x1 = sqrt(t) ;y = .;output;

end;

proc model data=test1;exogenous x1 ;control a1 -1 b1 -29 c1 -4 ;y = a1 * sqrt(y) + b1 * x1 * x1 + c1 * lag(x1);solve y / out=sim forecast dynamic ;

run;

which produces the output shown in Figure 14.70.




ERROR: Could not reduce norm of residuals in 10 subiterations.

ERROR: The solution failed because 1 equations are missing or have extremevalues for observation 1 at NEWTON iteration 1.




--- Listing of Program Data Vector ---_N_: 12 ACTUAL.x1: 1.41421 ACTUAL.y: .ERROR.y: . PRED.y: . RESID.y: .a1: -1 b1: -29 c1: -4x1: 1.41421 y: [email protected]/@y: . @ERROR.y/@y: .


ERROR: 1 execution errors for this observationNOTE: Check for missing input data or uninitialized lags.

(Note that the LAG and DIF functions return missing values for theinitial lag starting observations. This is a change from the 1982 and earlierversions of SAS/ETS which returned zero for uninitialized lags.)NOTE: Simulation aborted.

Figure 14.70. SOLVE Convergence Problems

At the first observation the following equation is attempted to be solved:

y = �py � 62

There is no solution to this problem. The iterative solution process got as close as itcould to making Y negative while still being able to evaluate the model. This problemcan be avoided in this case by altering the equation.

In other models, the problem of missing values can be avoided by either altering thedata set to provide better starting values for the solution variables or by altering theequations.

You should be aware that, in general, a nonlinear system can have any number ofsolutions, and the solution found may not be the one that you want. When multiplesolutions exist, the solution that is found is usually determined by the starting valuesfor the iterations. If the value from the input data set for a solution variable is missing,the starting value for it is taken from the solution of the last period (if nonmissing) orelse the solution estimate is started at 0.



Iteration OutputThe iteration output, produced by the ITPRINT option, is useful in determining thecause of a convergence problem. The ITPRINT option forces the printing of thesolution approximation and equation errors at each iteration for each observation.A portion of the ITPRINT output from the following statement is shown in Figure14.71.

proc model data=test1;exogenous x1 ;control a1 -1 b1 -29 c1 -4 ;y = a1 * sqrt(abs(y)) + b1 * x1 * x1 + c1 * lag(x1);solve y / out=sim forecast dynamic itprint;

run;

For each iteration, the equation with the largest error is listed in parentheses after theNewton convergence criteria measure. From this output you can determine whichequation or equations in the system are not converging well.




Observation 1 Iteration 0 CC 613961.39 ERROR.y -62.01010

Predicted Values

y

0.0001000

Iteration Errors

y

-62.01010

Observation 1 Iteration 1 CC 50.902771 ERROR.y -61.88684

Predicted Values

y

-1.215784

Iteration Errors

y

-61.88684

Observation 1 Iteration 2 CC 0.364806 ERROR.y 41.752112

Predicted Values

y

-114.4503

Iteration Errors

y

41.75211

Figure 14.71. SOLVE, ITPRINT Output

Numerical Integration

The differential equation system is numerically integrated to obtain a solution for thederivative variables at each data point. The integration is performed by evaluating theprovided model at multiple points between each data point. The integration methodused is a variable order, variable step-size backward difference scheme; for more de-tailed information, refer to Aiken (1985) and Byrne (1975). The step size or time step



is chosen to satisfy alocal truncation errorrequirement. The termtruncation errorcomes from the fact that the integration scheme uses a truncated series expansionof the integrated function to do the integration. Because the series is truncated, theintegration scheme is within the truncation error of the true value.

To further improve the accuracy of the integration, the total integration time is brokenup into small intervals (time steps or step sizes), and the integration scheme is appliedto those intervals. The integration at each time step uses the values computed at theprevious time step so that the truncation error tends to accumulate. It is usually notpossible to estimate the global error with much precision. The best that can be doneis to monitor and to control the local truncation error, which is the truncation errorcommitted at each time step relative to

d = max0�t�T

(ky(t)k1; 1)

wherey(t) is the integrated variable. Furthermore, they(t)s are dynamically scaledto within two orders of magnitude one to keep the error monitoring well behaved.

The local truncation error requirement defaults to1:0E � 9. You can specify theLTEBOUND= option to modify that requirement. The LTEBOUND= option is a rel-ative measure of accuracy, so a value smaller than1:0E � 10 is usually not practical.A larger bound increases the speed of the simulation and estimation but decreasesthe accuracy of the results. If the LTEBOUND= option is set too small, the integra-tor is not able to take time steps small enough to satisfy the local truncation errorrequirement and still have enough machine precision to compute the results. Sincethe integrations are scaled to within1:0E � 2 of one, the simulated values should becorrect to at least seven decimal places.

There is a default minimum time step of1:0E � 14. This minimum time step iscontrolled by the MINTIMESTEP= option and the machine epsilon. If the minimumtime step is smaller than the machine epsilon times the final time value, the minimumtime step is increased automatically.

For the points between each observation in the data set, the values for nonintegratedvariables in the data set are obtained from a linear interpolation from the two closestpoints. Lagged variables can be used with integrations, but their values are discreteand are not interpolated between points. Lagging, therefore, can then be used to inputstep functions into the integration.

The derivatives necessary for estimation (the gradient with respect to the parameters)and goal seeking (the Jacobian) are computed by numerically integrating analyticalderivatives. The accuracy of the derivatives is controlled by the same integrationtechniques mentioned previously.



Limitations

There are limitations to the types of differential equations that can be solved or es-timated. One type is an explosive differential equation (finite escape velocity) forwhich the following differential equation is an example:

y0

= a�y; a > 0

If this differential equation is integrated too far in time,y exceeds the maximum valueallowed on the computer, and the integration terminates.

Likewise, differential systems that are singular cannot be solved or estimated in gen-eral. For example, consider the following differential system:

x0

= �y0 + 2x+ 4y + exp(t)

y0

= �x0 + y + exp(4�t)

This system has an analytical solution, but an accurate numerical solution is verydifficult to obtain. The reason is thaty

0

andx0

cannot be isolated on the left-handside of the equation. If the equation is modified slightly to

x0

= �y0 + 2x+ 4y + exp(t)

y0

= x0

+ y + exp(4t)

the system is nonsingular, but the integration process could still fail or be extremelyslow. If the MODEL procedure encounters either system, a warning message is is-sued.

This system can be rewritten as the following recursive system,

x0

= 0:5y + 0:5exp(4t) + x+ 1:5y � 0:5exp(t)

y0

= x0

+ y + exp(4t)

which can be estimated and simulated successfully with the MODEL procedure.

Petzold (1982) mentions a class of differential algebraic equations that, when inte-grated numerically, could produce incorrect or misleading results. An example ofsuch a system mentioned in Petzold (1982) is

y0

2(t) = y1(t) + g1(t)

0 = y2(t) + g2(t)



The analytical solution to this system depends ong and its derivatives at the currenttime only and not on its initial value or past history. You should avoid systems of thisand other similar forms mentioned in Petzold (1982).

SOLVE Data Sets

SDATA= Input Data SetThe SDATA= option reads a cross-equation covariance matrix from a data set. Thecovariance matrix read from the SDATA= data set specified on the SOLVE statementis used to generate random equation errors when the RANDOM= option specifiesMonte Carlo simulation.

Typically, the SDATA= data set is created by the OUTS= on a previous FIT statement.(The OUTS= data set from a FIT statement can be read back in by a SOLVE statementin the same PROC MODEL step.)

You can create an input SDATA= data set using the DATA step. PROC MODELexpects to find a character variable–NAME– in the SDATA= data set as well asvariables for the equations in the estimation or solution. For each observation witha –NAME– value matching the name of an equation, PROC MODEL fills the cor-responding row of the S matrix with the values of the names of equations found inthe data set. If a row or column is omitted from the data set, an identity matrix rowor column is assumed. Missing values are ignored. Since the S matrix is symmetric,you can include only a triangular part of the S matrix in the SDATA= data set withthe omitted part indicated by missing values. If the SDATA= data set contains multi-ple observations with the same–NAME–, the last values supplied for the–NAME–variable are used. The "OUTS= Data Set" section contains more details on the formatof this data set.

Use the TYPE= option to specify the type of estimation method used to produce theS matrix you want to input.

ESTDATA= Input Data SetThe ESTDATA= option specifies an input data set that contains an observation withvalues for some or all of the model parameters. It can also contain observations withthe rows of a covariance matrix for the parameters.

When the ESTDATA= option is used, parameter values are set from the first obser-vation. If the RANDOM= option is used and the ESTDATA= data set contains acovariance matrix, the covariance matrix of the parameter estimates is read and usedto generate pseudo-random shocks to the model parameters for Monte Carlo simu-lation. These random perturbations have a multivariate normal distribution with thecovariance matrix read from the ESTDATA= data set.

The ESTDATA= data set is usually created by the OUTEST= option in a FIT state-ment. The OUTEST= data set contains the parameter estimates produced by the FITstatement and also contains the estimated covariance of the parameter estimates if theOUTCOV option is used. This OUTEST= data set can be read in by the ESTDATA=option in a SOLVE statement.



You can also create an ESTDATA= data set with a SAS DATA step program. Thedata set must contain a numeric variable for each parameter to be given a value orcovariance column. The name of the variable in the ESTDATA= data set must matchthe name of the parameter in the model. Parameters with names longer than eightcharacters cannot be set from an ESTDATA= data set. The data set must also containa character variable–NAME– of length eight.–NAME– has a blank value for theobservation that gives values to the parameters.–NAME– contains the name of aparameter for observations defining rows of the covariance matrix.

More than one set of parameter estimates and covariances can be stored in the EST-DATA= data set if the observations for the different estimates are identified by thevariable–TYPE–. –TYPE– must be a character variable of length eight. The TYPE=option is used to select for input the part of the ESTDATA= data set for which thevalue of the–TYPE– variable matches the value of the TYPE= option.

OUT= Data SetThe OUT= data set contains solution values, residual values, and actual values of thesolution variables.

The OUT= data set contains the following variables:

� BY variables

� RANGE variable

� ID variables

� –TYPE–, a character variable of length eight identifying the type of obser-vation. The–TYPE– variable can be PREDICT, RESIDUAL, ACTUAL, orERROR.

� –MODE–, a character variable of length eight identifying the solution mode.

–MODE– takes the value FORECAST or SIMULATE.

� if lags are used, a numeric variable,–LAG–, containing the number of dy-namic lags that contribute to the solution. The value of–LAG– is always zerofor STATIC mode solutions.–LAG– is set to a missing value for lag-startingobservations.

� –REP–, a numeric variable containing the replication number, if the RAN-DOM= option is used. For example, if RANDOM=10, each input observationresults in eleven output observations with–REP– values 0 through 10. Theobservations with–REP–=0 are from the unperturbed solution. (The random-number generator functions are suppressed, and the parameter and endogenousperturbations are zero when–REP–=0.)

� –ERRORS–, a numeric variable containing the number of errors that occurredduring the execution of the program for the last iteration for the observation. Ifthe solution failed to converge, this is counted as one error, and the–ERRORS–variable is made negative.



� solution and other variables. The solution variables contain solutionor predicted values for–TYPE–=PREDICT observations, residuals for

–TYPE–=RESIDUAL observations, or actual values for–TYPE–=ACTUALobservations. The other model variables, and any other variables read from theinput data set, are always actual values from the input data set.

� any other variables named in the OUTVARS statement. These can be programvariables computed by the model program, CONTROL variables, parameters,or special variables in the model program. Compound variable names longerthan eight characters are truncated in the OUT= data set.

By default only the predicted values are written to the OUT= data set. The OUT-RESID, OUTACTUAL, and OUTERROR options are used to add the residual, actual,and ERROR. values to the data set.

For examples of the OUT= data set, see Example 14.6 at the end of this chapter.

DATA= Input Data SetThe input data set should contain all of the exogenous variables and should supplynonmissing values for them for each period to be solved.

Solution variables can be supplied in the input data set and are used as follows:

� to supply initial lags. For example, if the lag length of the model is three, threeobservations are read in to feed the lags before any solutions are computed.

� to evaluate the goodness of fit. Goodness-of-fit measures are computed basedon the difference between the solved values and the actual values supplied fromthe data set.

� to supply starting values for the iterative solution. If the value from the inputdata set for a solution variable is missing, the starting value for it is taken fromthe solution of the last period (if nonmissing) or else the solution estimate isstarted at zero.

� For STATIC mode solutions, actual values from the data set are used by thelagging functions for the solution variables.

� for FORECAST mode solutions, actual values from the data set are used as thesolution values when nonmissing.


Chapter 14. Programming Language Overview

Programming Language Overview

Variables in the Model Program

Variable names are alphanumeric but must start with a letter. The length of a variablename is limited to thirty-two characters for non-SAS data set variables

PROC MODEL uses several classes of variables, and different variable classes aretreated differently. Variable class is controlled bydeclaration statements. These arethe VAR, ENDOGENOUS, and EXOGENOUS statements for model variables, thePARAMETERS statement for parameters, and the CONTROL statement for controlclass variables. These declaration statements have several valid abbreviations. Var-ious internal variablesare also made available to the model program to allow com-munication between the model program and the procedure. RANGE, ID, and BYvariables are also available to the model program. Those variables not declared asany of the preceding classes areprogram variables.

Some classes of variables can be lagged; that is, their value at each observation isremembered, and previous values can be referred to by the lagging functions. Otherclasses have only a single value and are not affected by lagging functions. For ex-ample, parameters have only one value and are not affected by lagging functions;therefore, if P is a parameter, DIFn(P) is always 0, and LAGn(P) is always the sameas P for all values ofn.

The different variable classes and their roles in the model are described in the follow-ing.

Model VariablesModel variables are declared by VAR, ENDOGENOUS, or EXOGENOUS state-ments, or by FIT and SOLVE statements. The model variables are the variables thatthe model is intended to explain or predict.

PROC MODEL allows you to use expressions on the left-hand side of the equal signto define model equations. For example, a log linear model for Y can now be writtenas

log( y ) = a + b * x;

Previously, only a variable name was allowed on the left-hand side of the equal sign.

The text on the left hand side of the equation serves as the equation name used toidentify the equation in printed output, in the OUT= data sets, and in FIT or SOLVEstatements. To refer to equations specified using left-hand side expressions (on theFIT statement, for example), place the left-hand side expression in quotes. For exam-ple, the following statements fit a log linear model to the dependent variable Y:

proc model data=in;log( y ) = a + b * x;fit "log(y)";

run;



The estimation and simulation is performed by transforming the models into generalform equations. No actual or predicted value is available for general form equationsso noR2 or adjustedR2 will be computed.

Equation VariablesAn equation variable is one of several special variables used by PROC MODEL tocontrol the evaluation of model equations. An equation variable name consists of oneof the prefixes EQ, RESID, ERROR, PRED, or ACTUAL, followed by a period andthe name of a model equation.

Equation variable names can appear on parts of the PROC MODEL printed output,and they can be used in the model program. For example, RESID-prefixed vari-ables can be used in LAG functions to define equations with moving-average errorterms. See the "Autoregressive Moving-Average Error Processes" section earlier inthis chapter for details.

The meaning of these prefixes is detailed in the "Equation Translations" section.

ParametersParameters are variables that have the same value for each observation. Parameterscan be given values or can be estimated by fitting the model to data. During theSOLVE stage, parameters are treated as constants. If no estimation is performed, theSOLVE stage uses the initial value provided in either the ESTDATA= data set, theMODEL= file, or on the PARAMETER statement, as the value of the parameter.

The PARAMETERS statement declares the parameters of the model. Parameters arenot lagged, and they cannot be changed by the model program.

Control VariablesControl variables supply constant values to the model program that can be used tocontrol the model in various ways. The CONTROL statement declares control vari-ables and specifies their values. A control variable is like a parameter except that ithas a fixed value and is not estimated from the data.

Control variables are not reinitialized before each pass through the data and can thusbe used to retain values between passes. You can use control variables to vary theprogram logic. Control variables are not affected by lagging functions.

For example, if you have two versions of an equation for a variable Y, you could putboth versions in the model and, using a CONTROL statement to select one of them,produce two different solutions to explore the effect the choice of equation has on themodel:

select (case);when (1) y = ...first version of equation... ;when (2) y = ...second version of equation... ;

end;control case 1;solve / out=case1;

run;control case 2;solve / out=case2;

run;



RANGE, ID, and BY VariablesThe RANGE statement controls the range of observations in the input data set that isprocessed by PROC MODEL. The ID statement lists variables in the input data setthat are used to identify observations on the printout and in the output data set. TheBY statement can be used to make PROC MODEL perform a separate analysis foreach BY group. The variable in the RANGE statement, the ID variables, and the BYvariables are available for the model program to examine, but their values should notbe changed by the program. The BY variables are not affected by lagging functions.

BY Processing ImprovementsPrior to version 6.11, the BY processing in the SOLVE statement was performedonly for the DATA= data set. The last values in the ESTDATA= and SDATA= datasets were used regardless of the existence of BY variables in those two data sets.This constraint is now removed. If the BY variables are identical in the DATA=data set and the ESTDATA= data set, then the two data sets are syncronized and thesimulations are performed using the data and parameters for each BY group. Thisholds for BY variables in the SDATA= data set as well. If, at some point, the BYvariables don’t match, BY processing is abandoned in either the ESTDATA= data setor the SDATA= data set, whichever has the missing BY value. If the DATA= data setdoes not contain BY variables and the ESTDATA= data set or the SDATA= data setdoes, then BY processing is performed for the ESTDATA= data set and the SDATA=data set by reusing the data in the DATA= data set for each BY group.

Internal VariablesYou can use several internal variables in the model program to communicate with theprocedure. For example, if you wanted PROC MODEL to list the values of all thevariables when more than 10 iterations are performed and the procedure is past the20th observation, you can write

if _obs_ > 20 then if _iter_ > 10 then _list_ = 1;

Internal variables are not affected by lagging functions, and they cannot be changedby the model program except as noted. The following internal variables are available.The variables are all numeric except where noted.

–ERRORS– a flag that is set to 0 at the start of program execution and is set to anonzero value whenever an error occurs. The program can also setthe–ERRORS– variable.

–ITER– the iteration number. For FIT tasks, the value of–ITER– is nega-tive for preliminary grid-search passes. The iterative phase of theestimation starts with iteration 0. After the estimates have con-verged, a final pass is made to collect statistics with–ITER– set toa missing value. Note that at least one pass, and perhaps severalsubiteration passes as well, is made for each iteration. For SOLVEtasks,–ITER– counts the iterations used to compute the simulta-neous solution of the system.

–LAG– the number of dynamic lags that contribute to the solution at thecurrent observation.–LAG– is always 0 for FIT tasks and for



STATIC solutions. –LAG– is set to a missing value during thelag starting phase.

–LIST– list flag that is set to 0 at the start of program execution. The pro-gram can set–LIST– to a nonzero value to request a listing of thevalues of all the variables in the program after the program hasfinished executing.

–METHOD– is the solution method in use for SOLVE tasks.–METHOD– is setto a blank value for FIT tasks.–METHOD– is a character-valuedvariable. Values are NEWTON, JACOBI, SIEDEL, or ONEPASS.

–MODE– takes the value ESTIMATE for FIT tasks and the value SIMULATEor FORECAST for SOLVE tasks.–MODE– is a character-valuedvariable.

–NMISS– the number of missing or otherwise unusable observations duringthe model estimation. For FIT tasks,–NMISS– is initially set to0; at the start of each iteration,–NMISS– is set to the number ofunusable observations for the previous iteration. For SOLVE tasks,

–NMISS– is set to a missing value.

–NUSED– the number of nonmissing observations used in the estimation. ForFIT tasks, PROC MODEL initially sets–NUSED– to the numberof parameters; at the start of each iteration,–NUSED– is resetto the number of observations used in the previous iteration. ForSOLVE tasks,–NUSED– is set to a missing value.

–OBS– counts the observations being processed.–OBS– is negative or 0for observations in the lag starting phase.

–REP– the replication number for Monte Carlo simulation when the RAN-DOM= option is specified in the SOLVE statement.–REP– is 0when the RANDOM= option is not used and for FIT tasks. When

–REP–=0, the random-number generator functions always return0.

–WEIGHT– the weight of the observation. For FIT tasks,–WEIGHT– providesa weight for the observation in the estimation.–WEIGHT– is ini-tialized to 1.0 at the start of execution for FIT tasks. For SOLVEtasks,–WEIGHT– is ignored.

Program VariablesVariables not in any of the other classes are called program variables. Program vari-ables are used to hold intermediate results of calculations. Program variables arereinitialized to missing values before each observation is processed. Program vari-ables can be lagged. The RETAIN statement can be used to give program variablesinitial values and enable them to keep their values between observations.

Character VariablesPROC MODEL supports both numeric and character variables. Character variablesare not involved in the model specification but can be used to label observations, towrite debugging messages, or for documentation purposes. All variables are numericunless they are the following.



� character variables in a DATA= SAS data set

� program variables assigned a character value

� declared to be character by a LENGTH or ATTRIB statement.

Equation Translations

Equations written in normalized form are always automatically converted to generalform equations. For example, when a normalized-form equation such as

y = a + b*x;

is encountered, it is translated into the equations

PRED.y = a + b*x;RESID.y = PRED.y - ACTUAL.y;ERROR.y = PRED.y - y;

If the same system is expressed as the following general-form equation, then thisequation is used unchanged.

EQ.y = y - a + b*x;

This makes it easy to solve for arbitrary variables and to modify the error terms forautoregressive or moving average models.

Use the LIST option to see how this transformation is performed. For example, thefollowing statements produce the listing shown in Figure 14.72.

proc model data=line list;y = a1 + b1*x1 + c1*x2;fit y;

run;

The MODEL Procedure


1 15820:39 PRED.y = a1 + b1 * x1 + c1 * x2;1 15820:39 RESID.y = PRED.y - ACTUAL.y;1 15820:39 ERROR.y = PRED.y - y;

Figure 14.72. LIST Output

PRED.Y is the predicted value of Y, and ACTUAL.Y is the value of Y in the dataset. The predicted value minus the actual value, RESID.Y, is then the error term,�,for the original Y equation. ACTUAL.Y and Y have the same value for parameterestimation. For solve tasks, ACTUAL.Y is still the value of Y in the data set but Ybecomes the solved value; the value that satisfies PRED.Y - Y = 0.

The following are the equation variable definitions.



EQ. The value of an EQ-prefixed equation variable (normally used todefine a general-form equation) represents the failure of the equa-tion to hold. When the EQ.namevariable is 0, thenameequationis satisfied.

RESID. The RESID.namevariables represent the stochastic parts of theequations and are used to define the objective function for the es-timation process. A RESID.-prefixed equation variable is like anEQ-prefixed variable but makes it possible to use or transform thestochastic part of the equation. The RESID. equation is used inplace of the ERROR. equation for model solutions if it has beenreassigned or used in the equation.

ERROR. An ERROR.namevariable is like an EQ-prefixed variable, exceptthat it is used only for model solution and does not affect parameterestimation.

PRED. For a normalized-form equation (specified by assignment to amodel variable), the PRED.nameequation variable holds the pre-dicted value, wherenameis the name of both the model variableand the corresponding equation. (PRED-prefixed variables are notcreated for general-form equations.)

ACTUAL. For a normalized-form equation (specified by assignment to amodel variable), the ACTUAL.nameequation variable holds thevalue of thenamemodel variable read from the input data set.

DERT. The DERT.namevariable defines a differential equation. Once de-fined, it may be used on the right-hand side of another equation.

H. The H.namevariable specifies the functional form for the varianceof the named equation.

GMM–H. This is created for H.varsand is the moment equation for the vari-ance for GMM. This variable is used only for GMM.

GMM_H.name = RESID.name**2 - H.name;

MSE. The MSE.y variable contains the value of the mean square errorfor y at each iteration. An MSE. variable is created for each de-pendent/endogenous variable in the model. These variables can beused to specify the lagged values in the estimation and simulationof GARCH type models.

demret = intercept ;if ( _OBS_ = 1 ) then

h.demret = arch0 + arch1 * mse.demret +garch1 * mse.demret;

elseh.demret = arch0 +

arch1 * zlag( resid.demret ** 2) +garch1 * zlag(h.demret) ;



NRESID. This is created for H.vars and is the normalized residual of thevariable <name>. The formula is

NRESID.name = RESID.name/ sqrt(H.name);

The three equation variable prefixes, RESID., ERROR., and EQ. allow for controlover the objective function for the FIT, the SOLVE, or both the FIT and the SOLVEstages. For FIT tasks, PROC MODEL looks first for a RESID.namevariable foreach equation. If defined, the RESID-prefixed equation variable is used to define theobjective function for the parameter estimation process. Otherwise, PROC MODELlooks for an EQ-prefixed variable for the equation and uses it instead.

For SOLVE tasks, PROC MODEL looks first for an ERROR.name variable for eachequation. If defined, the ERROR-prefixed equation variable is used for the solutionprocess. Otherwise, PROC MODEL looks for an EQ-prefixed variable for the equa-tion and uses it instead. To solve the simultaneous equation system, PROC MODELcomputes values of the solution variables (the model variables being solved for) thatmake all of the ERROR.name and EQ.namevariables close to 0.

Derivatives

Nonlinear modeling techniques require the calculation of derivatives of certain vari-ables with respect to other variables. The MODEL procedure includes an analyticdifferentiator that determines the model derivatives and generates program code tocompute these derivatives. When parameters are estimated, the MODEL proceduretakes the derivatives of the equation with respect to the parameters. When the modelis solved, Newton’s method requires the derivatives of the equations with respect tothe variables solved for.

PROC MODEL uses exact mathematical formulas for derivatives of non-user-definedfunctions. For other functions, numerical derivatives are computed and used.

The differentiator differentiates the entire model program, including conditional logicand flow of control statements. Delayed definitions, as when the LAG of a programvariable is referred to before the variable is assigned a value, are also differentiatedcorrectly.

The differentiator includes optimization features that produce efficient code for thecalculation of derivatives. However, when flow of control statements such as GOTOstatements are used, the optimization process is impeded, and less efficient code forderivatives may be produced. Optimization is also reduced by conditional statements,iterative DO loops, and multiple assignments to the same variable.

The table of derivatives is printed with the LISTDER option. The code generated forthe computation of the derivatives is printed with the LISTCODE option.

Derivative VariablesWhen the differentiator needs to generate code to evaluate the expression for thederivative of a variable, the result is stored in a special derivative variable. Derivativevariables are not created when the derivative expression reduces to a previously com-puted result, a variable, or a constant. The names of derivative variables, which may



sometimes appear in the printed output, have the form @obj/@wrt, whereobj is thevariable whose derivative is being taken andwrt is the variable that the differentiationis with respect to. For example, the derivative variable for the derivative ofY withrespect toX is named@Y/@X.

The derivative variables cannot be accessed or used as part of the model program.

Mathematical Functions

The following is a brief summary of SAS functions useful for defining models. Addi-tional functions and details are inSAS Language: Reference. Information on creatingnew functions can be found inSAS/TOOLKIT Software: Usage and Reference, chap-ter 15, "Writing a SAS Function or Call Routine."

ABS(x) the absolute value ofx

ARCOS(x) the arccosine in radians ofx. x should be between�1 and1.

ARSIN(x) the arcsine in radians ofx. x should be between�1 and1.

ATAN(x) the arctangent in radians ofx

COS(x) the cosine ofx. x is in radians.

COSH(x) the hyperbolic cosine ofx

EXP(x) ex

LOG(x) the natural logarithm ofx

LOG10(x) the log base ten ofx

LOG2(x) the log base two ofx

SIN(x) the sine ofx. x is in radians.

SINH(x) the hyperbolic sine ofx

SQRT(x) the square root ofx

TAN(x) the tangent ofx. x is in radians and is not an odd multiple of�=2.

TANH(x) the hyperbolic tangent ofx

Random-Number FunctionsThe MODEL procedure provides several functions for generating random numbersfor Monte Carlo simulation. These functions use the same generators as the corre-sponding SAS DATA step functions.

The following random-number functions are supported: RANBIN, RANCAU, RAN-EXP, RANGAM, RANNOR, RANPOI, RANTBL, RANTRI, and RANUNI. Formore information, refer toSAS Language: Reference.

Each reference to a random-number function sets up a separate pseudo-random se-quence. Note that this means that two calls to the same random function with the sameseed produce identical results. This is different from the behavior of the random-number functions used in the SAS DATA step. For example, the statements

x=rannor(123);y=rannor(123);z=rannor(567);



produce identical values for X and Y, but Z is from an independent pseudo-randomsequence.

For FIT tasks, all random-number functions always return 0. For SOLVE tasks, whenMonte Carlo simulation is requested, a random-number function computes a new ran-dom number on the first iteration for an observation (if it is executed on that iteration)and returns that same value for all later iterations of that observation. When MonteCarlo simulation is not requested, random-number functions always return 0.

Functions Across Time

PROC MODEL provides four types of special built-in functions that refer to the val-ues of variables and expressions in previous time periods. These functions have theform

LAGn( i , x ) returns theith lag ofx, wheren is the maximum lag;

DIFn(x) difference ofx at lagn

ZLAGn( i , x ) returns theith lag ofx, wheren is the maximum lag, with missinglags replaced with zero;

ZDIFn(x) difference with lag length truncated and missing values convertedto zero;

MOVAVGn( x ) the width of the moving average isn, andx is the variable or ex-pression to compute the moving average of. Missing values ofxare omitted in computing the average.

wheren represents the number of periods, andx is any expression. The argumenti isa variable or expression giving the lag length (0 <= i <= n), if the index valuei isomitted, the maximum lag lengthn is used.

If you do not specifyn, the number of periods is assumed to be one. For example,LAG(X) is the same as LAG1(X). No more than four digits can be used with a laggingfunction; that is, LAG9999 is the greatest LAG function, ZDIF9999 is the greatestZDIF function, and so on.

The LAG functions get values from previous observations and make them availableto the program. For example, LAG(X) returns the value of the variable X as it wascomputed in the execution of the program for the preceding observation. The expres-sion LAG2(X+2*Y) returns the value of the expression X+2*Y, computed using thevalues of the variables X and Y that were computed by the execution of the programfor the observation two periods ago.

The DIF functions return the difference between the current value of a variable orexpression and the value of its LAG. For example, DIF2(X) is a short way of writ-ing X-LAG2(X), and DIF15(SQRT(2*Z)) is a short way of writing SQRT(2*Z)-LAG15(SQRT(2*Z)).

The ZLAG and ZDIF functions are like the LAG and DIF functions, but they arenot counted in the determination of the program lag length, and they replace missing



values with 0s. The ZLAG function returns the lagged value if the lagged value isnonmissing, or 0 if the lagged value is missing. The ZDIF function returns the differ-enced value if the differenced value is nonmissing, or 0 if the value of the differencedvalue is missing. The ZLAG function is especially useful for models with ARMAerror processes. See "Lag Logic", which follows for details.

Lag LogicThe LAG and DIF lagging functions in the MODEL procedure are different from thequeuing functions with the same names in the DATA step. Lags are determined bythe final values that are set for the program variables by the execution of the modelprogram for the observation. This can have upsetting consequences for programs thattake lags of program variables that are given different values at various places in theprogram, for example,

temp = x + w;t = lag( temp );temp = q - r;s = lag( temp );

The expression LAG(TEMP) always refers to LAG(Q-R), never to LAG(X+W),since Q-R is the final value assigned to the variable TEMP by the model program.If LAG(X+W) is wanted for T, it should be computed as T=LAG(X+W) and notT=LAG(TEMP), as in the preceding example.

Care should also be exercised in using the DIF functions with program variables thatmay be reassigned later in the program. For example, the program

temp = x ;s = dif( temp );temp = 3 * y;

computes values for S equivalent to

s = x - lag( 3 * y );

Note that in the preceding examples, TEMP is a program variable,not a model vari-able. If it were a model variable, the assignments to it would be changed to assign-ments to a corresponding equation variable.

Note that whereas LAG1(LAG1(X)) is the same as LAG2(X), DIF1(DIF1(X)) isnotthe same as DIF2(X). The DIF2 function is the difference between the current pe-riod value at the point in the program where the function is executed and the finalvalue at the end of execution two periods ago; DIF2 is not the second difference.In contrast, DIF1(DIF1(X)) is equal to DIF1(X)-LAG1(DIF1(X)), which equals X-2*LAG1(X)+LAG2(X), which is the second difference of X.

More information on the differences between PROC MODEL and the DATA stepLAG and DIF functions is found in Chapter 2.



Lag LengthsThe lag length of the model program is the number of lags needed for any relevantequation. The program lag length controls the number of observations used to initial-ize the lags.

PROC MODEL keeps track of the use of lags in the model program and automati-cally determines the lag length of each equation and of the model as a whole. PROCMODEL sets the program lag length to the maximum number of lags needed to com-pute any equation to be estimated, solved, or needed to compute any instrument vari-able used.

In determining the lag length, the ZLAG and ZDIF functions are treated as alwayshaving a lag length of 0. For example, if Y is computed as

y = lag2( x + zdif3( temp ) );

then Y has a lag length of 2 (regardless of how TEMP is defined). If Y is computedas

y = zlag2( x + dif3( temp ) );

then Y has a lag length of 0.

This is so that ARMA errors can be specified without causing the loss of additionalobservations to the lag starting phase and so that recursive lag specifications, suchas moving-average error terms, can be used. Recursive lags are not permitted unlessthe ZLAG or ZDIF functions are used to truncate the lag length. For example, thefollowing statement produces an error message:

t = a + b * lag( t );

The program variable T depends recursively on its own lag, and the lag length of T istherefore undefined.

In the following equation RESID.Y depends on the predicted value for the Y equationbut the predicted value for the Y equation depends on the LAG of RESID.Y, and, thus,the predicted value for the Y equation depends recursively on its own lag.

y = yhat + ma * lag( resid.y );

The lag length is infinite, and PROC MODEL prints an error message and stops.Since this kind of specification is allowed, the recursion must be truncated at somepoint. The ZLAG and ZDIF functions do this.

The following equation is legal and results in a lag length for the Y equation equal tothe lag length of YHAT:

y = yhat + ma * zlag( resid.y );



Initially, the lags of RESID.Y are missing, and the ZLAG function replaces the miss-ing residuals with 0s, their unconditional expected values.

The ZLAG0 function can be used to zero out the lag length of an expression.ZLAG0(x) returns the current period value of the expressionx, if nonmissing, orelse returns 0, and prevents the lag length ofx from contributing to the lag length ofthe current statement.

Initializing LagsAt the start of each pass through the data set or BY group, the lag variables are set tomissing values and an initialization is performed to fill the lags. During this phase,observations are read from the data set, and the model variables are given values fromthe data. If necessary, the model is executed to assign values to program variables thatare used in lagging functions. The results for variables used in lag functions are saved.These observations are not included in the estimation or solution.

If, during the execution of the program for the lag starting phase, a lag functionrefers to lags that are missing, the lag function returns missing. Execution errors thatoccur while starting the lags are not reported unless requested. The modeling systemautomatically determines whether the program needs to be executed during the lagstarting phase.

If L is the maximum lag length of any equation being fit or solved, then the first Lobservations are used to prime the lags. If a BY statement is used, the first L observa-tions in the BY group are used to prime the lags. If a RANGE statement is used, thefirst L observations prior to the first observation requested in the RANGE statementare used to prime the lags. Therefore, there should be at least L observations in thedata set.

Initial values for the lags of model variables can also be supplied in VAR, ENDOGE-NOUS, and EXOGENOUS statements. This feature provides initial lags of solutionvariables for dynamic solution when initial values for the solution variable are notavailable in the input data set. For example, the statement

var x 2 3 y 4 5 z 1;

feeds the initial lags exactly like these values in an input data set:

Lag X Y Z2 3 5 .1 2 4 1

If initial values for lags are available in the input data set and initial lag values arealso given in a declaration statement, the values in the VAR, ENDOGENOUS, orEXOGENOUS statements take priority.

The RANGE statement is used to control the range of observations in the input dataset that are processed by PROC MODEL. In the statement

range date = ’01jan1924’d to ’01dec1943’d;



‘01jan1924’ specifies the starting period of the range, and ‘01dec1943’ specifies theending period. The observations in the data set immediately prior to the start of therange are used to initialize the lags.

Language Differences

For the most part, PROC MODEL programming statements work the same as theydo in the DATA step as documented inSAS Language: Reference. However, thereare several differences that should be noted.

DO Statement DifferencesThe DO statement in PROC MODEL does not allow a character index variable. Thus,the following DO statement is not valid in PROC MODEL, although it is supportedin the DATA step:

do i = ’A’, ’B’, ’C’; /* invalid PROC MODEL code */

IF Statement DifferencesThe IF statement in PROC MODEL does not allow a character-valued condition. Forexample, the following IF statement is not supported by PROC MODEL:

if ’this’ then statement;

Comparisons of character values are supported in IF statements, so the following IFstatement is acceptable:

if ’this’ < ’that’ then statement};

PROC MODEL allows for embedded conditionals in expressions. For example thefollowing two statements are equivalent:

flag = if time = 1 or time = 2 then conc+30/5 + dose*timeelse if time > 5 then (0=1) else (patient * flag);

if time = 1 or time = 2 then flag= conc+30/5 + dose*time;else if time > 5 then flag=(0=1); else flag=patient*flag;

Note that the ELSE operator only involves the first object or token after it so that thefollowing assignments are not equivalent:

total = if sum > 0 then sum else sum + reserve;total = if sum > 0 then sum else (sum + reserve);

The first assignment makes TOTAL always equal to SUM plus RESERVE.

PUT Statement DifferencesThe PUT statement, mostly used in PROC MODEL for program debugging, onlysupports some of the features of the DATA step PUT statement. It also has some newfeatures that the DATA step PUT statement does not support.



The PROC MODEL PUT statement does not support line pointers, factored lists,iteration factors, overprinting, the–INFILE– option, or the colon (:) format modifier.

The PROC MODEL PUT statement does support expressions but an expression mustbe enclosed in parentheses. For example, the following statement prints the squareroot of x:

put (sqrt(x));

Subscripted array names must be enclosed in parentheses. For example, the followingstatement prints theith element of the array A:

put (a i);

However, the following statement is an error:

put a i;

The PROC MODEL PUT statement supports the print item–PDV– to print a format-ted listing of all the variables in the program. For example, the following statementprints a much more readable listing of the variables than does the–ALL – print item:

put _pdv_;

To print all the elements of the array A, use the following statement:

put a;

To print all the elements of A with each value labeled by the name of the elementvariable, use the statement

put a=;

ABORT Statement DifferenceIn the MODEL procedure, the ABORT statement does not allow any arguments.

SELECT/WHEN/OTHERWISE Statement DifferencesThe WHEN and OTHERWISE statements allow more than one target statement. Thatis, DO groups are not necessary for multiple statement WHENs. For example inPROC MODEL, the following syntax is valid:

select;when(exp1)

stmt1;stmt2;

when(exp2)stmt3;stmt4;

end;



The ARRAY Statement

ARRAYarrayname [{dimensions}] [$ [length]] [ variables and constants];

The ARRAY statement is used to associate a name with a list of variables and con-stants. The array name can then be used with subscripts in the model program to referto the items in the list.

In PROC MODEL, the ARRAY statement does not support all the features of theDATA step ARRAY statement. Implicit indexing cannot be used; all array referencesmust have explicit subscript expressions. Only exact array dimensions are allowed;lower-bound specifications are not supported. A maximum of six dimensions is al-lowed.

On the other hand, the ARRAY statement supported by PROC MODEL does allowboth variables and constants to be used as array elements. You cannot make as-signments to constant array elements. Both dimension specification and the list ofelements are optional, but at least one must be supplied. When the list of elements isnot given or fewer elements than the size of the array are listed, array variables arecreated by suffixing element numbers to the array name to complete the element list.

The following are valid PROC MODEL array statements:

array x[120]; /* array X of length 120 */array q[2,2]; /* Two dimensional array Q */array b[4] va vb vc vd; /* B[2] = VB, B[4] = VD */array x x1-x30; /* array X of length 30, X[7] = X7 */array a[5] (1 2 3 4 5); /* array A initialized to 1,2,3,4,5 */

RETAIN Statement

RETAIN variables initial-values ;The RETAIN statement causes a program variable to hold its value from a previousobservation until the variable is reassigned. The RETAIN statement can be used toinitialize program variables.

The RETAIN statement does not work for model variables, parameters, or con-trol variables because the values of these variables are under the control of PROCMODEL and not programming statements. Use the PARMS and CONTROL state-ments to initialize parameters and control variables. Use the VAR, ENDOGENOUS,or EXOGENOUS statement to initialize model variables.

Storing Programs in Model Files

Models can be saved and recalled from SAS catalog files. SAS catalogs are specialfiles that can store many kinds of data structures as separate units in one SAS file.Each separate unit is called an entry, and each entry has an entry type that identifiesits structure to the SAS system.

In general, to save a model, use the OUTMODEL=nameoption on the PROCMODEL statement, wherenameis specified aslibref.catalog.entry, libref.entry, orentry. The libref, catalog, andentry names must be valid SAS names no more than



eight characters long. Thecatalogname is restricted to seven characters on the CMSoperating system. If not given, thecatalogname defaults to MODELS, and thelibrefdefaults to WORK. The entry type is always MODEL. Thus, OUTMODEL=X writesthe model to the file WORK.MODELS.X.MODEL.

The MODEL= option is used to read in a model. A list of model files can be specifiedin the MODEL= option, and a range of names with numeric suffixes can be given, asin MODEL=(MODEL1-MODEL10). When more than one model file is given, thelist must be placed in parentheses, as in MODEL=(A B C), except in the case of asingle name. If more than one model file is specified, the files are combined in theorder listed in the MODEL= option.

When the MODEL= option is specified in the PROC MODEL statement and modeldefinition statements are also given later in the PROC MODEL step, the model filesare read in first, in the order listed, and the model program specified in the PROCMODEL step is appended after the model program read from the MODEL= files. Theclass assigned to a variable, when multiple model files are used, is the last declarationof that variable. For example, if Y1 was declared endogenous in the model file M1and exogenous in the model file M2, the following statement will cause Y1 to bedeclared exogenous.

proc model model=(m1 m2);

The INCLUDE statement can be used to append model code to the current modelcode. In contrast, when the MODEL= option is used on the RESET statement, thecurrent model is deleted before the new model is read.

No model file is output by default if the PROC MODEL step performs any FIT orSOLVE tasks, or if the MODEL= option or the NOSTORE option is used. How-ever, to ensure compatibility with previous versions of SAS/ETS software, when thePROC MODEL step does nothing but compile the model program, no input model fileis read, and the NOSTORE option is not used, a model file is written. This model fileis the default input file for a later PROC SYSNLIN or PROC SIMNLIN step. The de-fault output model filename in this case is WORK.MODELS.–MODEL–.MODEL.

If FIT statements are used to estimate model parameters, the parameter estimateswritten to the output model file are the estimates from the last estimation performedfor each parameter.

Diagnostics and Debugging

PROC MODEL provides several features to aid in finding errors in the model pro-gram. These debugging features are not usually needed; most models can be devel-oped without them.

The example model program that follows will be used in the following sections toillustrate the diagnostic and debugging capabilities. This example is the estimationof a segmented model.



*---------Fitting a Segmented Model using MODEL----*| | || y | quadratic plateau || | y=a+b*x+c*x*x y=p || | ..................... || | . : || | . : || | . : || | . : || | . : || +-----------------------------------------X || x0 || || continuity restriction: p=a+b*x0+c*x0**2 || smoothness restriction: 0=b+2*c*x0 so x0=-b/(2*c)|*--------------------------------------------------*;title ’QUADRATIC MODEL WITH PLATEAU’;data a;

input y x @@;datalines;

.46 1 .47 2 .57 3 .61 4 .62 5 .68 6 .69 7

.78 8 .70 9 .74 10 .77 11 .78 12 .74 13 .80 13

.80 15 .78 16;proc model data=a;parms a 0.45 b 0.5 c -0.0025;

x0 = -.5*b / c; /* join point */if x < x0 then /* Quadratic part of model */

y = a + b*x + c*x*x;else /* Plateau part of model */

y = a + b*x0 + c*x0*x0;

fit y;run;

Program ListingThe LIST option produces a listing of the model program. The statements are printedone per line with the original line number and column position of the statement.

The program listing from the example program is shown in Figure 14.73.



QUADRATIC MODEL WITH PLATEAU

The MODEL Procedure


1 15888:74 x0 = (-0.5 * b) / c;2 15888:96 if x < x0 then3 15888:124 PRED.y = a + b * x + c * x * x;3 15888:124 RESID.y = PRED.y - ACTUAL.y;3 15888:124 ERROR.y = PRED.y - y;4 15888:148 else5 15888:176 PRED.y = a + b * x0 + c * x0 * x0;5 15888:176 RESID.y = PRED.y - ACTUAL.y;5 15888:176 ERROR.y = PRED.y - y;

Figure 14.73. LIST Output for Segmented Model

The LIST option also shows the model translations that PROC MODEL performs.LIST output is useful for understanding the code generated by the %AR and the%MA macros.

Cross-ReferenceThe XREF option produces a cross-reference listing of the variables in the modelprogram. The XREF listing is usually used in conjunction with the LIST option. TheXREF listing does not include derivative (@-prefixed) variables. The XREF listingdoes not include generated assignments to equation variables, PRED, RESID, andERROR-prefixed variables, unless the DETAILS option is used.

The cross-reference from the example program is shown in Figure 14.74.

The MODEL Procedure

Cross Reference Listing For ProgramSymbol----------- Kind Type References (statement)/(line):(col)

a Var Num Used: 3/15913:130 5/15913:182b Var Num Used: 1/15913:82 3/15913:133 5/15913:185c Var Num Used: 1/15913:85 3/15913:139 5/15913:192x0 Var Num Assigned: 1/15913:85

Used: 2/15913:103 5/15913:1855/15913:192 5/15913:195

x Var Num Used: 2/15913:103 3/15913:1333/15913:139 3/15913:141

PRED.y Var Num Assigned: 3/15913:136 5/15913:189

Figure 14.74. XREF Output for Segmented Model



Compiler ListingThe LISTCODE option lists the model code and derivatives tables produced by thecompiler. This listing is useful only for debugging and should not normally beneeded.

LISTCODE prints the operator and operands of each operation generated by the com-piler for each model program statement. Many of the operands are temporary vari-ables generated by the compiler and given names such as #temp1. When derivativesare taken, the code listing includes the operations generated for the derivatives calcu-lations. The derivatives tables are also listed.

A LISTCODE option prints the transformed equations from the example shown inFigure 14.75 and Figure 14.76.

The MODEL Procedure


1 16459:83 x0 = (-0.5 * b) / c;1 16459:83 @x0/@b = -0.5 / c;1 16459:83 @x0/@c = (0 - x0) / c;2 16459:105 if x < x0 then3 16459:133 PRED.y = a + b * x + c * x * x;3 16459:133 @PRED.y/@a = 1;3 16459:133 @PRED.y/@b = x;3 16459:133 @PRED.y/@c = x * x;3 16459:133 RESID.y = PRED.y - ACTUAL.y;3 16459:133 @RESID.y/@a = @PRED.y/@a;3 16459:133 @RESID.y/@b = @PRED.y/@b;3 16459:133 @RESID.y/@c = @PRED.y/@c;3 16459:133 ERROR.y = PRED.y - y;4 16459:157 else5 16459:185 PRED.y = a + b * x0 + c * x0 * x0;5 16459:185 @PRED.y/@a = 1;5 16459:185 @PRED.y/@b = x0 + b * @x0/@b + (c

* @x0/@b * x0 + c * x0 * @x0/@b);5 16459:185 @PRED.y/@c = b * @x0/@c + ((x0 + c

* @x0/@c) * x0 + c * x0 * @x0/@c);5 16459:185 RESID.y = PRED.y - ACTUAL.y;5 16459:185 @RESID.y/@a = @PRED.y/@a;5 16459:185 @RESID.y/@b = @PRED.y/@b;5 16459:185 @RESID.y/@c = @PRED.y/@c;5 16459:185 ERROR.y = PRED.y - y;

Figure 14.75. LISTCODE Output for Segmented Model - Statements as Parsed



The MODEL Procedure

1 Stmt ASSIGN line 5619 column83. (1) arg=x0argsave=x0Source Text: x0 = -.5*b / c;

Oper * at 5619:91 (30,0,2). * : #temp1 <- -0.5 bOper / at 5619:94 (31,0,2). / : x0 <- #temp1 cOper eeocf at 5619:94 (18,0,1). eeocf : _DER_ <- _DER_Oper / at 5619:94 (31,0,2). / : @x0/@b <- -0.5 cOper - at 5619:94 (33,0,2). - : @1dt1_2 <- 0 x0Oper / at 5619:94 (31,0,2). / : @x0/@c <- @1dt1_2 c

2 Stmt IF line 5619 column ref.st=ASSIGN stmt105. (2) arg=#temp1 number 5 at 5619:185argsave=#temp1Source Text: if x < x0 then

Oper < at 5619:112 < : #temp1 <- x x0(36,0,2).

3 Stmt ASSIGN line 5619 column133. (1) arg=PRED.yargsave=ySource Text: y = a + b*x + c*x*x;

Oper * at 5619:142 * : #temp1 <- b x(30,0,2).

Oper + at 5619:139 + : #temp2 <- a #temp1(32,0,2).

Oper * at 5619:148 * : #temp3 <- c x(30,0,2).

Oper * at 5619:150 * : #temp4 <- #temp3 x(30,0,2).

Oper + at 5619:145 + : PRED.y <- #temp2 #temp4(32,0,2).

Oper eeocf at 5619:150 eeocf : _DER_ <- _DER_(18,0,1).

Oper * at 5619:150 * : @1dt1_1 <- x x(30,0,2).

Oper = at 5619:145 (1,0,1). = : @PRED.y/@a <- 1Oper = at 5619:145 (1,0,1). = : @PRED.y/@b <- xOper = at 5619:145 (1,0,1). = : @PRED.y/@c <- @1dt1_1

3 Stmt Assign line 5619 column133. (1) arg=RESID.yargsave=y

Oper - at 5619:133 - : RESID.y <- PRED.y ACTUAL.y(33,0,2).

Oper eeocf at 5619:133 eeocf : _DER_ <- _DER_(18,0,1).

Oper = at 5619:133 (1,0,1). = : @RESID.y/@a <- @PRED.y/@aOper = at 5619:133 (1,0,1). = : @RESID.y/@b <- @PRED.y/@bOper = at 5619:133 (1,0,1). = : @RESID.y/@c <- @PRED.y/@c

3 Stmt Assign line 5619 column133. (1) arg=ERROR.yargsave=y

Oper - at 5619:133 - : ERROR.y <- PRED.y y(33,0,2).

4 Stmt ELSE line 5619 column157. (9)Source Text: else

Figure 14.76. LISTCODE Output for Segmented Model - Compiled Code



Analyzing the Structure of Large Models

PROC MODEL provides several features to aid in analyzing the structure of themodel program. These features summarize properties of the model in various forms.

The following Klein’s model program is used to introduce the LISTDEP, BLOCK,and GRAPH options.

proc model out=m data=klein listdep graph block;endogenous c p w i x wsum k y;exogenous wp g t year;parms c0-c3 i0-i3 w0-w3;a: c = c0 + c1 * p + c2 * lag(p) + c3 * wsum;b: i = i0 + i1 * p + i2 * lag(p) + i3 * lag(k);c: w = w0 + w1 * x + w2 * lag(x) + w3 * year;x = c + i + g;y = c + i + g-t;p = x-w-t;k = lag(k) + i;wsum = w + wp;id year;

run;

Dependency ListThe LISTDEP option produces a dependency list for each variable in the model pro-gram. For each variable, a list of variables that depend on it and a list of variables itdepends on is given. The dependency list produced by the example program is shownin Figure 14.77.



The MODEL Procedure

Dependency Listing For ProgramSymbol----------- Dependencies

c Current values affect: ERROR.c PRED.xRESID.x ERROR.x PRED.y RESID.y ERROR.y

p Current values affect: PRED.c RESID.cERROR.c PRED.i RESID.i ERROR.i ERROR.pLagged values affect: PRED.c PRED.i

w Current values affect: ERROR.wPRED.p RESID.p ERROR.p PRED.wsumRESID.wsum ERROR.wsum

i Current values affect: ERROR.i PRED.xRESID.x ERROR.x PRED.y RESID.yERROR.y PRED.k RESID.k ERROR.k

x Current values affect: PRED.w RESID.wERROR.w ERROR.x PRED.p RESID.p ERROR.pLagged values affect: PRED.w

wsum Current values affect: PRED.cRESID.c ERROR.c ERROR.wsum

k Current values affect: ERROR.kLagged values affect: PRED.i RESID.iERROR.i PRED.k RESID.k

Figure 14.77. A Portion of the LISTDEP Output for Klein’s Model

BLOCK ListingThe BLOCK option prints an analysis of the program variables based on the assign-ments in the model program. The output produced by the example is shown in Figure14.78.

The MODEL ProcedureModel Structure Analysis

(Based on Assignments to Endogenous Model Variables)

Exogenous Variables wp g t yearEndogenous Variables c p w i x wsum k y

NOTE: The System Consists of 2 Recursive Equations and 1 Simultaneous Blocks.

Block Structure of the System

Block 1 c p w i x wsum

Dependency Structure of the System

Block 1 Depends On All_Exogenousk Depends On Block 1 All_Exogenousy Depends On Block 1 All_Exogenous

Figure 14.78. The BLOCK Output for Klein’s Model

One use for the block output is to put a model in recursive form. Simulations of themodel can be done with the SEIDEL method, which is efficient if the model is recur-sive and if the equations are in recursive order. By examining the block output, youcan determine how to reorder the model equations for the most efficient simulation.



Adjacency GraphThe GRAPH option displays the same information as the BLOCK option with theaddition of an adjacency graph. An X in a column in an adjacency graph indicatesthat the variable associated with the row depends on the variable associated with thecolumn. The output produced by the example is shown in Figure 14.79.



The MODEL Procedure

Adjacency Matrix for Graph of Systemw ys eu w a

Variable c p w i x m k y p g t r

* * * *c X X . . . X . . . . . .p . X X . X . . . . . X .w . . X . X . . . . . . Xi . X . X . . . . . . . .x X . . X X . . . . X . .wsum . . X . . X . . X . . .k . . . X . . X . . . . .y X . . X . . . X . X X .wp * . . . . . . . . X . . .g * . . . . . . . . . X . .t * . . . . . . . . . . X .year * . . . . . . . . . . . X

(Note: * = Exogenous Variable.)

Transitive Closure Matrix of Sorted Systemwsu

Block Variable c p w i x m k y

1 c X X X X X X . .1 p X X X X X X . .1 w X X X X X X . .1 i X X X X X X . .1 x X X X X X X . .1 wsum X X X X X X . .

k X X X X X X X .y X X X X X X . X

Adjacency Matrix for Graph of SystemIncluding Lagged Impacts

w ys eu w a

Block Variable c p w i x m k y p g t r

* * * *1 c X L . . . X . . . . . .1 p . X X . X . . . . . X .1 w . . X . L . . . . . . X1 i . L . X . . L . . . . .1 x X . . X X . . . . X . .1 wsum . . X . . X . . X . . .

k . . . X . . L . . . . .y X . . X . . . X . X X .wp * . . . . . . . . X . . .g * . . . . . . . . . X . .t * . . . . . . . . . . X .year * . . . . . . . . . . . X

(Note: * = Exogenous Variable.)

Figure 14.79. The GRAPH Output for Klein’s Model


Chapter 14. Examples

The first and last graphs are straightforward. The middle graph represents the depen-dencies of the nonexogenous variables after transitive closure has been performed(that is, A depends on B, and B depends on C, so A depends on C). The precedingtransitive closure matrix indicates that K and Y do not directly or indirectly dependon each other.

Examples

Example 14.1. OLS Single Nonlinear Equation

This example illustrates the use of the MODEL procedure for nonlinear ordinaryleast-squares (OLS) regression. The model is a logistic growth curve for the pop-ulation of the United States. The data is the population in millions recorded at tenyear intervals starting in 1790 and ending in 1990. For an explanation of the startingvalues given by the START= option, see "Troubleshooting Convergence Problems"earlier in this chapter. Portions of the output from the following code are shown inOutput 14.1.1 and Output 14.1.2.

title ’Logistic Growth Curve Model of U.S. Population’;data uspop;

input pop :6.3 @@;retain year 1780;year=year+10;label pop=’U.S. Population in Millions’;datalines;

3929 5308 7239 9638 12866 17069 23191 31443 39818 5015562947 75994 91972 105710 122775 131669 151325 179323 203211226542 248710;

proc model data=uspop;label a = ’Maximum Population’

b = ’Location Parameter’c = ’Initial Growth Rate’;

pop = a / ( 1 + exp( b - c * (year-1790) ) );fit pop start=(a 1000 b 5.5 c .02)/ out=resid outresid;

run;



Output 14.1.1. Logistic Growth Curve Model Summary

Logistic Growth Curve Model of U.S. Population

The MODEL Procedure

Model Summary


Model Variables popParameters a(1000) b(5.5) c(0.02)

Equations pop


The MODEL Procedure

The Equation to Estimate is

pop = F(a, b, c)

Output 14.1.2. Logistic Growth Curve Estimation Summary


The MODEL Procedure



pop 3 18 345.6 19.2020 0.9972 0.9969



a 387.9307 30.0404 12.91 <.0001 Maximum Populationb 3.990385 0.0695 57.44 <.0001 Location Parameterc 0.022703 0.00107 21.22 <.0001 Initial Growth Rate

The adjustedR2 value indicates the model fits the data well. There are only 21observations and the model is nonlinear, so significance tests on the parameters areonly approximate. The significance tests and associated approximate probabilitiesindicate that all the parameters are significantly different from 0.

The FIT statement included the options OUT=RESID and OUTRESID so that theresiduals from the estimation are saved to the data set RESID. The residuals are plot-ted to check for heteroscedasticity using PROC GPLOT as follows.



proc gplot data=resid;plot pop*year / vref=0;title "Residual";symbol1 v=plus;

run;

The plot is shown in Output 14.1.3.

Output 14.1.3. Residual for Population Model (Actual - Predicted)

The residuals do not appear to be independent, and the model could be modified toexplain the remaining nonrandom errors.

Example 14.2. A Consumer Demand Model

This example shows the estimation of a system of nonlinear consumer demand equa-tions based on the translog functional form using seemingly unrelated regression(SUR). Expenditure shares and corresponding normalized prices are given for threegoods.

Since the shares add up to one, the system is singular; therefore, one equation is omit-ted from the estimation process. The choice of which equation to omit is arbitrary.The parameter estimates of the omitted equation (share3) can be recovered from theother estimated parameters. The nonlinear system is first estimated in unrestrictedform.

title1 ’Consumer Demand--Translog Functional Form’;title2 ’Nonsymmetric Model’;proc model data=tlog1;



var share1 share2 p1 p2 p3;parms a1 a2 b11 b12 b13 b21 b22 b23 b31 b32 b33;bm1 = b11 + b21 + b31;bm2 = b12 + b22 + b32;bm3 = b13 + b23 + b33;lp1 = log(p1);lp2 = log(p2);lp3 = log(p3);share1 = ( a1 + b11 * lp1 + b12 * lp2 + b13 * lp3 ) /

( -1 + bm1 * lp1 + bm2 * lp2 + bm3 * lp3 );share2 = ( a2 + b21 * lp1 + b22 * lp2 + b23 * lp3 ) /

( -1 + bm1 * lp1 + bm2 * lp2 + bm3 * lp3 );fit share1 share2

start=( a1 -.14 a2 -.45 b11 .03 b12 .47 b22 .98 b31 .20b32 1.11 b33 .71 ) / outsused = smatrix sur;

run;

A portion of the printed output produced in the preceding example is shown in Out-put 14.2.1 .

Output 14.2.1. Estimation Results from the Unrestricted Model

Consumer Demand--Translog Functional FormNonsymmetric Model

The MODEL Procedure

Model Summary


Model Variables share1 share2 p1 p2 p3Parameters a1(-0.14) a2(-0.45) b11(0.03) b12(0.47) b13 b21

b22(0.98) b23 b31(0.2) b32(1.11) b33(0.71)Equations share1 share2


The MODEL Procedure


share1 = F(a1, b11, b12, b13, b21, b22, b23, b31, b32, b33)share2 = F(a2, b11, b12, b13, b21, b22, b23, b31, b32, b33)

NOTE: At SUR Iteration 2 CONVERGE=0.001 Criteria Met.




The MODEL Procedure

Nonlinear SUR Summary of Residual Errors


share1 5.5 38.5 0.00166 0.000043 0.00656 0.8067 0.7841share2 5.5 38.5 0.00135 0.000035 0.00592 0.9445 0.9380

Nonlinear SUR Parameter Estimates


a1 -0.14881 0.00225 -66.08 <.0001a2 -0.45776 0.00297 -154.29 <.0001b11 0.048382 0.0498 0.97 0.3379b12 0.43655 0.0502 8.70 <.0001b13 0.248588 0.0516 4.82 <.0001b21 0.586326 0.2089 2.81 0.0079b22 0.759776 0.2565 2.96 0.0052b23 1.303821 0.2328 5.60 <.0001b31 0.297808 0.1504 1.98 0.0550b32 0.961551 0.1633 5.89 <.0001b33 0.8291 0.1556 5.33 <.0001



The model is then estimated under the restriction of symmetry (bij=bji).

Hypothesis testing requires that theSmatrix from the unrestricted model be imposedon the restricted model, as explained in "Tests on Parameters" in this chapter. TheSmatrix saved in the data set SMATRIX is requested by the SDATA= option.

A portion of the printed output produced in the following example is shown in Out-put 14.2.2.

title2 ’Symmetric Model’;proc model data=tlog1;

var share1 share2 p1 p2 p3;parms a1 a2 b11 b12 b22 b31 b32 b33;bm1 = b11 + b12 + b31;bm2 = b12 + b22 + b32;bm3 = b31 + b32 + b33;lp1 = log(p1);lp2 = log(p2);lp3 = log(p3);share1 = ( a1 + b11 * lp1 + b12 * lp2 + b31 * lp3 ) /

( -1 + bm1 * lp1 + bm2 * lp2 + bm3 * lp3 );share2 = ( a2 + b12 * lp1 + b22 * lp2 + b32 * lp3 ) /



( -1 + bm1 * lp1 + bm2 * lp2 + bm3 * lp3 );fit share1 share2

start=( a1 -.14 a2 -.45 b11 .03 b12 .47 b22 .98 b31 .20b32 1.11 b33 .71 ) / sdata=smatrix sur;

run;

A chi-square test is used to see if the hypothesis of symmetry is accepted or rejected.(Oc-Ou) has a chi-square distribution asymptotically, whereOc is the constrainedOBJECTIVE*N andOu is the unconstrained OBJECTIVE*N. The degrees of free-dom is equal to the difference in the number of free parameters in the two models.

In this example, Ou is 76.9697 and Oc is 78.4097, resulting in a difference of 1.44with 3 degrees of freedom. You can obtain the probability value by using the follow-ing statements:

data _null_;/* reduced-full, nrestrictions */

p = 1-probchi( 1.44, 3 );put p=;

run;

The output from this DATA step run is ‘P=0.6961858724’. With this probability youcannot reject the hypothesis of symmetry. This test is asymptotically valid.

Output 14.2.2. Estimation Results from the Restricted Model

Consumer Demand--Translog Functional FormSymmetric Model

The MODEL Procedure


share1 = F(a1, b11, b12, b22, b31, b32, b33)share2 = F(a2, b11, b12, b22, b31, b32, b33)



Consumer Demand--Translog Functional FormSymmetric Model

The MODEL Procedure



share1 4 40 0.00166 0.000041 0.00644 0.8066 0.7920share2 4 40 0.00139 0.000035 0.00590 0.9428 0.9385



a1 -0.14684 0.00135 -108.99 <.0001a2 -0.4597 0.00167 -275.34 <.0001b11 0.02886 0.00741 3.89 0.0004b12 0.467827 0.0115 40.57 <.0001b22 0.970079 0.0177 54.87 <.0001b31 0.208143 0.00614 33.88 <.0001b32 1.102415 0.0127 86.51 <.0001b33 0.694245 0.0168 41.38 <.0001



Example 14.3. Vector AR(1) Estimation

This example shows the estimation of a two-variable vector AR(1) error process forthe Grunfeld model (Grunfeld 1960) using the %AR macro. First, the full modelis estimated. Second, the model is estimated with the restriction that the errors areunivariate AR(1) instead of a vector process. The following produces Output 14.3.1and Output 14.3.2.

data grunfeld;input year gei gef gec whi whf whc;label gei = ’Gross Investment GE’

gec = ’Capital Stock Lagged GE’gef = ’Value of Outstanding Shares GE Lagged’whi = ’Gross Investment WH’whc = ’Capital Stock Lagged WH’whf = ’Value of Outstanding Shares Lagged WH’;

datalines;1935 33.1 1170.6 97.8 12.93 191.5 1.81936 45.0 2015.8 104.4 25.90 516.0 .81937 77.2 2803.3 118.0 35.05 729.0 7.41938 44.6 2039.7 156.2 22.89 560.4 18.11939 48.1 2256.2 172.6 18.84 519.9 23.51940 74.4 2132.2 186.6 28.57 628.5 26.51941 113.0 1834.1 220.9 48.51 537.1 36.21942 91.9 1588.0 287.8 43.34 561.2 60.81943 61.3 1749.4 319.9 37.02 617.2 84.4



1944 56.8 1687.2 321.3 37.81 626.7 91.21945 93.6 2007.7 319.6 39.27 737.2 92.41946 159.9 2208.3 346.0 53.46 760.5 86.01947 147.2 1656.7 456.4 55.56 581.4 111.11948 146.3 1604.4 543.4 49.56 662.3 130.61949 98.3 1431.8 618.3 32.04 583.8 141.81950 93.5 1610.5 647.4 32.24 635.2 136.71951 135.2 1819.4 671.3 54.38 723.8 129.71952 157.3 2079.7 726.1 71.78 864.1 145.51953 179.5 2371.6 800.3 90.08 1193.5 174.81954 189.6 2759.9 888.9 68.60 1188.9 213.5;

title1 ’Example of Vector AR(1) Error ProcessUsing Grunfeld’’s Model’;

/* Note: GE stands for General Electricand WH for Westinghouse */

proc model outmodel=grunmod;var gei whi gef gec whf whc;parms ge_int ge_f ge_c wh_int wh_f wh_c;label ge_int = ’GE Intercept’

ge_f = ’GE Lagged Share Value Coef’ge_c = ’GE Lagged Capital Stock Coef’wh_int = ’WH Intercept’wh_f = ’WH Lagged Share Value Coef’wh_c = ’WH Lagged Capital Stock Coef’;

gei = ge_int + ge_f * gef + ge_c * gec;whi = wh_int + wh_f * whf + wh_c * whc;

run;

The preceding PROC MODEL step defines the structural model and stores it in themodel file named GRUNMOD.

The following PROC MODEL step reads in the model, adds the vector autoregressiveterms using %AR, and requests SUR estimation using the FIT statement.

title2 ’With Unrestricted Vector AR(1) Error Process’;proc model data=grunfeld model=grunmod;

%ar( ar, 1, gei whi )fit gei whi / sur;

run;

The final PROC MODEL step estimates the restricted model.

title2 ’With restricted AR(1) Error Process’;proc model data=grunfeld model=grunmod;

%ar( gei, 1 )%ar( whi, 1)fit gei whi / sur;

run;



Output 14.3.1. Results for the Unrestricted Model (Partial Output)

Example of Vector AR(1) Error Process Using Grunfeld’s ModelWith Unrestricted Vector AR(1) Error Process

The MODEL Procedure

Model Summary


Model Variables gei whi gef gec whf whcParameters ge_int ge_f ge_c wh_int wh_f wh_c ar_l1_1_1(0)

ar_l1_1_2(0) ar_l1_2_1(0) ar_l1_2_2(0)Equations gei whi


The MODEL Procedure


gei = F(ge_int, ge_f, ge_c, wh_int, wh_f, wh_c, ar_l1_1_1, ar_l1_1_2)whi = F(ge_int, ge_f, ge_c, wh_int, wh_f, wh_c, ar_l1_2_1, ar_l1_2_2)

NOTE: At SUR Iteration 9 CONVERGE=0.001 Criteria Met.




The MODEL Procedure



gei 5 15 9374.5 625.0 0.7910 0.7352whi 5 15 1429.2 95.2807 0.7940 0.7391



ge_int -42.2858 30.5284 -1.39 0.1863 GE Interceptge_f 0.049894 0.0153 3.27 0.0051 GE Lagged Share

Value Coefge_c 0.123946 0.0458 2.70 0.0163 GE Lagged Capital

Stock Coefwh_int -4.68931 8.9678 -0.52 0.6087 WH Interceptwh_f 0.068979 0.0182 3.80 0.0018 WH Lagged Share

Value Coefwh_c 0.019308 0.0754 0.26 0.8015 WH Lagged Capital

Stock Coefar_l1_1_1 0.990902 0.3923 2.53 0.0233 AR(ar) gei: LAG1

parameter for geiar_l1_1_2 -1.56252 1.0882 -1.44 0.1716 AR(ar) gei: LAG1

parameter for whiar_l1_2_1 0.244161 0.1783 1.37 0.1910 AR(ar) whi: LAG1

parameter for geiar_l1_2_2 -0.23864 0.4957 -0.48 0.6372 AR(ar) whi: LAG1

parameter for whi

Output 14.3.2. Results for the Restricted Model (Partial Output)

Example of Vector AR(1) Error Process Using Grunfeld’s ModelWith Restricted AR(1) Error Process

The MODEL Procedure

Model Summary


Model Variables gei whi gef gec whf whcParameters ge_int ge_f ge_c wh_int wh_f wh_c gei_l1(0) whi_l1(0)

Equations gei whi



Example of Vector AR(1) Error Process Using Grunfeld’s ModelWith Restricted AR(1) Error Process

The MODEL Procedure



gei 4 16 10558.8 659.9 0.7646 0.7204whi 4 16 1669.8 104.4 0.7594 0.7142





Stock Coefwh_int 3.112671 9.2765 0.34 0.7416 WH Interceptwh_f 0.053932 0.0154 3.50 0.0029 WH Lagged Share


Stock Coefgei_l1 0.482397 0.2149 2.24 0.0393 AR(gei) gei lag1

parameterwhi_l1 0.455711 0.2424 1.88 0.0784 AR(whi) whi lag1

parameter

Example 14.4. MA(1) Estimation

This example estimates parameters for an MA(1) error process for the Grunfeldmodel, using both the unconditional least-squares and the maximum-likelihood meth-ods. The ARIMA procedure estimates for Westinghouse equation are shown for com-parison. The output of the following code is summarized in Output 14.4.1:

title1 ’Example of MA(1) Error Process Using Grunfeld’’s Model’;title2 ’MA(1) Error Process Using Unconditional Least Squares’;proc model data=grunfeld model=grunmod;

%ma(gei,1, m=uls);%ma(whi,1, m=uls);fit whi gei start=( gei_m1 0.8 -0.8) / startiter=2;

run;



Output 14.4.1. PROC MODEL Results Using ULS Estimation

Example of MA(1) Error Process Using Grunfeld’s ModelMA(1) Error Process Using Unconditional Least Squares

The MODEL Procedure



whi 4 16 1874.0 117.1 0.7299 0.6793resid.whi 16 1295.6 80.9754gei 4 16 13835.0 864.7 0.6915 0.6337resid.gei 16 7646.2 477.9







Stock Coefgei_m1 -0.87615 0.1614 -5.43 <.0001 MA(gei) gei lag1

parameterwhi_m1 -0.75001 0.2368 -3.17 0.0060 MA(whi) whi lag1

parameter

The estimation summary from the following PROC ARIMA statements is shown inOutput 14.4.2.

title2 ’PROC ARIMA Using Unconditional Least Squares’;

proc arima data=grunfeld;identify var=whi cross=(whf whc ) noprint;estimate q=1 input=(whf whc) method=uls maxiter=40;

run;



Output 14.4.2. PROC ARIMA Results Using ULS Estimation

Example of MA(1) Error Process Using Grunfeld’s ModelPROC ARIMA Using Unconditional Least Squares

The ARIMA Procedure

Unconditional Least Squares Estimation

Approx StdParameter Estimate Error t Value Pr > |t| Lag Variable Shift

MU 3.68608 9.54425 0.39 0.7044 0 whi 0MA1,1 -0.75005 0.23704 -3.16 0.0060 1 whi 0NUM1 0.04914 0.01723 2.85 0.0115 0 whf 0NUM2 0.06731 0.07077 0.95 0.3557 0 whc 0

Constant Estimate 3.686077Variance Estimate 80.97535Std Error Estimate 8.998631AIC 149.0044SBC 152.9873Number of Residuals 20

The model stored in Example 14.3 is read in using the MODEL= option and themoving average terms are added using the %MA macro.

The MA(1) model using maximum likelihood is estimated using the following:

title2 ’MA(1) Error Process Using Maximum Likelihood ’;proc model data=grunfeld model=grunmod;

%ma(gei,1, m=ml);%ma(whi,1, m=ml);fit whi gei;

run;

For comparison, the model is estimated using PROC ARIMA as follows:

title2 ’PROC ARIMA Using Maximum Likelihood ’;proc arima data=grunfeld;

identify var=whi cross=(whf whc) noprint;estimate q=1 input=(whf whc) method=ml;

run;

PROC ARIMA does not estimate systems so only one equation is evaluated.

The estimation results are shown in Output 14.4.3 and Output 14.4.4. The smalldifferences in the parameter values between PROC MODEL and PROC ARIMA canbe eliminated by tightening the convergence criteria for both procedures.



Output 14.4.3. PROC MODEL Results Using ML Estimation

Example of MA(1) Error Process Using Grunfeld’s ModelMA(1) Error Process Using Maximum Likelihood

The MODEL Procedure



whi 4 16 1857.5 116.1 0.7323 0.6821resid.whi 16 1344.0 84.0012gei 4 16 13742.5 858.9 0.6936 0.6361resid.gei 16 8095.3 506.0







Stock Coefgei_m1 -0.78516 0.1942 -4.04 0.0009 MA(gei) gei lag1

parameterwhi_m1 -0.69389 0.2540 -2.73 0.0148 MA(whi) whi lag1

parameter

Output 14.4.4. PROC ARIMA Results Using ML Estimation

Example of MA(1) Error Process Using Grunfeld’s ModelPROC ARIMA Using Maximum Likelihood

The ARIMA Procedure

Maximum Likelihood Estimation

Approx StdParameter Estimate Error t Value Pr > |t| Lag Variable Shift

MU 2.95645 9.20752 0.32 0.7481 0 whi 0MA1,1 -0.69305 0.25307 -2.74 0.0062 1 whi 0NUM1 0.05036 0.01686 2.99 0.0028 0 whf 0NUM2 0.06672 0.06939 0.96 0.3363 0 whc 0

Constant Estimate 2.956449Variance Estimate 81.29645Std Error Estimate 9.016455AIC 148.9113SBC 152.8942Number of Residuals 20



Example 14.5. Polynomial Distributed Lags Using %PDL

This example shows the use of the %PDL macro for polynomial distributed lag mod-els. Simulated data is generated so that Y is a linear function of six lags of X, withthe lag coefficients following a quadratic polynomial. The model is estimated using afourth-degree polynomial, both with and without endpoint constraints. The exampleuses simulated data generated from the following model:

yt = 10 +

6Xz=0

f(z)xt�z + �

f(z) = �5z2 + 1:5z

The LIST option prints the model statements added by the %PDL macro.

/*--------------------------------------------------------------*//* Generate Simulated Data for a Linear Model with a PDL on X *//* y = 10 + x(6,2) + e *//* pdl(x) = -5.*(lg)**2 + 1.5*(lg) + 0. *//*--------------------------------------------------------------*/data pdl;

pdl2=-5.; pdl1=1.5; pdl0=0;array zz(i) z0-z6;do i=1 to 7;

z=i-1;zz=pdl2*z**2 + pdl1*z + pdl0;end;

do n=-11 to 30;x =10*ranuni(1234567)-5;pdl=z0*x + z1*xl1 + z2*xl2 + z3*xl3 + z4*xl4 + z5*xl5 + z6*xl6;e =10*rannor(123);y =10+pdl+e;if n>=1 then output;xl6=xl5; xl5=xl4; xl4=xl3; xl3=xl2; xl2=xl1; xl1=x;end;

run;

title1 ’Polynomial Distributed Lag Example’;

title3 ’Estimation of PDL(6,4) Model-- No Endpoint Restrictions’;proc model data=pdl;

parms int; /* declare the intercept parameter */%pdl( xpdl, 6, 4 ) /* declare the lag distribution */y = int + %pdl( xpdl, x ); /* define the model equation */fit y / list; /* estimate the parameters */

run;



Output 14.5.1. PROC MODEL Listing of Generated Program

Polynomial Distributed Lag Example

Estimation of PDL(6,4) Model-- No Endpoint Restrictions

The MODEL Procedure


1 25242:14 XPDL_L0 = XPDL_0;2 25254:14 XPDL_L1 = XPDL_0 + XPDL_1 +

XPDL_2 + XPDL_3 + XPDL_4;3 25283:14 XPDL_L2 = XPDL_0 + XPDL_1 *

2 + XPDL_2 * 2 ** 2 + XPDL_3* 2 ** 3 + XPDL_4 * 2 ** 4;

4 25331:14 XPDL_L3 = XPDL_0 + XPDL_1 *3 + XPDL_2 * 3 ** 2 + XPDL_3* 3 ** 3 + XPDL_4 * 3 ** 4;




8 25121:204 PRED.y = int + XPDL_L0 * x + XPDL_L1 *LAG1( x ) + XPDL_L2 * LAG2( x ) +XPDL_L3 * LAG3( x ) + XPDL_L4* LAG4( x ) + XPDL_L5 * LAG5(x ) + XPDL_L6 * LAG6( x );

8 25121:204 RESID.y = PRED.y - ACTUAL.y;8 25121:204 ERROR.y = PRED.y - y;9 25218:15 ESTIMATE XPDL_L0, XPDL_L1, XPDL_L2,

XPDL_L3, XPDL_L4, XPDL_L5, XPDL_L6;10 25218:15 _est0 = XPDL_L0;11 25221:15 _est1 = XPDL_L1;12 25224:15 _est2 = XPDL_L2;13 25227:15 _est3 = XPDL_L3;14 25230:15 _est4 = XPDL_L4;15 25233:15 _est5 = XPDL_L5;16 25238:14 _est6 = XPDL_L6;



Output 14.5.2. PROC MODEL Results Specifying No Endpoint Restrictions


Estimation of PDL(6,4) Model-- No Endpoint Restrictions

The MODEL Procedure



y 6 18 2070.8 115.0 10.7259 0.9998 0.9998



int 9.621969 2.3238 4.14 0.0006XPDL_0 0.084374 0.7587 0.11 0.9127 PDL(XPDL,6,4)

parameter for (L)**0XPDL_1 0.749956 2.0936 0.36 0.7244 PDL(XPDL,6,4)

parameter for (L)**1XPDL_2 -4.196 1.6215 -2.59 0.0186 PDL(XPDL,6,4)

parameter for (L)**2XPDL_3 -0.21489 0.4253 -0.51 0.6195 PDL(XPDL,6,4)

parameter for (L)**3XPDL_4 0.016133 0.0353 0.46 0.6528 PDL(XPDL,6,4)

parameter for (L)**4

The LIST output for the model without endpoint restrictions is shown in Out-put 14.5.1 and Output 14.5.2. The first seven statements in the generated programare the polynomial expressions for lag parameters XPDL–L0 through XPDL–L6.The estimated parameters are INT, XPDL–0, XPDL–1, XPDL–2, XPDL–3, andXPDL–4.

Portions of the output produced by the following PDL model with endpoints of themodel restricted to 0 are presented in Output 14.5.3 and Output 14.5.4.

title3 ’Estimation of PDL(6,4) Model-- Both Endpoint Restrictions’;proc model data=pdl ;

parms int; /* declare the intercept parameter */%pdl( xpdl, 6, 4, r=both ) /* declare the lag distribution */y = int + %pdl( xpdl, x ); /* define the model equation */fit y /list; /* estimate the parameters */

run;



Output 14.5.3. PROC MODEL Results Specifying Both Endpoint Restrictions


Estimation of PDL(6,4) Model-- Both Endpoint Restrictions

The MODEL Procedure



y 4 20 449868 22493.4 150.0 0.9596 0.9535



int 17.08581 32.4032 0.53 0.6038XPDL_2 13.88433 5.4361 2.55 0.0189 PDL(XPDL,6,4)

parameter for (L)**2XPDL_3 -9.3535 1.7602 -5.31 <.0001 PDL(XPDL,6,4)

parameter for (L)**3XPDL_4 1.032421 0.1471 7.02 <.0001 PDL(XPDL,6,4)


Note that XPDL–0 and XPDL–1 are not shown in the estimate summary. They wereused to satisfy the endpoint restrictions analytically by the generated %PDL macrocode. Their values can be determined by back substitution.

To estimate the PDL model with one or more of the polynomial terms dropped, spec-ify the largest degree of the polynomial desired with the %PDL macro and use theDROP= option on the FIT statement to remove the unwanted terms. The droppedparameters should be set to 0. The following PROC MODEL code demonstratesestimation with a PDL of degree 2 without the 0th order term.

title3 ’Estimation of PDL(6,2) Model-- With XPDL_0 Dropped’;proc model data=pdl list;

parms int; /* declare the intercept parameter */%pdl( xpdl, 6, 2 ) /* declare the lag distribution */y = int + %pdl( xpdl, x ); /* define the model equation */xpdl_0 =0;fit y drop=xpdl_0; /* estimate the parameters */

run;

The results from this estimation are shown in Output 14.5.4.



Output 14.5.4. PROC MODEL Results Specifying %PDL( XPDL, 6, 2)


Estimation of PDL(6,2) Model-- With XPDL_0 Dropped

The MODEL Procedure



y 3 21 2114.1 100.7 10.0335 0.9998 0.9998



int 9.536382 2.1685 4.40 0.0003XPDL_1 1.883315 0.3159 5.96 <.0001 PDL(XPDL,6,2)

parameter for (L)**1XPDL_2 -5.08827 0.0656 -77.56 <.0001 PDL(XPDL,6,2)


Example 14.6. General-Form Equations

Data for this example are generated. General-form equations are estimated and fore-cast using PROC MODEL. The system is a basic supply-demand model. Portions ofthe output from the following code is shown in Output 14.6.1 through Output 14.6.4.

title1 "General Form Equations for Supply-Demand Model";

proc model;var price quantity income unitcost;parms d0-d2 s0-s2;eq.demand=d0+d1*price+d2*income-quantity;eq.supply=s0+s1*price+s2*unitcost-quantity;

/* estimate the model parameters */fit supply demand / data=history outest=est n2sls;instruments income unitcost year;

run;

/* produce forecasts for income and unitcost assumptions */solve price quantity / data=assume out=pq;

run;

/* produce goal-seeking solutions forincome and quantity assumptions*/

solve price unitcost / data=goal out=pc;run;

title2 "Parameter Estimates for the System";proc print data=est;run;



title2 "Price Quantity Solution";proc print data=pq;run;

title2 "Price Unitcost Solution";proc print data=pc;run;

Three data sets were used in this example. The first data set, HISTORY, was used toestimate the parameters of the model. The ASSUME data set was used to produce aforecast of PRICE and QUANTITY. Notice that the ASSUME data set does not haveto contain the variables PRICE and QUANTITY.

data history;input year income unitcost price quantity;datalines;

1976 2221.87 3.31220 0.17903 266.7141977 2254.77 3.61647 0.06757 276.0491978 2285.16 2.21601 0.82916 285.8581979 2319.37 3.28257 0.33202 295.0341980 2369.38 2.84494 0.63564 310.7731981 2395.26 2.94154 0.62011 319.1851982 2419.52 2.65301 0.80753 325.9701983 2475.09 2.41686 1.01017 342.4701984 2495.09 3.44096 0.52025 348.3211985 2536.72 2.30601 1.15053 360.750;

data assume;input year income unitcost;datalines;

1986 2571.87 2.312201987 2609.12 2.456331988 2639.77 2.516471989 2667.77 1.656171990 2705.16 1.01601;

The output produced by the first SOLVE statement is shown in Output 14.6.3.

The third data set, GOAL, is used in a forecast of PRICE and UNITCOST as a func-tion of INCOME and QUANTITY.

data goal;input year income quantity;datalines;

1986 2571.87 371.41987 2721.08 416.51988 3327.05 597.31989 3885.85 764.11990 3650.98 694.3;



The output from the final SOLVE statement is shown in Output 14.6.4.

Output 14.6.1. Printed Output from the FIT Statement

General Form Equations for Supply-Demand Model

The MODEL Procedure


supply = F(s0(1), s1(price), s2(unitcost))demand = F(d0(1), d1(price), d2(income))

Instruments 1 income unitcost year

General Form Equations for Supply-Demand Model

The MODEL Procedure

Nonlinear 2SLS Summary of Residual Errors


supply 3 7 3.3240 0.4749 0.6891demand 3 7 1.0829 0.1547 0.3933

Nonlinear 2SLS Parameter Estimates


d0 -395.887 4.1841 -94.62 <.0001d1 0.717328 0.5673 1.26 0.2466d2 0.298061 0.00187 159.65 <.0001s0 -107.62 4.1780 -25.76 <.0001s1 201.5711 1.5977 126.16 <.0001s2 102.2116 1.1217 91.12 <.0001

Output 14.6.2. Listing of OUTEST= Data Set Created in the FIT Statement

General Form Equations for Supply-Demand ModelParameter Estimates for the System

_S _

_ _ T NN T A UA Y T S

O M P U Eb E E S D d d d s s ss _ _ _ _ 0 1 2 0 1 2

1 2SLS 0 Converged 10 -395.887 0.71733 0.29806 -107.620 201.571 102.212



Output 14.6.3. Listing of OUT= Data Set Created in the First SOLVE Statement

General Form Equations for Supply-Demand ModelPrice Quantity Solution

Obs _TYPE_ _MODE_ _ERRORS_ price quantity income unitcost year

1 PREDICT SIMULATE 0 1.20473 371.552 2571.87 2.31220 19862 PREDICT SIMULATE 0 1.18666 382.642 2609.12 2.45633 19873 PREDICT SIMULATE 0 1.20154 391.788 2639.77 2.51647 19884 PREDICT SIMULATE 0 1.68089 400.478 2667.77 1.65617 19895 PREDICT SIMULATE 0 2.06214 411.896 2705.16 1.01601 1990

Output 14.6.4. Listing of OUT= Data Set Created in the Second SOLVE Statement

General Form Equations for Supply-Demand ModelPrice Unitcost Solution

Obs _TYPE_ _MODE_ _ERRORS_ price quantity income unitcost year

1 PREDICT SIMULATE 0 0.99284 371.4 2571.87 2.72857 19862 PREDICT SIMULATE 0 1.86594 416.5 2721.08 1.44798 19873 PREDICT SIMULATE 0 2.12230 597.3 3327.05 2.71130 19884 PREDICT SIMULATE 0 2.46166 764.1 3885.85 3.67395 19895 PREDICT SIMULATE 0 2.74831 694.3 3650.98 2.42576 1990

Example 14.7. Spring and Damper Continuous System

This model simulates the mechanical behavior of a spring and damper system shownin Figure 14.80.

spring constant K damper constant C

initial displacement 10

initial velocity 0

mass 9.2

Figure 14.80. Spring and Damper System Model

A mass is hung from a spring with spring constant K. The motion is slowed by adamper with damper constant C. The damping force is proportional to the velocity,while the spring force is proportional to the displacement.

This is actually a continuous system; however, the behavior can be approximated bya discrete time model. We approximate the differential equation

@ disp

@ time= velocity



with the difference equation

� disp

� time= velocity

This is rewritten

disp� LAG(disp)

dt= velocity

wheredt is the time step used. In PROC MODEL, this is expressed with the programstatement

disp = lag(disp) + vel * dt;

This statement is simply a computing formula for Euler’s approximation for the inte-gral

disp =

Zvelocity dt

If the time step is small enough with respect to the changes in the system, the approx-imation is good. Although PROC MODEL does not have the variable step-size anderror-monitoring features of simulators designed for continuous systems, the proce-dure is a good tool to use for less challenging continuous models.

This model is unusual because there are no exogenous variables, and endogenousdata are not needed. Although you still need a SAS data set to count the simulationperiods, no actual data are brought in.

Since the variables DISP and VEL are lagged, initial values specified in the VARstatement determine the starting state of the system. The mass, time step, springconstant, and damper constant are declared and initialized by a CONTROL statement.

title1 ’Simulation of Spring-Mass-Damper System’;

/*- Generate some obs. to drive the simulation time periods ---*/data one;

do n=1 to 100;output;

end;run;

proc model data=one;var force -200 disp 10 vel 0 accel -20 time 0;control mass 9.2 c 1.5 dt .1 k 20;force = -k * disp -c * vel;disp = lag(disp) + vel * dt;vel = lag(vel) + accel * dt;accel = force / mass;time = lag(time) + dt;



The displacement scale is zeroed at the point where the force of gravity is offset,so the acceleration of the gravity constant is omitted from the force equation. Thecontrol variable C and K represent the damper and the spring constants respectively.

The model is simulated three times, and the simulation results are written to outputdata sets. The first run uses the original initial conditions specified in the VAR state-ment. In the second run, the time step is reduced by half. Notice that the path of thedisplacement is close to the old path, indicating that the original time step is shortenough to yield an accurate solution. In the third run, the initial displacement is dou-bled; the results show that the period of the motion is unaffected by the amplitude.These simulations are performed by the following statements:

/*- Simulate the model for the base case -------------------*/control run ’1’;solve / out=a;

run;

/*- Simulate the model with half the time step -------------*/control run ’2’ dt .05;solve / out=b;

run;

/*- Simulate the model with twice the initial displacement -*/control run ’3’;var disp 20;solve / out=c;

run;

The output SAS data sets containing the solution results are merged and the displace-ment time paths for the three simulations are plotted. The three runs are identifiedon the plot as 1, 2, and 3. The following code produces Output 14.7.1 through Out-put 14.7.2.

/*- Plot the results ---------------------------------------*/data p;

set a b c;run;

title2 ’Overlay Plot of All Three Simulations’;proc gplot data=p;

plot disp*time=run;run;



Output 14.7.1. Printed Output Produced by PROC MODEL SOLVE Statements

Simulation of Spring-Mass-Damper System

The MODEL Procedure

Model Summary

Model Variables 5Control Variables 5Equations 5Number of Statements 5Program Lag Length 1

Model Variables force(-200) disp(10) vel(0) accel(-20) time(0)Control Variables mass(9.2) c(1.5) dt(0.1) k(20) run(1)

Equations force disp vel accel time


The MODEL ProcedureDynamic Simultaneous Simulation

Data Set Options

DATA= ONEOUT= A

Solution Summary

Variables Solved 5Simulation Lag Length 1Solution Method NEWTONCONVERGE= 1E-8Maximum CC 8.68E-15Maximum Iterations 1Total Iterations 99Average Iterations 1



Variables Solved For force disp vel accel time





Data Set Options

DATA= ONEOUT= B

Solution Summary









Data Set Options

DATA= ONEOUT= C

Solution Summary





Output 14.7.2. Overlay Plot of all Three Simulations



Example 14.8. Nonlinear FIML Estimation

The data and model for this example were obtained from Bard (1974, p.133-138).The example is a two-equation econometric model used by Bodkin and Klein to fitU.S production data for the years 1909-1949. The model is the following:

g1 = c110c2z4(c5z

�c41 + (1� c5)z

�c42 )�c3=c4 � z3 = 0

g2 = [c5=(1 � c5)](z1=z2)(�1�c4) � z5 = 0

wherez1 is capital input,z2 is labor input,z3 is real output,z4 is time in years with1929 as year zero, andz5 is the ratio of price of capital services to wage scale. Theci’s are the unknown parameters.z1 andz2 are considered endogenous variables. AFIML estimation is performed.

data bodkin;input z1 z2 z3 z4 z5;

datalines;1.33135 0.64629 0.4026 -20 0.244471.39235 0.66302 0.4084 -19 0.234541.41640 0.65272 0.4223 -18 0.232061.48773 0.67318 0.4389 -17 0.222911.51015 0.67720 0.4605 -16 0.224871.43385 0.65175 0.4445 -15 0.218791.48188 0.65570 0.4387 -14 0.232031.67115 0.71417 0.4999 -13 0.238281.71327 0.77524 0.5264 -12 0.265711.76412 0.79465 0.5793 -11 0.234101.76869 0.71607 0.5492 -10 0.221811.80776 0.70068 0.5052 -9 0.181571.54947 0.60764 0.4679 -8 0.229311.66933 0.67041 0.5283 -7 0.205951.93377 0.74091 0.5994 -6 0.194721.95460 0.71336 0.5964 -5 0.179812.11198 0.75159 0.6554 -4 0.180102.26266 0.78838 0.6851 -3 0.169332.33228 0.79600 0.6933 -2 0.162792.43980 0.80788 0.7061 -1 0.169062.58714 0.84547 0.7567 0 0.162392.54865 0.77232 0.6796 1 0.161032.26042 0.67880 0.6136 2 0.144561.91974 0.58529 0.5145 3 0.200791.80000 0.58065 0.5046 4 0.183071.86020 0.62007 0.5711 5 0.183521.88201 0.65575 0.6184 6 0.188471.97018 0.72433 0.7113 7 0.204152.08232 0.76838 0.7461 8 0.188471.94062 0.69806 0.6981 9 0.178001.98646 0.74679 0.7722 10 0.199792.07987 0.79083 0.8557 11 0.211152.28232 0.88462 0.9925 12 0.23453



2.52779 0.95750 1.0877 13 0.209372.62747 1.00285 1.1834 14 0.198432.61235 0.99329 1.2565 15 0.188982.52320 0.94857 1.2293 16 0.172032.44632 0.97853 1.1889 17 0.181402.56478 1.02591 1.2249 18 0.194312.64588 1.03760 1.2669 19 0.194922.69105 0.99669 1.2708 20 0.17912;

proc model data=bodkin;parms c1-c5;endogenous z1 z2;exogenous z3 z4 z5;

eq.g1 = c1 * 10 **(c2 * z4) * (c5*z1**(-c4)+(1-c5)*z2**(-c4))**(-c3/c4) - z3;

eq.g2 = (c5/(1-c5))*(z1/z2)**(-1-c4) -z5;

fit g1 g2 / fiml ;run;

When FIML estimation is selected, the log likelihood of the system is output as theobjective value. The results of the estimation are show in Output 14.8.1.

Output 14.8.1. FIML Estimation Results for U.S. Production Data

The MODEL Procedure

Nonlinear FIML Summary of Residual Errors


g1 4 37 0.0529 0.00143 0.0378g2 1 40 0.0173 0.000431 0.0208

Nonlinear FIML Parameter Estimates


c1 0.58395 0.0218 26.76 <.0001c2 0.005877 0.000673 8.74 <.0001c3 1.3636 0.1148 11.87 <.0001c4 0.473688 0.2699 1.75 0.0873c5 0.446748 0.0596 7.49 <.0001


Used 41 Log Likelihood 110.7773Missing 0



Example 14.9. Circuit Estimation

Consider the nonlinear circuit shown in Figure 14.81.

21

Figure 14.81. Nonlinear Resistor Capacitor Circuit

The theory of electric circuits is governed by Kirchhoff’s laws: the sum of the cur-rents flowing to a node is zero, and the net voltage drop around a closed loop is zero.In addition to Kirchhoff’s laws, there are relationships between the current I througheach element and the voltage drop V across the elements. For the circuit in Figure14.81, the relationships are

CdV

dt= I

for the capacitor and

V = (R1 + R2(1� exp(�V )))I

for the nonlinear resistor. The following differential equation describes the current atnode 2 as a function of time and voltage for this circuit:

label dvdt

CdV2

dt� V1 �V2

R1 + R2(1� exp(�V ))= 0

This equation can be written in the form

dV2

dt=

V1 �V2

(R1 + R2(1� exp(�V )))C

Consider the following data.

data circ;input v2 v1 time@@;datalines;

-0.00007 0.0 0.0000000001 0.00912 0.5 0.00000000020.03091 1.0 0.0000000003 0.06419 1.5 0.00000000040.11019 2.0 0.0000000005 0.16398 2.5 0.00000000060.23048 3.0 0.0000000007 0.30529 3.5 0.00000000080.39394 4.0 0.0000000009 0.49121 4.5 0.00000000100.59476 5.0 0.0000000011 0.70285 5.0 0.00000000120.81315 5.0 0.0000000013 0.90929 5.0 0.0000000014



1.01412 5.0 0.0000000015 1.11386 5.0 0.00000000161.21106 5.0 0.0000000017 1.30237 5.0 0.00000000181.40461 5.0 0.0000000019 1.48624 5.0 0.00000000201.57894 5.0 0.0000000021 1.66471 5.0 0.0000000022

;

You can estimate the parameters in the previous equation by using the following SASstatements:

proc model data=circ mintimestep=1.0e-23;parm R2 2000 R1 4000 C 5.0e-13;dert.v2 = (v1-v2)/((r1 + r2*(1-exp( -(v1-v2)))) * C);fit v2;

run;

The results of the estimation are shown in Output 14.9.1.

Output 14.9.1. Circuit Estimation

The MODEL Procedure



R2 3002.465 1556.5 1.93 0.0688R1 4984.848 1504.9 3.31 0.0037C 5E-13 1.01E-22 4.941E9 <.0001

Example 14.10. Systems of Differential Equations

The following is a simplified reaction scheme for the competitive inhibitors withrecombinant human renin (Morelock et al. 1995).

I+E + D

k2r

EI

ED

k2f

k1f

k1r

Figure 14.82. Competitive Inhibition of Recombinant Human Renin

In Figure 14.82,E= enzyme,D= probe, andI= inhibitor.



The differential equations describing this reaction scheme are

dD

dt= k1r�ED � k1f�E�D

dED

dt= k1f�E�D � k1r�ED

dE

dt= k1r�ED � k1f�E�D + k2r�EI � k2f�E�I

dEI

dt= k2f�E�I � k2r�EI

dI

dt= k2r�EI � k2f�E�I

For this system, the initial values for the concentrations are derived from equilibriumconsiderations (as a function of parameters) or are provided as known values.

The experiment used to collect the data was carried out in two ways; pre-incubation(type=’disassoc’) and no pre-incubation (type=’assoc’). The data also contain re-peated measurements. The data contain values for fluorescence F, which is a functionof concentration. Since there are no direct data for the concentrations, all the differ-ential equations are simulated dynamically.

The SAS statements used to fit this model are

proc model data=fit;

parameters qf = 2.1e8qb = 4.0e9k2f = 1.8e5k2r = 2.1e-3l = 0;

k1f = 6.85e6;k1r = 3.43e-4;

/* Initial values for concentrations */control dt 5.0e-7

et 5.0e-8it 8.05e-6;

/* Association initial values --------------*/if type = ’assoc’ and time=0 then

do;ed = 0;

/* solve quadratic equation ----------*/a = 1;b = -(&it+&et+(k2r/k2f));c = &it*&et;ei = (-b-(((b**2)-(4*a*c))**.5))/(2*a);



d = &dt-ed;i = &it-ei;e = &et-ed-ei;

end;

/* Disassociation initial values ----------*/if type = ’disassoc’ and time=0 then

do;ei = 0;a = 1;b = -(&dt+&et+(&k1r/&k1f));c = &dt*&et;ed = (-b-(((b**2)-(4*a*c))**.5))/(2*a);d = &dt-ed;i = &it-ei;e = &et-ed-ei;

end;

if time ne 0 thendo;

dert.d = k1r* ed - k1f *e *d;

dert.ed = k1f* e *d - k1r*ed;

dert.e = k1r* ed - k1f* e * d + k2r * ei - k2f * e *i;

dert.ei = k2f* e *i - k2r * ei;

dert.i = k2r * ei - k2f* e *i;

end;

/* L - offset between curves */if type = ’disassoc’ then

F = (qf*(d-ed)) + (qb*ed) -L;else

F = (qf*(d-ed)) + (qb*ed);

Fit F / method=marquardt;run;

This estimation requires the repeated simulation of a system of 42 differential equa-tions (5 base differential equations and 36 differential equations to compute the par-tials with respect to the parameters).

The results of the estimation are shown in Output 14.10.1.



Output 14.10.1. Kinetics Estimation

The MODEL Procedure



f 5 797 2525.0 3.1681 1.7799 0.9980 0.9980



qf 2.0413E8 681443 299.55 <.0001qb 4.2263E9 9133179 462.74 <.0001k2f 6451229 867011 7.44 <.0001k2r 0.007808 0.00103 7.55 <.0001l -5.76981 0.4138 -13.94 <.0001

Example 14.11. Monte Carlo Simulation

This example illustrates how the form of the error in a ODE model affects the resultsfrom a static and dynamic estimation. The differential equation studied is

dy

dt= a� ay

The analytical solution to this differential equation is

y = 1� exp(�at)

The first data set contains errors that are strictly additive and independent. The datafor this estimation are generated by the following DATA step:

data drive1;a = 0.5;do iter=1 to 100;

do time = 0 to 50;y = 1 - exp(-a*time) + 0.1 *rannor(123);output;

end;end;

The second data set contains errors that are cumulative in form.

data drive2;a = 0.5;yp = 1.0 + 0.01 *rannor(123);do iter=1 to 100;



do time = 0 to 50;y = 1 - exp(-a)*(1 - yp);yp = y + 0.01 *rannor(123);output;

end;end;

The following statements perform the 100 static estimations for each data set:

proc model data=drive1 noprint;parm a 0.5;dert.y = a - a * y;fit y / outest=est;by iter;

run;

Similar code is used to produce 100 dynamic estimations with a fixed and an unknowninitial value. The first value in the data set is used to simulate an error in the initialvalue. The following PROC UNIVARIATE code processes the estimations:

proc univariate data=est noprint;var a;output out=monte mean=mean p5=p5 p95=p95;

run;

proc print data=monte; run;

The results of these estimations are summarized in Table 14.4.

Table 14.4. Monte Carlo Summary, A=0.5

Estimation Additive Error Cumulative ErrorType mean p95 p5 mean p95 p5

static 0.77885 1.03524 0.54733 0.57863 1.16112 0.31334dynamic fixed 0.48785 0.63273 0.37644 3.8546E24 8.88E10 -51.9249dynamic unknown 0.48518 0.62452 0.36754 641704.51 1940.42 -25.6054

For this example model, it is evident that the static estimation is the least sensitive tomisspecification.



References

Aiken, R.C., ed. (1985),Stiff Computation,New York: Oxford University Press.

Amemiya, T. (1974), "The Nonlinear Two-stage Least-squares Estimator,"Journal ofEconometrics, 2, 105–110.

Amemiya, T. (1977), "The Maximum Likelihood Estimator and the Nonlinear Three-Stage Least Squares Estimator in the General Nonlinear Simultaneous EquationModel," Econometrica, 45(4), 955–968.

Amemiya, T. (1985),Advanced Econometrics, Cambridge, MA: Harvard UniversityPress.

Andrews, D.W.K. (1991), "Heteroscedasticity and Autocorrelation Consistent Co-variance Matrix Estimation,"Econometrica, 59(3), 817–858.

Andrews, D.W.K., and Monahan, J.C. (1992), "An Improved Heteroscedasticity andAutocorrelation Consistent Covariance Matrix Estimator,"Econometrica, 60(4),953–966.

Bard, Yonathan (1974),Nonlinear Parameter Estimation, New York: AcademicPress, Inc.

Bates, D.M. and Watts, D.G. (1981), "A Relative Offset Orthogonality ConvergenceCriterion for Nonlinear Least Squares,"Technometrics, 23, 2, 179–183.

Belsley, D.A., Kuh, E., and Welsch, R.E. (1980),Regression Diagnostics, New York:John Wiley & Sons, Inc.

Binkley, J.K. and Nelson, G. (1984),"Impact of Alternative Degrees of Freedom Cor-rections in Two and Three Stage Least Squares,"Journal of Econometrics, 24,3,223–233.

Bowden, R.J. and Turkington, D.A. (1984),Instrumental Variables, Cambridge Uni-versity Press.

Bratley, P., B.L. Fox, and H. Niederreiter (1992), "Implementation and tests of Low-Discrepancy Sequences",ACM Transactions on Modeling and Computer Simu-lation, 2(3), pg 195-213.

Breusch, T.S. and Pagan, A.R., (1979), "A Simple Test for Heteroscedasticity andRandom Coefficient Variation,"Econometrica, 47(5), 1287–1294.

Breusch, T.S. and Pagan, A.R. (1980), "The Lagrange Multiplier Test and its Appli-cations to Model Specification in Econometrics,"Review of Econometric Studies,47, 239–253.

Byrne, G.D. and Hindmarsh, A.C. (1975), "A Polyalgorithm for the Numerical Solu-tion of ODEs," inACM TOMS,1(1), 71–96.

Calzolari, G. and Panattoni, L. (1988),"Alternative Estimators of FIML CovarianceMatrix: A Monte Carlo Study,"Econometrica, 56(3), 701–714.

Chan, K.C. and Karolyi, G. Andrew and Longstaff, Francis A. and Sanders, AnthonyB. (1992),“An Empirical Comparison of Alternate Models of the Short-Term In-terest Rate”,The Journal of Finance, 47(3), pg 1209–1227.


Chapter 14. References

Christensen, L.R., Jorgenson, D.W. and L.J. Lau (1975), "Transcendental Logarith-mic Utility Functions,"American Economic Review, 65, 367–383.

Dagenais, M. G. (1978), "The Computation of FIML Estimates as Iterative Gener-alized Least Squares Estimates in Linear and Nonlinear Simultaneous EquationModels,"Econometrica, 46, 6, 1351–1362.

Davidian, M and D.M Giltinan (1995),Nonlinear Models for Repeated MeasurementData,London: Chapman & Hall.

Davidson, R and MacKinnon, J.G. (1993),Estimation and Inference in Econometrics,New York: Oxford University Press.

Fair, R.C. (1984),Specification, Estimation, and Analysis of Macroeconometric Mod-els, Cambridge: Harvard University Press.

Ferson, Wayne E. and Foerster, Stephen R. (1993), "Finite Sample Properties of theGeneralized Method of Moments in Tests of Conditional Asset Pricing Models,"Working Paper No. 77, University of Washington.

Fox, B.L. (1986), “Algorithm 647: Implementation and Relative Ef ficiency ofQuasirandom Sequence Generators”,ACM Transactions on Mathematical soft-ware,12(4), pg 362-276.

Gallant, A.R. (1977), "Three-Stage Least Squares Estimation for a System of Simul-taneous, Nonlinear, Implicit Equations,"Journal of Econometrics, 5, 71–88.

Gallant, A.R. and Holly, Alberto (1980),"Statistical Inference in an Implicit, Non-linear, Simultaneous Equation Model in the Context of Maximum LikelihoodEstimation,"Econometrica, 48(3), 697–720.

Gallant, A.R. (1987),Nonlinear Statistical Models, New York: John Wiley and Sons,Inc.

Gallant, A.R. and Jorgenson, D.W. (1979), "Statistical Inference for a System ofSimultaneous, Nonlinear, Implicit Equations in the Context of Instrumental Vari-ables Estimation,"Journal of Econometrics, 11, 275–302.

Gill, Philip E., Murray, Walter, and Wright, Margaret H. (1981), "Practical Optimiza-tion," New York: Academic Press Inc.

Godfrey, L.G. (1978a), "Testing against General Autoregressive and Moving Aver-age Error Models When the Regressors Include Lagged Dependent Variables,"Econometrica, 46, 1293–1301.

Godfrey, L.G. (1978b), "Testing for Higher Order Serial Correlation in RegressionEquations When the Regressors Include Lagged Dependent Variables,"Econo-metrica, 46, 1303–1310.

Goodnight, J.H. (1979), "A Tutorial on the SWEEP Operator,"The American Statis-tician, 33, 149–158.

Greene, William H. (1993), "Econometric Analysis," New Your: Macmillian Pub-lishing Company Inc.

Gregory, A.W. and Veall, M.R. (1985), "On Formulating Wald Tests for NonlinearRestrictions,"Econometrica, 53, 1465–1468.



Grunfeld, Y. and Griliches, "Is Aggregation Necessarily Bad ?"Review of Economicsand Statistics, February 1960, 113–134.

Hansen, L.P. (1982), "Large Sample Properties of Generalized Method of MomentsEstimators,"Econometrica, 50(4), 1029–1054.

Hansen, L.P. (1985),"A Method for Calculating Bounds on the Asymptotic Co-variance Matrices of Generalized Method Of Moments Estimators,"Journal ofEconometrics, 30, 203–238.

Hatanaka, M. (1978),"On the Efficient Estimation Methods for the Macro-EconomicModels Nonlinear in Variables,"Journal of Econometrics, 8, 323–356.

Hausman, J. A. (1978), "Specification Tests in Econometrics,"Econometrica,46(6),1251–1271.

Hausman, J.A. and Taylor, W.E. (1982), "A Generalized Specification Test,"Eco-nomics Letters,8, 239–245.

Henze, N. and Zirkler, B. (1990), "A Class of Invariant Consistant tests for Multivari-ate Normality," Commun. Statist. - Theory Meth., 19(10), 3595–3617.

Johnston, J. (1984),Econometric Methods, Third Edition, New York: McGraw-HillBook Co.

Jorgenson, D.W. and Laffont, J. (1974), "Efficient Estimation of Nonlinear Simulta-neous Equations With Additive Disturbances,"Annals of Social and EconomicMeasurement, 3, 615–640.

Joy, C., P.P. Boyle, and K.S. Tan (1996), “Quasi-Monte Carlo Methods in NumericalFinance ”,Management Science, 42(6), pg 926-938.

LaMotte, L.R. (1994), "A Note on the Role of Independence in t Statistics Con-structed From Linear Statistics in Regression Models,"The American Statisti-cian, 48(3), 238–239.

Maddala, G.S. (1977),Econometrics, New York: McGraw-Hill Book Co.

Mardia, K. V. (1980), "Measures of Multivariate Skewness and Kurtosis with Appli-cations,"Biometrika57(3), 519–529.

Mardia, K. V. (1974), "Applications of Some Measures of Multivariate Skewnessand Kurtosis in Testing Normality and Robustness Studies,"The Indian Journalof Statistics36(B) pt. 2, 115–128.

Matis, J.H., Miller, T.H., and Allen, D.M. (1991),Metal Ecotoxicology Concepts andApplications, eds. M.C Newman and Alan W. McIntosh, Chelsea, MI; LewisPublishers Inc.

Mikhail, W.M. (1975), "A Comparative Monte Carlo Study of the Properties of Eco-nomic Estimators,"Journal of the American Statistical Association, 70, 94–104.

Miller, D.M. (1984), "Reducing Transformation Bias in Curve Fitting,"The AmericanStatistician, 38(2), 124–126.


Chapter 14. References

Morelock, M.M., Pargellis, C.A., Graham, E.T., Lamarre, D., and Jung, G. (1995),"Time-Resolved Ligand Exchange Reactions: Kinetic Models for Competive In-hibitors with Recombinant Human Renin,"Journal of Medical Chemistry, 38,1751–1761.

Newey, W.K, and West, D. W. (1987),"A Simple, Positive Semi-Definite, Het-eroscedasticity and Autocorrelation Consistent Covariance Matrix,"Economet-rica, 55, 703–708.

Noble, B. and Daniel, J.W. (1977),Applied Linear Algebra,Englewood Cliffs, NJ:Prentice-Hall.

Ortega, J. M. and Rheinbolt, W.C. (1970), "Iterative Solution of Nonlinear Equationsin Several Variables," Academic Press.

Parzen, E. (1957),"On Consistent Estimates of the Spectrum of a Stationary TimeSeries,"Annals of Mathematical Statistics, 28, 329–348.

Pearlman, J. G. (1980), "An Algorithm for Exact Likelihood of a High-OrderAutoregressive-Moving Average Process,"Biometrika, 67(1), 232–233.

Petzold, L.R. (1982), "Differential/Algebraic Equations Are Not ODEs,"Siam J. Sci.Stat. Comput., 3, 367–384.

Phillips, C.B. and Park, J.Y. (1988), "On Formulating Wald Tests of Nonlinear Re-strictions,"Econometrica, 56, 1065–1083.

Pindyck, R.S. and Rubinfeld, D.L. (1981),Econometric Models and Economic Fore-casts, Second Edition, New York: McGraw-Hill Book Co.

Savin, N.E. and White, K.J. (1978), "Testing for Autocorrelation with Missing Ob-servations,"Econometrics, 46, 59–67.

Sobol’, Ilya M.,A Primer for the Monte Carlo Method, CRC Press, 1994.

Srivastava, Virendra and Giles, David E. A., (1987), "Seemingly Unrelated Regres-sion Equation Models," New York: Marcel Dekker, Inc.

Theil, H. (1971),Principles of Econometrics, New York: John Wiley & Sons, Inc.

Thursby, J., (1982), "Misspecification, Heteroscedasticity, and the Chow andGoldfield-Quandt Test,"Review of Economics and Statistics, 64, 314–321.

Venzon, D.J. and Moolgavkar, S.H. (1988), "A Method for Computing Profile-Likelihood Based Confidence Intervals,"Applied Statistics, 37, 87–94.

White, Halbert, (1980), "A Heteroskedasticity-Consistant Covariance Matrix Estima-tor and a Direct Test for Heteroskedasticity,"Econometrica, 48(4), 817–838.

Wu, D. M. (July 1973), "Alternative Tests of Independence Between Stochastic Re-gressors and Disturbances,"Econometrica ,41(4), 733–750.


The correct bibliographic citation for this manual is as follows: SAS Institute Inc., SAS/ETS User’s Guide, Version 8, Cary, NC: SAS Institute Inc., 1999. 1546 pp.

SAS/ETS User’s Guide, Version 8Copyright © 1999 by SAS Institute Inc., Cary, NC, USA.ISBN 1–58025–489–6All rights reserved. Printed in the United States of America. No part of this publicationmay be reproduced, stored in a retrieval system, or transmitted, in any form or by anymeans, electronic, mechanical, photocopying, or otherwise, without the prior writtenpermission of the publisher, SAS Institute Inc.U.S. Government Restricted Rights Notice. Use, duplication, or disclosure of thesoftware by the government is subject to restrictions as set forth in FAR 52.227–19Commercial Computer Software-Restricted Rights (June 1987).SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513.1st printing, October 1999SAS® and all other SAS Institute Inc. product or service names are registered trademarksor trademarks of SAS Institute Inc. in the USA and other countries.® indicates USAregistration.Other brand and product names are registered trademarks or trademarks of theirrespective companies.The Institute is a private company devoted to the support and further development of itssoftware and related services.

SAS NLN

Documents

Transcript of SAS NLN