On Theories Explaining the Success of the Gravity Equation 1

48
On Theories Explaining the Success of the Gravity Equation 1 Simon J. Evenett 2 Rutgers University and The Brookings Institution Wolfgang Keller 3 University of Texas and NBER August 2000 1 First version: November 1996.We have beneted from comments at seminar presentations at Berlin (Humboldt), Boston University, Brown, Dartmouth, Columbia, Geneva (Graduate Institute), Harvard, NYU, Penn, Rutgers, Toronto, UC San Diego, UC Santa Cruz, Wisconsin, and Yale, as well as from participants at the 1998 Midwest International Economics conference in Ann Arbor, the 1997 NBER Summer Institute, the August 1997 EEA meeting at Toulouse, the September 1997 ERWIT/CEPR conference in Helsinki, and the Empirical Trade Conference 1996 at Purdue University. Special thanks go to Don Davis, Peter Debaere, Mahmoud El-Gamal, Gordon Hanson, Elhanan Helpman, Dan Treer, Lars Peter Hansen as the editor, and especially to an anonymous referee. We also thank Jim Rauch and Karl Aiginger for providing data used in this paper. 2 Department of Economics, 75 Hamilton Street, New Brunswick NJ 08901-1248; email: evenett@fas- econ.rutgers.edu. 3 Department of Economics, Austin, TX 78712; email: [email protected].

Transcript of On Theories Explaining the Success of the Gravity Equation 1

Page 1: On Theories Explaining the Success of the Gravity Equation 1

On Theories Explaining the Success of the Gravity Equation1

Simon J. Evenett2

Rutgers University and The Brookings Institution

Wolfgang Keller3

University of Texas and NBER

August 2000

1 First version: November 1996.We have beneÞted from comments at seminar presentations at Berlin(Humboldt), Boston University, Brown, Dartmouth, Columbia, Geneva (Graduate Institute), Harvard, NYU,Penn, Rutgers, Toronto, UC San Diego, UC Santa Cruz, Wisconsin, and Yale, as well as from participantsat the 1998 Midwest International Economics conference in Ann Arbor, the 1997 NBER Summer Institute,the August 1997 EEA meeting at Toulouse, the September 1997 ERWIT/CEPR conference in Helsinki, andthe Empirical Trade Conference 1996 at Purdue University. Special thanks go to Don Davis, Peter Debaere,Mahmoud El-Gamal, Gordon Hanson, Elhanan Helpman, Dan Treßer, Lars Peter Hansen as the editor, andespecially to an anonymous referee. We also thank Jim Rauch and Karl Aiginger for providing data used inthis paper.

2 Department of Economics, 75 Hamilton Street, New Brunswick NJ 08901-1248; email: [email protected].

3 Department of Economics, Austin, TX 78712; email: [email protected].

Page 2: On Theories Explaining the Success of the Gravity Equation 1

Abstract

We examine whether two important theories of international trade, the Heckscher-Ohlin theory and

the Increasing Returns trade theory, can account for the empirical success of the so-called gravity

equation. Since versions of both models can predict this equation, we tackle the model identiÞcation

problem by conditioning bilateral trade relations on factor endowment differences and on the share

of intra-industry trade. Only for large factor endowment differences does the Heckscher-Ohlin model

predict production specialization in different countries as well as the gravity equation, and it generates

inter-, but not intra-industry trade. Our empirical analysis yields two Þndings. First, the perfect

specialization version of the Heckscher-Ohlin model is an unlikely explanation for the empirical

success of the gravity equation. In contrast, there is a substantial amount of support for a Heckscher-

Ohlin model with imperfect specialization, suggesting an important role for factor endowments in

determining trade ßows. Second, increasing returns is an important cause of perfect specialization

in samples with high levels of intra-industry trade. However, a model that includes both increasing

returns and factor endowments as sources of trade shows a mixed performance: it correctly predicts

more differentiated goods production when the level of intra-industry trade is higher, however, the

predicted link to factor proportions is tenuous. These Þndings suggest that factor endowments and

increasing returns explain different components of the international variation of production patterns

and trade volumes.

Keywords: Gravity equation, Heckscher-Ohlin trade theory, increasing returns trade theory,

model identiÞcation.

Page 3: On Theories Explaining the Success of the Gravity Equation 1

The so-called gravity equation of trade predicts that the volume of trade between two countries

is proportional to their gross domestic products and negatively related to trade barriers between

them. Empirical research has found that various versions of the gravity equation can account for

the variation in the volume of trade across country-pairs and over time (see Leamer and Levinsohn

1995).

Since Anderson (1979) it has been increasingly recognized that the gravity equation can be derived

from very different models, including Ricardian-, Heckscher-Ohlin (H-O)-, and increasing returns to

scale (IRS) models.1 All these models have one characteristic in common: perfect specialization,

where each commodity is produced in only one country. This multitude of models has led some to

argue that the gravity prediction cannot be used to test any one of these trade theories (Deardorff

1998). Yet, the gravity equation constitutes, perhaps along with the H-O-Vanek factor service trade

prediction, one of the most important results about trade ßows. Therefore, major insights into

the causes of international trade would be gained if it could be determined which theory actually

generated the gravity equation for a given sample of data. We refer to this as a model identiÞcation

problem.

We address this identiÞcation problem by noting that, in a constant returns (CRS) H-O world,

perfect specialization results only when bilateral factor proportions differences are large enough. In

contrast, when product specialization is the result of IRS, the gravity prediction can be obtained even

if there are no factor proportions differences. Figure 1 illustrates this. Furthermore, these models

predict different types of trade�in the H-O model, trade is exclusively in goods produced with different

factor intensities (inter-industry trade). In contrast, in the IRS model some, and potentially all trade

is intra-industry. Consequently we will take samples with high shares of intra-industry trade in total

trade as those where IRS-based specialization might drive the gravity equation. Furthermore, those

1Bergstrand (1990), Deardorff (1998), Eaton and Kortum (1997), Helpman and Krugman (1985), and Markusenand Wigle (1990); see also the recent survey by Helpman (1998).

1

Page 4: On Theories Explaining the Success of the Gravity Equation 1

samples with low shares of intra-industry trade in total trade and large factor proportions differences

are those where a model of H-O-based specialization might be behind the gravity equation.

In principle this could lead to a sample-by-sample reconciliation of the perfect specialization

models of the H-O and IRS trade theories. However, no such reconciliation emerges from our analysis

of a large and heterogeneous set of bilateral trade relations in the year of 1985. There is no evidence

that perfect specialization due to factor proportions differences explains in large part the success

of the gravity equation. In contrast, the IRS model of perfect specialization performs better in

our analysis. Trade among the industrialized countries (�North-North� trade) can thus be partly

captured by the IRS model of trade in differentiated goods. And trade between less developed

and industrialized countries (�North-South� trade) appears to be different from what the perfect

specialization H-O model of trade in homogeneous goods predicts.

The second contribution of this paper lies in the fact that we evaluate gravity equations where

production is imperfectly specialized. Empirical work has often found that perfect specialization

models, which predict a strict proportionality of trade with gross domestic product (GDP), typically

overpredict the volume of trade by a large amount. In contrast, the imperfect specialization models

that we present predict a factor of proportionality of trade with GDP of less than one. Moreover,

specialization is a function of relative factor abundance, a key exogenous variable of the models.

With data on the latter, we therefore have the means to evaluate these models.

Our analysis also sheds new light on the empirical relevance of the IRS trade models in general.

The inßuential Þnding of Helpman (1987), that key implications of the IRS model are consistent

with trade data for Organization for Economic Co-operation and Development (OECD) countries,

has been called into question. Hummels and Levinsohn (1995) repeated Helpman�s analysis with

a set of non-OECD countries, whose bilateral trade was not expected to contain much IRS-based

trade. They showed that many correlations found by Helpman continue to hold. Rather than

2

Page 5: On Theories Explaining the Success of the Gravity Equation 1

providing broad evidence against IRS trade models and, in particular, that these models cannot be

responsible for the empirical success of the gravity equation (see Hummels and Levinsohn 1995, 828),

their Þndings highlight the importance of the model identiÞcation problem. This paper accounts

for both Helpman�s (1987) and Hummels and Levinsohn�s (1995) Þndings by suggesting that the

former obtained his results because IRS-based trade is important among OECD countries, whereas

the latter found a similar correlation among non-OECD countries because of specialization driven

by factor proportions differences. Our results underscore the importance of both relative factor

abundance and IRS as determinants of the extent of specialization and international trade, with

their relative importance depending on the particular sample in question. This is in line with the

results of Antweiler and Treßer (1997) who estimate IRS and CRS models nested in the H-O Vanek

factor service trade expression. They Þnd that although a majority of industries seems to be well-

characterized by the assumption of CRS, there is evidence for IRS in a number of sectors.

The remainder of the paper is organized as follows. We present the gravity equation predictions

of four different trade models in the next section. Section 2 describes in more detail how we identify

a particular model in our empirical analysis. Section 3 discusses the data set employed in this study,

whereas section 4 presents the empirical results. Section 5 concludes. The sensitivity of our results

is discussed in the appendix.

1 The gravity equation: theories

1.1 The gravity equation with perfect specialization of production

Throughout the paper, we assume balanced trade, zero trade and transport costs, no trade in in-

termediate goods, two countries with identical production technologies, and that consumers in both

countries have identical homothetic preferences. First, we consider a typical IRS model as laid out

3

Page 6: On Theories Explaining the Success of the Gravity Equation 1

in Helpman and Krugman (1985, Ch. 8.1), where there are two countries, c = i, j, and two goods,

g = X,Z. Both X and Z come in many differentiated varieties which are identically produced with

increasing returns to scale. With preferences valuing product variety, a country will demand all

foreign varieties according to the country�s GDP as a share of world GDP. Let the GDPs of countries

i and j be denoted Y i and Y j respectively, and denote the GDP of the world by Y w. Given IRS leads

to perfect specialization of production for every variety, it can be shown that country i�s imports

from j, denoted M ij , is given by

M ij =Y iY j

Y w. (1)

Here imports are strictly proportional to the GDPs, and so this IRS trade model is a potential

candidate to explain the success of the gravity equation.2

The equation (1) is very general, as its derivation does not require assumptions on factor price

equalization, factor endowment differences across countries, factor intensities in the production of

goods X and Z, or even the number of sectors, factors, and countries (Helpman and Krugman 1985).

Equation (1) holds whenever there is perfect specialization in equilibrium, all consumers face the

same goods prices, have identical homothetic preferences, and trade is balanced. Also differences in

factor endowments that are large enough across countries relative to differences in the goods� factor

input requirements can lead to perfect specialization. Thus, a H-O model can also generate the

gravity equation (1). In the two-country, two-good, two-factor (2 × 2 × 2) case this requires that

the countries� relative endowment ratios differ by at least as much as the goods� relative input ratios

that are consistent with diversiÞed production and factor price equalization through trade. In the

empirical analysis below, we will refer to this as the multi-cone H-O model.

2This gravity prediction, as well as those below, is readily generalized to the case where tradable goods make up ashare λ, 0 ≤ λ ≤ 1, of GDP which is common to all countries. In that case, one obtains M ij = λY iY j

Y w .

4

Page 7: On Theories Explaining the Success of the Gravity Equation 1

1.2 Gravity equations with imperfect specialization of production

First, we present the gravity equation for a model where one sector (Z) produces a homogeneous good

under CRS, whereas a second sector (X) produces a differentiated good under IRS (Helpman 1981;

see also Helpman and Krugman 1985, Ch.7,8). This model will be referred to as IRS/uni-cone H-O

model. There are two countries (i and j) and two factors, capital (K) and labor (L). Assume that

the homogeneous good is more labor intensive in production and that country i is capital-abundant.

Let γc be the share of good Z in country c�s GDP, γc = Zc

pxXc+Zc , where px is the relative price

of good X. For endowments of the two countries that are sufficiently similar so that factor price

equalization through trade can be achieved, country i�s imports from country j are given by the

following gravity equation:3

M ij =³1− γi

´ Y iY j

Y w. (2)

For any value γi > 0, the level of bilateral imports is lower than in the case where both goods

are differentiated (compare equations 2 and 1). Furthermore, as the share of homogeneous goods in

GDP declines, the predicted level of imports rises; in the limit, as γi → 0, the generalized gravity

equation (2) reverts back to the simple gravity equation (1) above. Therefore, the volume of trade

is higher, the lower is the share of homogeneous goods in GDP.4

Second, we present the particular form of the gravity equation in the case of a 2× 2× 2 uni-cone

H-O model, where both good Z and good X are homogeneous and produced under CRS in both

countries. If country i is relatively capital-abundant and good Z is relatively labor-intensive, country

3In this equilibrium, country i exports only the capital-intensive X varieties, which are¡1− γi

¢of its GDP, in

proportion to country j�s size in the world, which is equal to Y j

Y w ; together with balanced trade this gives equation (2).4This Þnding is in part due to H-O reasons, because γi is inversely related to country i�s capital-labor ratio. A

decline in γi therefore implies an increase in the volume of imports due to an increase in the countries� factor proportionsdifferences. What we emphasize is that for given factor proportions differences, the more specialization there is thehigher is the level of imports; see below.

5

Page 8: On Theories Explaining the Success of the Gravity Equation 1

i�s imports from j are given by:5

M ij =³γj − γi

´ Y iY j

Y w=h³

1− γi´−³1− γj

´i Y iY j

Y w. (3)

As the capital-labor ratios in the two countries converge, so do γj and γi. In the limit, when the

factor proportions in i and j are equal, we have that γj = γi, in which case equation (3) gives the

familiar result that there is no trade in a H-O model when factor proportions are identical. Equation

(3) includes the volume of imports prediction of the multi-cone H-O model given in (1) as a special

case, because as factor proportions differences between i and j increase, the share of good Z in

country j�s GDP, γj , approaches one, whereas the share of good Z in the GDP of country i, γi,

tends to zero.

Both imperfect specialization models predict that an increase in GDP raises imports less than

proportionately. If we denote the volume of imports prediction for the case where the production of

both goods is perfectly specialized, equation (1), by MS ; the IRS/uni-cone H-O case with special-

ization for one good, but not for the other good (equation 2) with MIH ; and the uni-cone H-O case

in which both countries produce both goods, equation (3), with MH , the following inequalities hold,

ceteris paribus:

MS > MIH > MH . (4)

This conÞrms that the bilateral volume of imports is higher the more product specialization there is.

5Country i�s imports from country j are equal to px

³Xi − Y i

Y wXw´, where Xw is the world�s production of good

X. Using the deÞnition of γ and the fact that Y i

Y w = 1− Y j

Y w leads to (3).

6

Page 9: On Theories Explaining the Success of the Gravity Equation 1

2 Model identiÞcation

In the preceding section, we have discussed the speciÞc form of the gravity equation of trade for

four models. From now on, we consider only 2× 2 × 2 models. It is unlikely, however, that any of

the observed trade ßows are solely determined by any one of these four archetypical models. First,

the data comes from a world with more sectors, countries, and factors than exist in our 2 × 2 × 2

models. Secondly, there may be positive amounts of IRS-based trade even between countries with the

lowest recorded shares of IRS-based trade, and likewise for factor proportions-based trade. Observed

bilateral trade is likely to be the result of a combination of the four models considered here, and

perhaps of others which we have not addressed. However, in different circumstances (such as different

distributions of factor endowments across trading partners) we expect different trade models to

account for different proportions of the observed trade ßows. Our inferences are based on whether

each model actually performs better in the very sample(s) where one expects it to perform better.

Consider a cross-section of country-pairs where there is little (or no) specialization due to IRS, but

the absolute difference between the two countries� factor proportions, denoted FDIF , varies from

one pair to another. Increased specialization is associated with an increase in factor proportions

differences for given relative input requirements in production, so that we expect ceteris paribus

more specialization in a pair where FDIF is larger relative to another country-pair where FDIF is

smaller. Consequently, we expect the prediction MS to be more accurate for the pair where FDIF

is larger. This observation allows to identify the H-O motivation for specialization and the gravity

prediction: If factor proportions differences are important in explaining the success of the gravity

equation, then ceteris paribus the predictionMS should Þt better in samples where factor proportions

differences are higher. Moreover, we can test the multi-cone H-O model against the uni-cone H-O

model for different values of FDIF : If specialization is principally the result of factor proportions

differences, then the multi-cone H-O model should be increasingly preferred to the uni-cone H-O

7

Page 10: On Theories Explaining the Success of the Gravity Equation 1

model in samples where factor proportions differences are higher.

We employ the index proposed by Grubel and Lloyd (1975) to control for the extent of IRS-based

trade. Trade due to IRS and product differentiation results in a country simultaneously importing

and exporting varieties of a particular product (intra-industry trade).6 The index, denoted GLij ,

measures the share of intra-industry trade in total trade

GLij = 1− P

g

¯M ij

g −M jig

¯P

g

³M ij

g +M jig

´ , 0 ≤ GLij ≤ 1. (5)

In the extremes, when every good g is either exported or imported (no intra-industry trade), the

Grubel-Lloyd index will be equal to zero, whereas it is equal to one if a country�s imports and exports

of all goods are equal (only intra-industry trade).

This index is not a perfect indicator for the share of trade based on IRS. Finger (1975) has

emphasized that industry classiÞcations often contain quite different products, and clearly a high

degree of disaggregation is desirable when measuring intra-industry trade. Recently, researchers

have emphasized the importance of trade in intermediate goods in accounting for high values of GLij

(Greenaway, Hine, and Milner 1994, and Fontagne, Freudenberg and Peridy 1997), which can in

principle be due to IRS and other reasons. Moreover, a signiÞcant part of trade which is classiÞed

as inter-industry trade might involve IRS, such as wide-bodied aircraft exports from the U.S. to

most other countries.7 After discussing this issue in some detail, Krugman (1994, 23) concludes that

although GLij does not precisely measure the share of international trade which is due to IRS, there

is no indication that there is an important upward- or downward bias.

6We deÞne intra-industry trade as trade in goods with identical factor input requirements; for our empirical analysis,though, intra-industry trade is taken as two-way trade of goods in the same four-digit Standard International TradeClassiÞcation (SITC) class.

7The only important producer of such aircraft (made by Boeing) outside the U.S. is the European joint ventureAirbus, so that Boeing exports to any country other than the Airbus maker countries are classiÞed as inter-industrytrade. This is an example given in Krugman (1994).

8

Page 11: On Theories Explaining the Success of the Gravity Equation 1

We expect that the IRS model increasingly accounts for the performance of the gravity equation in

those samples where the bilateral Grubel-Lloyd indices are larger, indicating that a larger proportion

of bilateral trade is two-way trade in differentiated products. In section 4.1.1 we examine whether

in fact the prediction MS (of the IRS model) is more accurate in samples with higher Grubel-Lloyd

indices. We emphasize that predictionMS is common to both the multi-cone H-O and the IRS model.

However, if we were to Þnd that the prediction MS is increasingly accurate for the data in samples

with higher Grubel-Lloyd indices, it would be incorrect to interpret this Þnding as evidence in favor

of the multi-cone H-O theory. Furthermore, we can evaluate the performance of the IRS/uni-cone

H-O model versus that of the IRS model. We expect the latter to perform relatively best in the

sample with the highest share of intra-industry trade.

In the following section, we brießy discuss the data which will be used.

3 Data

We have assembled a cross-sectional data set for Þfty-eight countries in the year of 1985. The data set

includes all countries with GDPs above 1 billion U.S. dollars and where internationally comparable

capital-per-worker estimates are available. These Þfty-eight countries accounted for 67% of world

imports and 79% of world GDP. The data set includes nearly all of the industrialized countries

but relatively fewer of the less developed countries, reßecting the paucity of available capital stock

estimates for the latter (see Table 1A).

The source of data on GDP and capital-per-worker is the Summers and Heston (Version 5.6)

database (see Summers and Heston 1991). Both variables are reported in internationally-comparable

and purchasing power-adjusted real U.S. dollars. The total value of bilateral imports data comes from

the NBER World Bilateral Trade Database, see Feenstra et al. (1997). The quality of the data is

likely to be lower for the poorer countries in the sample. This differential degree of measurement error

9

Page 12: On Theories Explaining the Success of the Gravity Equation 1

requires caution when results for highly industrialized countries are compared with those obtained

from samples that include poorer countries.

The bilateral Grubel-Lloyd indices GLij are calculated using data on all goods trade at the four-

digit Standard International Trade ClassiÞcation (SITC) and therefore, unlike most reported GLij

indices, we do not conÞne ourselves to manufacturing goods trade (source: Feenstra et al. 1997).

The Grubel-Lloyd index can only be calculated for country-pairs where there are positive amounts of

trade, which is the case for 2870 observations, or 87% of the possible 3306 bilateral import relations

among 58 countries. The number of bilateral import relations varies across countries from a low

of 27 for Nepal to the maximum of 57 for most industrialized countries. Furthermore, the average

number of four-digit SITC categories that a country trades bilaterally varies considerably as well.

At one end, Mauritius trades on average 36 categories with its partners, whereas at the other end,

the U.S. trades on average 332 categories of goods with its trade partners.

Across each of their respective trading partners, Bolivia has the lowest average Grubel-Lloyd

index with a value of 0.0006; and the U.K. has the highest value of 0.1495. Each country�s average

Grubel-Lloyd index is shown in Table 1A, along with the country�s GDP per capita, the number of

bilateral relations the average is computed from, and the average number of goods categories where

there is trade. In Table 1B we report the correlation coefficients among the variables of Table 1A.

They are all positive; in particular, the correlation of the average Grubel-Lloyd index with GDP per

capita is 0.78, and the correlation between the average Grubel-Lloyd index and the average number

of goods traded is 0.93. The distribution of bilateral Grubel-Lloyd indices is very skewed: 44% of all

indices are equal to 0, and 78% of all (1120 country-pairs) have a value of 0.05 or less. In the empirical

analysis below, we will treat these 1120 country-pairs�or, 2240 observations�as trade relations where

there is little or no IRS-based trade. The remaining 315 country-pairs (where GLij > 0.05) are those

where IRS-based trade is thought to be present; they account for 87.1% of all of the imports in our

10

Page 13: On Theories Explaining the Success of the Gravity Equation 1

sample.

In the IRS/uni-cone H-O model, the bilateral imports prediction depends on whether the differ-

entiated good is relatively capital- or labor-intensive. We follow the presumption in major mono-

graphs such as Helpman and Krugman (1985) and assume that the differentiated good is relatively

capital-intensive. However, we have examined this question in some detail and concluded that the

assumption may not hold empirically.8 For this reason we also consider the alternative case in our

analysis below, according to which the differentiated good is relatively labor-intensive.

The gravity equation is a relationship between two endogenous variables, so the results might be

affected by simultaneity bias (e.g., Saxonhouse 1989). One possibility is to use factor endowments as

instruments for GDP (e.g., Harrigan 1996). However, factor endowments are known to be not only

correlated with GDP but also with trade ßows. Because this might call into question the validity of

factor endowments as instruments, we have decided to leave the gravity equation uninstrumented.

The data on bilateral distance between countries used below is the smallest arc tan distance between

capital cities, measured in kilometers (source: Haveman 1998).

4 Empirical Results

We split the 2870 bilateral imports observations into two subsamples according to an arbitrarily

chosen critical level of intra-industry trade, denoted GL. Those pairs where GLij ≤ GL belong to

8Rauch (1999) has computed the extent to which an industry produces goods that are reference-priced on organizedexchanges, denoted r. It is plausible to assume that this is inversely related to the degree of product differentiation.We have correlated r with industry-level data on capital-labor intensity in production from seven major industrializedcountries (from Keller 2000) as well as with more disaggregated U.S. data (from Bartelsmann-Gray 2000), Þnding a zeroor positive correlation. As an alternative measure of product differentiation, we have considered the standard deviationof unit import values, expecting a positive relationship (source: WIFO 1999). This variable is not signiÞcantly related tothe ratio of capital- to labor share in value added across industries (data from Peneder 1999). Due to data availability,this analysis covers only manufacturing products (details are available from the authors upon request). Findingsmight change when non-manufacturing sectors and broader deÞnitions of capital, especially including research anddevelopment, are adopted. However, within manufacturing there are a number of highly capital-intensive industriesthat are homogeneous (e.g. petrochemicals), as well as labor-intensive industries that are fairly differentiated (e.g.apparel), explaining the mixed results.

11

Page 14: On Theories Explaining the Success of the Gravity Equation 1

what is referred to as the low-GL sample, while the remaining observations are part of the high-

GL sample. The samples differ in that we expect substantial amounts of trade based on product

differentiation and IRS in the latter but not in the former. Given the critical level GL, we examine

the evidence for H-O-based trade in the low-GL sample by sorting the data, and by forming classes,

according to each observations�s factor proportions differences (FDIF ). Similarly, to shed light on

product differentiation/IRS-based trade we sort the observations in the high-GL sample according to

their level of intra-industry trade (measured by GL). Denote by V the number of classes into which

the sorted observations are allocated. The inferences are based on comparing the results across the

V classes as well as across models in the light of the identiÞcation strategy discussed above.

In the benchmark case that is discussed Þrst, GL is chosen to equal 0.05, and there are V = 5

classes of equal size. We then present results for two alternative cases. In one we adopt a criti-

cal level of GL = 0.033, which shifts more bilateral trade relations into the sample where product

differentiation/IRS-based trade is thought to be present. In the other case, we report results from

regressions in which the class size is endogenously chosen so as to maximize the Þt of the regres-

sion. This is complemented by a number of additional results and sensitivity analyses reported in

appendices A and B.

4.1 Benchmark case

The choice of GL = 0.05 leads to 2240 observations in the low-GL sample and the remaining 630

observations in the high-GL sample. We Þrst discuss results for the two perfect specialization models,

before we turn to examining the two imperfect specialization models.

12

Page 15: On Theories Explaining the Success of the Gravity Equation 1

4.1.1 IRS-based perfect specialization (high-GL sample)

The model considered Þrst is the IRS model, where both goods are differentiated and produced with

scale economies. As discussed above, the model has a volume of imports prediction of M ij = Y iY j

Y w .

We allocate the 630 observations into V = 5 classes, with GLij rising from class v = 1, ..., V. The

following equation, based on (1), is estimated:

M ijv = αv

Y ivY

jv

Y w+ εij

v , (6)

for each v. The disturbance εijv is assumed to have mean zero but might be heteroskedastic. In this

and all following regressions, we assume that the variance of εij is equal to σ2¡Zij

¢2, where Zij is a

vector that includes Þve variables: Y i, Y j , the sums of imports of country i and exports of country

j, and the bilateral distance between i and j (denoted Dij). Let the predicted values from regressing

log¯εij¯on a constant and log Zij be given by log εij; the parameter α is then estimated by ordinary

least squares (OLS) with the scaled variables M ij/εij and¡Y iY j/Y w

¢/εij . If this model captures

an important part of what explains the volume of trade between these countries, then αv should be

rising with GL class v. The results for these regressions can be seen in Table 2.

The estimated αv�s vary from 0.016 to 0.139; all fall well short of the theoretically predicted value

of one. By far the largest import parameter is estimated for the v = 3 class. Thus, an intermediate

level of product differentiation is associated with the maximum volume of trade. The table also

reports standard errors (in parentheses) and the symmetric 90% conÞdence intervals of the point

estimates.9 The conÞdence intervals suggest that α3 might not be larger than α5 (which is estimated

to be 0.099), and the general trend appears to be that αv is increasing with GL, the measure of

9The standard errors and conÞdence intervals are based on bootstrap techniques (see Bickel and Freedman 1981,Efron 1982); they tend to be larger, and appear to be preferable in this case relative to standard asymptotic results.

13

Page 16: On Theories Explaining the Success of the Gravity Equation 1

product differentiation in trade.10

The null hypothesis that the import parameters of all Þve classes are identical is rejected for

this model (a 10% signiÞcance level is adopted for all hypothesis tests). Also the hypothesis that

α1 = α5 is rejected. For all four models, we are interested in seeing if αv increases as v increases;

that is whether the point estimate is greater in classes where the extent of specialization should be

greater. Thus, we report as an additional, albeit rough, indicator of model performance the number

of bilateral comparisons of αv�s that are in line with these priors. That is, we ask whether α2 exceeds

α1, whether α3 is larger than α1, whether α3 exceeds α2, and so on. Eight out of ten bilateral

parameter comparisons are in accordance with the priors for the IRS model. Overall, while the

performance of the IRS model is far from perfect, broadly speaking its predictions seem to be born

out by our results. Thus, the Þndings are consistent with IRS-based trade theory being the reason

why the gravity prediction �works� empirically for these samples. We now turn to the multi-cone

H-O model.

4.1.2 H-O-based perfect specialization (low-GL sample)

Recall that in the multi-cone H-O model all trade is in homogeneous, perfectly specialized products.

The imports prediction is given by equation (1): M ij = Y iY j

Y w . Therefore, the only parameter αv is

predicted to equal 1 when all goods are tradable. Perfect specialization in a H-O world is caused

by increasingly different factor proportions. Thus we examine whether imports are rising as factor

proportions differences, measured by FDIF class v, are rising. We estimate

M ijv = αv

Y ivY

jv

Y w+ εij

v , (7)

10Equation (6) does not include a constant, which means that OLS estimates are inconsistent if E£εij

v

¤is, contrary to

the assumption, not equal to zero. While the mean of the Þtted εij is typically small relative to its standard deviation,we have nevertheless estimated all four models also with a constant. The pattern of how αv varies with v remains thesame, see Table A1.

14

Page 17: On Theories Explaining the Success of the Gravity Equation 1

for each v, v = 1, ..., 5. Factor proportions differences are lowest for v = 1 and highest for v = 5; with

2240 observations in the low-GL sample, each group has 448 observations. Table 2 shows the results.

The largest parameter, with 0.111, is estimated for class v = 2, whereas the α estimates for the

four other classes are between 0.039 and 0.047. The conÞdence intervals conÞrm that α2 is larger

than the other import parameters. Moreover, in contrast to the IRS model above, we do not reject

the null hypothesis that α1 = α5, that is, the import parameter for v = 1 and for v = 5 appear to

be equally large. Thus, while the import parameters vary with factor proportions differences in the

sample, they do not vary as one would expect if more product specialization due to factor proportions

differences were the driving force. For that, α5 should be higher than α1, and in fact, α5 should be

the highest parameter. Neither is the case. We Þnd that there are only two out of a total of ten

bilateral comparisons that are correct for this model.

Summarizing, the evidence that perfect specialization due to H-O reasons is an important expla-

nation for the success of the gravity equation is weak, while the evidence supporting the IRS model

is considerably stronger. Care must be taken though to not overinterpret this result, since as we have

noted above, the data quality for the high-GL sample is higher than for the low-GL sample. More-

over, the trade volume prediction of both of these models holds only if the production of all goods is

perfectly specialized, which is unlikely. We now turn to estimating the import volume predictions of

the two above models that incorporate imperfectly specialized production: import prediction MIH

for the IRS/uni-cone H-O model, and the prediction MH for the uni-cone H-O model.

15

Page 18: On Theories Explaining the Success of the Gravity Equation 1

4.1.3 IRS model incorporating imperfect specialization (high-GL sample)

The model considered here is the IRS/uni-cone H-O model. The regression equation when country

i is capital-abundant relative to j is

M ijv = (1− γi

v)Y i

vYj

v

Y w+ εij

v ,

for each v, v = 1, ..., 5, and M ijv = (1− γj

v)Y iv Y j

v

Y w + εijv if country j is relatively capital-abundant. The

number of estimated parameters (1−γiv) varies by class v.Moreover, the number of times that a given

country�s parameter is estimated in a given class changes as well. In Table 2 we report the median

of the predicted values d(1− γiv), which takes into account the different number of observations that

determine any (1− γiv) estimate. This statistic is denoted by αv.

The median predicted value varies across classes from 0.053 to 0.128. A non-structural interpre-

tation of αv is that it estimates the average size of the differentiated goods sector. This average

should, Þrst, be non-negative, and the results conÞrm that. Second, in this model higher values of

GL (and hence class v) should be associated with higher estimates of αv, because a higher share

of differentiated goods in GDP gives rise to a higher share of intra-industry trade in total trade.

The αv estimates rise from 0.078 for v = 1 to 0.128 for v = 5, with a non-monotonicity at v = 2

(α2 = 0.053). This pattern is broadly in accordance with the model�s prediction; as indicated in the

table, only one out of ten bilateral comparisons of αv�s is incorrect.

The conÞdence intervals suggest that some but not all differences of import parameter estimates

across GL classes are also statistically signiÞcant. In particular, the median import parameter for

intermediate levels of GL (v = 3) is statistically indistinguishable from that for the highest GL class.

At the same time, we strongly reject both the null hypothesis that the Þve αv are the same and the

null that α1 is equal to α5. Overall, this provides some support for the IRS/uni-cone H-O model.

16

Page 19: On Theories Explaining the Success of the Gravity Equation 1

We now turn to the uni-cone H-O model.

4.1.4 H-O-based imperfect specialization (low-GL sample)

We return to the sample of bilateral pairs where the computed Grubel-Lloyd (GLij) index is 0.05

or less. Here we estimate the uni-cone H-O model, where two homogeneous goods are produced

in both countries. As discussed in section 1 above, the model�s import prediction MH is equal to

M ij =¡γj − γi

¢Y iY j

Y w if country i is capital abundant. We estimate

M ijv =

³γj

v − γiv

´ Y ivY

jv

Y w+ εij

v ,

for each v, v = 1, ..., 5, andM ijv =

¡γi

v − γjv

¢ Y iv Y j

v

Y w +εijv if country j is capital-abundant. In each class

v, there are up to 58 countries. Out of 58 parameters γ, only 57 are identiÞed. We arbitrarily set

the γ for the country with the lowest capital-labor ratio in the sample (Sierra Leone) equal to one

in each class v. Because the composition of the classes varies in terms of countries, we report also

here the median of the predicted (net) import parameter, given byd³

γjv − γi

v

´and denoted by αv. A

non-structural interpretation is that it presents the difference between the size of the labor-intensive

sector in the labor- and in the capital-abundant country. We expect αv to be positive and that it

tends to rise when bilateral factor proportions differences increase.

Table 2 shows that αv for the uni-cone H-O model is estimated to lie between 0.021 and 0.080,

which together with the conÞdence intervals suggests that αv is positive for all classes. The median

value of d(γj − γi) when all 2240 observations are used in the estimation is 0.04 (last row). Further,

αv is generally rising with factor proportions differences across classes, although there is one non-

monotonicity at class v = 4. Nine out of ten bilateral comparisons are correctly predicted by the

uni-cone H-O model. The conÞdence intervals also allow to reject the null hypothesis that all αv�s

are identical as well as the hypothesis that α1 = α5. Thus, we Þnd support in favor of the uni-cone

17

Page 20: On Theories Explaining the Success of the Gravity Equation 1

H-O model. Interestingly, although we Þnd a non-monotonicity for αv at a relatively high FDIF

level (v = 4), the uni-cone H-O model appears to be working quite well even for the highest levels

of factor proportions differences. A priori one might have thought that the assumption of a common

cone of production diversiÞcation is then implausible. We will take up this issue again in section

4.1.6.

Summarizing, from this analysis of these imperfect specialization models, we Þnd supportive

evidence for both the IRS/uni-cone H-O and the uni-cone H-O model.

4.1.5 Comparing the perfect and imperfect-specialization models in terms of Þt

Table 3 shows a number of widely-used model selection criteria for the benchmark case. The R2

is computed using the expression y0y/y0y, where y and y are the deviations from the means of

the predicted values of the independent variable and the actual value of the independent variable,

respectively. Akaike�s Information Criterion (AIC) is deÞned as

AIC = ln

µe0eN

¶+ 2Q/N,

where e0e is the residual sum of squares, N is the number of observations, and Q is the number

of estimated parameters. Also the unadjusted log of e0e is reported. In the regression with all

630 observations, the R2 for the IRS model is 0.280, while for the IRS/uni-cone H-O, it is 0.105.11

Also according to the AIC criterion, the perfect specialization model is preferred to the imperfect

specialization model in the high-GL sample. The same qualitative results are obtained for the two

other models in the low-GL sample.

The comparison of Þt across the v classes is also of interest. In the high-GL sample, the question

11 It is sometimes suggested to compute y and y as deviations from zero when the regression does not contain aconstant (e.g. Kennedy 1992, 27). This changes little in the present context, though.

18

Page 21: On Theories Explaining the Success of the Gravity Equation 1

is whether the IRS model is increasingly preferred as the share of intra-industry trade in total trade

goes up. Comparing the AIC values of the IRS and the IRS/uni-cone H-O model, though, no clear

pattern emerges. In the low-GL sample, we Þnd that the multi-cone H-O model performs relatively

best when factor proportions differences are smallest (v = 1), and worst when they are largest

(v = 5). This is the opposite of what one would expect if factor proportions differences would be

important in determining trade in perfectly specialized goods.

Overall, it appears that our analysis of empirical Þt does not lead to major conclusions, except

perhaps that the reduction of the residual sum of squares that the imperfect specialization models

generate is small relative to the additional number of parameters that are estimated. None of the

models receives much support from its relative performance across classes. In particular, the pattern

of relative Þt for the multi-cone and uni-cone H-O models is the opposite of what should be expected

if the models performed well. The latter conÞrms the point that perfect specialization due to H-O

reasons is unlikely to be an empirically important determinant of international trade volumes. We

conclude the analysis of the benchmark case by examining the country-speciÞc estimates of the two

imperfect specialization models.

4.1.6 Country-speciÞc estimates of the imperfect specialization models

In this section we discuss the estimates of (1− γi) and γi from the IRS/uni-cone H-O and uni-cone

H-O models, respectively. The results are collected in Table 4. In the case of the IRS/uni-cone H-O

model, thirty-Þve country-speciÞc parameters are estimated. The remaining twenty-three countries

have no trade relations where the level of intra-industry trade exceeds the critical value of GL = 0.05.

These countries are primarily developing countries, as Table 4 shows. The parameter¡1− γi

¢denotes

the share of the differentiated goods sector in GDP in the capital-abundant country, which should

be bounded between zero and one. Moreover, under the assumption that the differentiated good is

19

Page 22: On Theories Explaining the Success of the Gravity Equation 1

relatively capital-intensive, the model implies that the share of differentiated goods in GDP increases

with the relative abundance of capital to labor, ceteris paribus. Thus one expects under these

conditions that the correlation of Ki

Li and¡1− γi

¢is positive. As reported in Table 4, 32 out of 35,

or 91% of the (1− γi) estimates satisfy the restriction of being between zero and one. We estimate

values above one, which would mean a negative share of the homogeneous goods sector, for Honduras,

Zimbabwe, and the Republic of Korea.

The table reports also the standard errors of the (1− γi) as well as p-values, both of which are

computed using bootstrap techniques. The p-values indicate the probability that a given country�s

(1− γ) exceeds that of Switzerland, the country with the highest relative capital endowment in the

sample. The estimate for Switzerland is equal to 0.091. The (1−γi) estimates have often a relatively

small standard error, but it is not that case that most of them are smaller than 0.091 : at a 10%

level, only 13 out of 34, or 38% of the parameters are signiÞcantly smaller than that of Switzerland.

Thus, it is not surprising that we do not Þnd the positive correlation between¡1− γi

¢and Ki

Li that

the model predicts. As reported in the lower part of the table, the simple correlation of¡1− γi

¢and Ki

Li is negative. The result is further conÞrmed by regressing the¡1− γi

¢on a constant and Ki

Li

using OLS: the t-statistic of the slope parameter in this regression is equal to -2.12.12 The analogous

correlations with GDP per capita are not much different.

These estimates do not provide support for the IRS/uni-cone H-O model. Given our difficulty

of establishing whether the differentiated good is relatively capital-intensive or not, we have also

considered the alternative assumption, that the homogenous good is relatively capital-intensive. It

can be shown that, on the maintained assumption that country i is relatively capital-abundant, the

import prediction becomesM ij =¡1− γj

¢ Y iY j

Y w . That is, the regression parameter estimates country

12We have also weighted each observation by the inverse of the standard error of (1−γi); the results of that regressiondo not provide more support for the model. When the three cases in which

¡1− γi

¢exceeds one are dropped from the

regression as they do not satisfy the restriction that¡1− γi

¢is a share, the t-statistic drops to -0.60.

20

Page 23: On Theories Explaining the Success of the Gravity Equation 1

j�s share of the differentiated good in GDP, instead of country i�s share. However, estimating this

alternative prediction also fails to conÞrm the theoretical relationship in the IRS/uni-cone H-O model

between the size of the differentiated goods sector and factor endowments.13

In the case of the uni-cone H-O model, we estimate 57 parameters γi, the share of the labor-

intensive good in GDP. The assumption of imperfect specialization means that γ should be between

zero and one for all countries, and our estimates are consistent with that (conditional on the set

value of γ = 1 for Sierra Leone). The range of the estimates is relatively small, from a low of 0.855

for Belgium-Luxemburg to a high of 1.027 for Mauritius. Because some international differences

in the size of the homogeneous goods sector in GDP are likely to be larger than (approximately)

seventeen percentage points, the γ estimates should not be interpreted too literally. Nevertheless, it

is interesting to see how different the point estimates are relative to the set value for Sierra Leone.

The p-values in the last column report the probability that the γ of a given country is at least as large

as that of Sierra Leone. For 36 out of 57 countries, or 63%, we can reject the null hypothesis that

γi is equal to one. Moreover, none of the γi is estimated signiÞcantly higher than one at standard

levels.

We now examine the correlation of relative capital endowments and the estimated γ�s. All else

equal, the higher a country�s relative capital endowment, the more it will specialize in the production

of the capital-intensive good, implying a lower share of the labor-intensive goods sector in GDP. Thus,

we expect a negative correlation of γ and KL . As reported in the lower part of Table 4, this is what

we obtain, with a correlation of −0.80. The result is strongly conÞrmed by an OLS regression of γ

on a constant and KL , which results in a t-statistic of the slope parameter of −9.65 and accounts for

13We estimate 47¡1− γj

¢parameters, all positive, and three parameters exceed one (for Malawi, Iceland, and West

Germany). If the homogeneous good Z is relatively capital-intensive, the model predicts a positive correlation of γ, Z�sshare in GDP, and K

L(or, equivalently, a negative correlation of

¡1− γj

¢and Kj

Lj ). Regressing¡1− γj

¢on a constant

and Kj

Lj , we Þnd a t-statistic for the slope parameter of -0.35, which changes to 2.06 once the three above-one estimatesare dropped.

21

Page 24: On Theories Explaining the Success of the Gravity Equation 1

63% of the variation in the estimated γ�s. We have also examined the correlation of γ and KL for

different levels of factor proportions differences, as indexed by class v, to see whether the negative

correlation holds for all levels of FDIF to the same extent. While the correlation is always found

to be negative, the differences in KL account for about 70% in the variation of γ�s for the classes

v = 1 and v = 2 where FDIF is relatively low, but only for less than 10% for the classes v = 4

and v = 5 where factor proportions differences are higher. These results suggests that the predicted

link between factor proportions and trade is stronger for countries with small- to intermediate-factor

proportions differences, where the assumption that such countries share the same cone of production

diversiÞcation is more plausible.

Overall, these results suggest that the factor proportions-driven pattern of specialization and

trade that the uni-cone H-O model predicts has indeed an important effect on the volume of bilateral

trade. We now brießy analyze the robustness of the major Þndings by considering (a) the case of

endogenous class formation, and (b) the case of a different critical level that separates the high- and

low-GL samples.

4.2 Alternative speciÞcations

4.2.1 Endogenous class size

We allow for a varying number of observations in each class, or equivalently, for the endogenous

formation of classes according to the variables FDIF ij and GLij , respectively. We choose that

allocation of observations across Þve classes that minimizes the AIC criterion. Another interpretation

of this analysis is that we estimate the Þve αv together with the four class breakpoint values of FDIF

(or GL, respectively) by maximum likelihood (MLE).14 Table 5 shows the results.

For the IRS model, we estimate αv between 0.016 (for the lowest GL sample) and 0.133 (for the

14 In order to ensure the estimability of the α�s in each class, we have to use certain minimum numbers of observationsper class. Our estimates are therefore only approximations to maximum likelihood estimates.

22

Page 25: On Theories Explaining the Success of the Gravity Equation 1

second-highest GL sample). Inspection of the conÞdence intervals reveals that both the hypothesis

that αv = α, ∀v, and the hypothesis that α1 = α5 are rejected, even though in the top end of the

distribution, there seems to be relatively little difference in the αv�s for v = 2 to v = 5 (see the 95

percentile estimates). The pattern of the αv is broadly increasing with GL, with eight out of ten

bilateral comparisons correct. With 630 high-GL observations, the best Þt tends to be obtained with

a relatively high number of observations in the v = 1 and v = 5 classes.15

In the case of the multi-cone H-O model, the αv estimates range from 0.029 to 0.111. As in the

benchmark case, we Þnd the highest estimate not for the class with the largest factor proportions

differences, but for the v = 2 class. Moreover, the estimate of α2 is signiÞcantly larger than α5 at

a 10% level, while we cannot reject the null hypothesis that α1 = α5. Only four out of ten bilateral

comparisons are correct. The best Þt tends to be obtained when the 2240 low-GL observations are

allocated more or less evenly across the Þve classes; only the v = 5 class tends to be relatively large.

We now turn to the imperfect specialization models.

In the high-GL sample, the results for the IRS/uni-cone H-O model differ from the benchmark

case discussed above. While the point estimates of the median d(1− γiv) still tend to increase with GL

�eight out of ten bilateral comparisons are correctly predicted�, the standard errors of the estimates

are fairly large. Consequently, one cannot reject the hypothesis that αv = α, ∀v, nor the hypothesis

that the estimates for the Þrst and the Þfth class are equal. The reason for the diverging MLE results

relative to the benchmark case has to do with the different allocation of the 630 observations into

the Þve classes. Here, there are approximately 78 observations each in class v = 1 to v = 4, and the

remainder of about 316 observations is in class Þve.

The MLE estimates of the median d(γj − γi) for the uni-cone H-O model range from 0.015 for

v = 1 to 0.079 for v = 4. As in the benchmark case, the αv are broadly increasing with higher

15Details on the class breakpoint analysis are available from the authors upon request.

23

Page 26: On Theories Explaining the Success of the Gravity Equation 1

factor proportions differences (nine of the bilateral comparisons correct). The allocation of the total

observations across classes varies from that in the benchmark case: the classes v = 1 to v = 4 tend

to have about 360 observations, while the remaining circa 800 are in the class with the higher factor

proportions differences. However, in contrast to the IRS/uni-cone H-O model, we can reject both

the null hypothesis that αv = α, ∀v as well as the hypothesis that α1 = α5 at standard levels of

signiÞcance.

In sum, while noting both the generally larger conÞdence intervals and the weaker results for the

IRS/uni-cone H-O model, we Þnd that the MLE results are broadly similar to those that we have

obtained in the benchmark case.

4.2.2 A different separation into high- and low-GL samples

Table 6 shows the results when the critical level GL to separate the high- and low-GL sample is set

at 0.033; this allocated 740 observations to the high- and 2130 observations to the low-GL sample.

The pattern of αv estimates for the IRS model is more erratic than in the benchmark case, with

only six out of ten bilateral comparisons correct. However, the estimates continue to suggest that α5

exceeds α1. There is no major change in the performance of the multi-cone H-O model; we still do

not estimate a larger import parameter in classes with higher levels of factor proportions differences.

Only four out of ten bilateral comparisons are correctly predicted by the model.

Among the imperfect specialization models, the IRS/uni-cone H-O model fares worse in that now

only six bilateral comparisons are correctly predicted, as opposed to nine in the benchmark case.

This conÞrms the MLE results for this model, and suggests that the differences between the d(1− γiv)

estimates for the higher GL classes should not be overemphasized. Nevertheless, it remains the case

that one can reject both the null hypothesis that all αv�s are the same and that α1 = α5 according

to the conÞdence intervals reported in Table 6. For the uni-cone H-O model, the pattern of the

24

Page 27: On Theories Explaining the Success of the Gravity Equation 1

estimates is the same as in the benchmark case. Moreover, we reject both the hypothesis that all

αv�s are identical and the hypothesis that α1 is as large as α5. Thus, the performance of the uni-cone

H-O model appears to be robust.

In some unreported analysis we have also examined both the goodness-of-Þt of the models as well

as the country-speciÞc parameter estimates for the imperfect specialization models. This produced

results very similar to those in the benchmark case. In particular, the correlation of¡1− γi

¢and

KiLiestimated from the IRS/uni-cone H-O model is non-positive, which differs from the positive

correlation predicted by the model. In contrast, the correlation of the γi�s estimated from the uni-

cone H-O model and data on KiLiremains strong and of the correct sign (which is negative).

Overall, the analysis of these two alternative speciÞcations suggests that the qualitative results

that we have obtained in the benchmark case are by and large robust.

4.3 The effect of distance and other sensitivity analyses

One element that we have not controlled for so far is differences in the distance between countries.

It is well-established empirically that bilateral trade volumes fall as distance increases (the second

pillar of the gravity equation), raising the possibility that the different parameter estimates are

driven by differences in bilateral distance among trade partners.16 Even though the import volume

predictions we use do not incorporate the effect of trade costs, at an empirical level it is important

to see whether our results depend on the omission of bilateral distance from our regressions. Our

analysis in Appendix A shows that this is not the case. While this cannot settle the question of how

structural bilateral import relations look in the presence of trade costs, it means that our earlier

Þndings are robust to incorporating the effects of distance.

We also report in appendices A and B the results from a range of additional sensitivity analyses.

16Note, however, the recent paper by Krishna (2000) who argues that the importance of distance has been exaggeratedin previous research.

25

Page 28: On Theories Explaining the Success of the Gravity Equation 1

This encompasses alternative speciÞcations, outlier analysis, as well as different sources and deÞn-

itions of the data; the latter is intended to address problems resulting from measurement error in

the data that we have used. The broad conclusion that emerges from this is that our results are not

sensitive to these factors.

5 Conclusions

Although it is well-known that the gravity equation can explain much of the variation in bilateral

trade volumes, to date there has been little agreement about which theory or theories underpins its

success. In this paper we have employed information on the extent of intra-industry trade and on

the differences in factor endowments in a given sample to help identify which of four models might

be driving the gravity equation.

Our approach relies on identifying intra-industry trade with IRS-based trade. Intra-industry

trade, though, can also be caused by other reasons, in particular by Ricardian technology differences

(Davis 1995). However, the lack of data on production technologies at the goods level makes it

difficult to evaluate this alternative hypothesis. In addition, we have not modeled other potentially

important determinants of trade ßows, especially transportation and trade costs. Although we

Þnd that controlling for distance does not overturn our principal Þndings, this analysis is far from

complete, and moreover, it does not address how trade impediments alter international trade ßows

in terms of theory.

We Þnd that specialization and trade are increasing as the share of intra-industry trade in total

trade rises. This provides support for the product differentiation cum IRS model of trade to play a

role in �North-North� trade. However, the perfect specialization version of this model overpredicts

the amount of bilateral trade by a large margin. Adding a factor abundance element leads to

some imperfectly specialized production. In line with the predictions of this IRS/uni-cone H-O

26

Page 29: On Theories Explaining the Success of the Gravity Equation 1

model, we Þnd that the size of the differentiated goods sector and the share of intra-industry trade

move together. However, the size of the differentiated goods sector is not related to relative factor

abundance in the way predicted by the IRS/uni-cone H-O model.

Before discussing our Þndings in samples where intra-industry trade is only a small part of overall

trade, and comparing them with those obtained for the high-GL sample, we caution again that the

data quality for the low-GL sample is worse than for the high-GL sample. Having said that, we

fail to Þnd that higher levels of factor proportions differences lead systematically to more perfect

specialization and trade. Thus, there is no support for the perfect specialization Heckscher-Ohlin

model. Other estimates suggest that imperfect specialization due to factor proportions differences is

important for explaining bilateral trade. Generally, trade volumes are going up with bilateral factor

proportions differences. Moreover, as predicted, the relative size of the labor-intensive goods sector

is negatively correlated with the relative capital endowment across countries. This suggests that the

factor abundance motive is important in explaining the volume of �North-South� trade.

In the light of this paper, among models that predict perfect specialization, the IRS model is

clearly the better candidate to explain the success of the gravity equation, relative to the multi-cone

H-O model. Moreover, there is no reason to believe that the results of Hummels and Levinsohn

(1995) question the empirical relevance of IRS trade theory. Imperfect specialization due to factor

proportions differences is likely to account for the regression results that these authors obtained from

a sample of trade relations between non-OECD nations, where little product differentiation/IRS was

expected. While our results are consistent with IRS and product differentiation being empirically

relevant elements in helping to explain the volume of bilateral trade, they stop well short of a full

endorsement of the IRS models that we have considered. SpeciÞcally, when product differentiation

in trade is important, differences in relative factor abundance across countries seem to matter less

for specialization than according to the synthesis of factor abundance and IRS motives for trade

27

Page 30: On Theories Explaining the Success of the Gravity Equation 1

proposed by Helpman (1981). In part our mixed results for the IRS/uni-cone H-O model might

be due to the fact that every differentiated good is neither more capital-intensive, nor more labor-

intensive, than homogeneous goods, in contrast to the clear-cut distinction in Helpman�s model.

Further, once product differentiation in trade is high (and bilateral factor proportions differences are

small), perhaps it is too much to ask of the data that the size of the differentiated goods sector be

related to relative factor abundance across countries. This corresponds to the widely-held expectation

that the factor abundance inßuence is muted in �North-North� trade relationships. Underlying all of

this is the strong suspicion that this two-good, two-factor IRS/uni-cone H-O model is too simple to

perform well empirically.

While we reject the multi-cone H-O model, the support for the uni-cone H-O model is perhaps

surprisingly strong. Of course, the latter does not mean that the world�s countries indeed occupy

the same cone of diversiÞcation; data on factor prices as well as on factor input ratios in production

casts doubt on this notion. Moreover, rejecting a model where each country occupies a distinct cone

does not mean that perfect specialization due to factor proportions differences is irrelevant; it leaves

open the possibility of a relatively small number, say three or four, of diversiÞcation cones. Future

research should develop multi-country IRS/H-O models in which production is partly specialized�

due to both factor abundance and IRS� and partly diversiÞed, and where bilateral trade volumes

are determined. We take the relative success of the uni-cone H-O model to suggest that the bilateral

trade ßow predictions of that multi-country model will probably turn out to be closely related to

that of the uni-cone H-O model that we have analyzed here. And we take the evidence on the IRS

and multi-cone H-O models to suggest that perfect specialization of production in different countries

has more to do with scale economies than with relative factor abundance differences, especially for

goods that are produced using similar factor intensities.

28

Page 31: On Theories Explaining the Success of the Gravity Equation 1

Table 1A

COUNTRY GDP per Capita GDP (billion Capital per Number of Average Number of Average Grubel-($ US) $ US) Worker ($ US) Bilateral Relations Industry Classes Traded Lloyd Index

U.S.A. 16570 3964.9 29925 57 332 0.1371CANADA 15589 392.3 34535 57 190 0.055SWITZERLAND 14864 96.2 62769 57 209 0.1227NORWAY 14144 58.7 44866 57 142 0.0537AUSTRALIA 13583 214 33875 56 151 0.0306SWEDEN 13451 112.3 31326 57 198 0.1163DENMARK 12969 66.3 29286 57 181 0.0923W. GERMANY 12535 765.4 47695 57 323 0.1473BELGIUM-LUX. 12230 116.1 33494.5 57 221 0.1347ICELAND 12209 2.9 15991 44 84 0.0064FRANCE 12206 673.4 31796 57 282 0.1377FINLAND 12051 59.1 38591 57 136 0.0625JAPAN 11771 1421.4 28106 57 244 0.064NETHERLANDS 11539 167.2 29325 57 255 0.1283NEW ZEALAND 11443 37.4 30511 53 106 0.0189U.K. 11237 636.2 17636 57 307 0.1495AUSTRIA 11131 84.1 29708 57 171 0.1031ITALY 10808 617.6 28117 57 265 0.1111HONG KONG 10599 57.8 11220 57 166 0.0805ISRAEL 8310 35.2 22307 52 119 0.0721SPAIN 7536 290.7 21831 57 209 0.0807IRELAND 7275 25.8 20700 56 117 0.0746VENEZUELA 6225 106.3 20419 48 91 0.0078GREECE 6224 61.8 22151 55 112 0.0304MEXICO 5621 420.3 14016 52 100 0.0239TAIWAN 5449 104.9 19194 57 167 0.0569ARGENTINA 5324 161.5 12084 52 92 0.0201PORTUGAL 5070 51.5 9503 56 111 0.0473SYRIA 4240 43.9 15681 41 70 0.0011MAURITIUS 4226 4.3 2393 35 36 0.0106S. KOREA 4217 172.1 12036 55 139 0.0644IRAN 4043 187.5 10686 41 81 0.0033PANAMA 3499 7.6 16417 45 73 0.0074CHILE 3467 42 7060 48 100 0.0081TURKEY 3077 154.8 6989 49 103 0.0195COLOMBIA 2968 87.5 12790 48 85 0.0092ECUADOR 2913 27.1 15536 44 67 0.0031PERU 2565 49.7 9480 48 90 0.0077THAILAND 2463 127.3 4051 53 114 0.0286JAMAICA 2215 5 3725 41 47 0.005DOMINICAN REP. 2111 13.5 5282 43 56 0.012GUATEMALA 2090 16.6 3985 45 57 0.0079PARAGUAY 2072 7.7 849 37 42 0.0021SRI LANKA 2045 32.4 8238 54 66 0.0186MOROCCO 1956 43.2 2803 48 76 0.0067BOLIVIA 1754 11.1 6987 37 49 0.0006IVORY COAST 1545 15.1 1171 51 57 0.0076PHILIPPINES 1542 84.4 4087 55 82 0.024HONDURAS 1387 6.1 4507 41 51 0.0028ZIMBABWE 1216 10.2 5750 48 52 0.0033NIGERIA 1062 88.4 1103 44 83 0.0024INDIA 1050 803.4 1712 57 115 0.0305NEPAL 936 15.6 758 27 41 0.0031SIERRA LEONE 905 3.3 215 34 38 0.0029ZAMBIA 808 5.5 1785 37 53 0.0053KENYA 794 16.1 1187 43 75 0.0052MADAGASCAR 769 7.7 1714 34 40 0.0059MALAWI 518 3.7 469 37 38 0.0028

Average 6248.55 222.31 16214.2 49.48 123.4 0.0427Median 4233 58.9 12437 52 100 0.0198Standard Deviation 4934.91 563.33 14136.79 8.05 78.57 0.0469

Page 32: On Theories Explaining the Success of the Gravity Equation 1

Table 1B

Correlation Matrix

GDP per GDP Capital per No. of Av. No. of Av. Grubel-Capita Worker Bilateral Pairs Industries Lloyd Index

GDP per Capita 1GDP 0.41 1Capital per Worker 0.89 0.26 1No. of Bilat. Pairs 0.68 0.31 0.66 1Av. No. of Industr. 0.81 0.6 0.73 0.74 1Av. Grubel-L. Index 0.78 0.45 0.73 0.71 0.93 1

Page 33: On Theories Explaining the Success of the Gravity Equation 1

Table 2: The benchmark case

IRS/H-O: High-GL sample (GL>0.05) H-O:Low-GL sample (GL<0.05)

Model

Name

IRS IRS/uni-cone H-O Multi-cone H-O Uni-cone H-O

2 types

of goods

IRS/IRS IRS/CRS CRS/CRS CRS/CRS

Only

perfect

specialization

of production

Yes No Yes No

GLαv

(s.e.)

5%

95%

αv

(s.e.)

5%

95%

FDIFαv

(s.e.)

5%

95%

αv

(s.e.)

5%

95%

v = 10.016

(0.012)

0.012;

0.044

0.078

(0.005)

0.072;

0.087

v = 10.039

(0.007)

0.030;

0.049

0.021

(0.004)

0.012;

0.026

v = 20.044

(0.005)

0.036;

0.052

0.053

(0.005)

0.047;

0.060

v = 20.111

(0.014)

0.087;

0.132

0.027

(0.008)

0.025;

0.043

v = 30.139

(0.013)

0.120;

0.164

0.117

(0.009)

0.112;

0.141

v = 30.047

(0.005)

0.040;

0.056

0.058

(0.008)

0.039;

0.066

v = 40.069

(0.017)

0.049;

0.097

0.123

(0.005)

0.109;

0.124

v = 40.039

(0.003)

0.034;

0.044

0.048

(0.006)

0.046;

0.064

v = 50.099

(0.015)

0.083;

0.125

0.128

(0.006)

0.119;

0.134

v = 50.039

(0.004)

0.033;

0.045

0.080

(0.007)

0.069;

0.101

H0 :

αi = α, ∀ireject reject reject reject

H0 :

α1 = α5

reject reject do not reject reject

Share of

bilat. comp.

correct

8/10 9/10 2/10 9/10

All observ.0.087

(0.009)

0.076;

0.104

0.086

(0.004)

0.079;

0.092

0.052

(0.003)

0.047;

0.056

0.040

(0.003)

0.034;

0.044

Page 34: On Theories Explaining the Success of the Gravity Equation 1

Table 3: Measures of Þt for the benchmark case

IRS/H-O: High-GL sample (GL>0.05) H-O: Low-GL sample (GL<0.05)

Model

Name

IRS IRS/uni-cone H-O Multi-cone H-O Uni-cone H-O

2 types

of goods

IRS/IRS IRS/CRS CRS/CRS CRS/CRS

Only

perfect

specialization

of production

Yes No Yes No

GL

ln(e0e)

AIC

R2

ln(e0e)

AIC

R2

FDIF

ln(e0e)

AIC

R2

ln(e0e)

AIC

R2

v = 1

44.61

39.79

1.083

43.74

39.30

0.832

v = 1

43.36

37.26

0.237

43.24

37.37

0.712

v = 2

45.21

40.39

0.456

44.93

40.46

0.619

v = 2

45.61

39.51

0.455

45.25

39.38

0.605

v = 3

48.81

43.99

0.958

48.75

44.26

0.674

v = 3

44.72

38.62

0.417

44.24

38.38

0.484

v = 4

47.17

42.34

0.917

46.45

41.93

0.869

v = 4

44.45

38.35

1.963

43.88

38.03

0.793

v = 5

50.61

45.79

0.214

49.61

45.04

0.934

v = 5

44.32

38.22

0.596

43.84

37.95

0.690

All observ.

50.91

44.46

0.280

51.03

44.69

0.105

All obs.

46.74

39.03

0.451

46.96

39.30

0.195

Page 35: On Theories Explaining the Success of the Gravity Equation 1

Table 4

IRS/Uni-cone H-O Uni-cone H-OGDP

COUNTRY per cap. K/L 1-gamma s.e. p-value gamma s.e. p-value

SIERRA LEONE 905 215 1.000 n/a n/aMALAWI 518 469 1.002 0.018 1.00NEPAL 936 758 0.962 0.018 0.02PARAGUAY 2072 849 1.001 0.016 1.00NIGERIA 1062 1103 1.007 0.020 1.00IVORY COAST 1545 1171 1.008 0.018 1.00KENYA 794 1187 0.999 0.015 0.96INDIA 1050 1712 0.977 0.015 0.00MADAGASCAR 769 1714 0.980 0.014 0.14ZAMBIA 808 1785 0.022 0.006 0.00 1.001 0.045 1.00MAURITIUS 4226 2393 1.027 0.032 1.00MOROCCO 1956 2803 0.976 0.015 0.04JAMAICA 2215 3725 1.009 0.017 1.00GUATEMALA 2090 3985 0.986 0.016 0.36THAILAND 2463 4051 0.976 0.014 0.02PHILIPPINES 1542 4087 0.976 0.015 0.00HONDURAS 1387 4507 2.647 0.685 1.00 1.009 0.021 1.00DOMINICAN REP. 2111 5282 0.211 0.001 1.00 0.986 0.016 0.28ZIMBABWE 1216 5750 6.415 1.521 1.00 0.976 0.016 0.02BOLIVIA 1754 6987 1.015 0.050 1.00TURKEY 3077 6989 0.981 0.014 0.22CHILE 3467 7060 1.005 0.015 1.00SRI LANKA 2045 8238 0.017 0.006 0.00 0.983 0.016 0.16PERU 2565 9480 0.982 0.015 0.12PORTUGAL 5070 9503 0.969 0.020 0.04IRAN 4043 10686 0.977 0.020 0.28HONG KONG 10599 11220 0.111 0.020 1.00 0.940 0.033 0.02REP. OF KOREA 4217 12036 1.497 0.227 1.00 0.975 0.020 0.24ARGENTINA 5324 12084 0.206 0.006 1.00 0.955 0.021 0.00COLOMBIA 2968 12790 0.140 0.018 1.00 0.971 0.015 0.00MEXICO 5621 14016 0.037 0.008 0.00 0.973 0.014 0.00ECUADOR 2913 15536 0.391 0.025 1.00 0.971 0.018 0.04SYRIA 4240 15681 0.964 0.015 0.00ICELAND 12209 15991 1.018 0.034 1.00PANAMA 3499 16417 0.406 0.163 1.00 0.975 0.034 0.54U.K. 11237 17636 0.048 0.006 0.00 0.903 0.020 0.00TAIWAN 5449 19194 0.111 0.020 1.00 0.959 0.019 0.02VENEZUELA 6225 20419 0.015 0.001 0.00 0.952 0.020 0.00IRELAND 7275 20700 0.198 0.035 1.00 0.934 0.023 0.00SPAIN 7536 21831 0.049 0.007 0.00 0.933 0.016 0.00GREECE 6224 22151 0.076 0.009 0.04 0.968 0.015 0.00ISRAEL 8310 22307 0.030 0.005 0.00 0.910 0.023 0.00JAPAN 11771 28106 0.072 0.008 0.06 0.954 0.016 0.00ITALY 10808 28117 0.021 0.003 0.00 0.932 0.017 0.00DENMARK 12969 29286 0.101 0.018 1.00 0.933 0.017 0.00NETHERLANDS 11539 29325 0.065 0.012 0.08 0.856 0.024 0.00AUSTRIA 11131 29708 0.080 0.013 0.38 0.943 0.015 0.00U.S.A. 16570 29925 0.042 0.006 0.00 0.953 0.014 0.00NEW ZEALAND 11443 30511 0.178 0.024 1.00 0.910 0.022 0.00SWEDEN 13451 31326 0.089 0.012 0.74 0.923 0.021 0.00FRANCE 12206 31796 0.054 0.004 0.00 0.925 0.019 0.00BELGIUM-LUX. 12230 33495 0.098 0.010 1.00 0.855 0.027 0.00AUSTRALIA 13583 33875 0.107 0.010 1.00 0.930 0.019 0.00CANADA 15589 34535 0.076 0.007 0.10 0.946 0.016 0.00FINLAND 12051 38591 0.143 0.020 1.00 0.930 0.017 0.00NORWAY 14144 44866 0.196 0.026 1.00 0.934 0.019 0.00W. GERMANY 12535 47695 0.086 0.005 0.30 0.906 0.018 0.00SWITZERLAND 14864 62769 0.091 0.010 n/a 0.874 0.018 0.00

Relation of (1-gamma) and gamma toCorrelation OLS Correlation OLS

t-stat. R squared t-stat. R squaredCapital-labor endowment ratio (K/L) -0.35 -2.12 0.12 -0.80 -9.65 0.63GDP per capita -0.40 -2.54 0.16 -0.74 -8.03 0.54

Page 36: On Theories Explaining the Success of the Gravity Equation 1

Table 5: Endogenous class size

IRS/H-O (GL>0.05) H-O (GL<0.05)

Model

Name

IRS IRS/uni-cone H-O Multi-cone H-O Uni-cone H-O

2 types

of goods

IRS/IRS IRS/CRS CRS/CRS CRS/CRS

Only

perfect

specialization

of production

Yes No Yes No

GLαv

(s.e.)

5%

95%

αv

(s.e.)

5%

95%

FDIFαv

(s.e.)

5%

95%

αv

(s.e.)

5%

95%

v = 10.016

(0.015)

0.011;

0.058

0.076

(0.030)

0.053;

0.144

v = 10.039

(0.014)

0.024;

0.056

0.015

(0.007)

0.008;

0.031

v = 20.042

(0.060)

0.034;

0.209

0.059

(0.026)

0.047;

0.126

v = 20.111

(0.026)

0.074;

0.157

0.030

(0.012)

0.019;

0.051

v = 30.027

(0.051)

0.037;

0.208

0.089

(0.037)

0.038;

0.150

v = 30.029

(0.033)

0.025;

0.123

0.047

(0.017)

0.034;

0.084

v = 40.133

(0.039)

0.040;

0.152

0.106

(0.064)

0.050;

0.270

v = 40.058

(0.018)

0.026;

0.076

0.079

(0.013)

0.043;

0.085

v = 50.102

(0.044)

0.080;

0.192

0.103

(0.033)

0.092;

0.192

v = 50.037

(0.015)

0.026;

0.062

0.058

(0.010)

0.052;

0.076

H0 :

αi = α,∀ireject do not reject reject reject

H0 :

α1 = α5

reject do not reject do not reject reject

Share of

bilat. comp.

correct

8/10 8/10 4/10 9/10

All observ. As in Table 2

Page 37: On Theories Explaining the Success of the Gravity Equation 1

Table 6: Alternative critical GL level (GL = 0.033)

IRS/H-O: High-GL sample (GL>0.05) H-O: Low-GL sample (GL<0.05)

Model

Name

IRS IRS/uni-cone H-O Multi-cone H-O Uni-cone H-O

2 types

of goods

IRS/IRS IRS/CRS CRS/CRS CRS/CRS

Only

perfect

specialization

of production

Yes No Yes No

GLαv

(s.e.)

5%

95%

αv

(s.e.)

5%

95%

FDIFαv

(s.e.)

5%

95%

αv

(s.e.)

5%

95%

v = 10.057

(0.010)

0.047;

0.072

0.070

(0.007)

0.057;

0.079

v = 10.052

(0.008)

0.040;

0.067

0.015

(0.004)

0.013;

0.025

v = 20.019

(0.010)

0.014;

0.045

0.068

(0.005)

0.061;

0.077

v = 20.123

(0.020)

0.096;

0.152

0.026

(0.008)

0.020;

0.044

v = 30.141

(0.015)

0.121;

0.168

0.124

(0.006)

0.111;

0.134

v = 30.049

(0.005)

0.041;

0.057

0.056

(0.008)

0.035;

0.059

v = 40.103

(0.019)

0.078;

0.139

0.117

(0.005)

0.106;

0.123

v = 40.034

(0.004)

0.028;

0.041

0.049

(0.006)

0.044;

0.062

v = 50.091

(0.012)

0.078;

0.115

0.100

(0.016)

0.097;

0.154

v = 50.062

(0.006)

0.054;

0.072

0.069

(0.007)

0.055;

0.076

H0 :

αi = α,∀ireject reject reject reject

H0 :

α1 = α5

reject reject do not reject reject

Share of

bilat. comp.

correct

6/10 6/10 4/10 9/10

All observ.0.085

(0.007)

0.073;

0.095

0.087

(0.005)

0.075;

0.092

0.052

(0.003)

0.047;

0.056

0.033

(0.003)

0.031;

0.040

Page 38: On Theories Explaining the Success of the Gravity Equation 1

Figure 1

The Model Identification Issue in the Gravity Equation Context

1.Structural Assumptions:

Identical homotheticdemand, free trade,

and…

Consistent with absence of factor proportions

differences?

2.Implication for Nature of

Trade

consistent with tradein goods with identicalfactor requirements?

3.Import Volume

Prediction

Technology Differences(Ricardian)

Increasing Returns Heckscher-OhlinType of Model

Technology Differences within industry classes

across countries

yes

Increasing Returns at the firm level, monopolistic

competition, product differentiation

yes

Homogeneous goods and Multi-coneworld (Large

Factor Proportions Differences)

no

Intra-industry trade in goods with alternating

technological superiority

yes

Intra-industry trade in product varieties with potentially identical technologies across

countries

yes

Inter-industry trade

no

Gravity Equation

Other Models

Page 39: On Theories Explaining the Success of the Gravity Equation 1

References

[1] Anderson, J. (1979) �A Theoretical Foundation for the Gravity Equation,� American Economic

Review, 69(1): 106-116.

[2] Antweiler, W., and D. Treßer (1997), �Increasing Returns and All That: A View from Trade�,

mimeo, University of Toronto, August.

[3] Bartelsman, E., and W. Gray (2000), NBER Manufacturing Productivity Database, NBER,

Cambridge, MA; http://www.nber.org, March.

[4] Bergstrand, J. (1990), �The Heckscher-Ohlin-Samuelson Model, the Linder Hypothesis and the

Determinants of Bilateral Intra-Industry Trade�, Economic Journal 100: 1216-1229.

[5] Bickel, P., and D. Freedman (1981), �Some Asymptotic Theory for the Bootstrap�, The Annals

of Statistics 9: 1196-1217.

[6] Davis, D., (1995), �Intra-industry Trade: A Ricardo-Heckscher Ohlin Approach�, Journal of

International Economics, 39: 201-226.

[7] Deardorff, A. (1998) �Determinants of Bilateral Trade: Does Gravity Work in a Neoclassical

World?�, chapter 1 in J. A. Frankel (ed.),The Regionalization of the World Economy, Chicago:

The University of Chicago Press.

[8] Eaton, J., and S. Kortum (1997), �Technology and Bilateral Trade�, mimeo, Boston University,

April.

[9] Efron, B. (1982), The Jackknife, the Bootstrap and Other Resampling Plans, Philadelphia:

Society for Industrial and Applied Mathematics.

Page 40: On Theories Explaining the Success of the Gravity Equation 1

[10] Feenstra, R., R. E. Lipsey, and H. P. Bowen (1997), �World Trade Flows, 1970-1992, with

Production and Tariff Data�, NBER Working Paper # 5910 and accompanying CD-ROM,

Cambridge, MA.

[11] Finger, J. (1975), �A New View of the Product Cycle Theory�, Weltwirtschaftliches Archiv :

79-99.

[12] Fontagne, L., M. Freudenberg, and N. Peridy (1997), �Intra-Industry Trade and the Single Mar-

ket: An Empirical Assessment Based on Trade Types�, mimeo, Universite de Paris I Pantheon-

Sorbonne.

[13] Greenaway, D., R. Hine, and C. Milner (1994), �Country-SpeciÞc Factors and the Pattern of

Horizontal and Vertical Intra-Industry Trade in the UK�, Weltwirtschaftliches Archiv 130, Vol.

1.

[14] Grubel, H., and P. Lloyd (1975), Intra-Industry Trade: The Theory and Measurement of Inter-

national Trade in Differentiated Products, London: Macmillan.

[15] Harrigan, J. (1996), �Openness to trade in manufactures in the OECD�, Journal of International

Economics 40: 23-39.

[16] Haveman, J. (1998), Geographic Distance Data, http: //intrepid.mgmt. purdue.edu/Jon/

Data/TradeData.html/Gravity

[17] Helpman, E. (1998), �The Structure of Foreign Trade�, NBERWorking Paper # 6752, October.

[18] Helpman, E. (1987) �Imperfect Competition and International Trade: Evidence from Fourteen

Industrial Countries,� Journal of Japanese and International Economics, 1: 62-81.

Page 41: On Theories Explaining the Success of the Gravity Equation 1

[19] Helpman, E. (1981), �International Trade in the Presence of Product Differentiation, Economies

of Scale, and Monopolistic Competition: A Chamberlin-Heckscher-Ohlin Approach�, Journal of

International Economics 11: 305-340.

[20] Helpman, E. and P. Krugman (1985) Market Structure and Foreign Trade: Increasing Returns,

Imperfect Competition and the International Economy, Cambridge, MA: MIT Press.

[21] Hummels, D. and J. Levinsohn (1995) �Monopolistic Competition and International Trade:

Reconsidering The Evidence,� Quarterly Journal of Economics, 110(3): 799-836.

[22] International Monetary Fund (1995), Directions of Trade Statistics, 1995 yearbook, Washington,

D.C.

[23] Keller, W. (2000), �Geographic Localization of International Technology Diffusion�, NBER

Working Paper # 7509, January.

[24] Kennedy, P. (1992), A Guide to Econometrics, 3rd edition, Cambridge, MA: MIT Press.

[25] Krishna, P., (2000), �Are Regional Trading Partners �Natural�?� , working paper, Brown

University, Providence, RI.

[26] Krugman, P. (1994), �Empirical Evidence on the New Trade Theories: The Current State of

Play�, in New Trade Theories: A Look at the Empirical Evidence, CEPR Conference Report,

pp. 11-31.

[27] Leamer, E. and J. Levinsohn (1995) �International Trade Theory: The Evidence�, in G.M.

Grossman and K. Rogoff (eds.), Handbook of International Economics, Vol. 3, Elsevier, Ams-

terdam.

[28] Markusen, J., and R. Wigle (1990), �Explaining the Volume of North-South Trade�, Economic

Journal 100: 1206-1215.

Page 42: On Theories Explaining the Success of the Gravity Equation 1

[29] Peneder, M. (1999), �Intangible Investment and Human Resources. The newWIFO Taxonomy of

Manufacturing Industries�, WIFOWorking Paper # 114, WIFO (Austrian Institute of Economic

Research), Vienna, May.

[30] Rauch, J. (1999), �Networks versus markets in international trade�, Journal of International

Economics 48: 7-35.

[31] Saxonhouse, G. (1989), �Differentiated Products, Economies of Scale, and Access to the

Japanese Market�, chapter 5 in R. C. Feenstra (ed.), Trade Policies for International Com-

petitiveness, Chicago: University of Chicago Press.

[32] Summers, R., and A. Heston (1991), �The Penn World Table (Mark 5): An Expanded Set of

International Comparisons, 1950-1988�, Quarterly Journal of Economics 106: 327-368.

[33] WIFO (1999), Competitiveness Report 1999. Prepared by WIFO (Austrian Institute of Eco-

nomic Research) for the European Commission, Vienna.

[34] World Bank (2000), World Development Indicators 2000, CD ROM, Washington, D.C.

Page 43: On Theories Explaining the Success of the Gravity Equation 1

A Sensitivity analysis I

Results on the Þrst and more extensive set of sensitivity analyses are presented in Table A1; the

benchmark case is listed again for reference. The speciÞcations are: (1) A higher critical level that

separates the low- and the high-GL samples, with GL = 0.075. This completes the analysis in the

text, where results for the cases of GL = 0.050 (the benchmark case) and GL = 0.033 are discussed.

(2) A speciÞcation that incorporates bilateral distance between countries i and j; it is

M ijv = αv

Y ivY

jv

Y wdij+ εij

v , (8)

for all v, where dij is the log distance between countries i and j. (3) The third speciÞcation includes

a constant term, α0 :

M ijv = α0 + αv

Y ivY

jv

Y w+ εij

v . (9)

For each of these three speciÞcations, four statistics that summarize the performance of the models

are reported: Þrst, the test whether the αv across all Þve classes are equal; second, the test whether

the values for the Þrst (lowest GL− or FDIF−class, respectively) and the Þfth class are equal; third,

how many bilateral comparisons of αv point estimates are correctly predicted for a given model. All

results are based on bootstrapped αv and their symmetric conÞdence intervals; the tests are based

on a signiÞcance level of 10%. Fourth, for the two imperfect specialization models, Table A1 reports

the t-statistic from an OLS regression of γi (for the uni-cone H-O model) and¡1− γi

¢(IRS/uni-cone

model) on the relative capital endowments, Ki

Li . All three speciÞcations conÞrm the benchmark results

for the two hypothesis tests, with one exception: for the multi-cone H-O model in the GL = 0.075

case, we Þnd that α1 is signiÞcantly larger than α5. This does not provide more evidence in support

of this model, though, because it is the opposite of what one would expect if the model performs

well. The number of bilateral comparisons correctly predicted varies somewhat across the three

Page 44: On Theories Explaining the Success of the Gravity Equation 1

alternatives. It is always the case, though, that the multi-cone H-O model performs relatively poor

compared to the other three models, scoring always on fewer than half of the comparisons.

We now turn to the correlation of the relative capital endowments with the country-speciÞc

estimates from the imperfect specialization models. While there are no major differences relative to

the benchmark case for the GL = 0.075 and the distance speciÞcation, including a constant leads to

the following changes. The correlation of (1− γi) and Ki

Li is now positive, which is what one expects

based on the assumption that the differentiated good is capital-intensive. This correlation is not very

strong, though. In consequence, this cannot change our general assessment of the performance of the

IRS/uni-cone model, which has been mixed. For the uni-cone H-O model, the correlation between γi

and Ki

Li is negative, as predicted by the model. However, the substantially lower t-statistic (−2.07,

versus −9.65 for the benchmark case) indicates that the relation between production specialization

and factor proportions is weakened. In none of these three alternative speciÞcations, though, do we

obtain qualitatively different results relative to the benchmark case: for the IRS/uni-cone model,

the predicted correlation is weak or has the wrong sign, whereas for the uni-cone H-O model, the

correlation has the correct sign, and it is generally also quite strong.

While Table A1 shows that controlling for within-class bilateral distance does not affect the results

substantially, we have extended this analysis to control for between-class differences in bilateral

distance. It turns out that the bilateral distance between countries does not vary much across

FDIF classes. However, the bilateral distance across the Þve classes in the high-GL sample varies

substantially; in particular, the median bilateral distance is falling from 6357 kilometers for v = 1

to 1066 kilometers for v = 5. Because it is well-known that, ceteris paribus, there is a higher level

of trade over shorter distance, it is important to see whether our result that αv is rising with v for

the two models in the high-GL sample holds up even when we control for the differences in bilateral

distance across classes. Our analysis (not reported, but available upon request) shows that this is

Page 45: On Theories Explaining the Success of the Gravity Equation 1

the case.

B Sensitivity analysis II: measurement error and other problems

In Table A2, we report the results of further sensitivity analysis. The alternative speciÞcations are:

(1) V = 2 classes instead of Þve; (2) V = 3 classes; (3) Endogenous formation of three GL- and

FDIF -classes as a function of model Þt (denoted as Three-class MLE in Table A2); (4) ModiÞed

Grubel-Lloyd indices: Instead of using the GLij values as computed in part three above, we use mod-

iÞed Grubel-Lloyd values where the idiosyncratic component for a given bilateral pair is eliminated.

This series is constructed by running a linear regression of GLij on country effects, and then using

the Þtted values from this as the modiÞed GL series. The correlation between the two GL series is

equal to 0.70. (5) New dependent variable: The left-hand side variable is now imports divided by the

GDP term,M ij/¡Y iY j/Y w

¢. If simultaneity between imports and GDPs is a major concern or there

is mismeasurement in the GDP terms, this speciÞcation should help detecting it. For the perfect

specialization models, the αv�s are given by the ratio of the mean ofM ij and the mean of Y iY j/Y w.

(6) GDP data from the World Bank�s World Development Indicators (WDI) database: Our pre-

ferred measure is the GDP data from Summers and Heston which are based on PPP-exchange rates,

whereas the WDI GDP data employed here are computed using actual exchange rates. (7) Bilateral

imports data are taken from the International Monetary Fund�s Directions of Trade Statistics (1995

edition). The latter data is based on the reports of national treasuries to the IMF, and differs from

the trade ßows reported by customs officials, which forms the basis of the revised United Nations

trade data of Feenstra et al. (1997).

Table A2 indicates that the results are by and large robust. Generally, the performance of the

multi-cone H-O model is substantially worse than that of the other three models. In only one case�for

the three-class MLE analysis�does the pattern of the estimated αv�s for the multi-cone H-O model

Page 46: On Theories Explaining the Success of the Gravity Equation 1

broadly conÞrm the a priori expectation that αv is rising with v. This assessment is conÞrmed by the

last row in Table A2, which reports the average share of correctly predicted bilateral comparisons

from all eight listed speciÞcations: with 26%, the multi-cone H-O model predicts considerably worse

than the other three models, for which the corresponding values lie higher than 80%.

Finally, a number of unreported sensitivity checks regarding the following have been conducted:

(1) Computing world GDP with Summers-Heston data as the sum of the GDP of all 152 countries

for which the data is available, as opposed to the 58 countries in the sample that we use; (2) using

the WDI�s GDP data on 58 countries to compute world GDP; (3) employing Summers-Heston data

on GDP per capita as an alternative proxy of the relative capital endowment, Ki

Li for the analysis

of the multi-cone and uni-cone H-O models; (4) outlier and inßuential points analysis in form of

(a) dropping each observation one at a time, and re-estimating, and (b) discarding small groups of

observations where the absolute value of the residual exceeds some speciÞed value, and re-estimating

; (5) different as well as simpler heteroskedasticity adjustments (scaling). The analysis of these cases

has always led to results similar to those for the benchmark case.

Page 47: On Theories Explaining the Success of the Gravity Equation 1

Table A1

H0 : αi = α,∀i

H0 : α1 = α5

Share of bilateral comparisons correct

t-statistic∗

SpeciÞcation IRS IRS/uni-cone Multi-cone Uni-cone

Benchmark

reject

reject

8/10

n/a

reject

reject

9/10

-2.12

reject

not reject

2/10

n/a

reject

reject

9/10

-9.65

GL = 0.075

reject

reject

6/10

n/a

reject

reject

9/10

-0.91

reject

reject∗∗

3/10

n/a

reject

reject

8/10

-10.49

With distance

reject

reject

8/10

n/a

reject

reject

6/10

-2.08

reject

not reject

2/10

n/a

reject

reject

9/10

-9.69

With constant

reject

reject

8/10

n/a

reject

reject

7/10

1.00

reject

not reject

3/10

n/a

reject

reject

8/10

-2.07

∗t − statistic : Ratio of slope coefficient to its standard error in OLS regression of (1 − γi) (for the

IRS/uni-cone model) and γi (for the uni-cone H-O model) on Ki

Li .

∗∗ The coefficient α1 is signiÞcantly larger than α5.

Page 48: On Theories Explaining the Success of the Gravity Equation 1

Table A2

Share of bilateral comparisons correct

(out of: x)

SpeciÞcation IRS IRS/Unicone Multi-cone Uni-cone

Benchmark80%

(10)

90%

(10)

20%

(10)

90%

(10)

Two classes (V = 2)100%

(1)

100%

(1)

0%

(1)

100%

(1)

Three classes (V = 3)66%

(3)

66%

(3)

0%

(3)

100%

(3)

Three class MLE100%

(3)

100%

(3)

66%

(3)

100%

(3)

ModiÞed GL60%

(10)

70%

(10)

40%

(10)

70%

(10)

Dep. var.¡M ij/

£Y iY j/Y w

¤¢ 80%

(10)

100%

(10)

30%

(10)

40%

(10)

GDP data from WDI80%

(10)

70%

(10)

30%

(10)

90%

(10)

Imports data from IMF80%

(10)

70%

(10)

20%

(10)

100%

(10)

Average 81% 83% 26% 86%