On the Information Properties of Trading Networks

8/3/2019 On the Information Properties of Trading Networks

1/29

On the Informational Properties of Trading Networks

Lada Adamic, Celso Brunetti, Jeffrey Harris, and Andrei Kirilenko

September 9, 2009

ABSTRACT

We apply network analysis to trace patterns of information transmission in an elec-

tronic limit order market. If market orders or large executable limit orders are submitted

by informed traders, then resulting star-shaped or diamond-shaped patterns or trading

networks should be associated with large changes in returns, smaller volume, and short

duration between trades. In contrast, the execution of small limit orders from uninformed

traders should result in networks with many triangular and reciprocal patterns and be

associated with smaller changes in returns, larger volume and longer duration between

trades. We compute a time series of trading networks using audit trail, transaction-level

data for all regular transactions in the September 2008 E-mini S&P 500 futures contract

the cornerstone of price discovery for the S&P 500 Index. We find that network met-

rics that quantify the shape of a network are statistically significantly related to returns,

volatility, volume, and duration.

Lada Adamic is with the University of Michigan and the Commodity Futures Trading Commission, Celso

Brunetti is with Johns Hopkins University and the Commodity Futures Trading Commission, Jeffrey Harris is

with the Commodity Futures Trading Commission and the University of Delaware, and Andrei Kirilenko is with

the Commodity Futures Trading Commission. We are greatful to Paul Tsyhura for invaluable assistance with

the retrieval, organization, and processing of transaction-level data. We thank Pat Fishe, Pete Kyle, Antonio

Mele, Han Ozsoylev, and seminar participants at the Chicago Mercantile Exchange, the Commodities FuturesTrading Commission, 2009 Econometric Society Summer Meetings in Barcelona, the Federal Reserve Board

of Governors, NASDAQ, the Securities and Exchange Commission, and the University of Maryland for very

helpful comments and suggestions. The views expressed in this paper are our own and do not constitute an

official position of the Commodity Futures Trading Commission, its Commissioners or staff.


2/29

Most securities exchanges around the world are electronic limit order markets. Yet, the

analysis of electronic limit order trading has proven to be very challenging. To quote from

the survey by Parlour and Seppi (2008): Despite the simplicity of limit orders themselves,

the economic interactions in limit order markets are complex because the associated state

and action spaces are extremely large and because trading with limit orders is dynamic and

generates non-linear payoffs.

In this paper, we apply network analysis to quantify the dynamics of information trans-mission in an electronic limit order market - a complex dynamic problem. The networks we

analyze are trading networks. We define a trading network as a set of traders engaged in trans-

actions within a period of time. In graph theoretic terminology, a trading network is a graph,

consisting of a set of nodes and a set of edges. Each node denotes a unique trader and an

edge between two nodes denotes the occurrence of trading between two unique counterparties

within a period of time. The direction of an edge indicates buy or sell transactions between

unique counterparties. Namely, a directed edge from node A to node B indicates that trader A

sold (one time or several times) to trader B during a specified period of time.

A trading network formed over a designated number of transactions traces a pattern of

order execution in the limit order book. By analyzing the shape of that pattern, we can quantifythe structure of the executed portion of the book. For example, the execution of a market

order will result in a star-shaped pattern with the node that submitted the market order in the

center and nodes that connected to it as the market order marched through the limit order

book in the periphery. This star-shaped network will also not have any triangular or reciprocal

connections. In contrast, the execution of two large limit orders that arrived at different times

will result in a diamond-shaped pattern with the two nodes that submitted large limit orders on

the ends and market makers that provided the immediacy of execution (in small installments)

in the middle. Finally, an execution of a sequence of small limit orders will look different

from the execution of market or large executable limit orders. Some nodes will have more

connections than others, but there will be no central dominant node or a diamond shape. There

will be a number of triangular connections and some pairs of nodes will have edges that go

both ways.

If market orders or large executable limit orders are submitted by informed traders, then

patterns of order execution should be informative beyond transaction prices, volume or trade

duration. Intuitively, if market orders or large executable limit orders are submitted by in-

formed traders, then resulting star-shaped or diamond-shaped trading networks should be as-

sociated with large changes in returns, possibly smaller volume, and short duration between

trades. Conversely, trading networks that are very dissimilar to a star or a diamond - e.g.,

those with triangular and reciprocal patterns - should be associated with smaller changes in

returns, possibly larger volume and longer duration between trades. Various network metricsthat quantify the shape of a network - e.g., the number of central nodes or triangular con-

nections in a network - should then be statistically related to returns, volatility, volume, and

duration.

1


3/29

In this paper we find evidence that network metrics serve as primitive measures of limit

order book dynamics. Namely, we compute network and financial variables for all regular

transactions that occurred during August 2008 in the nearby E-mini S&P 500 futures con-

tract and find that network variables strongly Granger-case intertrade duration and volume.

This suggests that network metrics presage the appearance of this information in duration and

volume. We also find that the network variable that quantifies centrality (or how star-shaped

a pattern is) exhibits a very high contemporaneous correlation with returns. Similarly, the

network variables that quantifies the assortativity of connections (or how diamond-shaped a

pattern is) exhibit high contemporaneous correlation with volatility.

These results are robust with respect to different equity index futures markets (E-mini

Dow Jones and Nasdaq 100), different observation periods (May 2008 and August 2008),

different levels of aggegation (at the broker level and individual trading account level), and

different sampling frequencies (240 and 600 transactions). Correlation results can also be

replicated in a simulated model, confirming that these empirical regularities do not arise by

chance. Furthermore, the results do not depend on any parametric specifications or modeling

assumptions.

This is the first paper to empirically link trading networks that trace the execution of thelimit order book with the dynamics of high frequency financial variables - transaction prices,

quantities and duration. As such, it offers a way to analyze the dynamics of the executed

portion of the limit order book from transaction level data.

Empirical network analysis has previously been applied in finance to study investment

decisions and corporate governance.1 In contrast to strategically-formed networks where par-

ticipants prefer to associate with specific counterparties, the networks we study are trading

networks in which connections are formed as a result of an automated matching algorithm and

reflect the participants beliefs about the valuation of an asset. These networks are also highly

dynamicwhereas boards of directors and portfolio holdings evolve gradually, over weeks,

months, or yearsfinancial trading networks change second by second.

Our paper proceeds as follows. In Section I, we describe our unique ultra high frequency

data, explain how we chose the sampling frequency, and describe financial variables. In Sec-

tion II, we describe network variables. In Section III we outline our conjecture of why patterns

of order executiontrading networkscontain valuable information beyond prices, quantities,

or intertrade duration. In Section IV, we present the empirical properties of network and finan-

cial variables. In Section V, we analyze time series properties and employ Granger-causality

tests between and among network and financial variables. Section VI demonstrates that our

results are robust with respect to different markets, different observation periods, and differ-

ent sampling frequencies. In Section VII, we use an agent-based simulation model of trading

networks to further test that our empirical results do not arise by chance. Finally, Section VIII

1For a recent survey, see Allen and Babus (2008).

2


4/29

summarizes our findings and suggests further applications of the network analysis methodol-

ogy to trading networks.

I. Data and Financial Variables

We use audit trail, transaction-level data for all regular transactions in the September 2008

E-mini S&P 500 futures contract. The transactions take place during the month of August

2008 during the time when the markets for stocks underlying the S&P 500 Index are open:

weekdays between 9:30 a.m. EST and 4:00 p.m. EST. The E-mini S&P 500 futures contract

is a highly liquid, fully electronic, cash-settled contract traded on the CME GLOBEX trading

platform. It is designed to track the price movements of the S&P 500 Index - the most widely

followed benchmark of stock market performance. Empirically, the E-mini futures has been

shown to contribute the most to price discovery for the S&P 500 Index.2 Price discovery

typically occurs in the front month contract: in August 2008, the September 2008 futures

contract is the front month, most actively traded contract (see Figure 1).

For each transaction, we utilize the following data fields: date, time (up to the second),

unique transaction ID (to identify consecutive transactions within a second), executing bro-

ker, opposite broker, trading account of the executing broker, trading account of the opposite

broker, buy or sell flag (for the executing broker), price, and quantity.3

Using the audit trail-level of detail, we uniquely identify two trading accounts for each

transaction: one for the broker who booked a buy and the opposite for the broker who booked

a sale. Our dataset consists of over 6 million transactions that took place among 26950 trading

accounts that belong to 346 brokers.

We first test the quality of the data by applying standard filters designed to look for record-

ing errors and outliers in the price and quantity series.4 We find the data to be of very highquality: the standard filters did not find any data irregularities.

We then determine the optimal sampling frequency by utilizing two techniques designed

to mitigate the effect of market microstructure noise in ultra high-frequency data.5 The first

technique is developed in Andersen, Bollerslev, Diebold and Labys (2000) and is commonly

referred to as the volatility signature plot. According to this technique, the effects of mar-

ket microstructure noise in our data are mitigated at the level of 120 transactions or higher.

2See, Hasbrouck (2003).3While the data fields are named executing broker and opposite broker, transaction data does not spec-

ify which trader initiated a transaction; in fact, for each transaction, there are two mirror entries for the two

counterparties one booking a sale and the other booking a buy.4See, Hansen and Lunde (2004).5For the literature on the subject, see among others, Zhang, Mykland and Ait-Sahalia (2005), Oomen (2005),

Bandi and Russell (2006), Hansen and Lunde (2006), Barndorf-Nielsen et al. (2008).

3


5/29

The second technique is developed by Bandi and Russell (2006) to select the sampling fre-

quency that minimizes the variance of market microstructure noise. According to the second

technique, the optimal sampling frequency is just below 100 transactions. Neither technique

makes any use of network variables. We adopt a very conservative approach and select 240

transactions as the sampling frequency for our data.6

For each period consisting of 240 transactions (which amount to a total of 25,104 such

periods in our sample), we compute the following financial variables: returns, volatility, inter-trade duration, and trading volume. These four variables are typically assumed to both con-

tain and convey valuable information to market participants about the true (but unobserved)

stochastic price process.7 Intuitively, market participants can learn about the true underlying

price process by observing transaction prices, trading volume, and times between trades.

Transaction prices contain valuable information about the true underlying price process,

but with a possibly significant amount of noise due to, among other reasons, market mi-

crostructure issues (e.g., bid-ask bounce), measurement issues (e.g., time scale, discrete re-

alizations from a continuous process), and seasonality (e.g., predictable intraday patterns).8

Both returns and their volatility are computed from observed prices and, thus, suffer from

the same noise issues. However, a number of techniques have been developed to reduce theimpact of different noise components in ultra high-frequency data. The techniques we use to

deal with measurement errors and reduce market microstructure noise filters and optimal

sampling frequency are described just above. In addition, we remove a predictable intraday

seasonal component from the computed raw returns by regressing them on a constant and a

sequence of dummy variables for each half-hour during the trading period. We then use the

unexplained term as our measure of returns.9 We compute returns as differences in log prices

using both the last price to the first price within the same period (close-to-open) and last prices

for consecutive periods (close-to-close). The results reported below refer to the close-to-open

deseasonalized returns, because we believe it to be an intuitively more appealing measure to

compare with network variables (also cleaned of seasonality), which are defined within each

sampling period. Having said that, the main results are not affected by the two different ways

to compute returns nor by the deseasonalization procedure.10

Volatility is another measure that contains valuable information about the true underlying

price process. As mentioned above, because it is computed from observed prices, it suffers

from the same noise issues as returns. Moreover, volatility suffers from the fact that unlike

prices, volatility is never directly observed. Thus, volatility estimates contain not only the

volatility of the noise, but also a possibly nontrivial factor due to covariance between the

6In order to ensure the robustness of our results, we repeat our analysis at a higher sampling frequency (see

our discussion on robustness later in the paper). The main results are unaffected.7

There is a vast theoretical and empirical literature on the subject. For a recent summary, see, Manganelli(2005).

8See, for example, Engle (2000).9We apply the same technique to all financial and network variables.

10We also used a Fourier flexible form to remove seasonality. It did not qualitatively change our results.

4


6/29

true price process and the noise component.11 We use three measures to estimate volatility

during each period: absolute returns, squared returns, and the price range. Absolute and

squared returns are proxies for the standard deviation and variance of returns, respectively.

The price range is defined as the difference between the high and low price (in logs) during

the period. For the results reported below, we use the price range as the measure of volatility.

Range-based volatility estimators have been shown to be more efficient than return-based

volatility estimators, because they incorporate the full sample path of observed prices (to select

a maximum and a minimum) rather than just open and close prices.12 Our main results are not

affected by the choice of volatility estimator.

Intertrade duration contains valuable information, because the estimation of characteristics

of the true price process obtained during periods of shorter intertrade duration can be more

precise. This would happen irrespective of the reasons for shorter intertrade duration: whether

more frequent trading occurs due to more informed trading or more liquidity trading, more

frequent sampling would result in greater precision with respect to the true process. Having

said that, there is a view that since information is disseminated through trading, the interval

of time between trades can be interpreted as a proxy for the arrival of new information to the

market.13 We compute duration as the time (in seconds) elapsed between the start and end

of the period. We compute three measures of duration: total (unweighted) period duration,

volume weighted period duration, and average for 239 intertrade (within period) durations.

The results reported below are for total period duration. The main results are unaffected by

the way we compute intertrade duration.

Trading volume contains valuable information, because volume together with observed

transaction prices can be driven by a common latent factor often referred to in the literature

as information intensity.14 Intuitively, during periods of higher volume, transaction prices

also exhibit greater precision about the characteristics of the true underlying price process.

We compute volume as the number of contracts both bought and sold during the observation

period.

11A number of techniques have been developed to estimate volatility components separately by varying the

time window. See, for example, Zhang, Mykland, Ait-Sahalia (2005). Application of these techniques to trading

networks will be explored in our future research.12For the literature on price range as an efficient estimator of asset price volatility, see, for example, Parkinson

(1980), Garman and Klass (1980), Beckers (1983), and Brunetti and Lildtholdt (2006). In recent years, the price

range has also been used to compute realized volatility in high frequency data. See, for example, Christensen

and Podolski (2009).13See, for example, Engle and Russell (1998) and Engle (2000).14There is a vast theoretical and empirical literature on the subject. See, for example, Clark (1973), Epps and

Epps (1976), Tauchen and Pitts (1983), Admati and Pfleiderer (1988), Easley and OHara (1992),and Andersen

(1996).

5


7/29

II. Network variables

Quantitative analysis of networks employs a set of standard metrics.15 Network metrics di-

rectly depend on what is defined as a node, an edge, and a network. In our analysis, a node

denotes a trading account, an edge indicates that a transaction has occurred between two trad-

ing accounts, and a trading network is constructed from a specified number of consecutive

transactions (e.g., 240 transactions) among trading accounts in an electronic limit order mar-ket.

The formation of a trading network consists of three interconnected steps: (i) the arrival

and accumulation of orders in the limit order book; (ii) the process of matching buy and sell

orders; and (iii) the display of transaction prices for matched orders.16 From the network

perspective, limit orders can be visualized as stubs (ends of edges) attached to a given node.

At a simple level, each stub has a time stamp (for the time it was created), a direction (in for

buy and out for sell), a price, and a quantity. A node can grow a large number of stubs

subject to the specifics of its trading strategy, the costs of creating, modifying, maintaining,

and cancelling stubs, as well as limits imposed by the stub-matching algorithm.17

Depending on its attributes, each stub is assigned to a specific place in the limit order book.

In stubs go into the Buy Orders side of the book and Out stubs go into the Sell Orders

side. On each side of the order book, the stubs are sorted in accordance with the rules of a

matching algorithm, e.g., by price and then the time stamp or by price, quantity and then the

time stamp. The matching algorithm makes edges out of stubs by linking together top stubs

from each side of the book, provided that they agree on a price. 18

Once two stubs are connected by a matching algorithm, two things happen: an edge

between two nodes is created and the associated transaction price at which the match was

achieved is displayed for all nodes to see. After seeing a transaction price or a sequence of

15See Newman (2003) for a review of basic network concepts and quantitative indicators.16Without the loss of generality, the modification and/or removal of unmatched existing orders is viewed as a

part of the order arrival process.17For example, the Chicago Mercantile Exchange (CME) Group allows most trading firms to grow the follow-

ing number of free stubs (known as messages) for the products matched via its GLOBEX algorithm during

regular business hours: 3,000 plus no more than a ratio of grown stubs to total executed volume for this product.

This volume ratio is set at 4 to 1 for E-mini S&P 500 Futures and Spreads, 8 to 1 for E-mini NASDAQ-100

Futures and Spreads, and 25 to 1 for E-mini Dow Futures and Spreads. Stubs grown in excess of 3,000 plus the

product-specific volume ratio are penalized by a surcharge fee.18For example, on the CMEs GLOBEX Trading System, there are three algorithms to match stubs (orders):

First In, First Out (FIFO); Pro Rata Allocation (Pro Rata); and Lead Market Maker (LMM). Quoting from the

CME documentation avialble to the public, FIFO uses price and time as the only criteria for filling an order: all

orders at the same price level are filled according to time priority. Pro Rata matches orders based on price, top

orders (the first order only that betters a market), and size. The LMM is a firm or trader designated by CME to

make a two-sided market in an assigned product. This LMM will have the benefit of certain matching privileges

and associated pricing concessions in return for meeting CME determined market obligations.

6


8/29

transaction prices, some nodes may decide to modify or remove some existing stubs or grow

new stubs, thus affecting the network formation process.

Empirically, we construct trading networks as follows. At 9:30:00 a.m. EST on August 1,

2008, we start counting transactions in the September 2008 E-mini S&P 500 futures contract.

For each transaction, we know which account bought from or sold to which other account (or

itself), at what price, and what number of contracts. We designate 240 consecutive transactions

as one period. Transactions 1 through 240 mark the first period, transactions 241-480, markthe second period, and so on. While for each period, we do not observe the limit order book

itself, we know that transactions occurred because market orders or limit orders were matched

with existing orders in the limit order book. We can then trace the pattern of order execution

or a trading network within each period. Even though the number of transactions for each

period is the same, a pattern for a large market order executed over the period will look very

different compared to a pattern for several smaller limit orders. Metrics that we compute for

each network should be interpreted as quantitative measures of the pattern of order execution

in the limit order book.

We realize that by taking snapshots of the market at equal transaction time intervals, we

cannot hope to characterize the whole complexity of changes that take place in the underlyinglimit order book. Specifically, we cannot observe how the revelation of transaction prices

translates into modifications or cancellations of existing orders and submissions of new orders.

Or in terms of the network formation process, we cannot observe how nodes remove some

existing stubs and grow new stubs.

While we know that the process of trading network formation - stubs, edges, transaction

prices, new stubs - goes on continuously, we must designate the number of transactions that

add up to a trading network at a point in time. This designated number of transactions could

be at times too small and at times too large to clearly capture the impact of order execution on

the order book through network analysis within each period. However, as we analyze the time

series properties of trading networks, a statistically significant pattern, if there is one, shouldemerge. In other words, the approach we take is to compute and analyze network metrics for

a time series of consecutive trading networks rather that those for one aggregate network that

emerges over the whole period.

Given our intutition about how patterns should be related to the dynamics of transaction

prices and quantities, we are interested in network metrics that can measure centrality (or how

star-shaped a network is); assortativity of connections (or how diamond-shaped a network

is); as well as those that can measure reciprocity, triangular connections, and the size of the

network.

The size of the network can be characterized in terms of the total number of nodes, denotedby N, and the total number of edges, denoted by E. From these two quantities we can also

compute the average degree, AV DEG = E/N the average number of nodes that a node isconnected to, and the standard deviation of degree, STDEG the standard deviation around

7


9/29

this average. These two variables characterize the first and second moment, respectively, of

the unconditional degree distribution.

Node centrality quantifies the position of a specific node on a network. There are several

node centrality measures, the simplest one being degree, or how many edges a node has. In a

directed network, degree can be further separated into indegree and outdegree in accordance

with the number of incoming or outgoing edges of a node.

However, the degree alone may not necessarily capture the role of a node on the network.

For example, a node that has a relatively low degree, but acts as a connector between otherwise

disconnected parts of the network, can be thought of as very central. To that end, there are

measures of centrality that take into account not just the degree of a node, but its position

relative to all other nodes in the network. For example, betweenness measures how many

other pairs of nodes would have to go through the given node in order to reach one another

in the shortest number of hops. Similarly, closeness measures how many hops away a node is

on average from every other node in the network. Figure 2 illustrates different node centrality

measures.

Node centrality is a critical input into the calculation of network centralization, a measurethat characterizes the inequality of connectivity among the nodes. In order to capture this

inequality in connectivity within the network whether there are a small number of nodes with

high centrality and a large number of nodes with low centrality we compute a centralization

measure defined as centralization Gini:

G =nr=1(2rN1)ki

NE, (1)

where ki is a nodes centrality measure and r is a nodes rank order number.

Taking node is degree as its centrality measure, we use the formula above to computeseparate centralization measures for indegree and outdegree incentralization, INCEN, and

outcentralization, OUTCEN, respectively. By construction, these measures are 0 if every node

has the same number of (incoming or outgoing) edges, and positive with increasing inequality:

e.g., one node has all the incoming (outgoing) edges, the others have no incoming (outgoing)

edges.

We also compute a combined measure of incentralization and outcentralization: CEN=INCENOUCEN. Intuitively, since we use a nodes degree as a measure of its centrality, the

difference between in and out centralization measures can be interpreted as the presence of a

dominant buyer or seller. CEN will be equal to 1 if there is a dominant buyer and -1 if there is

a dominant seller.

To measure wheher a node is both a buyer and a seller, we compute the Pearson corre-

lation coefficient between the indegree and the outdegree of each node, INOUT. A positive

8


10/29

correlation indicates that nodes with many in edges also have many out edges - i.e., it is both

buying and selling.

We also calculate statistical properties of nodes one edge away from each individual node

or connectivity of node B conditional on it being connected to node A. Assortativity in net-

works can represent any tendency of like to be connected with like for any node property

(see Newman (2002)), but here we will apply it to degree. Large degree nodes (i.e., those

with many edges) may connect more frequently to other large degree nodes or they may tendto connect to small degree nodes. Two large degree nodes connecting to a number of small

degree nodes between them will result in a diamond-shaped network.

One way to measure assortativity is by the Pearson correlation coefficient (ki,kj) forall edges ei j. When the edges are directed there are four possible assortativity measures:

(kini ,kinj ), (k

ini ,k

outj ), (k

outi ,k

inj ), and (k

outi ,k

outj ) corresponding to the four conditional de-

gree distributions.

From these four correlation coefficients, we construct the following compound measure,

that we call assortativity index for directed networks:

AI=1

4

(kini ,k

inj )+(k

outi ,k

outj )

(kini ,k

outj )+(k

outi ,k

inj )

, (2)

computed overall all edges ei j.

Figure 3 illustrates network assortativity. For example, in the context of trading networks,

the coefficient (kouti ,kinj ) measures the correlation between the number of unique buyers (con-

nected by an outward pointing edge) a seller is selling to (denoted by koutj ) and the number of

unique sellers those buyers are buying from (denoted by kinj ). A negative (kouti ,k

inj ) would

mean that when a seller has matched to many buyers, those buyers are likely to be transacting

with few or no other sellers.

We also measure if nodes one edge away from each individual node also form particular

(e.g., triangular) patterns. Transitivity, also termed clustering, measures the prevalence of

closed triads in the network. In this paper, we use the global clustering coefficientdenoted by

CCas a measure of transitivity:19

CC=3 number of triangles in the network

number of connected triples of vertices, (3)

19See, Newman (2003).

9


11/29

where a connected triple means three nodes ABC such that there is an edge AB and

an edge BC.20, and the prevalence of specific directed triads can be used to conduct a motif

analysis on a directed network.21

Finally, in addition to regularities in connections between pairs and triplets of nodes, a net-

work as a whole may be composed of several separate connected components. A connected

component is a maximal subset of nodes such that any node can be reached from any other

node by traversing edges. Within a strongly connected component any node can be reachedfrom any other by following directededges. Figure 4 illustrates the largest strongly connected

component(LSCC). Once the largest strongly connected component is identified, we can mea-

sure the global network structure by computing LSCC, the proportion of the network occupied

by this component.

Intuitively, the largest strongly connected component can only occupy a significant portion

of the network if many nodes have both incoming and outgoing edges during the same time

period, and there are cycles (the simplest of which are reciprocal ties and the triads mentioned

above) within the network. In other words, a large strongly connected component is much

more likely to emerge as a result of a large number of limit orders than one large market order.

III. Conjecture: Trading networks contain information

We believe that network analysis is very useful for quantifying patterns of information trans-

mission in an electronic limit order market. Specifically, we conjecture that orders that contain

information about the fundamental value of an asset, as well as demand and supply of liquid-

ity for this asset, should have particularstar-shaped or diamond-shapedexecution patterns.

In contrast, orders that have little such information should exhibit very different patterns,

namely, they should contain many triangular and reciprocal connections.

To illustrate this intuition, we show in Figure 5 three sample networks, with their network

and financial statistics. The sample networks are chosen to display extremes in centralization

(CEN), assortativity index (AI), and the number of edges (E).

The left column of Figure 5 presents a star-shaped network with one dominant buyer

matched with many sellers. This pattern is consistent with the execution of a market order.

It has a centralization coefficient close to one, high standard deviation of degree and large

assortativity. This network is associated with a positive return and low period duration.

The center column of Figure 5 presents a network with four large traders forming diamond-

shaped patterns as their limit orders are being crossed via intermediaries. High assortativity20For the clustering coefficient, we are treating the edges as undirected, although directed clustering coeffi-

cients can also be defined. See, Fagiolo (2007).21On motif analysis, see Milo et al (2004).

10


12/29

index means that large traders are mostly matched with many small traders rather than with

each other. Small largest strongly connected component suggests that large traders are not

trading with each other: rather they buy from or sell to small traders who quickly trade with

another large trader. This pattern is associated with negative returns, as well as with higher

volume and volatility.

The right column of Figure 5 presents a fairly uniform network with many buyers and sell-

ers of various sizes. This situation is reflected in a pattern of connections that exhibits networkparameters close to their sample averages with the exception of the number of edges - reflect-

ing a larger and more interconnected trading network. The financial variables estimated

from transaction prices - the rate of return and volatility - are very close to their averages.

Volume is somewhat above its sample average and period duration is quite high.

The examples above provide illustrative evidence in support of our intuitive conjecture.

Our next step is to take our conjecture to the dataa times series of over 25000 trading

neworksand to prove that that metrics of order execution patterns are statistically related

to returns, volatility, volume, and duration.

IV. Empirical properties of trading networks

A. Summary statistics for the financial variables

Table I presents summary statistics for the financial variables. All financial variables in our

sample are stationary. Standard ADF tests reject the null hypothesis of non-stationarity (p-

value = 0.00) for all financial variables. For the rate of return, the standard deviation dominates

the mean as expected. In addition, returns exhibit positive skewness. The range has a period

average of 0.04 percent, which corresponds to an annualized average volatility of 24 percent in line with estimates reported in the literature.22 Intertrade period duration in this very

liquid market ranges from zero to 176 seconds. In our sample, 240 transactions on average

occur every 19.5 seconds. Finally, volume, volatility, and duration are highly persistent, as

evidenced by autocorrelation coefficients at lags 1, 5, and 10.

22Of the three measures of volatility - absolute returns, squared returns, and range - the range exhibits the

lowest standard deviation (results available upon request). This is in line with the efficiency results of Parkinson

(1980).

11


13/29

B. Summary statistics for the network variables

Table II presents summary statistics for the network variables.23 All network variables study

are stationary. However, Jarque-Bera tests for individual network variables reject the null

of normality at standard significance levels. Finally, all network variables exhibit persistent

autocorrelation functions.

Our combined measure of the difference between in-centralization and out-centralization,

CEN= 0.00 0.23, is on average equal to zero. However, this is partly due to the fact thatthe shapes of distributions of incentralization and outcentralization are very similar, which

indicates that both the buy side and the sell side of the limit order book are executed in a

symmetrically balanced way.

The distributions of incentralization and outcentralization are also strongly negatively cor-

related. This indicates that when there are a few dominant buyers, the sellers tend to have

more even numbers of trading partners, and conversely, when there are a few dominant sell-

ers, the buyers tend to be more equal in trading partners. In other words, a large market order

or executable limit order to buy (sell) is likely to be executed against several small limit orders

on the sell (buy) side of the book.

These trading networks are highly dynamic, and the most central node in one period may

have few or no edges in the next. In other words, it is quite unlikely that a market order or

an executable limit order is so large that it spans several consequtive networks. Of the nearly

27000 trading accounts who bought or sold S&P 500 E-mini futures contracts during August

2008, 17 accounts were extremely active, accounting for nearly 40 percent of all transactions.

On the other hand, 85 percent of trading accounts traded only 10 percent of all transactions.

The correlation in indegree from period to period for the individual 17 most active trading

accounts varied between [0.28] and [0.64], while for the bottom 85 percent of accounts, the

correlation was essentialy zero.

At the level of the individual trader, the indegree is slightly correlated with outdegree

(INOUT= 0.080.22), suggesting that traders who have more buying interactions will tendto have a slightly higher number of selling interactions during the same transaction window.

Assortativity correlations in trading networks are on average negative. There is a moderate

negative correlation between the number of buyers a seller is selling to and the number of

sellers that a buyer is buying from. This relationship stems in part from the skewed degree

distribution. Most buyers have low indegree, therefore a seller with high outdegree must be

selling to many buyers with low indegree. Similarly, most sellers have low outdegree, and a

buyer with high indegree must be buying from many of them. The overall assortativity index,

which is computed by assigning equal weights to the four assortativity correlations, is slightly23While the table focuses on eight network variables, we have also analyzed a wide range of other network

metrics, including Freeman centralization, reciprocity, individual pairwise assortativity metrics, and alternate

centrality measures, including directed and undirected betweenness, closeness, and PageRank.

12


14/29

positive: AI= 0.090.07. This means that on average, when a seller (buyer) is matched withmany buyers (sellers) they are just as likely to be transacting with many other sellers (buyers)

as with a few or no sellers (buyers). A devitation from this pattern indicates that one buyer or

one seller is dominant.

The global clustering coefficient or the ratio of oberved triangular connections among

nodes to all possible triangular connections is 0.040.03, nearly one standard deviation below

the average clustering coefficient for randomized graphs with the same assignments of degrees.In other words, there is no tendency for the traders to cluster together.

Similar to the clustering coefficient, the size of the largest strongly connected component,

(0.040.04), does not deviate from what would be expected for networks of that size, density,and distribution of in and out degrees. But as we will see in the following section, it does

strongly correlate with density and other network variables.

V. Empirical analysis of trading networks

A. Correlations

Table III reports contemporaneous correlations among network variables. According to Ta-

ble III, the difference between buyer and seller centralization (CEN) is not correlated with any

other network variable. Average degree (AV DEG) is positively correlated with the standard

deviation of degree (SDDEG). This property is typical for a power law degree distribution, in

which a few dominant nodes have high degree (i.e., few accounts trade with a large number of

counterparties) and a large number of nodes have very low degree. As a result, for most power

law distributions, central moment estimators, like average degree and the standard deviation

of degree, grow with the sample size. Correlations between a nodes indegree and outdegree(INOUT) boost the variance in undirected degree (SDDEG). This means that large buyers

are also large sellers, while small buyers are also small sellers. By construction, the assor-

tativity index (AI) is highest when high degree buyers are matched with low degree sellers.

This is more likely to occur when the network is less dense and the largest strongly connected

component (LSCC) is small.

Table IV presents correlations between financial and network variables. The return process

exhibits 68 percent correlation with centralization, but no other network variables. Intuitively,

a large positive CEN results from a large market order or executable limit order to buy. This

order is likely to push prices up, which results in a positive rate of return. The same intituition

holds for a market order to sell.

Volatility is positively correlated with all the network variables, with the exception of the

assortativity index (AI). This means that when high degree buyers are matched with low

degree sellers (like in the case of a market order), volatility is somewhat smaller compared to

13


15/29

a situation when low degree buyers are matched with low degree sellers (several limit orders).

Intuitively, in a deep and liquid market like the E-mini S&P 500 futures, an incoming market

order has a significant chance to be executed against several limit orders sitting at (or near) the

same tick, resulting in high assortativity and centralization, but very little price impact and,

hence, low high-frequency volatility estimate. At the same time, intermediated execution of

two large limit orders from both sides of the limit order book will result in a positive high

frequency estimate of volatility, if only due to the bid-ask bounce.

Duration is positively correlated with the average degree and in-out degree correlation and

negatively correlated with the standard deviation degree and the assortativity index. Intuitively,

a longer time interval between trades is associated with trades that are distributed more evenly

among traders, increasing the average degree and decreasing the standard deviation of degree

and the assortativity index. Over longer time intervals, it is also more likely that a node that

has a high indegree also has a high outdegree (it has time to be both a buyer and a seller),

which results in a positive in-out degree correlation.

B. Granger Causality

We next test for Granger causality in the context of Vector Autoregressive (VAR) models.

Since the variables exhibit heteroskedasticity and serial correlation, we estimate VAR models

using the generalized method of moments (GMM) and Newey-West robust standard errors.

We first consider a VAR model with eight network variables. According to the Akaike

Information Criterion, the system that includes all eight network variables has an optimal lag-

length of twenty.24 However, the results of the model with eight network variables (available

from the authors upon request) show strong evidence of feedback effects among the network

variables, i.e., network variables tend to Granger cause each other. In light of this, we use

standard tests to reduce the model to four network variables.25

Tables V-IV provide the results (p-values) of Granger-non-causality tests. The last column

and the last row of each table are labelled all. In the last column we test whether each

variable is Granger-caused by all the other variables in the system, while the last row is testing

whether each variable is Granger-causing any other variable in the system. The null hypothesis

is that of Granger-non-causality. Therefore, a p-value greater than five percent indicates a

failure to reject the null.

Table V presents p-values for the Granger-non-causality test among three sets of four net-

work variables (three panels). Panel 1 shows that centralization (CEN) is Granger-causing the

other network variables (p-value = 0.5556), but is not Granger-caused by other network vari-

24Throughout the analysis we use both Akaike and Schwartz Information Criteria.25Standard test statistics are available from the authors upon request.

14


16/29

ables (p-value = 0.2387). On the other hand, Panels 2 and 3 show that the remaining network

variables Granger-cause each other.

Next, we test for Granger-causality between one financial variable and four network vari-

ables. Using standard techniques, we select groups of network variables that reflect degree

properties at the level of a single node (e.g., centralization, standard deviation of degree, and

in and out degree correlation), two nodes linked by an edge (assortativity index), connected

triples of nodes (clustering coefficient), and the connectivity of the whole network (the pro-portion of nodes in the largest strongly connected component).

Table VI presents p-values for the Granger-non-causality test for the rate of return and

network variables. We find that the return process is both Granger-caused and Granger-causes

network variables. The network variable that has a strong impact on returns is centralization.

This is in line with the correlation results in Table IV.

Table VII reports Granger-non-causality test results for the volatility process and network

variables. Similarly to the return process, we find a feedback effect between volatility and

network variables: volatility is both Granger-caused by network variables and Granger-causes

them.

Table VIII reports Granger-non-causality test results for intertrade duration and network

variables. We find that duration is Granger-caused by network variables (p-value =0.0000),

but does not Granger-cause network variables (p-value = 0.1811).

Finally, Table IX presents p-values for the Granger-non-causality test for volume and net-

work variables. The results show that volume is Granger-caused by network variables (p-value

= 0.0000) but does not Granger-cause network variables (p-value = 0.3662).

What are the possible reasons for the presence of feedback effects in Granger causality test

results for the rate of return and volatility (vis-a-vis the network variables) and the absence

of such effects for volume and duration? We believe that there is one fundamental reasonfor these empirical findings: our results for the price-based variables are polluted by noise.

Unlike volume, duration, and all the network variables, which we can measure directly, the

rate of return and volatility are estimated from transaction prices. As a result, the variables

we call the rate of return and volatility are noisy proxies for the unobservable characteristics

of the true price process. The level of noise at this very high frequency is so high that it is

very hard to effectively measure the interaction between network variables and the true price

process.

VI. Robustness

Our results are robust with respect to different markets, different observation periods, different

levels of aggregation, and different sampling frequencies. The results we report are for the E-

15


17/29

mini S&P 500 futures for the month of August 2008 (over 6 million transactions). The results

remain qualitatively the same when we repeat all procedures for the same market for the month

of May 2008 (5.15 million transactions) at the sampling frequency of 240 transactions. The

main results also remain the same for the sampling frequency of 600 transactions. Namely,

both correlations and Granger-causality results hold. The results are also the same whether

we construct networks at the broker level or trading account level. Finally, the results remain

the same for other stock index futures markets as confirmed by the analysis of the E-mini

Nasdaq 100 (2.3 and 2.8 million transactions in May 2008 and August 2008, respectively) and

E-mini Dow Jones futures contracts for both May 2008 and August 2008 (1.8 and 2.4 million

transactions in May 2008 and August 2008, respectively) at the sampling frequencies of 240

and 600 transactions.

VII. Agent-based simulation model of trading networks

In order to further test that our empirical results do not arise by chance and to examine the

source of the high correlation between network centralization and returns, we construct anagent-based simulation model of trading networks. Figure 6 presents a snapshot of the simu-

lation model.

The model setup is as follows. There is a fixed number of traders. At each interval of time,

a new buy or sell order is assigned at random to one of the traders. The order arrival time is

distributed according to a Poisson distribution. The order has an equal50 percentprobability

to either buy or sell a single quantity at a single price. The quantity is lognormally distributed.

For a sell order the price is set a small fixed number above the last transaction price plus

a lognormally distributed random variable with a mean of zero. For a buy order, the price is

set a small fixed number below the last transaction price plus a lognormally distributed ran-

dom variable with a mean of zero. Setting the order a small number above (below) the last

transaction price for a buy (sell) order indicates a willingness on the part of the trader to buy

(sell) at a slightly higher (lower) price than the market is currently trading at. The lognormally

distributed, zero mean random variable added to the order price represents the heterogene-

ity of beliefs. This parametric specification is consistent with a price function that arises in

equilibrium under the assumption of heterogenous beliefs about the true price process.26

Each incoming order is matched against previously placed orders by an automated match-

ing algorithm based on price and time priority. If a match is made, an edge is created be-

tween two traders.27 Immediately following a match, order quantities are updated for the two

matched traders. If the newly placed order is only partly fulfilled, the algorithm attempts to

26See, for example, Scheinkman and Xiong (2003).27Since the orders are randomly assigned, it is possible that an edge connects two orders submitted by the

same trader for the same trader.

16


18/29

match the remaining quantity against another, more recent previously placed order. Orders are

set to expire after a fixed amount of time from when they are first created, at which point they

are cancelled and withdrawn from the market.

Using the resulting simulated transactions, we construct trading networks using the pro-

cedure identical to the one used for the empirically observed data. Namely, we simulate 6

million transactions, segment the data into periods of 240 consecutive transactions and com-

pute network and financial statistics for each period. Just as in the actual trading, a singleorder may be reflected in multiple transactions in adjacent time windows.

This setup allows for a possibility of heterogenous beliefs about the price process, but

imparts no intentionality or memory upon the traders. It allows us to discern which features

of the trading networks are due to the arrival of information to the market, and which may be

due to strategic behavior on behalf of the traders.

We find that a sequence of orders with randomly distributed prices and quantities results in

network and financial variables that are very similar to those obtained from the futures market

data, but with the notable (and anticipated) exception of a dynamic structure. Specifically, we

find that contemporaneous correlations among the network variables, as well as correlationsamong network and financial variables are very similar to those we estimate from the actual

market data.28 This confirms that our empirical results do not arise by chance.

We also use the agent-based simulation model to investigate possible sources of high cor-

relation between network centralization and returns. By observing the simulation, we find

that high correlation between centralization and returns reflects the network mechanics of

the information arrival process: a trader submitting a large buy order at a high price will be

matched against several existing sell orders, giving that trader a high indegree, and increasing

the centralization of the network. At the same time, because a greater number of sell orders

was matched, the market-price goes up, yielding a positive rate of return. Moreover, we find

that for the simulated data (but not market data), centralization and other network variablesGranger-cause returns, but not vice versa.29

At the same time, we also find thatas expected in a model with no intentionality or

memoryGranger-causality tests among network variables and volatility, volume and duration

yield very weak results: feedback effects, lack of significance or very poor fit. This suggests

that the Granger-causality results that we find in the futures markets data arise as a result of

the behavior of traders and are not a statistical artifact.

28Results are available from the authors upon request.29We use the lag-length of order on in the VAR. As expected from the lack of dynamics in the silmulated data,

the Akaike information creterion selects a lag-length of order one in the VAR specification and the Schwartz

information creterion selects a lag length of order zero.

17


19/29

VIII. Concluding remarks

We use network analysis to examine information transmission in an electronic limit order

market. We conjecture that orders that contain information about the fundamental value of

an asset, as well as demand and supply of liquidity for this asset, should have particularstar-

shaped or diamond-shapedexecution patterns. In contrast, orders that have little such infor-

mation should exhibit very different patterns, namely, they should contain many triangularand reciprocal connections.

We test this conjecture by computing a time series of trading networks from audit trail,

transaction-level data for all regular transactions in the September 2008 E-mini S&P 500 fu-

tures contract during the month of August 2008 (over 6 million transactions).

We find that star-shaped or diamond-shaped patternscharacterized by high centralization

or assortativity and low transitivity (clustering coefficient) and connectednessare positively

related to returns and volume and negatively related to duration and volatility. In contrast,

less heterogeneous patternsthose with centralization and assortativity close to zero (their

averages), high transitivity and high connectednessare associated with average returns andvolatility, and positively related to volume and duration.

Moreover, we find that network variables strongly Granger-case intertrade duration and

volume, but not the other way around. This suggests that patterns of order execution presage

changes in duration and volume.

This is the first paper to employ network analysis to study complex dynamics of an elec-

tronic limit order market using transaction level data. While network analysis offers only a

partial look into the limit order book (i.e., the executed part), network technology offers sig-

nificant advantages for the task of analyzing the complexity of electronic limit order trading

beyond just financial variables.

18


20/29

References

[1] Allen, Franklin, and Ana Babus, 2008, Networks in Finance, Working Paper 08-07, Wharton

Financial Institutions Center, University of Pennsylvania.

[2] Andersen, T., Bollerslev, T., Diebold, F.X. and Labys, P., 2000, Great Realizations, Risk, 13,

105-108.

[3] Bandi, F. M. and Russell, J. R., 2006, Separating microstructure noise from volatility, Journal of

Financial Economics, 79, 655-692.

[4] Barndorff-Nielsen, O.E., Hansen, P.A., Lunds, A., and Shephard, N., 2008, Realised kernels in

practice: trades and quotes, manuscript.

[5] Beckers, S., 1983, Variance of security price returns based onhigh, low and closing prices, Journal

of Business 56, 97-112.

[6] Braha, Dan, and Bar-Yam, Y., 2006, From Centrality to Temporary Fame: Dynamic Centrality in

Complex Networks, Complexity 12(2), 59-63.

[7] Brunetti, Celso, and Lildholdt, P.M., 2006, Relative efficiency of return- and range-based volatil-ity estimators, manuscript.

[8] Clark, P., 1973, A subordinated stochastic process model with finite variance for speculative

prices, Econometrica 41, 135-155.

[9] Christensen, K., and Podolski, M., 2005, Asymptotic theory of range-based estimation of inte-

grated variance of a continuous semi-martingale, manuscript.

[10] Engle, Robert, 2000, The econometrics of ultra-high-frequency data, Econometrica 68, 1-22.

[11] Engle, R., and Gallo, G., 2006, A multiple indicators model for volatility using intra-daily data,

Journal of Econometrics 131, 3-27.

[12] Engle, R., and Russell, J., 1998, Autoregressive conditional duration: A new model for irregularly

spaced transaction data, Econometrica 66, 1127-1162.

[13] Epps, T. and Epps, M., 1976, The stochastic dependence of security price changes and transaction

volumes: Implications for the mixture-of-distribution hypothesis, Econometrica 44, 305-321.

[14] Fagiolo, G., 2007, Clustering in complex directed networks, Physical Review E 76(2), 26107.

[15] Garman, M. and Klass, M., 1980, On the estimation of security price volatilities from historical

data, Journal of Business 53(1), 67-78.

[16] Hansen, P. and Lunde, A., 2006, Realized variance and market microstructure noise, Journal of

Business and Economic Statistics 24, 127-218.

[17] Hasbrouck, Joel, Intraday Price Formation in U.S. Equity Index Markets, 2003, Journal of Finance

58(6), 2375-2400.

19


21/29

[18] Hong, Harrison, and Jeremy C. Stein, 1999, A unified theory of underreaction, momentum trading

and overreaction in asset markets, Journal of Finance 54, 2143-2184.

[19] Kossinets, G. and Watts, D.J., 2006, Empirical Analysis of an Evolving Social Network, Science

311 (5757), 88-90.

[20] Milo, R., Itzkovitz, S., Kashtan, N., Levitt, R., Shen-Orr, S., Ayzenshtat, I., Sheffer, M., and U.

Alon, 2004, Superfamilies of Evolved and Designed Networks, Science 303, 1538-1542.

[21] Newman, M. E. J., 2002, Assortative mixing in networks, Physical Review Letters 89, 208701.

[22] Newman, M. E. J., 2003, The structure and function of complex networks, SIAM Review 45, 167.

[23] Oomen, R., 2005, Properties of bias-corrected realized variance under alternative sampling

schemes, Journal of Financial Econometrics 3, 555-577.

[24] Parlour, Christine A. and Duane J. Seppi, 2008, Limit Order Markets: A Survey, Handbook of

Financial Intermediation and Banking, Boot, Arnoud W.A., and Anjan V. Thakor, eds., Elsevier

B.V., Oxford, UK.

[25] Scheinkman, Jose A. and Wei Xiong, 2003, Overconfidence and Speculative Bubbles, Journal of

Political Economy 111(6), 1183-1219.

[26] Tauchen, G. and Pitts, M., 1983, The price variability-volume relationship on speculative markets,

Econometrica 51, 485-505.

[27] Wilensky, U. (1999). NetLogo. http://ccl.northwestern.edu/netlogo. Center for Connected Learn-

ing and Computer-Based Modeling. Northwestern University, Evanston, IL.

[28] Zhang, L., Mykland, P. A., and Ait-Sahalia, Y., 2005, A tale of two scales: Determining integrated

volatility with noisy high-frequency data, Journal of American Statistical Association, 100, 1394-

1411.

20


22/29

Figure 1: E-mini S&P 500 front month futures contract.

21


23/29

Y

X

Y

X

Y

X YX

indegree outdegree betweenness closeness

Figure 2: Example networks with node X having greater centrality than node Y for the speci-

fied measure.

(k ini,k inj) -1 1 - 1(k outi, k

outj) -1 1 - 1(k ini,k outj) -1 - 1 1(k outi, k inj) -1 - 1 1AI 0 1 - 1Figure 3:Illustrat ionofNetw ork Assort ativity.

CE

FG

Figure 4:Anetw ork contain ingtwocon nectedcom ponents,A BCDEand FGH.The largeststrongly connected componen tisBCDE .

22


24/29

C E N 0.9 20 -0 .09 2 0 .11 6

AV DE G 2.0 27 3.2 69 2 .41 8SD DE G 8.1 34 5.9 60 4 .94 9IN O U T -0 .4 70 -0 .10 1 -0 .06 9AI 0.3 53 0.5 59 0 .19 5C C 0.0 01 0.0 49 0 .01 9LSC C 0.0 14 0.0 16 0 .00 6E 7 5 1 03 185Retu rn s 0 .0 59 -0 .01 9 0 .00 0Ran ge 0.0 59 0.0 59 0 .03 9Volu m e 1 1 04 13 43 1 19 1Du ratio n 0 1 2 16

F igu re 5:E xam p les of ob serv ed net wo rk san dth eir p rop ertie s.

23


25/29

Figure 6: A screenshot of the agent based simulation using Netlogo (Wilensky 1999). Asorders (denoted by squares) are randomly assigned to traders (denoted by human figures), an

edge is drawn between them. When sell orders (black squares) are matched with buy orders

(red squares), their quantities are reduced, and there is a direct edge drawn between the traders.

24


26/29

Table I: Financial Variables: Summary Statistics

Returns Volatility Volume Duration

Mean 0.0002 0.0425 1236.6720 19.4941

Median 0.0000 0.0392 1153 14

Maximum 0.2165 0.2165 6645 176

Minimum -0.1378 0.0190 459 0

Std. Dev. 0.0271 0.0140 407.1451 17.4485Skewness 0.0485 0.8273 2.6259 2.0299

Kurtosis 2.9876 6.4663 16.9377 9.3735

ADF prob 0.0001 0.0000 0.0000 0.0000

AC Lag 1 -0.001 [0.895] 0.187 [0.000] 0.528 [0.000] 0.473 [0.000]

AC Lag 5 -0.006 [0.062] 0.167 [0.000] 0.376 [0.000] 0.289 [0.000]

AC Lag 10 -0.011 [0.139] 0.151 [0.000] 0.284 [0.000] 0.241 [0.000]

ADF prob refers to the p-value of the ADF test for the null of unit root.

AC Lag X [Q-test prop] refers to the p-value of the Portmanteau Q-test

for no serial correlation at lags X= 1, 5, and 10.

Table II: Network Variables: Summary Statistics

CEN AV DEG SDDEG INOUT AI CC LSCC E

Mean 0.0049 2.9105 5.1401 0.0836 0.09903 0.0426 0.0403 164.4727

Median 0.0065 2.8814 4.9923 0.0101 0.0766 0.0365 0.0192 116.0000

Maximum 0.9804 5.7391 12.7021 0.9887 0.5589 0.2966 0.4889 219.0000

Minimum -0.9844 1.9692 2.4073 -1.0000 -0.0664 0.0000 0.0050 64.0000

Std. Dev. 0.2338 0.3691 1.0587 0.2228 0.0699 0.0300 0.0484 19.4519

Skewness -0.0129 0.5758 0.9584 1.3233 0.9233 1.1996 2.3741 -0.5869

Kurtosis 2.8782 3.7464 4.6527 4.5685 3.7564 4.9867 10.3298 3.5793

ADF prob 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

AC Lag 1 0.139 [0.000] 0.317 [0.000] 0.165 [0.000] 0.175 [0.000] 0.102 [0.000] 0.203 [0.000] 0.246 [0.000] 0.308 [0.000]

AC Lag 5 0.046 [0.000] 0.108 [0.000] 0.060 [0.000] 0.056 [0.000] 0.042 [0.000] 0.086 [0.000] 0.082 [0.000] 0.144 [0.000]

AC Lag 10 0.036 [0.000] 0.074 [0.000] 0.031 [0.000] 0.048 [0.000] 0.042 [0.000] 0.063 [0.000] 0.047 [0.000] 0.125 [0.000]

ADF prob refers to the p-value of the ADF test for the null of unit root.

AC Lag X [Q-test prop] refers to the p-value of the Portmanteau Q-test for no serial correlation at lags X= 1, 5, and 10.

25


27/29

Table III: Pairwise correlations between network variables

CEN AV DEG SDDEG INOU T AI CC LSCC CEN 1.0000

AV DEG -0.0012 1.0000

SDDEG -0.0015 0.0031 1.0000

INOUT -0.0019 0.2367 0.5119 1.0000

AI -0.0008 -0.1079 -0.2022 -0.6787 1.0000

CC -0.0008 0.8074 -0.0248 0.2052 -0.1095 1.0000

LSCC -0.0006 0.5042 0.4070 0.7226 -0.5019 0.4248 1.0000

Table IV: Correlations between financial and network variablesReturns Range Volume Duration

CEN 0.6774 -0.0076 0.0264 -0.0065

AV DEG -0.0034 0.0415 0.0061 0.1000

SDDEG 0.0037 0.0747 0.2363 -0.1620

INOUT -0.0061 0.0429 0.0853 0.0467

AI 0.0016 0.0635 0.0129 -0.0810

CC -0.0032 0.0314 0.0320 0.0360

LSCC -0.0076 0.0331 0.0884 0.0058

26


28/29

Table V: Network Variables: P-values for the Null Hypothesis of Granger Non-causality

Panel 1: 20 lags

CEN AI CC LSCC All

CEN 0.2143 0.4227 0.1731 0.2387

AI 0.6243 0.0004 0.0000 0.0000

CC 0.8711 0.0000 0.0000 0.0000LSCC 0.4375 0.0002 0.0219 0.0003

All 0.5556 0.0000 0.0000 0.0000

Panel 2: 18 lags

SDDEG AI CC LSCC All

SDDEG 0.1601 0.0054 0.0000 0.0000

AI 0.0000 0.0643 0.0078 0.0000

CC 0.0000 0.0000 0.0000 0.0000

LSCC 0.0000 0.0005 0.0001 0.0000

All 0.0000 0.0000 0.0000 0.0000

Panel 3: 14 lags

INOUT AI CC LSCC All

INOUT 0.1384 0.0794 0.0000 0.0000

AI 0.0000 0.0016 0.0029 0.0000

CC 0.0475 0.0000 0.0000 0.0000

LSCC 0.0000 0.0000 0.0185 0.0000

All 0.0000 0.0000 0.0000 0.0000

VAR estimated using GMM with HAC robust standard errors.

Optimal lag-length (26) is selected using Akaike Information Criterion.

Table VI: Returns and Network Variables: P-values for the Null Hypothesis of Granger Non-

causality

Returns CEN AI CC LSCC All

Returns 0.0148 0.8984 0.3530 0.7630 0.0320

CEN 0.0000 0.4459 0.8080 0.2615 0.0000

AI 0.0306 0.1491 0.0006 0.0000 0.0000

CC 0.0235 0.0632 0.0000 0.0000 0.0000

LSCC 0.1056 0.0826 0.0003 0.0240 0.0002

All 0.0000 0.0132 0.0000 0.0001 0.0000VAR estimated using GMM with HAC robust standard errors.


27


29/29

Table VII: Volatility and Network Variables: P-values for the Null Hypothesis of Granger

Non-causality

Volatility SDDEG AI CC LSCC All

Volatility 0.0005 0.2350 0.0000 0.0019 0.0000

SDDEG 0.0000 0.0263 0.0063 0.0000 0.0000

AI 0.0020 0.0000 0.0717 0.0093 0.0000

CC 0.0000 0.0000 0.0000 0.0000 0.0000

LSCC 0.0000 0.0000 0.0003 0.0116 0.0000

All 0.0000 0.0000 0.0000 0.0000 0.0000



Table VIII: Period Duration and Network Variables: P-values for the Null Hypothesis of

Granger Non-causality

Duration INOUT AI CC LSCC All

Duration 0.3328 0.0017 0.0000 0.0000 0.0000

INOUT 0.9526 0.0000 0.0000 0.1215 0.0000

AI 0.3345 0.0000 0.0020 0.0021 0.0000

CC 0.5520 0.0498 0.0000 0.0000 0.0000

LSCC 0.1211 0.0000 0.0000 0.0336 0.0000

All 0.1811 0.0000 0.0000 0.0000 0.0000



Table IX: Volume and Network Variables: P-values for the Null Hypothesis of Granger Non-

causality

Volume SDDEG AI CC LSCC All

Volume 0.0014 0.0012 0.0000 0.0063 0.0000

SDDEG 0.0669 0.1752 0.0053 0.0000 0.0000

AI 0.2008 0.0000 0.0911 0.0166 0.0000

CC 0.3970 0.0000 0.0000 0.0000 0.0000

LSCC 0.4034 0.0000 0.0014 0.0002 0.0000

All 0.3662 0.0000 0.0000 0.0000 0.0000



28

On the Information Properties of Trading Networks

Documents

Transcript of On the Information Properties of Trading Networks