THE DEVELOPMENT OF TRADING NETWORKS AMONG SPATIALLY SEPARATED
On the Information Properties of Trading Networks
-
Upload
ovvofinancialsystems -
Category
Documents
-
view
217 -
download
0
Transcript of On the Information Properties of Trading Networks
-
8/3/2019 On the Information Properties of Trading Networks
1/29
On the Informational Properties of Trading Networks
Lada Adamic, Celso Brunetti, Jeffrey Harris, and Andrei Kirilenko
September 9, 2009
ABSTRACT
We apply network analysis to trace patterns of information transmission in an elec-
tronic limit order market. If market orders or large executable limit orders are submitted
by informed traders, then resulting star-shaped or diamond-shaped patterns or trading
networks should be associated with large changes in returns, smaller volume, and short
duration between trades. In contrast, the execution of small limit orders from uninformed
traders should result in networks with many triangular and reciprocal patterns and be
associated with smaller changes in returns, larger volume and longer duration between
trades. We compute a time series of trading networks using audit trail, transaction-level
data for all regular transactions in the September 2008 E-mini S&P 500 futures contract
the cornerstone of price discovery for the S&P 500 Index. We find that network met-
rics that quantify the shape of a network are statistically significantly related to returns,
volatility, volume, and duration.
Lada Adamic is with the University of Michigan and the Commodity Futures Trading Commission, Celso
Brunetti is with Johns Hopkins University and the Commodity Futures Trading Commission, Jeffrey Harris is
with the Commodity Futures Trading Commission and the University of Delaware, and Andrei Kirilenko is with
the Commodity Futures Trading Commission. We are greatful to Paul Tsyhura for invaluable assistance with
the retrieval, organization, and processing of transaction-level data. We thank Pat Fishe, Pete Kyle, Antonio
Mele, Han Ozsoylev, and seminar participants at the Chicago Mercantile Exchange, the Commodities FuturesTrading Commission, 2009 Econometric Society Summer Meetings in Barcelona, the Federal Reserve Board
of Governors, NASDAQ, the Securities and Exchange Commission, and the University of Maryland for very
helpful comments and suggestions. The views expressed in this paper are our own and do not constitute an
official position of the Commodity Futures Trading Commission, its Commissioners or staff.
-
8/3/2019 On the Information Properties of Trading Networks
2/29
Most securities exchanges around the world are electronic limit order markets. Yet, the
analysis of electronic limit order trading has proven to be very challenging. To quote from
the survey by Parlour and Seppi (2008): Despite the simplicity of limit orders themselves,
the economic interactions in limit order markets are complex because the associated state
and action spaces are extremely large and because trading with limit orders is dynamic and
generates non-linear payoffs.
In this paper, we apply network analysis to quantify the dynamics of information trans-mission in an electronic limit order market - a complex dynamic problem. The networks we
analyze are trading networks. We define a trading network as a set of traders engaged in trans-
actions within a period of time. In graph theoretic terminology, a trading network is a graph,
consisting of a set of nodes and a set of edges. Each node denotes a unique trader and an
edge between two nodes denotes the occurrence of trading between two unique counterparties
within a period of time. The direction of an edge indicates buy or sell transactions between
unique counterparties. Namely, a directed edge from node A to node B indicates that trader A
sold (one time or several times) to trader B during a specified period of time.
A trading network formed over a designated number of transactions traces a pattern of
order execution in the limit order book. By analyzing the shape of that pattern, we can quantifythe structure of the executed portion of the book. For example, the execution of a market
order will result in a star-shaped pattern with the node that submitted the market order in the
center and nodes that connected to it as the market order marched through the limit order
book in the periphery. This star-shaped network will also not have any triangular or reciprocal
connections. In contrast, the execution of two large limit orders that arrived at different times
will result in a diamond-shaped pattern with the two nodes that submitted large limit orders on
the ends and market makers that provided the immediacy of execution (in small installments)
in the middle. Finally, an execution of a sequence of small limit orders will look different
from the execution of market or large executable limit orders. Some nodes will have more
connections than others, but there will be no central dominant node or a diamond shape. There
will be a number of triangular connections and some pairs of nodes will have edges that go
both ways.
If market orders or large executable limit orders are submitted by informed traders, then
patterns of order execution should be informative beyond transaction prices, volume or trade
duration. Intuitively, if market orders or large executable limit orders are submitted by in-
formed traders, then resulting star-shaped or diamond-shaped trading networks should be as-
sociated with large changes in returns, possibly smaller volume, and short duration between
trades. Conversely, trading networks that are very dissimilar to a star or a diamond - e.g.,
those with triangular and reciprocal patterns - should be associated with smaller changes in
returns, possibly larger volume and longer duration between trades. Various network metricsthat quantify the shape of a network - e.g., the number of central nodes or triangular con-
nections in a network - should then be statistically related to returns, volatility, volume, and
duration.
1
-
8/3/2019 On the Information Properties of Trading Networks
3/29
In this paper we find evidence that network metrics serve as primitive measures of limit
order book dynamics. Namely, we compute network and financial variables for all regular
transactions that occurred during August 2008 in the nearby E-mini S&P 500 futures con-
tract and find that network variables strongly Granger-case intertrade duration and volume.
This suggests that network metrics presage the appearance of this information in duration and
volume. We also find that the network variable that quantifies centrality (or how star-shaped
a pattern is) exhibits a very high contemporaneous correlation with returns. Similarly, the
network variables that quantifies the assortativity of connections (or how diamond-shaped a
pattern is) exhibit high contemporaneous correlation with volatility.
These results are robust with respect to different equity index futures markets (E-mini
Dow Jones and Nasdaq 100), different observation periods (May 2008 and August 2008),
different levels of aggegation (at the broker level and individual trading account level), and
different sampling frequencies (240 and 600 transactions). Correlation results can also be
replicated in a simulated model, confirming that these empirical regularities do not arise by
chance. Furthermore, the results do not depend on any parametric specifications or modeling
assumptions.
This is the first paper to empirically link trading networks that trace the execution of thelimit order book with the dynamics of high frequency financial variables - transaction prices,
quantities and duration. As such, it offers a way to analyze the dynamics of the executed
portion of the limit order book from transaction level data.
Empirical network analysis has previously been applied in finance to study investment
decisions and corporate governance.1 In contrast to strategically-formed networks where par-
ticipants prefer to associate with specific counterparties, the networks we study are trading
networks in which connections are formed as a result of an automated matching algorithm and
reflect the participants beliefs about the valuation of an asset. These networks are also highly
dynamicwhereas boards of directors and portfolio holdings evolve gradually, over weeks,
months, or yearsfinancial trading networks change second by second.
Our paper proceeds as follows. In Section I, we describe our unique ultra high frequency
data, explain how we chose the sampling frequency, and describe financial variables. In Sec-
tion II, we describe network variables. In Section III we outline our conjecture of why patterns
of order executiontrading networkscontain valuable information beyond prices, quantities,
or intertrade duration. In Section IV, we present the empirical properties of network and finan-
cial variables. In Section V, we analyze time series properties and employ Granger-causality
tests between and among network and financial variables. Section VI demonstrates that our
results are robust with respect to different markets, different observation periods, and differ-
ent sampling frequencies. In Section VII, we use an agent-based simulation model of trading
networks to further test that our empirical results do not arise by chance. Finally, Section VIII
1For a recent survey, see Allen and Babus (2008).
2
-
8/3/2019 On the Information Properties of Trading Networks
4/29
summarizes our findings and suggests further applications of the network analysis methodol-
ogy to trading networks.
I. Data and Financial Variables
We use audit trail, transaction-level data for all regular transactions in the September 2008
E-mini S&P 500 futures contract. The transactions take place during the month of August
2008 during the time when the markets for stocks underlying the S&P 500 Index are open:
weekdays between 9:30 a.m. EST and 4:00 p.m. EST. The E-mini S&P 500 futures contract
is a highly liquid, fully electronic, cash-settled contract traded on the CME GLOBEX trading
platform. It is designed to track the price movements of the S&P 500 Index - the most widely
followed benchmark of stock market performance. Empirically, the E-mini futures has been
shown to contribute the most to price discovery for the S&P 500 Index.2 Price discovery
typically occurs in the front month contract: in August 2008, the September 2008 futures
contract is the front month, most actively traded contract (see Figure 1).
For each transaction, we utilize the following data fields: date, time (up to the second),
unique transaction ID (to identify consecutive transactions within a second), executing bro-
ker, opposite broker, trading account of the executing broker, trading account of the opposite
broker, buy or sell flag (for the executing broker), price, and quantity.3
Using the audit trail-level of detail, we uniquely identify two trading accounts for each
transaction: one for the broker who booked a buy and the opposite for the broker who booked
a sale. Our dataset consists of over 6 million transactions that took place among 26950 trading
accounts that belong to 346 brokers.
We first test the quality of the data by applying standard filters designed to look for record-
ing errors and outliers in the price and quantity series.4 We find the data to be of very highquality: the standard filters did not find any data irregularities.
We then determine the optimal sampling frequency by utilizing two techniques designed
to mitigate the effect of market microstructure noise in ultra high-frequency data.5 The first
technique is developed in Andersen, Bollerslev, Diebold and Labys (2000) and is commonly
referred to as the volatility signature plot. According to this technique, the effects of mar-
ket microstructure noise in our data are mitigated at the level of 120 transactions or higher.
2See, Hasbrouck (2003).3While the data fields are named executing broker and opposite broker, transaction data does not spec-
ify which trader initiated a transaction; in fact, for each transaction, there are two mirror entries for the two
counterparties one booking a sale and the other booking a buy.4See, Hansen and Lunde (2004).5For the literature on the subject, see among others, Zhang, Mykland and Ait-Sahalia (2005), Oomen (2005),
Bandi and Russell (2006), Hansen and Lunde (2006), Barndorf-Nielsen et al. (2008).
3
-
8/3/2019 On the Information Properties of Trading Networks
5/29
The second technique is developed by Bandi and Russell (2006) to select the sampling fre-
quency that minimizes the variance of market microstructure noise. According to the second
technique, the optimal sampling frequency is just below 100 transactions. Neither technique
makes any use of network variables. We adopt a very conservative approach and select 240
transactions as the sampling frequency for our data.6
For each period consisting of 240 transactions (which amount to a total of 25,104 such
periods in our sample), we compute the following financial variables: returns, volatility, inter-trade duration, and trading volume. These four variables are typically assumed to both con-
tain and convey valuable information to market participants about the true (but unobserved)
stochastic price process.7 Intuitively, market participants can learn about the true underlying
price process by observing transaction prices, trading volume, and times between trades.
Transaction prices contain valuable information about the true underlying price process,
but with a possibly significant amount of noise due to, among other reasons, market mi-
crostructure issues (e.g., bid-ask bounce), measurement issues (e.g., time scale, discrete re-
alizations from a continuous process), and seasonality (e.g., predictable intraday patterns).8
Both returns and their volatility are computed from observed prices and, thus, suffer from
the same noise issues. However, a number of techniques have been developed to reduce theimpact of different noise components in ultra high-frequency data. The techniques we use to
deal with measurement errors and reduce market microstructure noise filters and optimal
sampling frequency are described just above. In addition, we remove a predictable intraday
seasonal component from the computed raw returns by regressing them on a constant and a
sequence of dummy variables for each half-hour during the trading period. We then use the
unexplained term as our measure of returns.9 We compute returns as differences in log prices
using both the last price to the first price within the same period (close-to-open) and last prices
for consecutive periods (close-to-close). The results reported below refer to the close-to-open
deseasonalized returns, because we believe it to be an intuitively more appealing measure to
compare with network variables (also cleaned of seasonality), which are defined within each
sampling period. Having said that, the main results are not affected by the two different ways
to compute returns nor by the deseasonalization procedure.10
Volatility is another measure that contains valuable information about the true underlying
price process. As mentioned above, because it is computed from observed prices, it suffers
from the same noise issues as returns. Moreover, volatility suffers from the fact that unlike
prices, volatility is never directly observed. Thus, volatility estimates contain not only the
volatility of the noise, but also a possibly nontrivial factor due to covariance between the
6In order to ensure the robustness of our results, we repeat our analysis at a higher sampling frequency (see
our discussion on robustness later in the paper). The main results are unaffected.7
There is a vast theoretical and empirical literature on the subject. For a recent summary, see, Manganelli(2005).
8See, for example, Engle (2000).9We apply the same technique to all financial and network variables.
10We also used a Fourier flexible form to remove seasonality. It did not qualitatively change our results.
4
-
8/3/2019 On the Information Properties of Trading Networks
6/29
true price process and the noise component.11 We use three measures to estimate volatility
during each period: absolute returns, squared returns, and the price range. Absolute and
squared returns are proxies for the standard deviation and variance of returns, respectively.
The price range is defined as the difference between the high and low price (in logs) during
the period. For the results reported below, we use the price range as the measure of volatility.
Range-based volatility estimators have been shown to be more efficient than return-based
volatility estimators, because they incorporate the full sample path of observed prices (to select
a maximum and a minimum) rather than just open and close prices.12 Our main results are not
affected by the choice of volatility estimator.
Intertrade duration contains valuable information, because the estimation of characteristics
of the true price process obtained during periods of shorter intertrade duration can be more
precise. This would happen irrespective of the reasons for shorter intertrade duration: whether
more frequent trading occurs due to more informed trading or more liquidity trading, more
frequent sampling would result in greater precision with respect to the true process. Having
said that, there is a view that since information is disseminated through trading, the interval
of time between trades can be interpreted as a proxy for the arrival of new information to the
market.13 We compute duration as the time (in seconds) elapsed between the start and end
of the period. We compute three measures of duration: total (unweighted) period duration,
volume weighted period duration, and average for 239 intertrade (within period) durations.
The results reported below are for total period duration. The main results are unaffected by
the way we compute intertrade duration.
Trading volume contains valuable information, because volume together with observed
transaction prices can be driven by a common latent factor often referred to in the literature
as information intensity.14 Intuitively, during periods of higher volume, transaction prices
also exhibit greater precision about the characteristics of the true underlying price process.
We compute volume as the number of contracts both bought and sold during the observation
period.
11A number of techniques have been developed to estimate volatility components separately by varying the
time window. See, for example, Zhang, Mykland, Ait-Sahalia (2005). Application of these techniques to trading
networks will be explored in our future research.12For the literature on price range as an efficient estimator of asset price volatility, see, for example, Parkinson
(1980), Garman and Klass (1980), Beckers (1983), and Brunetti and Lildtholdt (2006). In recent years, the price
range has also been used to compute realized volatility in high frequency data. See, for example, Christensen
and Podolski (2009).13See, for example, Engle and Russell (1998) and Engle (2000).14There is a vast theoretical and empirical literature on the subject. See, for example, Clark (1973), Epps and
Epps (1976), Tauchen and Pitts (1983), Admati and Pfleiderer (1988), Easley and OHara (1992),and Andersen
(1996).
5
-
8/3/2019 On the Information Properties of Trading Networks
7/29
II. Network variables
Quantitative analysis of networks employs a set of standard metrics.15 Network metrics di-
rectly depend on what is defined as a node, an edge, and a network. In our analysis, a node
denotes a trading account, an edge indicates that a transaction has occurred between two trad-
ing accounts, and a trading network is constructed from a specified number of consecutive
transactions (e.g., 240 transactions) among trading accounts in an electronic limit order mar-ket.
The formation of a trading network consists of three interconnected steps: (i) the arrival
and accumulation of orders in the limit order book; (ii) the process of matching buy and sell
orders; and (iii) the display of transaction prices for matched orders.16 From the network
perspective, limit orders can be visualized as stubs (ends of edges) attached to a given node.
At a simple level, each stub has a time stamp (for the time it was created), a direction (in for
buy and out for sell), a price, and a quantity. A node can grow a large number of stubs
subject to the specifics of its trading strategy, the costs of creating, modifying, maintaining,
and cancelling stubs, as well as limits imposed by the stub-matching algorithm.17
Depending on its attributes, each stub is assigned to a specific place in the limit order book.
In stubs go into the Buy Orders side of the book and Out stubs go into the Sell Orders
side. On each side of the order book, the stubs are sorted in accordance with the rules of a
matching algorithm, e.g., by price and then the time stamp or by price, quantity and then the
time stamp. The matching algorithm makes edges out of stubs by linking together top stubs
from each side of the book, provided that they agree on a price. 18
Once two stubs are connected by a matching algorithm, two things happen: an edge
between two nodes is created and the associated transaction price at which the match was
achieved is displayed for all nodes to see. After seeing a transaction price or a sequence of
15See Newman (2003) for a review of basic network concepts and quantitative indicators.16Without the loss of generality, the modification and/or removal of unmatched existing orders is viewed as a
part of the order arrival process.17For example, the Chicago Mercantile Exchange (CME) Group allows most trading firms to grow the follow-
ing number of free stubs (known as messages) for the products matched via its GLOBEX algorithm during
regular business hours: 3,000 plus no more than a ratio of grown stubs to total executed volume for this product.
This volume ratio is set at 4 to 1 for E-mini S&P 500 Futures and Spreads, 8 to 1 for E-mini NASDAQ-100
Futures and Spreads, and 25 to 1 for E-mini Dow Futures and Spreads. Stubs grown in excess of 3,000 plus the
product-specific volume ratio are penalized by a surcharge fee.18For example, on the CMEs GLOBEX Trading System, there are three algorithms to match stubs (orders):
First In, First Out (FIFO); Pro Rata Allocation (Pro Rata); and Lead Market Maker (LMM). Quoting from the
CME documentation avialble to the public, FIFO uses price and time as the only criteria for filling an order: all
orders at the same price level are filled according to time priority. Pro Rata matches orders based on price, top
orders (the first order only that betters a market), and size. The LMM is a firm or trader designated by CME to
make a two-sided market in an assigned product. This LMM will have the benefit of certain matching privileges
and associated pricing concessions in return for meeting CME determined market obligations.
6
-
8/3/2019 On the Information Properties of Trading Networks
8/29
transaction prices, some nodes may decide to modify or remove some existing stubs or grow
new stubs, thus affecting the network formation process.
Empirically, we construct trading networks as follows. At 9:30:00 a.m. EST on August 1,
2008, we start counting transactions in the September 2008 E-mini S&P 500 futures contract.
For each transaction, we know which account bought from or sold to which other account (or
itself), at what price, and what number of contracts. We designate 240 consecutive transactions
as one period. Transactions 1 through 240 mark the first period, transactions 241-480, markthe second period, and so on. While for each period, we do not observe the limit order book
itself, we know that transactions occurred because market orders or limit orders were matched
with existing orders in the limit order book. We can then trace the pattern of order execution
or a trading network within each period. Even though the number of transactions for each
period is the same, a pattern for a large market order executed over the period will look very
different compared to a pattern for several smaller limit orders. Metrics that we compute for
each network should be interpreted as quantitative measures of the pattern of order execution
in the limit order book.
We realize that by taking snapshots of the market at equal transaction time intervals, we
cannot hope to characterize the whole complexity of changes that take place in the underlyinglimit order book. Specifically, we cannot observe how the revelation of transaction prices
translates into modifications or cancellations of existing orders and submissions of new orders.
Or in terms of the network formation process, we cannot observe how nodes remove some
existing stubs and grow new stubs.
While we know that the process of trading network formation - stubs, edges, transaction
prices, new stubs - goes on continuously, we must designate the number of transactions that
add up to a trading network at a point in time. This designated number of transactions could
be at times too small and at times too large to clearly capture the impact of order execution on
the order book through network analysis within each period. However, as we analyze the time
series properties of trading networks, a statistically significant pattern, if there is one, shouldemerge. In other words, the approach we take is to compute and analyze network metrics for
a time series of consecutive trading networks rather that those for one aggregate network that
emerges over the whole period.
Given our intutition about how patterns should be related to the dynamics of transaction
prices and quantities, we are interested in network metrics that can measure centrality (or how
star-shaped a network is); assortativity of connections (or how diamond-shaped a network
is); as well as those that can measure reciprocity, triangular connections, and the size of the
network.
The size of the network can be characterized in terms of the total number of nodes, denotedby N, and the total number of edges, denoted by E. From these two quantities we can also
compute the average degree, AV DEG = E/N the average number of nodes that a node isconnected to, and the standard deviation of degree, STDEG the standard deviation around
7
-
8/3/2019 On the Information Properties of Trading Networks
9/29
this average. These two variables characterize the first and second moment, respectively, of
the unconditional degree distribution.
Node centrality quantifies the position of a specific node on a network. There are several
node centrality measures, the simplest one being degree, or how many edges a node has. In a
directed network, degree can be further separated into indegree and outdegree in accordance
with the number of incoming or outgoing edges of a node.
However, the degree alone may not necessarily capture the role of a node on the network.
For example, a node that has a relatively low degree, but acts as a connector between otherwise
disconnected parts of the network, can be thought of as very central. To that end, there are
measures of centrality that take into account not just the degree of a node, but its position
relative to all other nodes in the network. For example, betweenness measures how many
other pairs of nodes would have to go through the given node in order to reach one another
in the shortest number of hops. Similarly, closeness measures how many hops away a node is
on average from every other node in the network. Figure 2 illustrates different node centrality
measures.
Node centrality is a critical input into the calculation of network centralization, a measurethat characterizes the inequality of connectivity among the nodes. In order to capture this
inequality in connectivity within the network whether there are a small number of nodes with
high centrality and a large number of nodes with low centrality we compute a centralization
measure defined as centralization Gini:
G =nr=1(2rN1)ki
NE, (1)
where ki is a nodes centrality measure and r is a nodes rank order number.
Taking node is degree as its centrality measure, we use the formula above to computeseparate centralization measures for indegree and outdegree incentralization, INCEN, and
outcentralization, OUTCEN, respectively. By construction, these measures are 0 if every node
has the same number of (incoming or outgoing) edges, and positive with increasing inequality:
e.g., one node has all the incoming (outgoing) edges, the others have no incoming (outgoing)
edges.
We also compute a combined measure of incentralization and outcentralization: CEN=INCENOUCEN. Intuitively, since we use a nodes degree as a measure of its centrality, the
difference between in and out centralization measures can be interpreted as the presence of a
dominant buyer or seller. CEN will be equal to 1 if there is a dominant buyer and -1 if there is
a dominant seller.
To measure wheher a node is both a buyer and a seller, we compute the Pearson corre-
lation coefficient between the indegree and the outdegree of each node, INOUT. A positive
8
-
8/3/2019 On the Information Properties of Trading Networks
10/29
correlation indicates that nodes with many in edges also have many out edges - i.e., it is both
buying and selling.
We also calculate statistical properties of nodes one edge away from each individual node
or connectivity of node B conditional on it being connected to node A. Assortativity in net-
works can represent any tendency of like to be connected with like for any node property
(see Newman (2002)), but here we will apply it to degree. Large degree nodes (i.e., those
with many edges) may connect more frequently to other large degree nodes or they may tendto connect to small degree nodes. Two large degree nodes connecting to a number of small
degree nodes between them will result in a diamond-shaped network.
One way to measure assortativity is by the Pearson correlation coefficient (ki,kj) forall edges ei j. When the edges are directed there are four possible assortativity measures:
(kini ,kinj ), (k
ini ,k
outj ), (k
outi ,k
inj ), and (k
outi ,k
outj ) corresponding to the four conditional de-
gree distributions.
From these four correlation coefficients, we construct the following compound measure,
that we call assortativity index for directed networks:
AI=1
4
(kini ,k
inj )+(k
outi ,k
outj )
(kini ,k
outj )+(k
outi ,k
inj )
, (2)
computed overall all edges ei j.
Figure 3 illustrates network assortativity. For example, in the context of trading networks,
the coefficient (kouti ,kinj ) measures the correlation between the number of unique buyers (con-
nected by an outward pointing edge) a seller is selling to (denoted by koutj ) and the number of
unique sellers those buyers are buying from (denoted by kinj ). A negative (kouti ,k
inj ) would
mean that when a seller has matched to many buyers, those buyers are likely to be transacting
with few or no other sellers.
We also measure if nodes one edge away from each individual node also form particular
(e.g., triangular) patterns. Transitivity, also termed clustering, measures the prevalence of
closed triads in the network. In this paper, we use the global clustering coefficientdenoted by
CCas a measure of transitivity:19
CC=3 number of triangles in the network
number of connected triples of vertices, (3)
19See, Newman (2003).
9
-
8/3/2019 On the Information Properties of Trading Networks
11/29
where a connected triple means three nodes ABC such that there is an edge AB and
an edge BC.20, and the prevalence of specific directed triads can be used to conduct a motif
analysis on a directed network.21
Finally, in addition to regularities in connections between pairs and triplets of nodes, a net-
work as a whole may be composed of several separate connected components. A connected
component is a maximal subset of nodes such that any node can be reached from any other
node by traversing edges. Within a strongly connected component any node can be reachedfrom any other by following directededges. Figure 4 illustrates the largest strongly connected
component(LSCC). Once the largest strongly connected component is identified, we can mea-
sure the global network structure by computing LSCC, the proportion of the network occupied
by this component.
Intuitively, the largest strongly connected component can only occupy a significant portion
of the network if many nodes have both incoming and outgoing edges during the same time
period, and there are cycles (the simplest of which are reciprocal ties and the triads mentioned
above) within the network. In other words, a large strongly connected component is much
more likely to emerge as a result of a large number of limit orders than one large market order.
III. Conjecture: Trading networks contain information
We believe that network analysis is very useful for quantifying patterns of information trans-
mission in an electronic limit order market. Specifically, we conjecture that orders that contain
information about the fundamental value of an asset, as well as demand and supply of liquid-
ity for this asset, should have particularstar-shaped or diamond-shapedexecution patterns.
In contrast, orders that have little such information should exhibit very different patterns,
namely, they should contain many triangular and reciprocal connections.
To illustrate this intuition, we show in Figure 5 three sample networks, with their network
and financial statistics. The sample networks are chosen to display extremes in centralization
(CEN), assortativity index (AI), and the number of edges (E).
The left column of Figure 5 presents a star-shaped network with one dominant buyer
matched with many sellers. This pattern is consistent with the execution of a market order.
It has a centralization coefficient close to one, high standard deviation of degree and large
assortativity. This network is associated with a positive return and low period duration.
The center column of Figure 5 presents a network with four large traders forming diamond-
shaped patterns as their limit orders are being crossed via intermediaries. High assortativity20For the clustering coefficient, we are treating the edges as undirected, although directed clustering coeffi-
cients can also be defined. See, Fagiolo (2007).21On motif analysis, see Milo et al (2004).
10
-
8/3/2019 On the Information Properties of Trading Networks
12/29
index means that large traders are mostly matched with many small traders rather than with
each other. Small largest strongly connected component suggests that large traders are not
trading with each other: rather they buy from or sell to small traders who quickly trade with
another large trader. This pattern is associated with negative returns, as well as with higher
volume and volatility.
The right column of Figure 5 presents a fairly uniform network with many buyers and sell-
ers of various sizes. This situation is reflected in a pattern of connections that exhibits networkparameters close to their sample averages with the exception of the number of edges - reflect-
ing a larger and more interconnected trading network. The financial variables estimated
from transaction prices - the rate of return and volatility - are very close to their averages.
Volume is somewhat above its sample average and period duration is quite high.
The examples above provide illustrative evidence in support of our intuitive conjecture.
Our next step is to take our conjecture to the dataa times series of over 25000 trading
neworksand to prove that that metrics of order execution patterns are statistically related
to returns, volatility, volume, and duration.
IV. Empirical properties of trading networks
A. Summary statistics for the financial variables
Table I presents summary statistics for the financial variables. All financial variables in our
sample are stationary. Standard ADF tests reject the null hypothesis of non-stationarity (p-
value = 0.00) for all financial variables. For the rate of return, the standard deviation dominates
the mean as expected. In addition, returns exhibit positive skewness. The range has a period
average of 0.04 percent, which corresponds to an annualized average volatility of 24 percent in line with estimates reported in the literature.22 Intertrade period duration in this very
liquid market ranges from zero to 176 seconds. In our sample, 240 transactions on average
occur every 19.5 seconds. Finally, volume, volatility, and duration are highly persistent, as
evidenced by autocorrelation coefficients at lags 1, 5, and 10.
22Of the three measures of volatility - absolute returns, squared returns, and range - the range exhibits the
lowest standard deviation (results available upon request). This is in line with the efficiency results of Parkinson
(1980).
11
-
8/3/2019 On the Information Properties of Trading Networks
13/29
B. Summary statistics for the network variables
Table II presents summary statistics for the network variables.23 All network variables study
are stationary. However, Jarque-Bera tests for individual network variables reject the null
of normality at standard significance levels. Finally, all network variables exhibit persistent
autocorrelation functions.
Our combined measure of the difference between in-centralization and out-centralization,
CEN= 0.00 0.23, is on average equal to zero. However, this is partly due to the fact thatthe shapes of distributions of incentralization and outcentralization are very similar, which
indicates that both the buy side and the sell side of the limit order book are executed in a
symmetrically balanced way.
The distributions of incentralization and outcentralization are also strongly negatively cor-
related. This indicates that when there are a few dominant buyers, the sellers tend to have
more even numbers of trading partners, and conversely, when there are a few dominant sell-
ers, the buyers tend to be more equal in trading partners. In other words, a large market order
or executable limit order to buy (sell) is likely to be executed against several small limit orders
on the sell (buy) side of the book.
These trading networks are highly dynamic, and the most central node in one period may
have few or no edges in the next. In other words, it is quite unlikely that a market order or
an executable limit order is so large that it spans several consequtive networks. Of the nearly
27000 trading accounts who bought or sold S&P 500 E-mini futures contracts during August
2008, 17 accounts were extremely active, accounting for nearly 40 percent of all transactions.
On the other hand, 85 percent of trading accounts traded only 10 percent of all transactions.
The correlation in indegree from period to period for the individual 17 most active trading
accounts varied between [0.28] and [0.64], while for the bottom 85 percent of accounts, the
correlation was essentialy zero.
At the level of the individual trader, the indegree is slightly correlated with outdegree
(INOUT= 0.080.22), suggesting that traders who have more buying interactions will tendto have a slightly higher number of selling interactions during the same transaction window.
Assortativity correlations in trading networks are on average negative. There is a moderate
negative correlation between the number of buyers a seller is selling to and the number of
sellers that a buyer is buying from. This relationship stems in part from the skewed degree
distribution. Most buyers have low indegree, therefore a seller with high outdegree must be
selling to many buyers with low indegree. Similarly, most sellers have low outdegree, and a
buyer with high indegree must be buying from many of them. The overall assortativity index,
which is computed by assigning equal weights to the four assortativity correlations, is slightly23While the table focuses on eight network variables, we have also analyzed a wide range of other network
metrics, including Freeman centralization, reciprocity, individual pairwise assortativity metrics, and alternate
centrality measures, including directed and undirected betweenness, closeness, and PageRank.
12
-
8/3/2019 On the Information Properties of Trading Networks
14/29
positive: AI= 0.090.07. This means that on average, when a seller (buyer) is matched withmany buyers (sellers) they are just as likely to be transacting with many other sellers (buyers)
as with a few or no sellers (buyers). A devitation from this pattern indicates that one buyer or
one seller is dominant.
The global clustering coefficient or the ratio of oberved triangular connections among
nodes to all possible triangular connections is 0.040.03, nearly one standard deviation below
the average clustering coefficient for randomized graphs with the same assignments of degrees.In other words, there is no tendency for the traders to cluster together.
Similar to the clustering coefficient, the size of the largest strongly connected component,
(0.040.04), does not deviate from what would be expected for networks of that size, density,and distribution of in and out degrees. But as we will see in the following section, it does
strongly correlate with density and other network variables.
V. Empirical analysis of trading networks
A. Correlations
Table III reports contemporaneous correlations among network variables. According to Ta-
ble III, the difference between buyer and seller centralization (CEN) is not correlated with any
other network variable. Average degree (AV DEG) is positively correlated with the standard
deviation of degree (SDDEG). This property is typical for a power law degree distribution, in
which a few dominant nodes have high degree (i.e., few accounts trade with a large number of
counterparties) and a large number of nodes have very low degree. As a result, for most power
law distributions, central moment estimators, like average degree and the standard deviation
of degree, grow with the sample size. Correlations between a nodes indegree and outdegree(INOUT) boost the variance in undirected degree (SDDEG). This means that large buyers
are also large sellers, while small buyers are also small sellers. By construction, the assor-
tativity index (AI) is highest when high degree buyers are matched with low degree sellers.
This is more likely to occur when the network is less dense and the largest strongly connected
component (LSCC) is small.
Table IV presents correlations between financial and network variables. The return process
exhibits 68 percent correlation with centralization, but no other network variables. Intuitively,
a large positive CEN results from a large market order or executable limit order to buy. This
order is likely to push prices up, which results in a positive rate of return. The same intituition
holds for a market order to sell.
Volatility is positively correlated with all the network variables, with the exception of the
assortativity index (AI). This means that when high degree buyers are matched with low
degree sellers (like in the case of a market order), volatility is somewhat smaller compared to
13
-
8/3/2019 On the Information Properties of Trading Networks
15/29
a situation when low degree buyers are matched with low degree sellers (several limit orders).
Intuitively, in a deep and liquid market like the E-mini S&P 500 futures, an incoming market
order has a significant chance to be executed against several limit orders sitting at (or near) the
same tick, resulting in high assortativity and centralization, but very little price impact and,
hence, low high-frequency volatility estimate. At the same time, intermediated execution of
two large limit orders from both sides of the limit order book will result in a positive high
frequency estimate of volatility, if only due to the bid-ask bounce.
Duration is positively correlated with the average degree and in-out degree correlation and
negatively correlated with the standard deviation degree and the assortativity index. Intuitively,
a longer time interval between trades is associated with trades that are distributed more evenly
among traders, increasing the average degree and decreasing the standard deviation of degree
and the assortativity index. Over longer time intervals, it is also more likely that a node that
has a high indegree also has a high outdegree (it has time to be both a buyer and a seller),
which results in a positive in-out degree correlation.
B. Granger Causality
We next test for Granger causality in the context of Vector Autoregressive (VAR) models.
Since the variables exhibit heteroskedasticity and serial correlation, we estimate VAR models
using the generalized method of moments (GMM) and Newey-West robust standard errors.
We first consider a VAR model with eight network variables. According to the Akaike
Information Criterion, the system that includes all eight network variables has an optimal lag-
length of twenty.24 However, the results of the model with eight network variables (available
from the authors upon request) show strong evidence of feedback effects among the network
variables, i.e., network variables tend to Granger cause each other. In light of this, we use
standard tests to reduce the model to four network variables.25
Tables V-IV provide the results (p-values) of Granger-non-causality tests. The last column
and the last row of each table are labelled all. In the last column we test whether each
variable is Granger-caused by all the other variables in the system, while the last row is testing
whether each variable is Granger-causing any other variable in the system. The null hypothesis
is that of Granger-non-causality. Therefore, a p-value greater than five percent indicates a
failure to reject the null.
Table V presents p-values for the Granger-non-causality test among three sets of four net-
work variables (three panels). Panel 1 shows that centralization (CEN) is Granger-causing the
other network variables (p-value = 0.5556), but is not Granger-caused by other network vari-
24Throughout the analysis we use both Akaike and Schwartz Information Criteria.25Standard test statistics are available from the authors upon request.
14
-
8/3/2019 On the Information Properties of Trading Networks
16/29
ables (p-value = 0.2387). On the other hand, Panels 2 and 3 show that the remaining network
variables Granger-cause each other.
Next, we test for Granger-causality between one financial variable and four network vari-
ables. Using standard techniques, we select groups of network variables that reflect degree
properties at the level of a single node (e.g., centralization, standard deviation of degree, and
in and out degree correlation), two nodes linked by an edge (assortativity index), connected
triples of nodes (clustering coefficient), and the connectivity of the whole network (the pro-portion of nodes in the largest strongly connected component).
Table VI presents p-values for the Granger-non-causality test for the rate of return and
network variables. We find that the return process is both Granger-caused and Granger-causes
network variables. The network variable that has a strong impact on returns is centralization.
This is in line with the correlation results in Table IV.
Table VII reports Granger-non-causality test results for the volatility process and network
variables. Similarly to the return process, we find a feedback effect between volatility and
network variables: volatility is both Granger-caused by network variables and Granger-causes
them.
Table VIII reports Granger-non-causality test results for intertrade duration and network
variables. We find that duration is Granger-caused by network variables (p-value =0.0000),
but does not Granger-cause network variables (p-value = 0.1811).
Finally, Table IX presents p-values for the Granger-non-causality test for volume and net-
work variables. The results show that volume is Granger-caused by network variables (p-value
= 0.0000) but does not Granger-cause network variables (p-value = 0.3662).
What are the possible reasons for the presence of feedback effects in Granger causality test
results for the rate of return and volatility (vis-a-vis the network variables) and the absence
of such effects for volume and duration? We believe that there is one fundamental reasonfor these empirical findings: our results for the price-based variables are polluted by noise.
Unlike volume, duration, and all the network variables, which we can measure directly, the
rate of return and volatility are estimated from transaction prices. As a result, the variables
we call the rate of return and volatility are noisy proxies for the unobservable characteristics
of the true price process. The level of noise at this very high frequency is so high that it is
very hard to effectively measure the interaction between network variables and the true price
process.
VI. Robustness
Our results are robust with respect to different markets, different observation periods, different
levels of aggregation, and different sampling frequencies. The results we report are for the E-
15
-
8/3/2019 On the Information Properties of Trading Networks
17/29
mini S&P 500 futures for the month of August 2008 (over 6 million transactions). The results
remain qualitatively the same when we repeat all procedures for the same market for the month
of May 2008 (5.15 million transactions) at the sampling frequency of 240 transactions. The
main results also remain the same for the sampling frequency of 600 transactions. Namely,
both correlations and Granger-causality results hold. The results are also the same whether
we construct networks at the broker level or trading account level. Finally, the results remain
the same for other stock index futures markets as confirmed by the analysis of the E-mini
Nasdaq 100 (2.3 and 2.8 million transactions in May 2008 and August 2008, respectively) and
E-mini Dow Jones futures contracts for both May 2008 and August 2008 (1.8 and 2.4 million
transactions in May 2008 and August 2008, respectively) at the sampling frequencies of 240
and 600 transactions.
VII. Agent-based simulation model of trading networks
In order to further test that our empirical results do not arise by chance and to examine the
source of the high correlation between network centralization and returns, we construct anagent-based simulation model of trading networks. Figure 6 presents a snapshot of the simu-
lation model.
The model setup is as follows. There is a fixed number of traders. At each interval of time,
a new buy or sell order is assigned at random to one of the traders. The order arrival time is
distributed according to a Poisson distribution. The order has an equal50 percentprobability
to either buy or sell a single quantity at a single price. The quantity is lognormally distributed.
For a sell order the price is set a small fixed number above the last transaction price plus
a lognormally distributed random variable with a mean of zero. For a buy order, the price is
set a small fixed number below the last transaction price plus a lognormally distributed ran-
dom variable with a mean of zero. Setting the order a small number above (below) the last
transaction price for a buy (sell) order indicates a willingness on the part of the trader to buy
(sell) at a slightly higher (lower) price than the market is currently trading at. The lognormally
distributed, zero mean random variable added to the order price represents the heterogene-
ity of beliefs. This parametric specification is consistent with a price function that arises in
equilibrium under the assumption of heterogenous beliefs about the true price process.26
Each incoming order is matched against previously placed orders by an automated match-
ing algorithm based on price and time priority. If a match is made, an edge is created be-
tween two traders.27 Immediately following a match, order quantities are updated for the two
matched traders. If the newly placed order is only partly fulfilled, the algorithm attempts to
26See, for example, Scheinkman and Xiong (2003).27Since the orders are randomly assigned, it is possible that an edge connects two orders submitted by the
same trader for the same trader.
16
-
8/3/2019 On the Information Properties of Trading Networks
18/29
match the remaining quantity against another, more recent previously placed order. Orders are
set to expire after a fixed amount of time from when they are first created, at which point they
are cancelled and withdrawn from the market.
Using the resulting simulated transactions, we construct trading networks using the pro-
cedure identical to the one used for the empirically observed data. Namely, we simulate 6
million transactions, segment the data into periods of 240 consecutive transactions and com-
pute network and financial statistics for each period. Just as in the actual trading, a singleorder may be reflected in multiple transactions in adjacent time windows.
This setup allows for a possibility of heterogenous beliefs about the price process, but
imparts no intentionality or memory upon the traders. It allows us to discern which features
of the trading networks are due to the arrival of information to the market, and which may be
due to strategic behavior on behalf of the traders.
We find that a sequence of orders with randomly distributed prices and quantities results in
network and financial variables that are very similar to those obtained from the futures market
data, but with the notable (and anticipated) exception of a dynamic structure. Specifically, we
find that contemporaneous correlations among the network variables, as well as correlationsamong network and financial variables are very similar to those we estimate from the actual
market data.28 This confirms that our empirical results do not arise by chance.
We also use the agent-based simulation model to investigate possible sources of high cor-
relation between network centralization and returns. By observing the simulation, we find
that high correlation between centralization and returns reflects the network mechanics of
the information arrival process: a trader submitting a large buy order at a high price will be
matched against several existing sell orders, giving that trader a high indegree, and increasing
the centralization of the network. At the same time, because a greater number of sell orders
was matched, the market-price goes up, yielding a positive rate of return. Moreover, we find
that for the simulated data (but not market data), centralization and other network variablesGranger-cause returns, but not vice versa.29
At the same time, we also find thatas expected in a model with no intentionality or
memoryGranger-causality tests among network variables and volatility, volume and duration
yield very weak results: feedback effects, lack of significance or very poor fit. This suggests
that the Granger-causality results that we find in the futures markets data arise as a result of
the behavior of traders and are not a statistical artifact.
28Results are available from the authors upon request.29We use the lag-length of order on in the VAR. As expected from the lack of dynamics in the silmulated data,
the Akaike information creterion selects a lag-length of order one in the VAR specification and the Schwartz
information creterion selects a lag length of order zero.
17
-
8/3/2019 On the Information Properties of Trading Networks
19/29
VIII. Concluding remarks
We use network analysis to examine information transmission in an electronic limit order
market. We conjecture that orders that contain information about the fundamental value of
an asset, as well as demand and supply of liquidity for this asset, should have particularstar-
shaped or diamond-shapedexecution patterns. In contrast, orders that have little such infor-
mation should exhibit very different patterns, namely, they should contain many triangularand reciprocal connections.
We test this conjecture by computing a time series of trading networks from audit trail,
transaction-level data for all regular transactions in the September 2008 E-mini S&P 500 fu-
tures contract during the month of August 2008 (over 6 million transactions).
We find that star-shaped or diamond-shaped patternscharacterized by high centralization
or assortativity and low transitivity (clustering coefficient) and connectednessare positively
related to returns and volume and negatively related to duration and volatility. In contrast,
less heterogeneous patternsthose with centralization and assortativity close to zero (their
averages), high transitivity and high connectednessare associated with average returns andvolatility, and positively related to volume and duration.
Moreover, we find that network variables strongly Granger-case intertrade duration and
volume, but not the other way around. This suggests that patterns of order execution presage
changes in duration and volume.
This is the first paper to employ network analysis to study complex dynamics of an elec-
tronic limit order market using transaction level data. While network analysis offers only a
partial look into the limit order book (i.e., the executed part), network technology offers sig-
nificant advantages for the task of analyzing the complexity of electronic limit order trading
beyond just financial variables.
18
-
8/3/2019 On the Information Properties of Trading Networks
20/29
References
[1] Allen, Franklin, and Ana Babus, 2008, Networks in Finance, Working Paper 08-07, Wharton
Financial Institutions Center, University of Pennsylvania.
[2] Andersen, T., Bollerslev, T., Diebold, F.X. and Labys, P., 2000, Great Realizations, Risk, 13,
105-108.
[3] Bandi, F. M. and Russell, J. R., 2006, Separating microstructure noise from volatility, Journal of
Financial Economics, 79, 655-692.
[4] Barndorff-Nielsen, O.E., Hansen, P.A., Lunds, A., and Shephard, N., 2008, Realised kernels in
practice: trades and quotes, manuscript.
[5] Beckers, S., 1983, Variance of security price returns based onhigh, low and closing prices, Journal
of Business 56, 97-112.
[6] Braha, Dan, and Bar-Yam, Y., 2006, From Centrality to Temporary Fame: Dynamic Centrality in
Complex Networks, Complexity 12(2), 59-63.
[7] Brunetti, Celso, and Lildholdt, P.M., 2006, Relative efficiency of return- and range-based volatil-ity estimators, manuscript.
[8] Clark, P., 1973, A subordinated stochastic process model with finite variance for speculative
prices, Econometrica 41, 135-155.
[9] Christensen, K., and Podolski, M., 2005, Asymptotic theory of range-based estimation of inte-
grated variance of a continuous semi-martingale, manuscript.
[10] Engle, Robert, 2000, The econometrics of ultra-high-frequency data, Econometrica 68, 1-22.
[11] Engle, R., and Gallo, G., 2006, A multiple indicators model for volatility using intra-daily data,
Journal of Econometrics 131, 3-27.
[12] Engle, R., and Russell, J., 1998, Autoregressive conditional duration: A new model for irregularly
spaced transaction data, Econometrica 66, 1127-1162.
[13] Epps, T. and Epps, M., 1976, The stochastic dependence of security price changes and transaction
volumes: Implications for the mixture-of-distribution hypothesis, Econometrica 44, 305-321.
[14] Fagiolo, G., 2007, Clustering in complex directed networks, Physical Review E 76(2), 26107.
[15] Garman, M. and Klass, M., 1980, On the estimation of security price volatilities from historical
data, Journal of Business 53(1), 67-78.
[16] Hansen, P. and Lunde, A., 2006, Realized variance and market microstructure noise, Journal of
Business and Economic Statistics 24, 127-218.
[17] Hasbrouck, Joel, Intraday Price Formation in U.S. Equity Index Markets, 2003, Journal of Finance
58(6), 2375-2400.
19
-
8/3/2019 On the Information Properties of Trading Networks
21/29
[18] Hong, Harrison, and Jeremy C. Stein, 1999, A unified theory of underreaction, momentum trading
and overreaction in asset markets, Journal of Finance 54, 2143-2184.
[19] Kossinets, G. and Watts, D.J., 2006, Empirical Analysis of an Evolving Social Network, Science
311 (5757), 88-90.
[20] Milo, R., Itzkovitz, S., Kashtan, N., Levitt, R., Shen-Orr, S., Ayzenshtat, I., Sheffer, M., and U.
Alon, 2004, Superfamilies of Evolved and Designed Networks, Science 303, 1538-1542.
[21] Newman, M. E. J., 2002, Assortative mixing in networks, Physical Review Letters 89, 208701.
[22] Newman, M. E. J., 2003, The structure and function of complex networks, SIAM Review 45, 167.
[23] Oomen, R., 2005, Properties of bias-corrected realized variance under alternative sampling
schemes, Journal of Financial Econometrics 3, 555-577.
[24] Parlour, Christine A. and Duane J. Seppi, 2008, Limit Order Markets: A Survey, Handbook of
Financial Intermediation and Banking, Boot, Arnoud W.A., and Anjan V. Thakor, eds., Elsevier
B.V., Oxford, UK.
[25] Scheinkman, Jose A. and Wei Xiong, 2003, Overconfidence and Speculative Bubbles, Journal of
Political Economy 111(6), 1183-1219.
[26] Tauchen, G. and Pitts, M., 1983, The price variability-volume relationship on speculative markets,
Econometrica 51, 485-505.
[27] Wilensky, U. (1999). NetLogo. http://ccl.northwestern.edu/netlogo. Center for Connected Learn-
ing and Computer-Based Modeling. Northwestern University, Evanston, IL.
[28] Zhang, L., Mykland, P. A., and Ait-Sahalia, Y., 2005, A tale of two scales: Determining integrated
volatility with noisy high-frequency data, Journal of American Statistical Association, 100, 1394-
1411.
20
-
8/3/2019 On the Information Properties of Trading Networks
22/29
Figure 1: E-mini S&P 500 front month futures contract.
21
-
8/3/2019 On the Information Properties of Trading Networks
23/29
Y
X
Y
X
Y
X YX
indegree outdegree betweenness closeness
Figure 2: Example networks with node X having greater centrality than node Y for the speci-
fied measure.
(k ini,k inj) -1 1 - 1(k outi, k
outj) -1 1 - 1(k ini,k outj) -1 - 1 1(k outi, k inj) -1 - 1 1AI 0 1 - 1Figure 3:Illustrat ionofNetw ork Assort ativity.
CE
FG
Figure 4:Anetw ork contain ingtwocon nectedcom ponents,A BCDEand FGH.The largeststrongly connected componen tisBCDE .
22
-
8/3/2019 On the Information Properties of Trading Networks
24/29
C E N 0.9 20 -0 .09 2 0 .11 6
AV DE G 2.0 27 3.2 69 2 .41 8SD DE G 8.1 34 5.9 60 4 .94 9IN O U T -0 .4 70 -0 .10 1 -0 .06 9AI 0.3 53 0.5 59 0 .19 5C C 0.0 01 0.0 49 0 .01 9LSC C 0.0 14 0.0 16 0 .00 6E 7 5 1 03 185Retu rn s 0 .0 59 -0 .01 9 0 .00 0Ran ge 0.0 59 0.0 59 0 .03 9Volu m e 1 1 04 13 43 1 19 1Du ratio n 0 1 2 16
F igu re 5:E xam p les of ob serv ed net wo rk san dth eir p rop ertie s.
23
-
8/3/2019 On the Information Properties of Trading Networks
25/29
Figure 6: A screenshot of the agent based simulation using Netlogo (Wilensky 1999). Asorders (denoted by squares) are randomly assigned to traders (denoted by human figures), an
edge is drawn between them. When sell orders (black squares) are matched with buy orders
(red squares), their quantities are reduced, and there is a direct edge drawn between the traders.
24
-
8/3/2019 On the Information Properties of Trading Networks
26/29
Table I: Financial Variables: Summary Statistics
Returns Volatility Volume Duration
Mean 0.0002 0.0425 1236.6720 19.4941
Median 0.0000 0.0392 1153 14
Maximum 0.2165 0.2165 6645 176
Minimum -0.1378 0.0190 459 0
Std. Dev. 0.0271 0.0140 407.1451 17.4485Skewness 0.0485 0.8273 2.6259 2.0299
Kurtosis 2.9876 6.4663 16.9377 9.3735
ADF prob 0.0001 0.0000 0.0000 0.0000
AC Lag 1 -0.001 [0.895] 0.187 [0.000] 0.528 [0.000] 0.473 [0.000]
AC Lag 5 -0.006 [0.062] 0.167 [0.000] 0.376 [0.000] 0.289 [0.000]
AC Lag 10 -0.011 [0.139] 0.151 [0.000] 0.284 [0.000] 0.241 [0.000]
ADF prob refers to the p-value of the ADF test for the null of unit root.
AC Lag X [Q-test prop] refers to the p-value of the Portmanteau Q-test
for no serial correlation at lags X= 1, 5, and 10.
Table II: Network Variables: Summary Statistics
CEN AV DEG SDDEG INOUT AI CC LSCC E
Mean 0.0049 2.9105 5.1401 0.0836 0.09903 0.0426 0.0403 164.4727
Median 0.0065 2.8814 4.9923 0.0101 0.0766 0.0365 0.0192 116.0000
Maximum 0.9804 5.7391 12.7021 0.9887 0.5589 0.2966 0.4889 219.0000
Minimum -0.9844 1.9692 2.4073 -1.0000 -0.0664 0.0000 0.0050 64.0000
Std. Dev. 0.2338 0.3691 1.0587 0.2228 0.0699 0.0300 0.0484 19.4519
Skewness -0.0129 0.5758 0.9584 1.3233 0.9233 1.1996 2.3741 -0.5869
Kurtosis 2.8782 3.7464 4.6527 4.5685 3.7564 4.9867 10.3298 3.5793
ADF prob 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
AC Lag 1 0.139 [0.000] 0.317 [0.000] 0.165 [0.000] 0.175 [0.000] 0.102 [0.000] 0.203 [0.000] 0.246 [0.000] 0.308 [0.000]
AC Lag 5 0.046 [0.000] 0.108 [0.000] 0.060 [0.000] 0.056 [0.000] 0.042 [0.000] 0.086 [0.000] 0.082 [0.000] 0.144 [0.000]
AC Lag 10 0.036 [0.000] 0.074 [0.000] 0.031 [0.000] 0.048 [0.000] 0.042 [0.000] 0.063 [0.000] 0.047 [0.000] 0.125 [0.000]
ADF prob refers to the p-value of the ADF test for the null of unit root.
AC Lag X [Q-test prop] refers to the p-value of the Portmanteau Q-test for no serial correlation at lags X= 1, 5, and 10.
25
-
8/3/2019 On the Information Properties of Trading Networks
27/29
Table III: Pairwise correlations between network variables
CEN AV DEG SDDEG INOU T AI CC LSCC CEN 1.0000
AV DEG -0.0012 1.0000
SDDEG -0.0015 0.0031 1.0000
INOUT -0.0019 0.2367 0.5119 1.0000
AI -0.0008 -0.1079 -0.2022 -0.6787 1.0000
CC -0.0008 0.8074 -0.0248 0.2052 -0.1095 1.0000
LSCC -0.0006 0.5042 0.4070 0.7226 -0.5019 0.4248 1.0000
Table IV: Correlations between financial and network variablesReturns Range Volume Duration
CEN 0.6774 -0.0076 0.0264 -0.0065
AV DEG -0.0034 0.0415 0.0061 0.1000
SDDEG 0.0037 0.0747 0.2363 -0.1620
INOUT -0.0061 0.0429 0.0853 0.0467
AI 0.0016 0.0635 0.0129 -0.0810
CC -0.0032 0.0314 0.0320 0.0360
LSCC -0.0076 0.0331 0.0884 0.0058
26
-
8/3/2019 On the Information Properties of Trading Networks
28/29
Table V: Network Variables: P-values for the Null Hypothesis of Granger Non-causality
Panel 1: 20 lags
CEN AI CC LSCC All
CEN 0.2143 0.4227 0.1731 0.2387
AI 0.6243 0.0004 0.0000 0.0000
CC 0.8711 0.0000 0.0000 0.0000LSCC 0.4375 0.0002 0.0219 0.0003
All 0.5556 0.0000 0.0000 0.0000
Panel 2: 18 lags
SDDEG AI CC LSCC All
SDDEG 0.1601 0.0054 0.0000 0.0000
AI 0.0000 0.0643 0.0078 0.0000
CC 0.0000 0.0000 0.0000 0.0000
LSCC 0.0000 0.0005 0.0001 0.0000
All 0.0000 0.0000 0.0000 0.0000
Panel 3: 14 lags
INOUT AI CC LSCC All
INOUT 0.1384 0.0794 0.0000 0.0000
AI 0.0000 0.0016 0.0029 0.0000
CC 0.0475 0.0000 0.0000 0.0000
LSCC 0.0000 0.0000 0.0185 0.0000
All 0.0000 0.0000 0.0000 0.0000
VAR estimated using GMM with HAC robust standard errors.
Optimal lag-length (26) is selected using Akaike Information Criterion.
Table VI: Returns and Network Variables: P-values for the Null Hypothesis of Granger Non-
causality
Returns CEN AI CC LSCC All
Returns 0.0148 0.8984 0.3530 0.7630 0.0320
CEN 0.0000 0.4459 0.8080 0.2615 0.0000
AI 0.0306 0.1491 0.0006 0.0000 0.0000
CC 0.0235 0.0632 0.0000 0.0000 0.0000
LSCC 0.1056 0.0826 0.0003 0.0240 0.0002
All 0.0000 0.0132 0.0000 0.0001 0.0000VAR estimated using GMM with HAC robust standard errors.
Optimal lag-length (18) is selected using Akaike Information Criterion.
27
-
8/3/2019 On the Information Properties of Trading Networks
29/29
Table VII: Volatility and Network Variables: P-values for the Null Hypothesis of Granger
Non-causality
Volatility SDDEG AI CC LSCC All
Volatility 0.0005 0.2350 0.0000 0.0019 0.0000
SDDEG 0.0000 0.0263 0.0063 0.0000 0.0000
AI 0.0020 0.0000 0.0717 0.0093 0.0000
CC 0.0000 0.0000 0.0000 0.0000 0.0000
LSCC 0.0000 0.0000 0.0003 0.0116 0.0000
All 0.0000 0.0000 0.0000 0.0000 0.0000
VAR estimated using GMM with HAC robust standard errors.
Optimal lag-length (18) is selected using Akaike Information Criterion.
Table VIII: Period Duration and Network Variables: P-values for the Null Hypothesis of
Granger Non-causality
Duration INOUT AI CC LSCC All
Duration 0.3328 0.0017 0.0000 0.0000 0.0000
INOUT 0.9526 0.0000 0.0000 0.1215 0.0000
AI 0.3345 0.0000 0.0020 0.0021 0.0000
CC 0.5520 0.0498 0.0000 0.0000 0.0000
LSCC 0.1211 0.0000 0.0000 0.0336 0.0000
All 0.1811 0.0000 0.0000 0.0000 0.0000
VAR estimated using GMM with HAC robust standard errors.
Optimal lag-length (15) is selected using Akaike Information Criterion.
Table IX: Volume and Network Variables: P-values for the Null Hypothesis of Granger Non-
causality
Volume SDDEG AI CC LSCC All
Volume 0.0014 0.0012 0.0000 0.0063 0.0000
SDDEG 0.0669 0.1752 0.0053 0.0000 0.0000
AI 0.2008 0.0000 0.0911 0.0166 0.0000
CC 0.3970 0.0000 0.0000 0.0000 0.0000
LSCC 0.4034 0.0000 0.0014 0.0002 0.0000
All 0.3662 0.0000 0.0000 0.0000 0.0000
VAR estimated using GMM with HAC robust standard errors.
Optimal lag-length (15) is selected using Akaike Information Criterion.
28