On the Information Properties of Trading Networks

download On the Information Properties of Trading Networks

of 29

Transcript of On the Information Properties of Trading Networks

  • 8/3/2019 On the Information Properties of Trading Networks

    1/29

    On the Informational Properties of Trading Networks

    Lada Adamic, Celso Brunetti, Jeffrey Harris, and Andrei Kirilenko

    September 9, 2009

    ABSTRACT

    We apply network analysis to trace patterns of information transmission in an elec-

    tronic limit order market. If market orders or large executable limit orders are submitted

    by informed traders, then resulting star-shaped or diamond-shaped patterns or trading

    networks should be associated with large changes in returns, smaller volume, and short

    duration between trades. In contrast, the execution of small limit orders from uninformed

    traders should result in networks with many triangular and reciprocal patterns and be

    associated with smaller changes in returns, larger volume and longer duration between

    trades. We compute a time series of trading networks using audit trail, transaction-level

    data for all regular transactions in the September 2008 E-mini S&P 500 futures contract

    the cornerstone of price discovery for the S&P 500 Index. We find that network met-

    rics that quantify the shape of a network are statistically significantly related to returns,

    volatility, volume, and duration.

    Lada Adamic is with the University of Michigan and the Commodity Futures Trading Commission, Celso

    Brunetti is with Johns Hopkins University and the Commodity Futures Trading Commission, Jeffrey Harris is

    with the Commodity Futures Trading Commission and the University of Delaware, and Andrei Kirilenko is with

    the Commodity Futures Trading Commission. We are greatful to Paul Tsyhura for invaluable assistance with

    the retrieval, organization, and processing of transaction-level data. We thank Pat Fishe, Pete Kyle, Antonio

    Mele, Han Ozsoylev, and seminar participants at the Chicago Mercantile Exchange, the Commodities FuturesTrading Commission, 2009 Econometric Society Summer Meetings in Barcelona, the Federal Reserve Board

    of Governors, NASDAQ, the Securities and Exchange Commission, and the University of Maryland for very

    helpful comments and suggestions. The views expressed in this paper are our own and do not constitute an

    official position of the Commodity Futures Trading Commission, its Commissioners or staff.

  • 8/3/2019 On the Information Properties of Trading Networks

    2/29

    Most securities exchanges around the world are electronic limit order markets. Yet, the

    analysis of electronic limit order trading has proven to be very challenging. To quote from

    the survey by Parlour and Seppi (2008): Despite the simplicity of limit orders themselves,

    the economic interactions in limit order markets are complex because the associated state

    and action spaces are extremely large and because trading with limit orders is dynamic and

    generates non-linear payoffs.

    In this paper, we apply network analysis to quantify the dynamics of information trans-mission in an electronic limit order market - a complex dynamic problem. The networks we

    analyze are trading networks. We define a trading network as a set of traders engaged in trans-

    actions within a period of time. In graph theoretic terminology, a trading network is a graph,

    consisting of a set of nodes and a set of edges. Each node denotes a unique trader and an

    edge between two nodes denotes the occurrence of trading between two unique counterparties

    within a period of time. The direction of an edge indicates buy or sell transactions between

    unique counterparties. Namely, a directed edge from node A to node B indicates that trader A

    sold (one time or several times) to trader B during a specified period of time.

    A trading network formed over a designated number of transactions traces a pattern of

    order execution in the limit order book. By analyzing the shape of that pattern, we can quantifythe structure of the executed portion of the book. For example, the execution of a market

    order will result in a star-shaped pattern with the node that submitted the market order in the

    center and nodes that connected to it as the market order marched through the limit order

    book in the periphery. This star-shaped network will also not have any triangular or reciprocal

    connections. In contrast, the execution of two large limit orders that arrived at different times

    will result in a diamond-shaped pattern with the two nodes that submitted large limit orders on

    the ends and market makers that provided the immediacy of execution (in small installments)

    in the middle. Finally, an execution of a sequence of small limit orders will look different

    from the execution of market or large executable limit orders. Some nodes will have more

    connections than others, but there will be no central dominant node or a diamond shape. There

    will be a number of triangular connections and some pairs of nodes will have edges that go

    both ways.

    If market orders or large executable limit orders are submitted by informed traders, then

    patterns of order execution should be informative beyond transaction prices, volume or trade

    duration. Intuitively, if market orders or large executable limit orders are submitted by in-

    formed traders, then resulting star-shaped or diamond-shaped trading networks should be as-

    sociated with large changes in returns, possibly smaller volume, and short duration between

    trades. Conversely, trading networks that are very dissimilar to a star or a diamond - e.g.,

    those with triangular and reciprocal patterns - should be associated with smaller changes in

    returns, possibly larger volume and longer duration between trades. Various network metricsthat quantify the shape of a network - e.g., the number of central nodes or triangular con-

    nections in a network - should then be statistically related to returns, volatility, volume, and

    duration.

    1

  • 8/3/2019 On the Information Properties of Trading Networks

    3/29

    In this paper we find evidence that network metrics serve as primitive measures of limit

    order book dynamics. Namely, we compute network and financial variables for all regular

    transactions that occurred during August 2008 in the nearby E-mini S&P 500 futures con-

    tract and find that network variables strongly Granger-case intertrade duration and volume.

    This suggests that network metrics presage the appearance of this information in duration and

    volume. We also find that the network variable that quantifies centrality (or how star-shaped

    a pattern is) exhibits a very high contemporaneous correlation with returns. Similarly, the

    network variables that quantifies the assortativity of connections (or how diamond-shaped a

    pattern is) exhibit high contemporaneous correlation with volatility.

    These results are robust with respect to different equity index futures markets (E-mini

    Dow Jones and Nasdaq 100), different observation periods (May 2008 and August 2008),

    different levels of aggegation (at the broker level and individual trading account level), and

    different sampling frequencies (240 and 600 transactions). Correlation results can also be

    replicated in a simulated model, confirming that these empirical regularities do not arise by

    chance. Furthermore, the results do not depend on any parametric specifications or modeling

    assumptions.

    This is the first paper to empirically link trading networks that trace the execution of thelimit order book with the dynamics of high frequency financial variables - transaction prices,

    quantities and duration. As such, it offers a way to analyze the dynamics of the executed

    portion of the limit order book from transaction level data.

    Empirical network analysis has previously been applied in finance to study investment

    decisions and corporate governance.1 In contrast to strategically-formed networks where par-

    ticipants prefer to associate with specific counterparties, the networks we study are trading

    networks in which connections are formed as a result of an automated matching algorithm and

    reflect the participants beliefs about the valuation of an asset. These networks are also highly

    dynamicwhereas boards of directors and portfolio holdings evolve gradually, over weeks,

    months, or yearsfinancial trading networks change second by second.

    Our paper proceeds as follows. In Section I, we describe our unique ultra high frequency

    data, explain how we chose the sampling frequency, and describe financial variables. In Sec-

    tion II, we describe network variables. In Section III we outline our conjecture of why patterns

    of order executiontrading networkscontain valuable information beyond prices, quantities,

    or intertrade duration. In Section IV, we present the empirical properties of network and finan-

    cial variables. In Section V, we analyze time series properties and employ Granger-causality

    tests between and among network and financial variables. Section VI demonstrates that our

    results are robust with respect to different markets, different observation periods, and differ-

    ent sampling frequencies. In Section VII, we use an agent-based simulation model of trading

    networks to further test that our empirical results do not arise by chance. Finally, Section VIII

    1For a recent survey, see Allen and Babus (2008).

    2

  • 8/3/2019 On the Information Properties of Trading Networks

    4/29

    summarizes our findings and suggests further applications of the network analysis methodol-

    ogy to trading networks.

    I. Data and Financial Variables

    We use audit trail, transaction-level data for all regular transactions in the September 2008

    E-mini S&P 500 futures contract. The transactions take place during the month of August

    2008 during the time when the markets for stocks underlying the S&P 500 Index are open:

    weekdays between 9:30 a.m. EST and 4:00 p.m. EST. The E-mini S&P 500 futures contract

    is a highly liquid, fully electronic, cash-settled contract traded on the CME GLOBEX trading

    platform. It is designed to track the price movements of the S&P 500 Index - the most widely

    followed benchmark of stock market performance. Empirically, the E-mini futures has been

    shown to contribute the most to price discovery for the S&P 500 Index.2 Price discovery

    typically occurs in the front month contract: in August 2008, the September 2008 futures

    contract is the front month, most actively traded contract (see Figure 1).

    For each transaction, we utilize the following data fields: date, time (up to the second),

    unique transaction ID (to identify consecutive transactions within a second), executing bro-

    ker, opposite broker, trading account of the executing broker, trading account of the opposite

    broker, buy or sell flag (for the executing broker), price, and quantity.3

    Using the audit trail-level of detail, we uniquely identify two trading accounts for each

    transaction: one for the broker who booked a buy and the opposite for the broker who booked

    a sale. Our dataset consists of over 6 million transactions that took place among 26950 trading

    accounts that belong to 346 brokers.

    We first test the quality of the data by applying standard filters designed to look for record-

    ing errors and outliers in the price and quantity series.4 We find the data to be of very highquality: the standard filters did not find any data irregularities.

    We then determine the optimal sampling frequency by utilizing two techniques designed

    to mitigate the effect of market microstructure noise in ultra high-frequency data.5 The first

    technique is developed in Andersen, Bollerslev, Diebold and Labys (2000) and is commonly

    referred to as the volatility signature plot. According to this technique, the effects of mar-

    ket microstructure noise in our data are mitigated at the level of 120 transactions or higher.

    2See, Hasbrouck (2003).3While the data fields are named executing broker and opposite broker, transaction data does not spec-

    ify which trader initiated a transaction; in fact, for each transaction, there are two mirror entries for the two

    counterparties one booking a sale and the other booking a buy.4See, Hansen and Lunde (2004).5For the literature on the subject, see among others, Zhang, Mykland and Ait-Sahalia (2005), Oomen (2005),

    Bandi and Russell (2006), Hansen and Lunde (2006), Barndorf-Nielsen et al. (2008).

    3

  • 8/3/2019 On the Information Properties of Trading Networks

    5/29

    The second technique is developed by Bandi and Russell (2006) to select the sampling fre-

    quency that minimizes the variance of market microstructure noise. According to the second

    technique, the optimal sampling frequency is just below 100 transactions. Neither technique

    makes any use of network variables. We adopt a very conservative approach and select 240

    transactions as the sampling frequency for our data.6

    For each period consisting of 240 transactions (which amount to a total of 25,104 such

    periods in our sample), we compute the following financial variables: returns, volatility, inter-trade duration, and trading volume. These four variables are typically assumed to both con-

    tain and convey valuable information to market participants about the true (but unobserved)

    stochastic price process.7 Intuitively, market participants can learn about the true underlying

    price process by observing transaction prices, trading volume, and times between trades.

    Transaction prices contain valuable information about the true underlying price process,

    but with a possibly significant amount of noise due to, among other reasons, market mi-

    crostructure issues (e.g., bid-ask bounce), measurement issues (e.g., time scale, discrete re-

    alizations from a continuous process), and seasonality (e.g., predictable intraday patterns).8

    Both returns and their volatility are computed from observed prices and, thus, suffer from

    the same noise issues. However, a number of techniques have been developed to reduce theimpact of different noise components in ultra high-frequency data. The techniques we use to

    deal with measurement errors and reduce market microstructure noise filters and optimal

    sampling frequency are described just above. In addition, we remove a predictable intraday

    seasonal component from the computed raw returns by regressing them on a constant and a

    sequence of dummy variables for each half-hour during the trading period. We then use the

    unexplained term as our measure of returns.9 We compute returns as differences in log prices

    using both the last price to the first price within the same period (close-to-open) and last prices

    for consecutive periods (close-to-close). The results reported below refer to the close-to-open

    deseasonalized returns, because we believe it to be an intuitively more appealing measure to

    compare with network variables (also cleaned of seasonality), which are defined within each

    sampling period. Having said that, the main results are not affected by the two different ways

    to compute returns nor by the deseasonalization procedure.10

    Volatility is another measure that contains valuable information about the true underlying

    price process. As mentioned above, because it is computed from observed prices, it suffers

    from the same noise issues as returns. Moreover, volatility suffers from the fact that unlike

    prices, volatility is never directly observed. Thus, volatility estimates contain not only the

    volatility of the noise, but also a possibly nontrivial factor due to covariance between the

    6In order to ensure the robustness of our results, we repeat our analysis at a higher sampling frequency (see

    our discussion on robustness later in the paper). The main results are unaffected.7

    There is a vast theoretical and empirical literature on the subject. For a recent summary, see, Manganelli(2005).

    8See, for example, Engle (2000).9We apply the same technique to all financial and network variables.

    10We also used a Fourier flexible form to remove seasonality. It did not qualitatively change our results.

    4

  • 8/3/2019 On the Information Properties of Trading Networks

    6/29

    true price process and the noise component.11 We use three measures to estimate volatility

    during each period: absolute returns, squared returns, and the price range. Absolute and

    squared returns are proxies for the standard deviation and variance of returns, respectively.

    The price range is defined as the difference between the high and low price (in logs) during

    the period. For the results reported below, we use the price range as the measure of volatility.

    Range-based volatility estimators have been shown to be more efficient than return-based

    volatility estimators, because they incorporate the full sample path of observed prices (to select

    a maximum and a minimum) rather than just open and close prices.12 Our main results are not

    affected by the choice of volatility estimator.

    Intertrade duration contains valuable information, because the estimation of characteristics

    of the true price process obtained during periods of shorter intertrade duration can be more

    precise. This would happen irrespective of the reasons for shorter intertrade duration: whether

    more frequent trading occurs due to more informed trading or more liquidity trading, more

    frequent sampling would result in greater precision with respect to the true process. Having

    said that, there is a view that since information is disseminated through trading, the interval

    of time between trades can be interpreted as a proxy for the arrival of new information to the

    market.13 We compute duration as the time (in seconds) elapsed between the start and end

    of the period. We compute three measures of duration: total (unweighted) period duration,

    volume weighted period duration, and average for 239 intertrade (within period) durations.

    The results reported below are for total period duration. The main results are unaffected by

    the way we compute intertrade duration.

    Trading volume contains valuable information, because volume together with observed

    transaction prices can be driven by a common latent factor often referred to in the literature

    as information intensity.14 Intuitively, during periods of higher volume, transaction prices

    also exhibit greater precision about the characteristics of the true underlying price process.

    We compute volume as the number of contracts both bought and sold during the observation

    period.

    11A number of techniques have been developed to estimate volatility components separately by varying the

    time window. See, for example, Zhang, Mykland, Ait-Sahalia (2005). Application of these techniques to trading

    networks will be explored in our future research.12For the literature on price range as an efficient estimator of asset price volatility, see, for example, Parkinson

    (1980), Garman and Klass (1980), Beckers (1983), and Brunetti and Lildtholdt (2006). In recent years, the price

    range has also been used to compute realized volatility in high frequency data. See, for example, Christensen

    and Podolski (2009).13See, for example, Engle and Russell (1998) and Engle (2000).14There is a vast theoretical and empirical literature on the subject. See, for example, Clark (1973), Epps and

    Epps (1976), Tauchen and Pitts (1983), Admati and Pfleiderer (1988), Easley and OHara (1992),and Andersen

    (1996).

    5

  • 8/3/2019 On the Information Properties of Trading Networks

    7/29

    II. Network variables

    Quantitative analysis of networks employs a set of standard metrics.15 Network metrics di-

    rectly depend on what is defined as a node, an edge, and a network. In our analysis, a node

    denotes a trading account, an edge indicates that a transaction has occurred between two trad-

    ing accounts, and a trading network is constructed from a specified number of consecutive

    transactions (e.g., 240 transactions) among trading accounts in an electronic limit order mar-ket.

    The formation of a trading network consists of three interconnected steps: (i) the arrival

    and accumulation of orders in the limit order book; (ii) the process of matching buy and sell

    orders; and (iii) the display of transaction prices for matched orders.16 From the network

    perspective, limit orders can be visualized as stubs (ends of edges) attached to a given node.

    At a simple level, each stub has a time stamp (for the time it was created), a direction (in for

    buy and out for sell), a price, and a quantity. A node can grow a large number of stubs

    subject to the specifics of its trading strategy, the costs of creating, modifying, maintaining,

    and cancelling stubs, as well as limits imposed by the stub-matching algorithm.17

    Depending on its attributes, each stub is assigned to a specific place in the limit order book.

    In stubs go into the Buy Orders side of the book and Out stubs go into the Sell Orders

    side. On each side of the order book, the stubs are sorted in accordance with the rules of a

    matching algorithm, e.g., by price and then the time stamp or by price, quantity and then the

    time stamp. The matching algorithm makes edges out of stubs by linking together top stubs

    from each side of the book, provided that they agree on a price. 18

    Once two stubs are connected by a matching algorithm, two things happen: an edge

    between two nodes is created and the associated transaction price at which the match was

    achieved is displayed for all nodes to see. After seeing a transaction price or a sequence of

    15See Newman (2003) for a review of basic network concepts and quantitative indicators.16Without the loss of generality, the modification and/or removal of unmatched existing orders is viewed as a

    part of the order arrival process.17For example, the Chicago Mercantile Exchange (CME) Group allows most trading firms to grow the follow-

    ing number of free stubs (known as messages) for the products matched via its GLOBEX algorithm during

    regular business hours: 3,000 plus no more than a ratio of grown stubs to total executed volume for this product.

    This volume ratio is set at 4 to 1 for E-mini S&P 500 Futures and Spreads, 8 to 1 for E-mini NASDAQ-100

    Futures and Spreads, and 25 to 1 for E-mini Dow Futures and Spreads. Stubs grown in excess of 3,000 plus the

    product-specific volume ratio are penalized by a surcharge fee.18For example, on the CMEs GLOBEX Trading System, there are three algorithms to match stubs (orders):

    First In, First Out (FIFO); Pro Rata Allocation (Pro Rata); and Lead Market Maker (LMM). Quoting from the

    CME documentation avialble to the public, FIFO uses price and time as the only criteria for filling an order: all

    orders at the same price level are filled according to time priority. Pro Rata matches orders based on price, top

    orders (the first order only that betters a market), and size. The LMM is a firm or trader designated by CME to

    make a two-sided market in an assigned product. This LMM will have the benefit of certain matching privileges

    and associated pricing concessions in return for meeting CME determined market obligations.

    6

  • 8/3/2019 On the Information Properties of Trading Networks

    8/29

    transaction prices, some nodes may decide to modify or remove some existing stubs or grow

    new stubs, thus affecting the network formation process.

    Empirically, we construct trading networks as follows. At 9:30:00 a.m. EST on August 1,

    2008, we start counting transactions in the September 2008 E-mini S&P 500 futures contract.

    For each transaction, we know which account bought from or sold to which other account (or

    itself), at what price, and what number of contracts. We designate 240 consecutive transactions

    as one period. Transactions 1 through 240 mark the first period, transactions 241-480, markthe second period, and so on. While for each period, we do not observe the limit order book

    itself, we know that transactions occurred because market orders or limit orders were matched

    with existing orders in the limit order book. We can then trace the pattern of order execution

    or a trading network within each period. Even though the number of transactions for each

    period is the same, a pattern for a large market order executed over the period will look very

    different compared to a pattern for several smaller limit orders. Metrics that we compute for

    each network should be interpreted as quantitative measures of the pattern of order execution

    in the limit order book.

    We realize that by taking snapshots of the market at equal transaction time intervals, we

    cannot hope to characterize the whole complexity of changes that take place in the underlyinglimit order book. Specifically, we cannot observe how the revelation of transaction prices

    translates into modifications or cancellations of existing orders and submissions of new orders.

    Or in terms of the network formation process, we cannot observe how nodes remove some

    existing stubs and grow new stubs.

    While we know that the process of trading network formation - stubs, edges, transaction

    prices, new stubs - goes on continuously, we must designate the number of transactions that

    add up to a trading network at a point in time. This designated number of transactions could

    be at times too small and at times too large to clearly capture the impact of order execution on

    the order book through network analysis within each period. However, as we analyze the time

    series properties of trading networks, a statistically significant pattern, if there is one, shouldemerge. In other words, the approach we take is to compute and analyze network metrics for

    a time series of consecutive trading networks rather that those for one aggregate network that

    emerges over the whole period.

    Given our intutition about how patterns should be related to the dynamics of transaction

    prices and quantities, we are interested in network metrics that can measure centrality (or how

    star-shaped a network is); assortativity of connections (or how diamond-shaped a network

    is); as well as those that can measure reciprocity, triangular connections, and the size of the

    network.

    The size of the network can be characterized in terms of the total number of nodes, denotedby N, and the total number of edges, denoted by E. From these two quantities we can also

    compute the average degree, AV DEG = E/N the average number of nodes that a node isconnected to, and the standard deviation of degree, STDEG the standard deviation around

    7

  • 8/3/2019 On the Information Properties of Trading Networks

    9/29

    this average. These two variables characterize the first and second moment, respectively, of

    the unconditional degree distribution.

    Node centrality quantifies the position of a specific node on a network. There are several

    node centrality measures, the simplest one being degree, or how many edges a node has. In a

    directed network, degree can be further separated into indegree and outdegree in accordance

    with the number of incoming or outgoing edges of a node.

    However, the degree alone may not necessarily capture the role of a node on the network.

    For example, a node that has a relatively low degree, but acts as a connector between otherwise

    disconnected parts of the network, can be thought of as very central. To that end, there are

    measures of centrality that take into account not just the degree of a node, but its position

    relative to all other nodes in the network. For example, betweenness measures how many

    other pairs of nodes would have to go through the given node in order to reach one another

    in the shortest number of hops. Similarly, closeness measures how many hops away a node is

    on average from every other node in the network. Figure 2 illustrates different node centrality

    measures.

    Node centrality is a critical input into the calculation of network centralization, a measurethat characterizes the inequality of connectivity among the nodes. In order to capture this

    inequality in connectivity within the network whether there are a small number of nodes with

    high centrality and a large number of nodes with low centrality we compute a centralization

    measure defined as centralization Gini:

    G =nr=1(2rN1)ki

    NE, (1)

    where ki is a nodes centrality measure and r is a nodes rank order number.

    Taking node is degree as its centrality measure, we use the formula above to computeseparate centralization measures for indegree and outdegree incentralization, INCEN, and

    outcentralization, OUTCEN, respectively. By construction, these measures are 0 if every node

    has the same number of (incoming or outgoing) edges, and positive with increasing inequality:

    e.g., one node has all the incoming (outgoing) edges, the others have no incoming (outgoing)

    edges.

    We also compute a combined measure of incentralization and outcentralization: CEN=INCENOUCEN. Intuitively, since we use a nodes degree as a measure of its centrality, the

    difference between in and out centralization measures can be interpreted as the presence of a

    dominant buyer or seller. CEN will be equal to 1 if there is a dominant buyer and -1 if there is

    a dominant seller.

    To measure wheher a node is both a buyer and a seller, we compute the Pearson corre-

    lation coefficient between the indegree and the outdegree of each node, INOUT. A positive

    8

  • 8/3/2019 On the Information Properties of Trading Networks

    10/29

    correlation indicates that nodes with many in edges also have many out edges - i.e., it is both

    buying and selling.

    We also calculate statistical properties of nodes one edge away from each individual node

    or connectivity of node B conditional on it being connected to node A. Assortativity in net-

    works can represent any tendency of like to be connected with like for any node property

    (see Newman (2002)), but here we will apply it to degree. Large degree nodes (i.e., those

    with many edges) may connect more frequently to other large degree nodes or they may tendto connect to small degree nodes. Two large degree nodes connecting to a number of small

    degree nodes between them will result in a diamond-shaped network.

    One way to measure assortativity is by the Pearson correlation coefficient (ki,kj) forall edges ei j. When the edges are directed there are four possible assortativity measures:

    (kini ,kinj ), (k

    ini ,k

    outj ), (k

    outi ,k

    inj ), and (k

    outi ,k

    outj ) corresponding to the four conditional de-

    gree distributions.

    From these four correlation coefficients, we construct the following compound measure,

    that we call assortativity index for directed networks:

    AI=1

    4

    (kini ,k

    inj )+(k

    outi ,k

    outj )

    (kini ,k

    outj )+(k

    outi ,k

    inj )

    , (2)

    computed overall all edges ei j.

    Figure 3 illustrates network assortativity. For example, in the context of trading networks,

    the coefficient (kouti ,kinj ) measures the correlation between the number of unique buyers (con-

    nected by an outward pointing edge) a seller is selling to (denoted by koutj ) and the number of

    unique sellers those buyers are buying from (denoted by kinj ). A negative (kouti ,k

    inj ) would

    mean that when a seller has matched to many buyers, those buyers are likely to be transacting

    with few or no other sellers.

    We also measure if nodes one edge away from each individual node also form particular

    (e.g., triangular) patterns. Transitivity, also termed clustering, measures the prevalence of

    closed triads in the network. In this paper, we use the global clustering coefficientdenoted by

    CCas a measure of transitivity:19

    CC=3 number of triangles in the network

    number of connected triples of vertices, (3)

    19See, Newman (2003).

    9

  • 8/3/2019 On the Information Properties of Trading Networks

    11/29

    where a connected triple means three nodes ABC such that there is an edge AB and

    an edge BC.20, and the prevalence of specific directed triads can be used to conduct a motif

    analysis on a directed network.21

    Finally, in addition to regularities in connections between pairs and triplets of nodes, a net-

    work as a whole may be composed of several separate connected components. A connected

    component is a maximal subset of nodes such that any node can be reached from any other

    node by traversing edges. Within a strongly connected component any node can be reachedfrom any other by following directededges. Figure 4 illustrates the largest strongly connected

    component(LSCC). Once the largest strongly connected component is identified, we can mea-

    sure the global network structure by computing LSCC, the proportion of the network occupied

    by this component.

    Intuitively, the largest strongly connected component can only occupy a significant portion

    of the network if many nodes have both incoming and outgoing edges during the same time

    period, and there are cycles (the simplest of which are reciprocal ties and the triads mentioned

    above) within the network. In other words, a large strongly connected component is much

    more likely to emerge as a result of a large number of limit orders than one large market order.

    III. Conjecture: Trading networks contain information

    We believe that network analysis is very useful for quantifying patterns of information trans-

    mission in an electronic limit order market. Specifically, we conjecture that orders that contain

    information about the fundamental value of an asset, as well as demand and supply of liquid-

    ity for this asset, should have particularstar-shaped or diamond-shapedexecution patterns.

    In contrast, orders that have little such information should exhibit very different patterns,

    namely, they should contain many triangular and reciprocal connections.

    To illustrate this intuition, we show in Figure 5 three sample networks, with their network

    and financial statistics. The sample networks are chosen to display extremes in centralization

    (CEN), assortativity index (AI), and the number of edges (E).

    The left column of Figure 5 presents a star-shaped network with one dominant buyer

    matched with many sellers. This pattern is consistent with the execution of a market order.

    It has a centralization coefficient close to one, high standard deviation of degree and large

    assortativity. This network is associated with a positive return and low period duration.

    The center column of Figure 5 presents a network with four large traders forming diamond-

    shaped patterns as their limit orders are being crossed via intermediaries. High assortativity20For the clustering coefficient, we are treating the edges as undirected, although directed clustering coeffi-

    cients can also be defined. See, Fagiolo (2007).21On motif analysis, see Milo et al (2004).

    10

  • 8/3/2019 On the Information Properties of Trading Networks

    12/29

    index means that large traders are mostly matched with many small traders rather than with

    each other. Small largest strongly connected component suggests that large traders are not

    trading with each other: rather they buy from or sell to small traders who quickly trade with

    another large trader. This pattern is associated with negative returns, as well as with higher

    volume and volatility.

    The right column of Figure 5 presents a fairly uniform network with many buyers and sell-

    ers of various sizes. This situation is reflected in a pattern of connections that exhibits networkparameters close to their sample averages with the exception of the number of edges - reflect-

    ing a larger and more interconnected trading network. The financial variables estimated

    from transaction prices - the rate of return and volatility - are very close to their averages.

    Volume is somewhat above its sample average and period duration is quite high.

    The examples above provide illustrative evidence in support of our intuitive conjecture.

    Our next step is to take our conjecture to the dataa times series of over 25000 trading

    neworksand to prove that that metrics of order execution patterns are statistically related

    to returns, volatility, volume, and duration.

    IV. Empirical properties of trading networks

    A. Summary statistics for the financial variables

    Table I presents summary statistics for the financial variables. All financial variables in our

    sample are stationary. Standard ADF tests reject the null hypothesis of non-stationarity (p-

    value = 0.00) for all financial variables. For the rate of return, the standard deviation dominates

    the mean as expected. In addition, returns exhibit positive skewness. The range has a period

    average of 0.04 percent, which corresponds to an annualized average volatility of 24 percent in line with estimates reported in the literature.22 Intertrade period duration in this very

    liquid market ranges from zero to 176 seconds. In our sample, 240 transactions on average

    occur every 19.5 seconds. Finally, volume, volatility, and duration are highly persistent, as

    evidenced by autocorrelation coefficients at lags 1, 5, and 10.

    22Of the three measures of volatility - absolute returns, squared returns, and range - the range exhibits the

    lowest standard deviation (results available upon request). This is in line with the efficiency results of Parkinson

    (1980).

    11

  • 8/3/2019 On the Information Properties of Trading Networks

    13/29

    B. Summary statistics for the network variables

    Table II presents summary statistics for the network variables.23 All network variables study

    are stationary. However, Jarque-Bera tests for individual network variables reject the null

    of normality at standard significance levels. Finally, all network variables exhibit persistent

    autocorrelation functions.

    Our combined measure of the difference between in-centralization and out-centralization,

    CEN= 0.00 0.23, is on average equal to zero. However, this is partly due to the fact thatthe shapes of distributions of incentralization and outcentralization are very similar, which

    indicates that both the buy side and the sell side of the limit order book are executed in a

    symmetrically balanced way.

    The distributions of incentralization and outcentralization are also strongly negatively cor-

    related. This indicates that when there are a few dominant buyers, the sellers tend to have

    more even numbers of trading partners, and conversely, when there are a few dominant sell-

    ers, the buyers tend to be more equal in trading partners. In other words, a large market order

    or executable limit order to buy (sell) is likely to be executed against several small limit orders

    on the sell (buy) side of the book.

    These trading networks are highly dynamic, and the most central node in one period may

    have few or no edges in the next. In other words, it is quite unlikely that a market order or

    an executable limit order is so large that it spans several consequtive networks. Of the nearly

    27000 trading accounts who bought or sold S&P 500 E-mini futures contracts during August

    2008, 17 accounts were extremely active, accounting for nearly 40 percent of all transactions.

    On the other hand, 85 percent of trading accounts traded only 10 percent of all transactions.

    The correlation in indegree from period to period for the individual 17 most active trading

    accounts varied between [0.28] and [0.64], while for the bottom 85 percent of accounts, the

    correlation was essentialy zero.

    At the level of the individual trader, the indegree is slightly correlated with outdegree

    (INOUT= 0.080.22), suggesting that traders who have more buying interactions will tendto have a slightly higher number of selling interactions during the same transaction window.

    Assortativity correlations in trading networks are on average negative. There is a moderate

    negative correlation between the number of buyers a seller is selling to and the number of

    sellers that a buyer is buying from. This relationship stems in part from the skewed degree

    distribution. Most buyers have low indegree, therefore a seller with high outdegree must be

    selling to many buyers with low indegree. Similarly, most sellers have low outdegree, and a

    buyer with high indegree must be buying from many of them. The overall assortativity index,

    which is computed by assigning equal weights to the four assortativity correlations, is slightly23While the table focuses on eight network variables, we have also analyzed a wide range of other network

    metrics, including Freeman centralization, reciprocity, individual pairwise assortativity metrics, and alternate

    centrality measures, including directed and undirected betweenness, closeness, and PageRank.

    12

  • 8/3/2019 On the Information Properties of Trading Networks

    14/29

    positive: AI= 0.090.07. This means that on average, when a seller (buyer) is matched withmany buyers (sellers) they are just as likely to be transacting with many other sellers (buyers)

    as with a few or no sellers (buyers). A devitation from this pattern indicates that one buyer or

    one seller is dominant.

    The global clustering coefficient or the ratio of oberved triangular connections among

    nodes to all possible triangular connections is 0.040.03, nearly one standard deviation below

    the average clustering coefficient for randomized graphs with the same assignments of degrees.In other words, there is no tendency for the traders to cluster together.

    Similar to the clustering coefficient, the size of the largest strongly connected component,

    (0.040.04), does not deviate from what would be expected for networks of that size, density,and distribution of in and out degrees. But as we will see in the following section, it does

    strongly correlate with density and other network variables.

    V. Empirical analysis of trading networks

    A. Correlations

    Table III reports contemporaneous correlations among network variables. According to Ta-

    ble III, the difference between buyer and seller centralization (CEN) is not correlated with any

    other network variable. Average degree (AV DEG) is positively correlated with the standard

    deviation of degree (SDDEG). This property is typical for a power law degree distribution, in

    which a few dominant nodes have high degree (i.e., few accounts trade with a large number of

    counterparties) and a large number of nodes have very low degree. As a result, for most power

    law distributions, central moment estimators, like average degree and the standard deviation

    of degree, grow with the sample size. Correlations between a nodes indegree and outdegree(INOUT) boost the variance in undirected degree (SDDEG). This means that large buyers

    are also large sellers, while small buyers are also small sellers. By construction, the assor-

    tativity index (AI) is highest when high degree buyers are matched with low degree sellers.

    This is more likely to occur when the network is less dense and the largest strongly connected

    component (LSCC) is small.

    Table IV presents correlations between financial and network variables. The return process

    exhibits 68 percent correlation with centralization, but no other network variables. Intuitively,

    a large positive CEN results from a large market order or executable limit order to buy. This

    order is likely to push prices up, which results in a positive rate of return. The same intituition

    holds for a market order to sell.

    Volatility is positively correlated with all the network variables, with the exception of the

    assortativity index (AI). This means that when high degree buyers are matched with low

    degree sellers (like in the case of a market order), volatility is somewhat smaller compared to

    13

  • 8/3/2019 On the Information Properties of Trading Networks

    15/29

    a situation when low degree buyers are matched with low degree sellers (several limit orders).

    Intuitively, in a deep and liquid market like the E-mini S&P 500 futures, an incoming market

    order has a significant chance to be executed against several limit orders sitting at (or near) the

    same tick, resulting in high assortativity and centralization, but very little price impact and,

    hence, low high-frequency volatility estimate. At the same time, intermediated execution of

    two large limit orders from both sides of the limit order book will result in a positive high

    frequency estimate of volatility, if only due to the bid-ask bounce.

    Duration is positively correlated with the average degree and in-out degree correlation and

    negatively correlated with the standard deviation degree and the assortativity index. Intuitively,

    a longer time interval between trades is associated with trades that are distributed more evenly

    among traders, increasing the average degree and decreasing the standard deviation of degree

    and the assortativity index. Over longer time intervals, it is also more likely that a node that

    has a high indegree also has a high outdegree (it has time to be both a buyer and a seller),

    which results in a positive in-out degree correlation.

    B. Granger Causality

    We next test for Granger causality in the context of Vector Autoregressive (VAR) models.

    Since the variables exhibit heteroskedasticity and serial correlation, we estimate VAR models

    using the generalized method of moments (GMM) and Newey-West robust standard errors.

    We first consider a VAR model with eight network variables. According to the Akaike

    Information Criterion, the system that includes all eight network variables has an optimal lag-

    length of twenty.24 However, the results of the model with eight network variables (available

    from the authors upon request) show strong evidence of feedback effects among the network

    variables, i.e., network variables tend to Granger cause each other. In light of this, we use

    standard tests to reduce the model to four network variables.25

    Tables V-IV provide the results (p-values) of Granger-non-causality tests. The last column

    and the last row of each table are labelled all. In the last column we test whether each

    variable is Granger-caused by all the other variables in the system, while the last row is testing

    whether each variable is Granger-causing any other variable in the system. The null hypothesis

    is that of Granger-non-causality. Therefore, a p-value greater than five percent indicates a

    failure to reject the null.

    Table V presents p-values for the Granger-non-causality test among three sets of four net-

    work variables (three panels). Panel 1 shows that centralization (CEN) is Granger-causing the

    other network variables (p-value = 0.5556), but is not Granger-caused by other network vari-

    24Throughout the analysis we use both Akaike and Schwartz Information Criteria.25Standard test statistics are available from the authors upon request.

    14

  • 8/3/2019 On the Information Properties of Trading Networks

    16/29

    ables (p-value = 0.2387). On the other hand, Panels 2 and 3 show that the remaining network

    variables Granger-cause each other.

    Next, we test for Granger-causality between one financial variable and four network vari-

    ables. Using standard techniques, we select groups of network variables that reflect degree

    properties at the level of a single node (e.g., centralization, standard deviation of degree, and

    in and out degree correlation), two nodes linked by an edge (assortativity index), connected

    triples of nodes (clustering coefficient), and the connectivity of the whole network (the pro-portion of nodes in the largest strongly connected component).

    Table VI presents p-values for the Granger-non-causality test for the rate of return and

    network variables. We find that the return process is both Granger-caused and Granger-causes

    network variables. The network variable that has a strong impact on returns is centralization.

    This is in line with the correlation results in Table IV.

    Table VII reports Granger-non-causality test results for the volatility process and network

    variables. Similarly to the return process, we find a feedback effect between volatility and

    network variables: volatility is both Granger-caused by network variables and Granger-causes

    them.

    Table VIII reports Granger-non-causality test results for intertrade duration and network

    variables. We find that duration is Granger-caused by network variables (p-value =0.0000),

    but does not Granger-cause network variables (p-value = 0.1811).

    Finally, Table IX presents p-values for the Granger-non-causality test for volume and net-

    work variables. The results show that volume is Granger-caused by network variables (p-value

    = 0.0000) but does not Granger-cause network variables (p-value = 0.3662).

    What are the possible reasons for the presence of feedback effects in Granger causality test

    results for the rate of return and volatility (vis-a-vis the network variables) and the absence

    of such effects for volume and duration? We believe that there is one fundamental reasonfor these empirical findings: our results for the price-based variables are polluted by noise.

    Unlike volume, duration, and all the network variables, which we can measure directly, the

    rate of return and volatility are estimated from transaction prices. As a result, the variables

    we call the rate of return and volatility are noisy proxies for the unobservable characteristics

    of the true price process. The level of noise at this very high frequency is so high that it is

    very hard to effectively measure the interaction between network variables and the true price

    process.

    VI. Robustness

    Our results are robust with respect to different markets, different observation periods, different

    levels of aggregation, and different sampling frequencies. The results we report are for the E-

    15

  • 8/3/2019 On the Information Properties of Trading Networks

    17/29

    mini S&P 500 futures for the month of August 2008 (over 6 million transactions). The results

    remain qualitatively the same when we repeat all procedures for the same market for the month

    of May 2008 (5.15 million transactions) at the sampling frequency of 240 transactions. The

    main results also remain the same for the sampling frequency of 600 transactions. Namely,

    both correlations and Granger-causality results hold. The results are also the same whether

    we construct networks at the broker level or trading account level. Finally, the results remain

    the same for other stock index futures markets as confirmed by the analysis of the E-mini

    Nasdaq 100 (2.3 and 2.8 million transactions in May 2008 and August 2008, respectively) and

    E-mini Dow Jones futures contracts for both May 2008 and August 2008 (1.8 and 2.4 million

    transactions in May 2008 and August 2008, respectively) at the sampling frequencies of 240

    and 600 transactions.

    VII. Agent-based simulation model of trading networks

    In order to further test that our empirical results do not arise by chance and to examine the

    source of the high correlation between network centralization and returns, we construct anagent-based simulation model of trading networks. Figure 6 presents a snapshot of the simu-

    lation model.

    The model setup is as follows. There is a fixed number of traders. At each interval of time,

    a new buy or sell order is assigned at random to one of the traders. The order arrival time is

    distributed according to a Poisson distribution. The order has an equal50 percentprobability

    to either buy or sell a single quantity at a single price. The quantity is lognormally distributed.

    For a sell order the price is set a small fixed number above the last transaction price plus

    a lognormally distributed random variable with a mean of zero. For a buy order, the price is

    set a small fixed number below the last transaction price plus a lognormally distributed ran-

    dom variable with a mean of zero. Setting the order a small number above (below) the last

    transaction price for a buy (sell) order indicates a willingness on the part of the trader to buy

    (sell) at a slightly higher (lower) price than the market is currently trading at. The lognormally

    distributed, zero mean random variable added to the order price represents the heterogene-

    ity of beliefs. This parametric specification is consistent with a price function that arises in

    equilibrium under the assumption of heterogenous beliefs about the true price process.26

    Each incoming order is matched against previously placed orders by an automated match-

    ing algorithm based on price and time priority. If a match is made, an edge is created be-

    tween two traders.27 Immediately following a match, order quantities are updated for the two

    matched traders. If the newly placed order is only partly fulfilled, the algorithm attempts to

    26See, for example, Scheinkman and Xiong (2003).27Since the orders are randomly assigned, it is possible that an edge connects two orders submitted by the

    same trader for the same trader.

    16

  • 8/3/2019 On the Information Properties of Trading Networks

    18/29

    match the remaining quantity against another, more recent previously placed order. Orders are

    set to expire after a fixed amount of time from when they are first created, at which point they

    are cancelled and withdrawn from the market.

    Using the resulting simulated transactions, we construct trading networks using the pro-

    cedure identical to the one used for the empirically observed data. Namely, we simulate 6

    million transactions, segment the data into periods of 240 consecutive transactions and com-

    pute network and financial statistics for each period. Just as in the actual trading, a singleorder may be reflected in multiple transactions in adjacent time windows.

    This setup allows for a possibility of heterogenous beliefs about the price process, but

    imparts no intentionality or memory upon the traders. It allows us to discern which features

    of the trading networks are due to the arrival of information to the market, and which may be

    due to strategic behavior on behalf of the traders.

    We find that a sequence of orders with randomly distributed prices and quantities results in

    network and financial variables that are very similar to those obtained from the futures market

    data, but with the notable (and anticipated) exception of a dynamic structure. Specifically, we

    find that contemporaneous correlations among the network variables, as well as correlationsamong network and financial variables are very similar to those we estimate from the actual

    market data.28 This confirms that our empirical results do not arise by chance.

    We also use the agent-based simulation model to investigate possible sources of high cor-

    relation between network centralization and returns. By observing the simulation, we find

    that high correlation between centralization and returns reflects the network mechanics of

    the information arrival process: a trader submitting a large buy order at a high price will be

    matched against several existing sell orders, giving that trader a high indegree, and increasing

    the centralization of the network. At the same time, because a greater number of sell orders

    was matched, the market-price goes up, yielding a positive rate of return. Moreover, we find

    that for the simulated data (but not market data), centralization and other network variablesGranger-cause returns, but not vice versa.29

    At the same time, we also find thatas expected in a model with no intentionality or

    memoryGranger-causality tests among network variables and volatility, volume and duration

    yield very weak results: feedback effects, lack of significance or very poor fit. This suggests

    that the Granger-causality results that we find in the futures markets data arise as a result of

    the behavior of traders and are not a statistical artifact.

    28Results are available from the authors upon request.29We use the lag-length of order on in the VAR. As expected from the lack of dynamics in the silmulated data,

    the Akaike information creterion selects a lag-length of order one in the VAR specification and the Schwartz

    information creterion selects a lag length of order zero.

    17

  • 8/3/2019 On the Information Properties of Trading Networks

    19/29

    VIII. Concluding remarks

    We use network analysis to examine information transmission in an electronic limit order

    market. We conjecture that orders that contain information about the fundamental value of

    an asset, as well as demand and supply of liquidity for this asset, should have particularstar-

    shaped or diamond-shapedexecution patterns. In contrast, orders that have little such infor-

    mation should exhibit very different patterns, namely, they should contain many triangularand reciprocal connections.

    We test this conjecture by computing a time series of trading networks from audit trail,

    transaction-level data for all regular transactions in the September 2008 E-mini S&P 500 fu-

    tures contract during the month of August 2008 (over 6 million transactions).

    We find that star-shaped or diamond-shaped patternscharacterized by high centralization

    or assortativity and low transitivity (clustering coefficient) and connectednessare positively

    related to returns and volume and negatively related to duration and volatility. In contrast,

    less heterogeneous patternsthose with centralization and assortativity close to zero (their

    averages), high transitivity and high connectednessare associated with average returns andvolatility, and positively related to volume and duration.

    Moreover, we find that network variables strongly Granger-case intertrade duration and

    volume, but not the other way around. This suggests that patterns of order execution presage

    changes in duration and volume.

    This is the first paper to employ network analysis to study complex dynamics of an elec-

    tronic limit order market using transaction level data. While network analysis offers only a

    partial look into the limit order book (i.e., the executed part), network technology offers sig-

    nificant advantages for the task of analyzing the complexity of electronic limit order trading

    beyond just financial variables.

    18

  • 8/3/2019 On the Information Properties of Trading Networks

    20/29

    References

    [1] Allen, Franklin, and Ana Babus, 2008, Networks in Finance, Working Paper 08-07, Wharton

    Financial Institutions Center, University of Pennsylvania.

    [2] Andersen, T., Bollerslev, T., Diebold, F.X. and Labys, P., 2000, Great Realizations, Risk, 13,

    105-108.

    [3] Bandi, F. M. and Russell, J. R., 2006, Separating microstructure noise from volatility, Journal of

    Financial Economics, 79, 655-692.

    [4] Barndorff-Nielsen, O.E., Hansen, P.A., Lunds, A., and Shephard, N., 2008, Realised kernels in

    practice: trades and quotes, manuscript.

    [5] Beckers, S., 1983, Variance of security price returns based onhigh, low and closing prices, Journal

    of Business 56, 97-112.

    [6] Braha, Dan, and Bar-Yam, Y., 2006, From Centrality to Temporary Fame: Dynamic Centrality in

    Complex Networks, Complexity 12(2), 59-63.

    [7] Brunetti, Celso, and Lildholdt, P.M., 2006, Relative efficiency of return- and range-based volatil-ity estimators, manuscript.

    [8] Clark, P., 1973, A subordinated stochastic process model with finite variance for speculative

    prices, Econometrica 41, 135-155.

    [9] Christensen, K., and Podolski, M., 2005, Asymptotic theory of range-based estimation of inte-

    grated variance of a continuous semi-martingale, manuscript.

    [10] Engle, Robert, 2000, The econometrics of ultra-high-frequency data, Econometrica 68, 1-22.

    [11] Engle, R., and Gallo, G., 2006, A multiple indicators model for volatility using intra-daily data,

    Journal of Econometrics 131, 3-27.

    [12] Engle, R., and Russell, J., 1998, Autoregressive conditional duration: A new model for irregularly

    spaced transaction data, Econometrica 66, 1127-1162.

    [13] Epps, T. and Epps, M., 1976, The stochastic dependence of security price changes and transaction

    volumes: Implications for the mixture-of-distribution hypothesis, Econometrica 44, 305-321.

    [14] Fagiolo, G., 2007, Clustering in complex directed networks, Physical Review E 76(2), 26107.

    [15] Garman, M. and Klass, M., 1980, On the estimation of security price volatilities from historical

    data, Journal of Business 53(1), 67-78.

    [16] Hansen, P. and Lunde, A., 2006, Realized variance and market microstructure noise, Journal of

    Business and Economic Statistics 24, 127-218.

    [17] Hasbrouck, Joel, Intraday Price Formation in U.S. Equity Index Markets, 2003, Journal of Finance

    58(6), 2375-2400.

    19

  • 8/3/2019 On the Information Properties of Trading Networks

    21/29

    [18] Hong, Harrison, and Jeremy C. Stein, 1999, A unified theory of underreaction, momentum trading

    and overreaction in asset markets, Journal of Finance 54, 2143-2184.

    [19] Kossinets, G. and Watts, D.J., 2006, Empirical Analysis of an Evolving Social Network, Science

    311 (5757), 88-90.

    [20] Milo, R., Itzkovitz, S., Kashtan, N., Levitt, R., Shen-Orr, S., Ayzenshtat, I., Sheffer, M., and U.

    Alon, 2004, Superfamilies of Evolved and Designed Networks, Science 303, 1538-1542.

    [21] Newman, M. E. J., 2002, Assortative mixing in networks, Physical Review Letters 89, 208701.

    [22] Newman, M. E. J., 2003, The structure and function of complex networks, SIAM Review 45, 167.

    [23] Oomen, R., 2005, Properties of bias-corrected realized variance under alternative sampling

    schemes, Journal of Financial Econometrics 3, 555-577.

    [24] Parlour, Christine A. and Duane J. Seppi, 2008, Limit Order Markets: A Survey, Handbook of

    Financial Intermediation and Banking, Boot, Arnoud W.A., and Anjan V. Thakor, eds., Elsevier

    B.V., Oxford, UK.

    [25] Scheinkman, Jose A. and Wei Xiong, 2003, Overconfidence and Speculative Bubbles, Journal of

    Political Economy 111(6), 1183-1219.

    [26] Tauchen, G. and Pitts, M., 1983, The price variability-volume relationship on speculative markets,

    Econometrica 51, 485-505.

    [27] Wilensky, U. (1999). NetLogo. http://ccl.northwestern.edu/netlogo. Center for Connected Learn-

    ing and Computer-Based Modeling. Northwestern University, Evanston, IL.

    [28] Zhang, L., Mykland, P. A., and Ait-Sahalia, Y., 2005, A tale of two scales: Determining integrated

    volatility with noisy high-frequency data, Journal of American Statistical Association, 100, 1394-

    1411.

    20

  • 8/3/2019 On the Information Properties of Trading Networks

    22/29

    Figure 1: E-mini S&P 500 front month futures contract.

    21

  • 8/3/2019 On the Information Properties of Trading Networks

    23/29

    Y

    X

    Y

    X

    Y

    X YX

    indegree outdegree betweenness closeness

    Figure 2: Example networks with node X having greater centrality than node Y for the speci-

    fied measure.

    (k ini,k inj) -1 1 - 1(k outi, k

    outj) -1 1 - 1(k ini,k outj) -1 - 1 1(k outi, k inj) -1 - 1 1AI 0 1 - 1Figure 3:Illustrat ionofNetw ork Assort ativity.

    CE

    FG

    Figure 4:Anetw ork contain ingtwocon nectedcom ponents,A BCDEand FGH.The largeststrongly connected componen tisBCDE .

    22

  • 8/3/2019 On the Information Properties of Trading Networks

    24/29

    C E N 0.9 20 -0 .09 2 0 .11 6

    AV DE G 2.0 27 3.2 69 2 .41 8SD DE G 8.1 34 5.9 60 4 .94 9IN O U T -0 .4 70 -0 .10 1 -0 .06 9AI 0.3 53 0.5 59 0 .19 5C C 0.0 01 0.0 49 0 .01 9LSC C 0.0 14 0.0 16 0 .00 6E 7 5 1 03 185Retu rn s 0 .0 59 -0 .01 9 0 .00 0Ran ge 0.0 59 0.0 59 0 .03 9Volu m e 1 1 04 13 43 1 19 1Du ratio n 0 1 2 16

    F igu re 5:E xam p les of ob serv ed net wo rk san dth eir p rop ertie s.

    23

  • 8/3/2019 On the Information Properties of Trading Networks

    25/29

    Figure 6: A screenshot of the agent based simulation using Netlogo (Wilensky 1999). Asorders (denoted by squares) are randomly assigned to traders (denoted by human figures), an

    edge is drawn between them. When sell orders (black squares) are matched with buy orders

    (red squares), their quantities are reduced, and there is a direct edge drawn between the traders.

    24

  • 8/3/2019 On the Information Properties of Trading Networks

    26/29

    Table I: Financial Variables: Summary Statistics

    Returns Volatility Volume Duration

    Mean 0.0002 0.0425 1236.6720 19.4941

    Median 0.0000 0.0392 1153 14

    Maximum 0.2165 0.2165 6645 176

    Minimum -0.1378 0.0190 459 0

    Std. Dev. 0.0271 0.0140 407.1451 17.4485Skewness 0.0485 0.8273 2.6259 2.0299

    Kurtosis 2.9876 6.4663 16.9377 9.3735

    ADF prob 0.0001 0.0000 0.0000 0.0000

    AC Lag 1 -0.001 [0.895] 0.187 [0.000] 0.528 [0.000] 0.473 [0.000]

    AC Lag 5 -0.006 [0.062] 0.167 [0.000] 0.376 [0.000] 0.289 [0.000]

    AC Lag 10 -0.011 [0.139] 0.151 [0.000] 0.284 [0.000] 0.241 [0.000]

    ADF prob refers to the p-value of the ADF test for the null of unit root.

    AC Lag X [Q-test prop] refers to the p-value of the Portmanteau Q-test

    for no serial correlation at lags X= 1, 5, and 10.

    Table II: Network Variables: Summary Statistics

    CEN AV DEG SDDEG INOUT AI CC LSCC E

    Mean 0.0049 2.9105 5.1401 0.0836 0.09903 0.0426 0.0403 164.4727

    Median 0.0065 2.8814 4.9923 0.0101 0.0766 0.0365 0.0192 116.0000

    Maximum 0.9804 5.7391 12.7021 0.9887 0.5589 0.2966 0.4889 219.0000

    Minimum -0.9844 1.9692 2.4073 -1.0000 -0.0664 0.0000 0.0050 64.0000

    Std. Dev. 0.2338 0.3691 1.0587 0.2228 0.0699 0.0300 0.0484 19.4519

    Skewness -0.0129 0.5758 0.9584 1.3233 0.9233 1.1996 2.3741 -0.5869

    Kurtosis 2.8782 3.7464 4.6527 4.5685 3.7564 4.9867 10.3298 3.5793

    ADF prob 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

    AC Lag 1 0.139 [0.000] 0.317 [0.000] 0.165 [0.000] 0.175 [0.000] 0.102 [0.000] 0.203 [0.000] 0.246 [0.000] 0.308 [0.000]

    AC Lag 5 0.046 [0.000] 0.108 [0.000] 0.060 [0.000] 0.056 [0.000] 0.042 [0.000] 0.086 [0.000] 0.082 [0.000] 0.144 [0.000]

    AC Lag 10 0.036 [0.000] 0.074 [0.000] 0.031 [0.000] 0.048 [0.000] 0.042 [0.000] 0.063 [0.000] 0.047 [0.000] 0.125 [0.000]

    ADF prob refers to the p-value of the ADF test for the null of unit root.

    AC Lag X [Q-test prop] refers to the p-value of the Portmanteau Q-test for no serial correlation at lags X= 1, 5, and 10.

    25

  • 8/3/2019 On the Information Properties of Trading Networks

    27/29

    Table III: Pairwise correlations between network variables

    CEN AV DEG SDDEG INOU T AI CC LSCC CEN 1.0000

    AV DEG -0.0012 1.0000

    SDDEG -0.0015 0.0031 1.0000

    INOUT -0.0019 0.2367 0.5119 1.0000

    AI -0.0008 -0.1079 -0.2022 -0.6787 1.0000

    CC -0.0008 0.8074 -0.0248 0.2052 -0.1095 1.0000

    LSCC -0.0006 0.5042 0.4070 0.7226 -0.5019 0.4248 1.0000

    Table IV: Correlations between financial and network variablesReturns Range Volume Duration

    CEN 0.6774 -0.0076 0.0264 -0.0065

    AV DEG -0.0034 0.0415 0.0061 0.1000

    SDDEG 0.0037 0.0747 0.2363 -0.1620

    INOUT -0.0061 0.0429 0.0853 0.0467

    AI 0.0016 0.0635 0.0129 -0.0810

    CC -0.0032 0.0314 0.0320 0.0360

    LSCC -0.0076 0.0331 0.0884 0.0058

    26

  • 8/3/2019 On the Information Properties of Trading Networks

    28/29

    Table V: Network Variables: P-values for the Null Hypothesis of Granger Non-causality

    Panel 1: 20 lags

    CEN AI CC LSCC All

    CEN 0.2143 0.4227 0.1731 0.2387

    AI 0.6243 0.0004 0.0000 0.0000

    CC 0.8711 0.0000 0.0000 0.0000LSCC 0.4375 0.0002 0.0219 0.0003

    All 0.5556 0.0000 0.0000 0.0000

    Panel 2: 18 lags

    SDDEG AI CC LSCC All

    SDDEG 0.1601 0.0054 0.0000 0.0000

    AI 0.0000 0.0643 0.0078 0.0000

    CC 0.0000 0.0000 0.0000 0.0000

    LSCC 0.0000 0.0005 0.0001 0.0000

    All 0.0000 0.0000 0.0000 0.0000

    Panel 3: 14 lags

    INOUT AI CC LSCC All

    INOUT 0.1384 0.0794 0.0000 0.0000

    AI 0.0000 0.0016 0.0029 0.0000

    CC 0.0475 0.0000 0.0000 0.0000

    LSCC 0.0000 0.0000 0.0185 0.0000

    All 0.0000 0.0000 0.0000 0.0000

    VAR estimated using GMM with HAC robust standard errors.

    Optimal lag-length (26) is selected using Akaike Information Criterion.

    Table VI: Returns and Network Variables: P-values for the Null Hypothesis of Granger Non-

    causality

    Returns CEN AI CC LSCC All

    Returns 0.0148 0.8984 0.3530 0.7630 0.0320

    CEN 0.0000 0.4459 0.8080 0.2615 0.0000

    AI 0.0306 0.1491 0.0006 0.0000 0.0000

    CC 0.0235 0.0632 0.0000 0.0000 0.0000

    LSCC 0.1056 0.0826 0.0003 0.0240 0.0002

    All 0.0000 0.0132 0.0000 0.0001 0.0000VAR estimated using GMM with HAC robust standard errors.

    Optimal lag-length (18) is selected using Akaike Information Criterion.

    27

  • 8/3/2019 On the Information Properties of Trading Networks

    29/29

    Table VII: Volatility and Network Variables: P-values for the Null Hypothesis of Granger

    Non-causality

    Volatility SDDEG AI CC LSCC All

    Volatility 0.0005 0.2350 0.0000 0.0019 0.0000

    SDDEG 0.0000 0.0263 0.0063 0.0000 0.0000

    AI 0.0020 0.0000 0.0717 0.0093 0.0000

    CC 0.0000 0.0000 0.0000 0.0000 0.0000

    LSCC 0.0000 0.0000 0.0003 0.0116 0.0000

    All 0.0000 0.0000 0.0000 0.0000 0.0000

    VAR estimated using GMM with HAC robust standard errors.

    Optimal lag-length (18) is selected using Akaike Information Criterion.

    Table VIII: Period Duration and Network Variables: P-values for the Null Hypothesis of

    Granger Non-causality

    Duration INOUT AI CC LSCC All

    Duration 0.3328 0.0017 0.0000 0.0000 0.0000

    INOUT 0.9526 0.0000 0.0000 0.1215 0.0000

    AI 0.3345 0.0000 0.0020 0.0021 0.0000

    CC 0.5520 0.0498 0.0000 0.0000 0.0000

    LSCC 0.1211 0.0000 0.0000 0.0336 0.0000

    All 0.1811 0.0000 0.0000 0.0000 0.0000

    VAR estimated using GMM with HAC robust standard errors.

    Optimal lag-length (15) is selected using Akaike Information Criterion.

    Table IX: Volume and Network Variables: P-values for the Null Hypothesis of Granger Non-

    causality

    Volume SDDEG AI CC LSCC All

    Volume 0.0014 0.0012 0.0000 0.0063 0.0000

    SDDEG 0.0669 0.1752 0.0053 0.0000 0.0000

    AI 0.2008 0.0000 0.0911 0.0166 0.0000

    CC 0.3970 0.0000 0.0000 0.0000 0.0000

    LSCC 0.4034 0.0000 0.0014 0.0002 0.0000

    All 0.3662 0.0000 0.0000 0.0000 0.0000

    VAR estimated using GMM with HAC robust standard errors.

    Optimal lag-length (15) is selected using Akaike Information Criterion.

    28