SSRN-id2424840

33
Electronic copy available at: http://ssrn.com/abstract=2424840 1 MEASURING E-COMMERCE CONCENTRATION EFFECTS WHEN PRODUCT POPULARITY IS CHANNEL-SPECIFIC Gonca Soysal Alejandro Zentner August, 2014 This paper uses household-level panel data from three large apparel retailers to examine how e- commerce affects the concentration of sales across products. In our data there are remarkable differences between the products that are popular in online versus offline channels. When the relative popularity of products differs by channel, as in our data, we demonstrate that the traditional long tail metrics used in the literature provide biased results regarding changes to the concentration of sales caused by the growth in online sales. We propose an alternative metric that allows us to measure concentration effects when product popularity varies by channel. Our results demonstrate that ignoring differences in product popularity across channels can lead to erroneous conclusions regarding whether e-commerce increases or decreases the concentration of sales across products. Examining how the migration of consumers from brick and mortar to online channels affects the anatomy of their purchases is important for guiding managerial practice. * Gonca Soysal ([email protected]) is Assistant Professor of Marketing and Alejandro Zentner ([email protected]) is Associate Professor of Managerial Economics, Naveen Jindal School of Management, University of Texas at Dallas. We are grateful to the Wharton Customer Analytics Initiative (WCAI) and the anonymous retailers for making the dataset used in this study available.

description

Research Paper

Transcript of SSRN-id2424840

  • Electronic copy available at: http://ssrn.com/abstract=2424840

    1

    MEASURING E-COMMERCE CONCENTRATION EFFECTS WHEN PRODUCT POPULARITY IS

    CHANNEL-SPECIFIC

    Gonca Soysal

    Alejandro Zentner

    August, 2014

    This paper uses household-level panel data from three large apparel retailers to examine how e-

    commerce affects the concentration of sales across products. In our data there are remarkable

    differences between the products that are popular in online versus offline channels. When the

    relative popularity of products differs by channel, as in our data, we demonstrate that the

    traditional long tail metrics used in the literature provide biased results regarding changes to the

    concentration of sales caused by the growth in online sales. We propose an alternative metric that

    allows us to measure concentration effects when product popularity varies by channel. Our results

    demonstrate that ignoring differences in product popularity across channels can lead to erroneous

    conclusions regarding whether e-commerce increases or decreases the concentration of sales across

    products. Examining how the migration of consumers from brick and mortar to online channels

    affects the anatomy of their purchases is important for guiding managerial practice.

    * Gonca Soysal ([email protected]) is Assistant Professor of Marketing and Alejandro Zentner

    ([email protected]) is Associate Professor of Managerial Economics, Naveen Jindal School of Management,

    University of Texas at Dallas. We are grateful to the Wharton Customer Analytics Initiative (WCAI) and the

    anonymous retailers for making the dataset used in this study available.

  • Electronic copy available at: http://ssrn.com/abstract=2424840

    2

    1. Introduction

    E-commerce does not necessarily need to affect all industries in the same way. Most prior

    empirical studies examining how the migration of consumers from offline to online channels affect

    their purchases focused on a narrow set of product categories (i.e., books and movies), but it is

    unclear whether we can generalize the results from these studies to other product categories. For

    example, while several academic studies focusing on the book or movie markets have found

    support for the long tail hypothesis predicting a lower concentration of sales across products as

    consumers move online (e.g., Brynjolfsson, Hu, and Smith 2003; Zentner, Smith, and Kaya 2013),

    it is unclear whether e-commerce changes the concentration of sales in the same way for other

    product categories where physical examination before purchase might be more important (e.g.,

    clothing, art, furniture, eyewear, or fresh produce). In this paper we focus on the apparel industry

    and use household-level panel data from three large apparel retailers in order to study how e-

    commerce affects the concentration of sales across products.

    The e-commerce literature distinguishes between digital versus non-digital product attributes based

    on how easily these product attributes can be communicated over the Internet (e.g., Lal and

    Sarvary 1999; Lee and Bell 2013; Bell, Gallino, and Moreno 2013), and products have different

    combinations of digital and non-digital product attributes.1 Examining markets where digital

    product attributes are prevalent have captured most of the attention in the long tail versus superstar

    effects literature (e.g., books, movies). It is not clear, however, whether the results from these

    studies will generalize to markets where non-digital product attributes are important in product

    choice. Specifically, one key difference when examining markets where non-digital attributes are

    important (e.g., apparel industry) is that brick and mortar stores have an advantage over the online

    channel when purchasing certain products because they provide consumers with the opportunity to

    physically examine product characteristics (e.g., personal fit, color, or texture). Since non-digital

    product attributes are critical when choosing certain items (e.g., womens swimsuits), customers

    may have a preference towards purchasing such items offline, which might cause a discrepancy in

    product popularity across channels. We show that in our data there are remarkable differences

    1 For example, price is a digital product attribute because information regarding price is easily communicated over the

    Internet. Conversely, personal fit is a non-digital attribute because personal fit is not easily communicated over the

    Internet.

  • Electronic copy available at: http://ssrn.com/abstract=2424840

    3

    between the products that are popular in the online and the offline channels. A substantial number

    of products that take a large share of the transactions in one channel (are superstar products in one

    channel) take a small share of the transactions in the other channel (are niche products in the other

    channel), and vice versa. For instance, the data from one of our focal companies show that 14 of

    the top 50 best-selling products in the online channel are not even among the top 1000 best-selling

    products in the offline channel. The data from our two other companies show similar patterns.

    Our objective is not restricted to examining whether or not empirical results regarding e-commerce

    concentration effects from examining books or movies are generalizable to other product

    categories. We also demonstrate that the metrics employed to evaluate concentration effects in the

    book or movie industries should not be used to examine other product categories. We will show

    that in our data online sales and sales at physical stores center on different sets of products and as

    a result the locations of the online and offline sales distributions are different (see Figure 1). A

    metric seeking to gauge whether the overall concentration of sales increases or decreases as

    consumers move online must therefore account for the differences in not only the concentration

    but also the location of the distributions of online and offline sales.

    Figure 1: Distribution Concentration versus Location

    The figure illustrates an example with only five products (products A through E), which are

    arranged arbitrarily on the horizontal axis. The distributions 1 and 2 in the left panel, representing

    online and offline sales respectively, have different concentration but are centered on the same set

    of products: product B, product C, and product D are popular in both channels, and product C is

    the most popular product in both channels. The distributions 3 and 4 in the right panel, representing

    online and offline sales respectively, have the same concentration but are centered on different sets

    of products: product B is the most popular product in the online channel (distribution 3) whereas

    product D is the most popular product in the offline channel (distribution 4). Figure 1 only

    represents an example where we have arbitrarily designed the distributions of sales.

  • 4

    Most prior studies on the long tail use data from either online or offline markets exclusively to

    define product sales ranks and assume that product popularity online and offline are either equal or

    at least similar. We will show that using the traditional long tail metrics (not accounting for

    differences in locations of the distributions of online and offline sales) for our focal industry may

    result in seemingly contradictory results regarding whether the concentration of sales increases or

    decreases as consumers move online. For instance, studies using data from offline sales

    exclusively to determine product sales ranks will incorrectly tend to find large long tail effects

    when the distributions of online and offline sales concentrate around different sets of products.

    However, when correctly interpreted, the results using a metric based exclusively on offline sales

    intuitively suggest that, as consumers move from offline to online markets, sales concentrate

    around products that are popular online and away from products that are popular offline.

    Conversely, studies using data from online sales exclusively to determine product sales ranks will

    tend to find spurious superstar effects because when consumers move from offline to online

    markets sales concentrate around those products that are popular online and niche offline.

    Because traditional long-tail metrics do not provide information on the overall concentration e-

    commerce effects when sales online and offline concentrate around different sets of products, in

    this paper we propose a metric that allows the measurement of the overall concentration effects

    accounting for the differences in the locations of the online and offline sales distributions. Using

    our proposed metric, we find long tail effects when consumers use the online channel more

    frequently for two of our retailers and no changes in concentration for the other retailer. We also

    show how biased the estimates can be, either toward finding large long tail effects or large

    superstar effects, when using metrics based on either online sales or offline sales exclusively.

    We believe that our results are important for the long tail literature. The examination of long tail

    versus superstar e-commerce effects has focused on a narrow set of products and has overlooked

    the possible existence of differences in product popularity across channels. Our analyses

    demonstrate that these differences are substantial for the apparel industry, and our results show

    how ignoring these differences can cause incorrect conclusions regarding whether e-commerce

    increases or decreases the concentration of sales and, more importantly, regarding the size of the

    concentration effects. Our paper demonstrates that long tail versus superstar examinations must

  • 5

    account for cross channel popularity differences. Examining how prevalent product popularity

    differences are across channels for other industries is an important avenue for future research.

    Studying how the concentration of sales changes as consumers move online is not only important

    for the academic literature but also important for managerial practice. Our results show that

    producers and retailers in this industry should consider the differences in product popularity in the

    online versus offline channels, and shift their efforts toward products that are popular in the online

    channel as consumers move online. Our results suggest that two of our focal companies should

    shift their efforts toward a larger set of products as e-commerce gains market share relative to sales

    at brick and mortar stores, but our results also suggest the third company should not change its

    overall variety as consumers move online. Comparing the size of our concentration effect

    estimates and those from using traditional metrics is important for managers when making

    decisions over variety; our concentration effect estimates are bracketed by large spurious long tail

    effects when measuring product popularity based on offline sales exclusively and large spurious

    superstar effects when measuring product popularity based on online sales exclusively.

    2. Literature

    Our paper contributes to a growing literature on how information technology is affecting the

    anatomy of consumer purchases, and most directly to how e-commerce affects sales concentration

    patterns. The empirical studies in this literature can be categorized by the type of data they employ.

    Using cross-sectional data, Brynjolfsson, Hu, and Smith (2003) find that online book retailers offer

    a wider variety than physical stores do, and also show that a large proportion of online book

    purchases occur for titles not stocked in brick and mortar stores. Brynjolfsson, Hu, and Simester

    (2011) also use cross-sectional data to examine sales of clothing for a retailer selling via both the

    Internet and the catalog channels. They find that the concentration of sales is lower for the Internet

    channel compared to the catalog channel. Although Brynjolfsson, Hu, and Simester (2011) also

    examine the clothing industry, our study is different because neither the catalog channel nor the

    Internet channel allows for an easy evaluation of non-digital product attributes, unlike the brick

    and mortar channel. Brynjolfsson et al. (2009) use cross-sectional data to study the nature of

    competition between brick-and-mortar and internet retailers. They show that internet retailers face

    significant competition from brick and mortar retailers when selling popular products, but the

  • 6

    competition is not as intense for niche products. Using aggregated data at the level of the movie

    title, Elberse and Oberholzer-Gee (2007) study how online commerce affected the distribution of

    video sales from 2000 to 2005. They find that the tail is longer in 2005, but also in 2005, superstar

    products took a larger proportion of sales than ever before. Waldfogel (2012) uses data aggregated

    by albums, and shows that Internet markets decrease the concentration of music sales in a few

    artists. Using panel data Zentner, Smith, and Kaya (2013) examine the movie rental market, and

    find that superstar DVD titles take a smaller share of the market as the closure of physical stores

    made consumers shift from offline to online marketplaces. Goldfarb et al. (2013) also use panel

    data, and show how e-commerce might produce long tail effects by decreasing social inhibitions.

    While Pozzi (2012) does not focus on how e-commerce may affect the concentration of sales per

    se, he examines brand exploration online and offline by using panel data on grocery shopping. He

    finds that brand exploration for groceries is more prevalent at physical stores than in online

    marketplaces. Forman et al. (2009) use location and product level monthly panel data from

    Amazon.com on sales of books, and show that when a physical store opens locally peoples online

    purchases of the nationally most popular products decline relative to the purchases of products

    unlikely to be popular or available offline. Our study also uses panel data and documents that for

    the apparel industry a substantial number of products that are superstars in one channel are niche in

    the other channel. We also show how to measure overall sales concentration in this setting (i.e.,

    when the distributions of online and offline sales center on different sets of products).

    Our results also contribute to the literature on how recommendation systems and popularity lists

    affect the sales of popular and niche products. In this literature Tucker and Zhang (2011) study the

    impact of popularity information on sales, arguing that titles with niche appeal may benefit from

    being listed in popular product lists more than general appeal products do. Likewise, Fleder and

    Hosanagar (2009) and Oestreicher-Singer and Sundararajan (2011) analyze how peer-based

    automated recommendation lists influence online preferences for long tail versus superstar

    products, with the former authors finding that recommendation lists can either increase or decrease

    sales of long tail products, and the latter authors finding that product categories that are more

    sensitive to recommendation networks are also more likely to have higher sales of long tail

    products. In contrast to our study, these studies focus on the online market exclusively and do not

    examine sales from physical stores or cross-channel choices.

  • 7

    3. Data and Setting

    Our data come from three different brands of a North American specialty apparel retailer.

    Although these three brands belong to the same parent company, each brand has its own brick-and-

    mortar stores and an online store selling the brands independent and exclusive line of clothing and

    home accessories. The stores from the three brands are not co-located and the brands are managed

    independently. Our data cover a two year period from July 1st, 2010 through June 30th, 2012. For

    each brand, the company random sampled a total of 14,000 customers from all the customers who

    have been active during the two-year data period. However, the total number of customers we

    observe for each brand exceeds 14,000 customers due to cross-shopping across brands (when

    customers are sampled for a brand, the information is collected for the transactions from all three

    brands). For each customer, we observe purchases from both the retailers physical stores and the

    online store. The company matches credit and debit card purchases to a specific customer using

    card numbers and customer names, and matches cash purchases using e-mail addresses, which the

    store clerks are trained to request. For each purchase event, we observe the purchase channel,

    physical store number for transactions made at physical stores, list of purchased items, and number

    of units and dollar amount for each item purchased. In addition to the detailed transaction data, for

    each customer we also observe the gender, age, date of first purchase from the retailer, and

    geographic location (latitude and longitude of the census block of the customers residence).2

    Table 1 presents summary statistics for our data. Over our two year period of analysis, we observe

    a total of 197,220 transactions for Brand A; 19% of these transactions (or 36,528 transactions) took

    place online. For Brand B we observe a total of 148,069 transactions; 27% (or 40,162 transactions)

    took place online. For Brand C we observe a total of 37,225 transactions; 47% (or 17,453

    transactions) took place online.

    Table 1 shows that the share of online business is rather large for all three brands, and that a

    substantial fraction of customers use both the online and offline channels. Out of 24,072 unique

    customers who made a purchase from Brand A, 42% (or 10,079 consumers) ever made a purchase

    online, 92% (or 22,206 customers) ever made a purchase offline, and 34% (or 8,213 customers

    2 Census blocks are small geographic units. According to the Census, Census Blocks are generally defined to contain

    between 600 and 3,000 people. http://www.census.gov/geo/reference/gtc/gtc_bg.html

  • 8

    used both channels. For Brand B, there are a total of 27,008 unique customers who made a

    purchase, 51% (or 13,826 customers) ever made a purchase online, 87% (or 23,582 customers)

    ever made a purchase offline, and 39% (or 10,400 customers) used both channels. For Brand C,

    there are a total of 12,035 unique customers made a purchase in our sample, 55% (or 6,587

    customers) made a purchase online, 67% (or 8,012 customers) made a purchase offline, and 21%

    (or 2,564 customers) used both channels.

    Because a substantial portion of the customers in our data are multichannel and we also observe

    several transactions per customer during our observation period, in our empirical analysis we are

    able to exploit the panel nature of the data. Customers made on average 7.31 transactions offline

    and 3.62 transactions online for Brand A, 4.64 transactions offline and 2.91 transactions online for

    Brand B, and 2.54 transactions offline and 2.65 transactions online for Brand C.

    Customers purchased similar number of units per transaction when they used the online versus the

    offline channel. For Brand A (Brand B, Brand C) they purchased an average of 2.54 (2.96, 2.01)

    units when they used the online channel and 2.63 (2.60, 1.89) units when they used the offline

    channel. In spite of buying similar number of units per transaction from both channels, it is

    interesting that the dollar values of the transactions were smaller in the physical store channel than

    in the online channel.

    Table 1: Summary Statistics

    Standard deviations are in parentheses.

    Online Offline Online Offline Online Offline

    # of Transactions by Channel 36,528 160,692 40,162 107,907 17,453 19,772

    # of Transactions Overall

    # of Customers by Channel 10,079 22,206 13,826 23,582 6,587 8,012

    # of Multichannel Customers

    # of Customers Overall

    Average # of Transactions Per Customer

    3.62

    (7.15)

    7.31

    (11.98)

    2.91

    (3.48)

    4.64

    (6.55)

    2.65

    (6.08)

    2.54

    (5.68)

    Average Transaction Size (# of Units)

    2.54

    (2.97)

    2.63

    (1.77)

    2.96

    (2.34)

    2.60

    (1.82)

    2.01

    (1.63)

    1.89

    (1.40)

    AverageTransaction Size ($)

    159.97

    (170.3)

    115.12

    (99.37)

    96.17

    (76.29)

    70.56

    (62.34)

    145.54

    (153.50)

    113.74

    (106.54)

    10,400 2,564

    Brand A Brand B Brand C

    24,072 27,008 12,035

    197,220 148,069 37,225

    8,213

  • 9

    A prior stream of literature that investigated the impact of migration of customers from offline to

    online channels on the distribution of sales across products made a distinction between popular and

    niche products based on the number of transactions made for each product. These previous studies

    used sales ranks from one channel (either offline or online) to classify products as either popular or

    niche. This approach is valid for categories like books or movies where the distributions of sales

    across products are likely to be similar for the online and offline channels, and popular products in

    one channel are also popular in the other channel. One possible explanation for this similarity is

    that the need to physically examine the product before purchase is minimal for product categories

    such as books or movies: digital product attributes are prevalent in these product categories

    allowing consumers to easily evaluate product features and characteristics both online and offline.

    However, the ability to physically examine the product before purchase (a benefit unique to the

    offline channel) is more important for some product categories where non-digital product attributes

    are more prevalent. For example, in the apparel industry, the importance of actual color, quality of

    materials, and personal fit (i.e., the non-digital attributes) for some products might result in large

    differences between the distributions of sales across products for the online and offline channels.

    Products where physical examination is important (e.g., a womans swim suit) might be more

    popular in the offline channel whereas products where physical examination is relatively less

    important (e.g., a mens classic white dress shirt) might be more popular in the online channel. If

    the distributions of sales across products are substantially different for the online and the offline

    channels, examining sales from only one channel in order to classify goods as popular versus niche

    would be misleading.

    For our empirical analysis we must also specify the appropriate level in the hierarchy for product

    definition. Unlike the case of movies or books where the title may suffice to identify a unique

    product,3 in the apparel industry a unique style (design) is often offered in a variety of sizes and

    colors. One can therefore either conduct the analysis at the style level (aggregating over different

    sizes and colors) or at the unique SKU level (SKUs are unique identifiers for a color and size

    within a specific style). We conduct our analysis at the style level in the main text, and replicate

    the analysis at the SKU level in the appendix as a robustness check.

    3 There are also some issues when defining products for the movie or book industries. For example, should a movie in

    DVD and the same movie in Blu-ray disc be classified as the same product?

  • 10

    Figure 2 shows the commonality of superstar products across the online and offline channels for

    our observation period for all three brands. Areas B, C, and D in Figure 2 represent the 100

    products with highest overall sales (Top 100 Overall). The products in area B are among the Top

    100 in sales in the offline channel and in overall sales but not among the Top 100 in sales in the

    online channel, and the products in area D are among the Top 100 in sales in the online channel

    and in overall sales but not among the Top 100 in sales in the offline channel. The products in area

    C (the middle intersection) are in the Top 100 in all overall sales, online sales, and offline sales.

    The low commonality of top products online and offline is remarkable for all three brands. Only 43

    products for Brands A and B and 45 products for Brand C rank within the Top 100 in sales in both

    the online and offline channels; these statistics demonstrate the important differences in superstar

    products across the online and offline channels.

    The area E in Figure 2 represents the products that are Top 100 from the online channel but not

    Top 100 when considering overall sales from both channels. This area is rather large for all brands

    (50 for Brand A, 57 for Brand B, and 36 for Brand C).

    The differences in product assortment online versus offline are not as wide in our focal industry

    compared to other industries examined in prior research, where product assortment is substantially

    larger online than at physical stores (e.g., see Zentner, Smith, and Kaya 2013 for statistics on

    product assortment online versus offline in the movie rental industry). Our data suggest that almost

    all top products either online or offline were available from both channels. For example, we can

    investigate product assortment by channel for the products in Figure 2 by examining products with

    zero sales in either the online or offline channel. In Figure 2 for Brand A (Brand B, Brand C) only

    3 (4, 0) out of the 57 (57, 55) products that are among the Top 100 in the offline channel and not

    among the Top 100 in the online channel actually have zero sales in the online channel. Similarly,

    in Figure 2 for Brand A (Brand B, Brand C) only 6 (10, 6) out of the 57 (57, 55) products that are

    among the Top 100 in the online channel and not among the Top 100 in the offline channel have

    zero sales in the offline channel. Although our focus in this paper is not on explaining whether the

    concentration e-commerce effects arise from the demand side (e.g., the way consumers search

    online versus offline) or the supply side (e.g., assortment differences online versus offline), these

    statistics suggest that the cross-channel popularity differences that we document are likely to be

    driven by the demand side and not by cross-channel differences in product assortment.

  • 11

    The offline channel has a much larger share of overall sales compared to the online channel -- 81%

    of all transactions for Brand A take place in brick and mortar stores, 73% for Brand B, and 53%

    for Brand C. Thus, the list of popular products in overall sales is dominated by products that are

    popular from the offline channel. As we explain below, the dominance of the offline channel also

    invalidates the use of overall sales (adding the sales from online and offline channels) to classify

    products as popular or niche when there are differences in product popularity across online and

    offline channels.

    Figure 2: Commonality in Popular Products in Online and Offline Channels

    Brand A

    Brand B

  • 12

    Brand C

    In Tables 2a, 2b, and 2c we further investigate the differences in product popularity ranks online

    and offline by computing popularity ranks for various thresholds. For example, the top left cell in

    Table 2a for Brand A shows that only 21 of the top 50 products offline are also within the top 50

    products in the online channel. Table 2a also shows that some products can be very popular in one

    channel and have very low sales in the other channel (e.g., for Brand A 10 of the top 50 online

    products are not even among top 1000 products offline). Tables 2b and 2c show similar patterns

    for Brands B and C.

    Table 2a: Comparison of Popular Products Offline vs. Online for Brand A for

    Various Rank Thresholds

  • 13

    Table 2b: Comparison of Popular Products Offline vs. Online for Brand B for

    Various Rank Thresholds

    Table 2c: Comparison of Popular Products Offline vs. Online for Brand C for

    Various Rank Thresholds

    While Figure 2 reveals the differences in superstar products in online and offline markets, Tables

    2a, 2b, and 2c further reveal that a substantial number of products are superstar in one channel and

    niche in the other channel. Together these statistics demonstrate that a metric seeking to examine

    the existence of long tail effects as consumers move online must account for the differences in the

    locations of the distributions of online versus offline sales.

    Table 3 presents summary statistics showing the concentration of sales for all brands in the online

    versus offline channels. We measure the concentration of sales by channel using the cumulative

    share of transactions taken by the top-ranked products in each channel. Computing concentration

    in a channel looking at the share of best sellers in that channel allows us to isolate the differences

    in the concentration of the sales distributions across channels from the positions of these

    distributions. Table 3 shows that, once we control for the differences in product popularity across

    the channels, the cumulative share of transactions taken by the top-ranked products are slightly

  • 14

    larger offline compared to online for Brands A and B, and larger online compared to offline for

    Brand C.

    Table 3: Cumulative Share of Transactions Taken by the Top Ranked Products

    in the Online and Offline Channels

    Brand A

    Brand B

    Brand C

    4. Econometric Model

    Our goal in this paper is to establish how the concentration of sales is expected to change when

    consumers gravitate toward online channels. While previous research has found that e-commerce

    is expected to drive sales from the most popular products in the head of the popularity distribution

    toward less popular products in the tail of the popularity distribution, we note previously that these

    results may not generalize to the specialty apparel industry because of the likely differences in the

  • 15

    way individuals search for clothing compared to the way individuals search for books or movies

    that have captured most of the attention in prior research.

    Moreover, the summary statistics presented previously show that in our focal industry there are

    remarkable differences between the popularity rankings of products across online versus offline

    markets: online and offline sales concentrate around different sets of products. Overlooking the

    differences in the locations of the distributions of online and offline sales may lead to misleading

    managerial implications for producers and retailers in the specialty apparel industry in deciding

    whether they should focus their production and stocking efforts on either long tail or superstar

    products as consumers move online.

    Our empirical analysis uses individual and transaction level panel data. For each transaction in the

    data made by household i on date t we define a dummy variable indicating whether the transaction

    was made online or offline:

    {

    For each transaction we also define the share of products in the transaction taken by the purchases

    of top 100 products as follows:

    The superscript m indicates that the sales ranking used to compute the numerator of

    is calculated using monthly sales, and the superscript c indicates which one of

    the following three different ways is used to compute the monthly sales ranks of products:

    a- Using sales from the physical store channel exclusively;

    b- Using sales from the online channel exclusively;

    c- Using sales from the physical store channel for transactions made at physical stores and

    using sales from the online channel for transactions made online, i.e.:

  • 16

    {

    While the first two metrics have been traditionally used to measure how e-commerce affects the

    concentration of sales, these metrics are less useful in our context due to the large differences

    between the popularity rankings of products across online versus offline markets: niche products in

    the offline market might be popular products in the online market and vice versa. In this context, it

    becomes necessary to disentangle the differences in the spread (concentration) of the distributions

    of online and offline sales from the differences in the location of the distributions of online and

    offline sales, since these distributions concentrate around different sets of products (see Figure 1

    above).4 By separately evaluating the popularity ranks for transactions made at physical stores

    versus online, our third and proposed metric is useful for examining whether consumers purchases

    become more or less concentrated when they move to the online market in a way that is not

    contaminated by the differences in the location of the distributions of online and offline sales.

    Our empirical approach is to estimate fixed effects models of the following form:

    ( )

    Model (1) examines the change in the share of products in each transaction taken by the purchases

    of top 100 products when consumers move from offline to online channels (when the dummy

    variable representing an online transaction goes from 0 to 1). In Model (1) represents a fixed

    effect for household i and represents a fixed effect for month m. Household fixed effects

    capture the heterogeneity from time-invariant characteristics of each household as well as other

    characteristics that are unlikely to change substantially during our two year observation period

    4 It might seem reasonable to use aggregate sales from both online and offline markets to calculate an overall ranking

    of sales. However, using aggregate sales does not separate the location from the spread of the distributions of online

    and offline sales. Moreover, using aggregate sales to compute ranks would provide an invalid metric in our setting

    because it would give a substantially larger weight to the distribution of sales at physical stores than to the distribution

    of sales online (we note above that a much larger fraction of the overall sales from our focal firms are made at physical

    stores than online). For reference, we replicate our analysis using a ranking based on aggregate sales in the Appendix.

  • 17

    (e.g., preferences, income, or household size). Month fixed effects absorb aggregate shocks over

    time such as time trends or seasonality shocks.

    Similar to Zentner, Smith, and Kaya (2012), our panel data estimation approach accounts for the

    potential sorting of heterogeneous consumers into channels by controlling for all time-invariant

    characteristics of each household.

    5. Ordinary Least Square Results

    Table 4 presents OLS estimation results for Model (1) using data from Firms A, B, and C. All

    regressions include fixed effects for each month and also include household level fixed effects.

    The standard errors are clustered at the household level to allow for the possibility of serial

    correlation over time.

    In Column I of Table 4 we define product popularity based on sales from the online channel

    exclusively. When defining popularity in this way the regression results show superstar effects for

    all firms: the results show that the share of transactions taken by the top 100 online products

    increases when consumers move to online markets. The predicted superstar effects are large in size

    for all three firms. For Firm A the top 100 products online take 17.8% of all sales, and the

    estimation results predict that the share of transactions taken by the top 100 online products would

    increase by 43.3% if consumers moved all of their transactions to online markets.5 For Firms B

    and C the estimation results predict that the share of transactions taken by the top 100 online

    products would increase by 42.3% and 19.6% respectively if all transactions moved to online

    markets.

    In Column II of Table 4 we define product popularity based on sales from the physical store

    channel exclusively. The results in Column II contrast with those in Column I; they show long tail

    effects instead of superstar effects for all three firms: the results in Column II show that the share

    of transactions taken by the top 100 offline products decreases when consumers move from offline

    to online markets. The predicted long tail effects from Column II are large in size; according to the

    estimates in Column II the share of transactions taken by the top 100 offline products would

    5 This predicted effect is calculated using the proportion of all transactions taken by the online channel (18.9%), the

    regression coefficient (0.0955), and the proportion of transactions taken by the top 100 products from the online

    market (0.178): (1-0.189)x(0.0955/0.178).

  • 18

    decrease as consumers move all of their transactions to online markets (by 32.9% for Firm A,

    55.8% for Firm B, and 47.8% for Firm C).

    Table 4: Share Taken by Top 100 Products in Each Transaction

    OLS Estimates

    The estimates in Columns I and II of Table 4 provide contradictory conclusions regarding whether

    online commerce produces long tail or superstar effects. Based on the results from these

    regressions it is unclear whether producers and retailers should focus their resources on producing

    and stocking long tail or superstar products as consumers move to online markets.

    The seemingly implausible results in Columns I and II actually have an intuitive explanation, and

    the explanation might have been anticipated from our summary statistics showing that there are

    differences in the locations of the distributions of online and offline sales -- some popular products

    at physical stores are niche in online markets and vice versa. The results in Columns I and II are

    I II III

    Rank Based in Online Sales Rank Based in Offline Sales Proposed Metric

    Dummy Online Purchases 0.0955*** -0.0988*** 0.0038

    (0.0033) (0.0029) (0.0035)

    Observations 171,964 171,964 171,964

    R-squared 0.1909 0.2021 0.1866

    Dummy Online Purchases 0.0738*** -0.1519*** -0.0577***

    (0.0027) (0.0026) (0.0030)

    Observations 134,934 134,934 134,934

    R-squared 0.2537 0.2889 0.261

    Dummy Online Purchases 0.1220*** -0.3516*** -0.1594***

    (0.0095) (0.0087) (0.0097)

    Observations 33,355 33,355 33,355

    R-squared 0.411 0.5117 0.4157

    Includes fixed effects for both months (24) and individuals (Company A: 22,464; Company B: 23991 ; Company C: 11,106).

    Standard errors in parentheses are clustered by household.

    * significant at 10%; ** significant at 5%; *** significant at 1%

    For firm A, the mean of the depent variable is 0.178 in Column I, 0.242 in Column II, and 0.261 in Column III.

    For firm B, the mean of the depent variable is 0.125 in Column I, 0.196 in Column II, and 0.221 in Column III.

    For firm C, the mean of the depent variable is 0.329 in Column I, 0.389 in Column II, and 0.481 in Column III.

    The mean of the independent variable (Dummy Online Purchases) is 0.189 for Firm A, 0.278 for Firm B, and 0.469 for Firm C.

    Firm C

    Firm A

    Firm B

  • 19

    consistent with consumers increasing their purchases of products that are niche at physical stores

    but popular in online markets as they move from offline to online markets.

    It might be thought that using overall sales combining sales from both the online and offline

    channels to compute popularity ranks might be the correct way of measuring concentration effects.

    However, a metric seeking to measure changes in the concentration of sales as consumers move to

    online markets must allow for separating the differences in the location versus the concentration of

    the distributions of online and offline sales. Although ranks based on overall sales combine sales

    from both channels, this metric does not separate the two relevant moments: location and

    concentration of the online and offline sales distributions. Moreover, using overall sales gives a

    substantially larger weight to the distribution of sales at physical stores than to the distribution of

    sales online as we note above (see footnote 4 and the Appendix where we present results using

    overall sales to compute sales rankings).

    In Column III of Table 4 we show the results using our proposed metric that accounts for the

    differences in the locations of the distributions of online and offline sales. This metric uses sales

    from physical stores when evaluating popularity ranks for transactions made at physical stores, and

    sales from the online channel when evaluating popularity ranks for transactions made online. The

    results show no change in the concentration of sales for Firm A. When consumers buying from

    Firm A move their purchases to the online channel, they buy a similar fraction of products from

    the head of the online sales distribution than they were buying from the head of the offline sales

    distribution. The estimation results in Column III show long tail effects for Firms B and C; these

    long tail effects are, however, substantially smaller in size than when using sales from the offline

    channel exclusively to define popularity ranks.

    Our results demonstrate the importance of accounting for the differences in product popularity

    across the online and offline channels when measuring the size of the e-commerce impact on the

    concentration of sales. This is particularly important when examining industries where product

    popularity online and offline present important differences, such as our focal industry where

    ignoring these differences would lead to wrong conclusions regarding the effect of e-commerce on

    the concentration of sales.

  • 20

    6. Instrumental Variable Results

    Our regressions in the previous section allow us to control for the heterogeneity across consumers

    who might sort into either the online or the physical store channel. However, a second confounding

    factor may arise due to channel selection if customers who use both channels choose the specific

    channel based on the types of products they desire to purchase. This confounding factor would

    cause a bias in measuring concentration effects if the choice of the channel is correlated with

    product popularity (e.g., consumers choose the online channel to purchase either niche or popular

    products). In this section we examine e-commerce concentration effects after controlling for this

    source of potential endogeneity in channel choice for multichannel consumers. In order to break

    this potential endogeneity problem, we need to observe changes in channel choice that are

    unrelated to the popularity of the products that the customers wish to purchase. We use the entry of

    physical stores during our study period as an instrumental variable. Our focal retailer opened a

    large number of physical stores for its three brands: it increased the number of physical stores by

    17.6% for Brand A, 23.2% for brand B, and 89.1% for Brand C. The entry of physical stores

    decreases the transportation cost for customers who live near the location of the entrant stores, and

    therefore may affect their channel choice (see Forman et al. 2009 and Choi and Bell 2011). We

    believe that the entry of physical stores is a valid instrument. While the entry of physical stores and

    the location of the new store are obviously choices for the firms, we believe these choices are

    unrelated to the relative demand for niche versus superstar products making the instrument

    unrelated to the error term (i.e., exogenous).

    Table 5 presents Instrumental Variables results for Firms A, B, and C. The regressions in Column I

    of Table 5 present the first stage regression results examining how the entry of physical stores

    affects consumers channel choices. These regressions include household level fixed effects; the

    results of these regressions show that customers decrease the proportion of purchases from the

    online channel and increase the proportion of purchases from physical stores when physical stores

    enter close to where they live. For Firm A some of the estimates for the coefficients on the

    dummies indicating the presence of store near the consumers location in each month are not

    statistically significant; for Firms B and C the coefficient estimates on all dummies indicating the

    presence of a store near the consumers location are statistically significant and also large in size.

    For example, using the first stage coefficient estimates for Firm B shows that consumers who did

  • 21

    not have a store within 20 miles from where they live would increase the likelihood of buying from

    the physical store by 25.9% when a physical store enters within a mile of where they live.

    Column II in Table 5 presents the second stage instrumental variable results. These results are

    consistent with the OLS results in Column IV of Table 4: they show no statistically significant

    changes in concentration for Firm A (the positive sign in fact suggests superstar effects) and long

    tail effects for Firms B and C.

    Unlike the OLS results in the previous section, the Instrumental Variable results account for the

    types of products that customers select to purchase online versus offline. Thus, comparing the OLS

    results in Table 4 that are affected by the selection of the channel and the Instrumental Variable

    results in Table 5 that account for the selection of the channel we can speculate about whether

    consumers choose to purchase niche versus popular products in online versus offline markets. The

    Instrumental Variable results for Firms B and C in Table 5 predict a longer tail compared to the

    OLS results in Column IV of Table 4, suggesting that consumers actually select the online channel

    to purchase relatively more superstar products than when selecting the physical store channel. Any

    bias in long tail effect estimates from the OLS regressions for Firms B and C would therefore be in

    the direction of finding superstar effects. For Firm A, the size of the instrumental variable

    coefficient estimate in Table 5 shows more superstar effects than the size of the OLS coefficient

    estimate in Table 4, suggesting that consumers select the online channel to purchase relatively

    more long tail products than when selecting the brick and mortar store channel (although neither

    the OLS nor the IV coefficient estimates are statistically significant for Firm A).

  • 22

    Table 5: Share Taken by Top 100 Products in Each Transaction

    IV Estimates

    I II

    First Stage Second Stage - Proposed Metric

    Dummy Store between 0 and 1 Miles -0.0760** na

    (0.0321) na

    Dummy Store between 1 and 3 Miles -0.0294 na

    (0.0235) na

    Dummy Store between 3 and 10 Miles -0.0569*** na

    (0.0210) na

    Dummy Store between 10 and 20 Miles -0.0346 na

    (0.0265) na

    Dummy Online Purchases na 0.0946

    na (0.2727)

    Observations 171,964 171,964

    R-squared 0.4259 na

    Dummy Store between 0 and 1 Miles -0.2599*** na

    (0.0434) na

    Dummy Store between 1 and 3 Miles -0.2630*** na

    (0.0339) na

    Dummy Store between 3 and 10 Miles -0.1701*** na

    (0.0235) na

    Dummy Store between 10 and 20 Miles -0.1129*** na

    (0.0234) na

    Dummy Online Purchases -0.1100*

    (0.0668)

    Observations 134,934 134,934

    R-squared 0.4495 na

    Dummy Store between 0 and 1 Miles -0.3711*** na

    (0.0927) na

    Dummy Store between 1 and 3 Miles -0.2584*** na

    (0.0646) na

    Dummy Store between 3 and 10 Miles -0.2024*** na

    (0.0402) na

    Dummy Store between 10 and 20 Miles -0.1380*** na

    (0.0460) na

    Dummy Online Purchases -0.1800*

    (0.1038)

    Observations 33,355 33,355

    R-squared 0.6417 na

    Includes fixed effects for both months (24) and individuals (Company A: 22,464; Company B: 23,991; Company C:11,106:).

    Standard errors in parentheses are clustered by household.

    * significant at 10%; ** significant at 5%; *** significant at 1%

    Firm A

    Firm C

    Firm B

  • 23

    7. Conclusion

    While the long tail hypothesis was considered one of the best ideas of 2005 by industry

    observers (Businessweek 2005), most empirical examinations on the long tail hypothesis have

    focused on a narrow set of products categories: movies and books. One important question we

    raise in this paper is whether prior findings regarding the long tail from examining movies or

    books generalize to other contexts, in particular industries where physical examination before

    purchase might be more prevalent than the movie and book industries.

    Our empirical analysis focuses on the apparel industry, where non-digital product attributes are

    more prevalent and thus physical examination of product characteristics such as personal fit, color,

    or texture is more important than it is for movies or books. We use a unique individual-level panel

    data set of purchase transactions from three specialty apparel retailers that operate both brick and

    mortar and online channels. We demonstrate that sales online and offline are substantially different

    in our focal industry, to an extent that creates wide differences between the sets of products that

    are popular online versus offline. This characteristic of our focal industry does not only invalidate

    the empirical generalization of prior results from examining movies or books to the apparel

    industry, but we also show that the methods previously employed to examine long tail effects in

    the markets for movies or books are invalid for examining the apparel industry in particular and

    therefore generally invalid except for special cases.

    In this paper we demonstrate how ignoring the way sales in the online and offline channels

    concentrate around different sets of products leads to misleading conclusions regarding e-

    commerce concentration effects. We show that measuring long tail effects using data from offline

    (online) sales exclusively when the distributions of online and offline sales concentrate around

    different sets of products biases the estimates toward finding large long tail (superstar) effects.

    However, these estimated long tail (superstar) effects just indicate that sales move toward products

    that are popular online (offline) and away from products that are popular offline (online).

    To overcome this challenge, we propose a metric that isolates the difference in the concentration of

    sales across the offline and the online channels (second moment) by controlling for the difference

    in the locations of the online and offline sales distributions. In our empirical analysis we control

  • 24

    for consumer heterogeneity by using individual-level fixed effects, and further control for channel

    selection effects by using store entry as an instrumental variable. Qualitatively, using our proposed

    metric we find long tail effects when consumers use the online channel more frequently for two of

    our retailers and no changes in concentration for the other retailer. More importantly, in terms of

    size our concentration effect estimates are bracketed by large spurious long tail effects when

    measuring product popularity based on offline sales exclusively and large spurious superstar

    effects when measuring product popularity based on online sales exclusively.

    We believe that our results and methods are important not only for the long tail versus superstar

    literature, but also for managerial practice. Ignoring the differences in product popularity across

    channels, and using an incorrect measure of the e-commerce concentration effects may lead to

    managerial errors regarding product selection, product variety, and stocking decisions. Examining

    differences in product popularity by channel and long tail effects for other industries where

    product popularity might differ across online and offline channels is warranted.

  • 25

    Appendix

    Appendix A Analysis at the SKU Level

    In the database each specific color and size combination within a style is assigned a unique SKU

    code. In the main text we conducted our analysis at the style level, which aggregates items

    offered in a variety of sizes and colors. To check for robusteness, in this appendix we replicate our

    analysis in the main text using information disagregated at the SKU level. We show that the

    conclusions in the main text are accentuated when using data at the SKU level our analysis at the

    style level in the main text represents a conservative choice.

    Table A1 presents OLS estimates for Model (1) in the main text using SKU level data. The results

    in Table A1 are similar to the results at the style level presented in the main text. Column I in

    Table A1 shows superstar effects when product popularity is based on online sales exclusively.

    Moreover, the predicted superstar effects in Column I of Table A1 are larger in magnitude

    compared to the results in the main text using data at the style level. The estimation results in

    Column I of Table A1 indicate that the share of transactions taken by the top 100 online products

    would increase by 118% if consumers moved all of their transactions to online markets for Firm A

    (109% for Firm B and 42% for Firm C).

    Column II of Table A1 presents results computing product popularity based on offline sales

    exclusively; the results predict long tail effects for all three firms as in the style level analysis

    presented in the main text. In terms of size, the estimation results in Column II of Table A1

    indicate that the share of transactions taken by the top 100 offline products would decrease by 47%

    if consumers moved all of their transactions to online markets for Firm A (66% for Firm B and

    61% for Firm C). The predicted long tail effects in Column II of Table A1 are slightly larger in

    magnitude compared to those in the main text using data at the style level.

    In Column III of Table A1 we present the results using our proposed metric on data aggregated at

    the SKU level. The results indicate the existence of superstar effects for Firms A and B (23% and

    5% respectively) and long tail effects for Firm C (15%).

  • 26

    Table A1: Share Taken by Top 100 Products in Each Transaction

    OLS Estimates at the SKU Level

    The differences in product popularity across channels are magnified when conducting the analysis

    at the more disaggregate SKU level compared to the style level used in the main text. Figure A1 is

    analogous to Figure 2 in the main text, and shows the commonality of superstar products across the

    online and offline channels for all three brands when examining the data at the more disaggregated

    SKU level. Compared to the style level analysis presented in Figure 2 in the main text, the analysis

    at the SKU level in Figure A1 shows that the number of products in the intersection area C,

    representing the number of products that are top 100 in all overall, online, and offline rankings

    drops from 43 (in Figure 2 in the main text) to 23 (in Figure A1) for Brand A, from 43 to 14 for

    Brand B and from 45 to 19 for Brand C.

    I II III

    Rank Based on Online Sales Rank Based on Offline Sales Proposed Metric

    Dummy Online Purchases 0.0696*** -0.0439*** 0.0257***

    (0.0021) (0.0016) (0.0023)

    Observations 171,964 171,964 171,964

    R-squared 0.2393 0.2259 0.2169

    Dummy Online Purchases 0.0575*** -0.0549*** 0.0051***

    (0.0018) (0.0015) (0.0020)

    Observations 134,934 134,934 134,934

    R-squared 0.3128 0.2526 0.2610

    Dummy Online Purchases 0.1010*** -0.1816*** -0.0605***

    (0.0063) (0.0067) (0.0075)

    Observations 33,355 33,355 33,355

    R-squared 0.417 0.4441 0.3931

    Includes fixed effects for both months (24) and individuals (Company A: 22,464; Company B: 23,991 ; Company C: 11,106).

    Standard errors in parentheses are clustered by household.

    * significant at 10%; ** significant at 5%; *** significant at 1%

    For firm A, the mean of the depent variable is 0.048 in Column I, 0.076 in Column II, and 0.089 in Column III.

    For firm B, the mean of the depent variable is 0.038 in Column I, 0.06 in Column II, and 0.077 in Column III.

    For firm C, the mean of the depent variable is 0.128 in Column I, 0.157 in Column II, and 0.216 in Column III.

    The mean of the independent variable (Dummy Online Purchases) is 0.189 for Firm A, 0.278 for Firm B, and 0.469 for Firm C.

    Firm C

    Firm A

    Firm B

  • 27

    Figure A1: Commonality in Popular Products in Online and Offline Channels

    SKU Level Analysis

    Brand A

    Brand B

    Brand C

  • 28

    Tables A1a through A1c are analogous to Tables 2a through 2c in the main text, and tabulate

    popularity ranks for various thresholds for the online and offline channels calculated at the SKU

    level. Compared to the statistics in Tables 2a through 2c in the main text using data at the style

    level, the differences in product popularity across the online and offline channels are substantially

    larger when computing the statistics at the SKU level. For example, Table A2b for Brand B shows

    that 28 of the top 50 products in the online channel are not even among the top 1000 products in

    the offline channel Table 2b for the same brand in the main text shows that 14 of the top 50

    products in the online channel are not among the top 1000 products in the offline channel. The data

    from our two other companies show similar patterns.

    Table A2a: Comparison of Popular Products Offline vs. Online for Brand B at

    the SKU Level

    Table A2b: Comparison of Popular Products Offline vs. Online for Brand B at

    the SKU Level

  • 29

    Table A2c: Comparison of Popular Products Offline vs. Online for Brand C at

    the SKU Level

    The greater differences in product popularity across channels when conducting the analysis at the

    SKU level generates a greater contrast between the superstar effects in Column 1 of Table A1 and

    the long tail effects in Column 2 of Table A1 compared to the results in the main text (the contrast

    between the superstar effects in Column 1 of Table 4 in the main text and the long tail effects in

    Column 2 of Table 4 in the main text is smaller than the contrast between the estimates in Columns

    1 and 2 of Table A1).

    Table A3 presents IV results using data at the SKU level. The first stage results in Column I are

    identical to those in the main text (the instrumented and instrumental variable in Table A3 and

    Table 5 in the main text are the same). The second stage IV results in Column II of Table A3 are

    consistent with our results at the style level in the main text.

  • 30

    Table A3: Share Taken by Top 100 Products in Each Transaction

    IV Estimates at the SKU Level

    I II

    First Stage Second Stage - Proposed Metric

    Dummy Store between 0 and 1 Miles -0.0760** na

    (0.0321) na

    Dummy Store between 1 and 3 Miles -0.0294 na

    (0.0235) na

    Dummy Store between 3 and 10 Miles -0.0569*** na

    (0.0210) na

    Dummy Store between 10 and 20 Miles -0.0346 na

    (0.0265) na

    Dummy Online Purchases na 0.062

    na (0.1741)

    Observations 171,964 171,964

    R-squared 0.4259 na

    Dummy Store between 0 and 1 Miles -0.2599*** na

    (0.0434) na

    Dummy Store between 1 and 3 Miles -0.2630*** na

    (0.0339) na

    Dummy Store between 3 and 10 Miles -0.1701*** na

    (0.0235) na

    Dummy Store between 10 and 20 Miles -0.1129*** na

    (0.0234) na

    Dummy Online Purchases -0.1016**

    (0.0432)

    Observations 134,934 134,934

    R-squared 0.4495 na

    Dummy Store between 0 and 1 Miles -0.3711*** na

    (0.0927) na

    Dummy Store between 1 and 3 Miles -0.2584*** na

    (0.0646) na

    Dummy Store between 3 and 10 Miles -0.2024*** na

    (0.0402) na

    Dummy Store between 10 and 20 Miles -0.1380*** na

    (0.0460) na

    Dummy Online Purchases -0.0869

    (0.0854)

    Observations 33,355 33,355

    R-squared 0.6417 na

    Includes fixed effects for both months (24) and individuals (Company A: 22,464; Company B: 23,991; Company C:11,106:).

    Standard errors in parentheses are clustered by household.

    * significant at 10%; ** significant at 5%; *** significant at 1%

    Firm A

    Firm C

    Firm B

  • 31

    Appendix B Popularity Based on Overall Sales

    Table A4 presents results using popularity ranks based on overall sales (aggregating sales from

    both the online and offline channels). The table also presents the results from Table 4 in the main

    text for comparison.

    The results for Firms A and B in Table A4 show that the results from an analysis using overall

    sales to rank product popularity (Column IV) closely resemble the results from an analysis

    computing popularity using offline sales exclusively (Column II). This is expected for Firms A and

    B since offline sales account for a substantially larger share of the overall sales than online sales

    for these companies (see Table 1 in the main text). As a result, basing product popularity on

    overall sales is similar to basing product popularity on sales from the dominant channel; using

    overall sales does not control for the differences in product popularity across channels when the

    distributions of online and offline sales are centered on different locations. This demonstrates that

    using overall sales also produces misleading results.

    Table A4: Share Taken by Top 100 Products in Each Transaction

    OLS Estimates Including Overall Sales Measure

    I II III IV

    Rank Based in Online Sales Rank Based in Offline Sales Proposed Metric Rank Based on Overall Sales

    Dummy Online Purchases 0.0955*** -0.0988*** 0.0038 -0.0709***

    (0.0033) (0.0029) (0.0035) (0.0031)

    Observations 171,964 171,964 171,964 171,964

    R-squared 0.1909 0.2021 0.1866 0.1967

    Dummy Online Purchases 0.0738*** -0.1519*** -0.0577*** -0.1162***

    (0.0027) (0.0026) (0.0030) (0.0027)

    Observations 134,934 134,934 134,934 134,934

    R-squared 0.2537 0.2889 0.261 0.2743

    Dummy Online Purchases 0.1220*** -0.3516*** -0.1594*** -0.1861***

    (0.0095) (0.0087) (0.0097) (0.0096)

    Observations 33,355 33,355 33,355 33,355

    R-squared 0.411 0.5117 0.4157 0.4359

    Includes fixed effects for both months (24) and individuals (Company A: 22,464; Company B: 23991 ; Company C: 11,106).

    Standard errors in parentheses are clustered by household.

    * significant at 10%; ** significant at 5%; *** significant at 1%

    For firm A, the mean of the depent variable is 0.178 in Column I, 0.242 in Column II, 0.261 in Column III, and 0.244 in column IV

    For firm B, the mean of the depent variable is 0.125 in Column I, 0.196 in Column II, 0.221 in Column III, and 0.199 in column IV.

    For firm C, the mean of the depent variable is 0.329 in Column I, 0.389 in Column II, 0.481 in Column III, and 0.413 in column IV

    The mean of the independent variable (Dummy Online Purchases) is 0.189 for Firm A, 0.278 for Firm B, and 0.469 for Firm C.

    Firm C

    Firm A

    Firm B

  • 32

    REFERENCES

    Bell, David, Santiago Gallino and Antonio Moreno (2013), Inventory Showrooms and Customer

    Migration in Omni-channel Retail: The Effect of Product Information, Working Paper, the Wharton

    School, University of Pennsylvania.

    Brynjolfsson, Erik, Yu Hu, and Michael Smith (2003), Consumer Surplus in the Digital Economy:

    Estimating the Value of Increased Product Variety, Management Science, 49(11), 1580-1596.

    Brynjolfsson Erik, Yu Hu, and Mohammad Rahman (2009), Battle of the Retail Channels: How

    Product Selection and Geography Drive Cross-Channel Competition, Management Science, 55(11),

    17551765.

    Brynjolfsson, Erik, Yu Hu, and Duncan Simester (2011), Goodbye Pareto Principle, Hello Long Tail:

    the Effect of Search Costs on the Concentration of Product Sales, Management Science, 57(8), 1373-

    1386.

    BusinessWeek (2005) Best of 2005: IdeasHow the Net can find markets for the obscure. Accessed

    March 8, 2014, http://images.businessweek.com/ss/05/12/bestideas/source/11.htm.

    Elberse Anita, and Felix Oberholzer-Gee, (2007), Superstars and Underdogs: An Examination of the

    Long Tail Phenomenon in Video Sales, MSI Reports: Working Paper Series, 4, 4972.

    Fleder Daniel, and Kartik Hosanagar, (2009), Blockbuster Cultures Next Rise and Fall: The Impact

    of Recommender Systems on Sales Diversity, Management Science, 55(5), 697712.

    Forman Chris, Anindya Ghose, and Avi Goldfarb, (2009), Competition Between Local and Electronic

    Markets: How the Benefit of Buying Online Depends on where You Live, Management Science,

    55(1), 4757.

    Goldfarb, Avi, Ryan C. McDevitt, Sampsa Samila, and Brian Silverman, (2013),. The Effect of Social

    Interaction on Economic Transactions: An Embarrassment of Niches? Working Paper, University of

    Toronto.

    Choi, Jeonghye and David Bell, (2011), Preference Minorities and the Internet, Journal of

    Marketing Research, 58, 670 682.

  • 33

    Lal, Rajiv and Miklos Sarvary, (1999), When and how is the Internet Likely to Decrease Price

    Competition?, Marketing Science, 18 (4), 485-503.

    Lee, Jae Young and David R. Bell, (2013), Neighborhood Social Capital and Social Learning for Experience Attributes of Products, Marketing Science, 32 (6) , pp. 960976

    Oestreicher-Singer Gal, and Arun Sundararajan , (2012), Recommendation Networks and the Long

    Tail of Electronic Commerce, MIS Quarterly, 36(1), 6584.

    Pozzi, Andreas, (2012), Shopping Cost and Brand Exploration in Online Grocery, American

    Economic Journal: Microeconomics, 4(3), 96-120.

    Tucker Catherine, and Juanjuan Zhang, (2011), How does Popularity Information affect Choices? A

    Field Experiment, Management Science, 57(5), 828842.

    Waldfogel, Joel, (2012), And the Bands Played On: Digital Disintermediation and the Quality of New

    Recorded Music, Working Paper, University of Minnesota. Minneapolis, Minnesota.

    Zentner, Alejandro, Michael Smith and, Cuneyd Kaya, (2013), How Video Rental Patterns Change as

    Consumers Move Online, Management Science, 59(11), 26222634.