Quantifying the “Pitchfork Effect” - BrownBlogsQuantifying the “Pitchfork Effect” Michael B....
Transcript of Quantifying the “Pitchfork Effect” - BrownBlogsQuantifying the “Pitchfork Effect” Michael B....
1
Quantifying the “Pitchfork Effect”
Michael B. Briskin Brown University
December 2014
Abstract The Internet has changed the way people discover and consume new music, shifting the role of critics and reviewers. This study attempts to isolate the effect that the music website Pitchfork has on the popularity of the albums it reviews. Specifically, I focus on Pitchfork’s signature “Best New Music” (BNM) distinction, awarded to albums that Pitchfork deems worthy. I compare albums that received the award to albums of the same score that didn’t receive the award, allowing me to attribute any differences in popularity to the award. I find that “Best New Music” has a large and statistically significant impact on an album’s popularity immediately after its release—if popularity is measured on a 0-100 scale, the effect of “Best New Music” is about 15-17. That effect is sustained over time. “Best New Music,” not score, seems to be the most significant predictor of long-term popularity.
2
Background
When nineteen-year-old Ryan Schreiber founded Pitchfork Media out of his
parents’ basement in 1995, he capitalized on a gaping hole in the rapidly developing
Internet; while music nerds thrived on “fanzines” to keep up with the latest trends, there
was no continually updated source that could provide fans instant information and
reviews. Pitchfork “took the model and the voice of a print publication to the Internet,
where it could cultivate a small but influential readership and write about music in any
form and at any length it wanted,” says journalist Dave Itzkoff (2006) in his Wired article
“The Pitchfork Effect.” In many ways, the website was “speaking directly to listeners no
longer served by traditional media outlets.”
According to Matt Frampton, Pitchfork’s vice president of sales, the website
receives about 6.2 million unique visitors and 40 million page views each month, making
it the most popular independent-focused music site in the world (Singer 2014). In
selecting which albums to review, Pitchfork attempts to bring attention to more obscure
artists and up-and-coming acts. In general, artists signed to major labels (Sony, EMI,
Universal, and Warner) are excluded. While known for its album reviews, the website
also has news, feature stories, interviews, video series, and its annual Pitchfork Festival.
Journalists have written extensively about the so-called “Pitchfork effect,”
described by the New York Times as “the power to pluck a band from obscurity and
thrust it into the indie consciousness, and to push it out just as quickly” (Caramanica
2010). Indie rock gods Arcade Fire are frequently cited as an example of Pitchfork’s
powerful influence. Before Arcade Fire released their debut album Funeral in 2004, they
had a modest following in Montreal. A rare Pitchfork score of 9.7, however, brought
3
them immediate widespread attention that they likely would not have seen otherwise.
Building on the success of their first album, Arcade Fire has become one of the most
recognizable rock bands in the world today, and it’s safe to say they may owe a share of
their success to that initial glowing Pitchfork review. Conversely, if Pitchfork pans an
album and writes a derisive review, the artist faces a possibility of lackluster album sales
and empty venues.
This study attempts to identify the existence and magnitude of the “Pitchfork
Effect” by taking advantaged of a particular facet of Pitchfork’s grading—the award of
“Best New Music” (BNM). While albums with higher scores have a greater chance of
receiving BNM, there is no cutoff score for the award. Thus, albums with the same score
may differ in BNM. The basic question this study tries to answer is: What effect does a
BNM distinction have on an album’s popularity? I compare albums of the same score
with BNM to albums of the same score without BNM, hoping to find a difference in
popularity for BNM albums. I use Google Trends to study the award’s effect one week
after an album’s release, and I find that albums with BNM exhibit a 15-17 greater change
in popularity where popularity for each album is measured on a 0-100 scale relative to its
peak. I use Spotify to measure to long-term popularity. Spotify’s absolute measure of
popularity ranks all albums on the same 0-100 scale, and I find that albums with BNM
are about 15-17 more popular than those without any award.
First, I present findings from a survey of past literature. Second, I describe my
data sources and assess their strengths and weaknesses. Third, I present the results of my
linear regressions. Fourth, I discuss the implications of these findings and ideas for future
research.
4
Relevant Literature
Researchers have studied the role of critics in the creative industries for years, and
many newer studies focus on the changing landscape of online and consumer reviews.
Entertainment products often fall under the category of “experience goods”—“products
whose quality is difficult to observe or sample adequately before purchase” (Dhar &
Chang 2007). Consequently, many people rely on word-of-mouth and advice from critics
to inform their consumption decisions. We would expect positive movie reviews and box
office sales, for example, to be correlated, but establishing causality is a major challenge.
It is possible that critics are simply good at predicting consumer decisions. Alternatively,
it is possible that they actual change consumer decisions and directly impact sales.
Eliashberg and Shugan (1997) show that critical reviews are significantly correlated with
cumulative box office sales, but not with sales during a film’s first few weeks. This
would suggest that critics merely predict consumer behavior rather than influence it.
Basuroy et al. (2003) build upon this approach and separate movie reviews into positive
and negative. Both positive and negative reviews are correlated with sales over an eight-
week period. The impact of negative reviews is significant in a film’s first week but
diminishes over time, suggesting that critics may be able to influence rather than just
predict sales. These studies, while insightful, suffer from a fundamental causality
problem; the authors can speculate about causality based on patterns observed in the data,
but they cannot make any strong causal claims.
Sorensen (2007) exploits accidental omissions in the New York Times Bestseller
List and finds that appearing on the list is associated with a moderate increase in sales,
5
with a larger effect for debut best-selling authors. The New York Times calculates its
best-seller list by sampling 4,000 bookstores. This method occasionally causes errors—
sometimes artists that actually sold enough books to appear on the list are mistakenly
kept off. Thus, the author can compare albums that appeared on the list with albums that
“should have” appeared on the list. In this case, the New York Times functions not as a
critic, but as an information-provider to consumers. It seems that even appearing on a list
published by an influential media outlet can affect sales, in this case by about 8%.
As reviews have moved to the Internet, people have almost immediate access to a
plethora of opinions on every new book, movie, or album. This flood of information has
led to the growing importance of blogs and other user-generated reviews that may not
come from well-respected newspapers. Dhar and Chang (2003) consider three types of
online reviews: consumer reviews, online media reviews, and mainstream media reviews.
Consumer reviews consist of user reviews from Amazon as well as a measure of “blog
chatter.” Pitchfork, as well as similar but less influential review websites comprise the
“online media reviews” category. Given that each website reviews albums differently
(some on a 0-10 scale, some with stars, some with letter grades), the authors create a
scale that (perhaps imprecisely) allows them to compare reviews across different
platforms. Finally, the “mainstream media reviews” come from large publications like
Rolling Stone that also exist in print. Tracking a sample of 108 albums four weeks before
and after their release, the authors find blog chatter to be most predictive of sales. Online
media like Pitchfork come in second, while mainstream ratings are least correlated with
sales. Once again, the authors struggle with the correlation versus causation question, and
their results cannot be extrapolated because of the small sample size of 108 albums.
6
My study takes advantage of a particular facet of Pitchfork’s grading system that
has yet to be studied. Pitchfork’s “Best New Music” distinction can be thought of as a
treatment; by comparing albums that received the award to albums that received the same
score but no award, I can capture the causal effect of BNM on popularity, assuming that
there are no other existing differences between BNM and non-BNM albums that may
affect popularity. If this assumption holds, I effectively have natural treatment and
control groups, an advantage that allows me to make stronger claims about causality, as
opposed to previous studies that are trapped into studying only correlation. In addition,
rather than collecting a sample of albums, I have access to every album Pitchfork has
reviewed, allowing me to make stronger claims about its influence.
Data
1. Pitchfork Data
I begin with a dataset consisting of every album Pitchfork has reviewed from
1998 through October 17, 2014—a total of 15,210 albums. For each album reviewed, I
have information for the artist name, album name, release year, label name, Pitchfork
reviewer, score, accolade, and review date. The “accolade” column contains two awards:
“Best New Music” and “Best New Reissue.” Since I’m interested only in Pitchfork’s
effect on new music, I removed reissued albums from the dataset and created a dummy
variable called bnm which is equal to 1 if the album received Best New Music, and 0
otherwise. Figure 1 displays the distribution of scores taken from every album in the
dataset. The distribution is skewed left with a mean of about 7.
7
Figure 1. How does Pitchfork rate all the albums it reviews?
I’m interested in albums that have the same score but may differ in bnm. This
occurs only between the scores 8.1 to 8.9. Anything below 8.1, and an album is
guaranteed to not receive the distinction, and every album with a score 9 or higher does
receive BNM. Thus, I limit my dataset to albums with scores between 8.1 and 8.9. In an
attempt to clean the data, I also removed any reissued albums, as well as live albums,
compilations, anthologies, greatest hits and any other albums that are not “new.” Keeping
in mind that Pitchfork did not start awarding BNM until 2003 and that Google Trends
data only goes as far back as 2004, I removed all albums from the dataset that were
released before 2004. In total, my final dataset consists of 1,105 albums. It is important to
note that this selection of albums is not a random sample, but rather the entire population
Notes: Observations consist of all 15,209 albums that Pitchfork reviewed from 1998 through October 2014.
0.1
.2.3
.4.5
Den
sity
0 2 4 6 8 10Score
Distribution of Pitchfork Scores
8
of interest; I have the relevant information for every album Pitchfork has reviewed in the
score range and time period crucial to my analysis. We can think of these observations,
then, as a sample of the larger population that will include future album reviews.
Figure 2 describes the likelihood of BNM at each score. Of course, as score
increases, so does the chance of BNM. The y-axis denotes the percentage of albums that
received BNM, and the numbers above each bar represent total number of albums. For
example, among albums with a score of 8.5, 87 received BNM while 62 received no
award.
Figure 2. Percentage and total number of BNM and non-BNM albums at each score of interest.
My analysis is predicated on the assumption that, among albums of the same
score, BNM and non-BNM albums are appropriate treatment and comparison groups. If
212
5
203
17
134
56
86
7362
87
26
49
9
38
4
31
1
12
0.2
.4.6
.81
Freq
uenc
y
8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9
How Likely Is BNM For Each Score?
No Award Best New Music
Notes: The x-axis denotes score. The y-axis shows the percentage of BNM and non-BNM albums at each score. The number above each bar represents total number of albums.
9
certain albums are more likely than others to receive BNM, and if those unobserved
characteristics drive differences in popularity, my regressions will yield biased estimates
that likely overstate the effect of BNM on popularity. Ideally, albums of the same score
will not differ on unobserved characteristics that increase their propensity for BNM.
However, my estimates will still be unbiased as long as any existing differences between
the two groups do not affect popularity. To test this assumption, I ran nine chi-square
tests (for every score 8.1-8.9) to check if, for each score, BNM is equally distributed
among record labels. A low p-value would imply that Pitchfork favors some labels over
others, a notion that would confound my results. Table 1 shows that, for each score, we
cannot reject the null hypothesis that BNM is uniformly distributed among labels. While
it’s possible that other confounds exist, this falsification test gives credence to my key
assumption.
Table 1: Are different record labels equally likely to receive BNM?
Notes: At each score, the chi-square test checks if the BNM award is equally distributed among labels. A p-value below .05 would indicate that some labels are more likely to be awarded BNM.
10
2. Google Trends Data
Using the Pitchfork review date, I’m able to calculate each album’s popularity
one week before, the week of, and one week after. Google does not allow us to compare
absolute search volume. Instead, every query is displayed on a 0-100 scale, with 100
being the peak interest for that query. I use the syntax “Artist+Album” to input each
album. For example, the search query “Best Coast Crazy For You” gives the results for
the album Crazy For You by the band Best Coast. Since I can’t calculate absolute
popularity, I calculate each album’s change in popularity by subtracting its popularity the
week before its Pitchfork review from its popularity one week after the review. I can
compare this change in popularity across albums because they are all based on a 0-100
scale relative to the album’s peak popularity.
Given that Pitchfork is known for promoting “independent” music, it comes as no
surprise that Google does not have enough data to give results for over half of my dataset.
About 52% of the 1,105 albums are too obscure, leaving me with 533 albums for which
Google returns results. Aside from the missing data and the problem of relative
popularity, a key limitation to Google trends is that it does not measure actual
consumption of the product. Unlike the studies reviewed in the introduction, this study is
unable to directly track sales data. While Google Trends might provide a good estimate
for the “buzz” generated by an album upon release, I cannot claim that it is strongly
related to actual sales. Goel et al. (2010) track a sample of 307 songs and match Yahoo!
search entries to rank on the Billboard Hot 100 Chart, a measure of every week’s 100
most popular songs. They find the correlation between search volume and rank to be
0.56, suggesting a moderate relationship between search popularity and music sales.
11
3. Spotify Data
Similar to Google Trends, Spotify calculates every album’s popularity on a 0-100
scale, with higher values for more popular albums. This album popularity measure is
based on the individual play counts of each song on the album. To my knowledge, it is
not possible to track Spotify popularity over time; the popularity measure only gives us
an account of how popular an album has been up to the point of data collection. Spotify
does not have results for about 22% of the dataset in this case, leaving me with a final list
of 875 albums.
While Google Trends allowed me to measure popularity at an exact point in time,
the Spotify popularity just consists of cumulative plays since the album’s release. Thus, I
should be able to tell if the Pitchfork effect is sustained over longer periods of time. The
Spotify data is also more relevant to the question at hand because it measures direct
consumption. Google Trends doesn’t actually tell us how many people bought or listened
to an album. Conversely, Spotify popularity is based on the actual consumption of the
music.
Given the longer time frame studied compared to Google Trends, it’s possible that
other factors over time may contribute to an album’s Spotify popularity. For instance,
Pitchfork and other websites release their “Best of the Year” lists at the end of every year,
and it’s likely that albums appearing on these lists see another spike in popularity. If this
is the case, then my regressions will overstate the effect of BNM on popularity. It may, in
fact, be that these end-of-year lists are the cause of sustained popularity.
I took a random sample of 55 albums and manually checked to see if they
appeared on Pitchfork’s and Rolling Stone’s “Best Albums” of the year or “Best Songs”
12
of the year. I created a dummy variable 𝑝𝑖𝑡𝑐ℎ𝑓𝑜𝑟𝑘_𝑙𝑖𝑠𝑡 which is equal to one if the
album or a song from that album was featured on an end-of-the-year Pitchfork list.
Similarly, a dummy variable 𝑟𝑜𝑙𝑙𝑖𝑛𝑔𝑠𝑡𝑜𝑛𝑒_𝑙𝑖𝑠𝑡 is equal to one if the album or a song
from that album was featured on an end-of-year Rolling Stone list. Controlling for score
and release year, albums appearing on a Pitchfork list have a Spotify popularity score
4.62 higher than albums not appearing on a Pitchfork list, though this coefficient is not
statistically significant. The effect of appearing on a Rolling Stone list, however, appears
to be large and statistically significant; controlling for score and release year, albums
appearing on a Rolling Stone end-of-year list have a score 13.79 higher than albums not
on a Rolling Stone list, significant at the 5% level.
Of albums from the sample that appear on a Rolling Stone list, 62.5% of them
received BNM. Thus, if BNM albums are more likely to appear on a Rolling Stone list
and those lists are strongly associated with an increase in Spotify popularity, my
estimates will overstate the effect of BNM on Spotify popularity. It would be the case
that appearing on a Rolling Stone list actually accounts for a portion of the effect
seemingly caused by BNM. While the results from this sample illustrate a possible
confound to my methodology, I don’t have enough information to conclude that my
assumption is violated. First, the sample size is very small for this context, and I simply
don’t have enough observations. Of the 55 albums in this sample, only 17 received BNM.
In the future, I hope to collect information on end-of-year lists for each album so I can
include appearance on these lists as a control in my regressions. Second, the correlation
between BNM and appearing on a Rolling Stone list is a fairly weak 0.282, suggesting
that BNM is not a strong predictor of appearing on one of these lists.
13
Methodology
1. Google Trends Methodology
I use a difference-in-differences approach to estimate the effect of BNM on an
album’s change in popularity from one week before and one week after the Pitchfork
review is published. My hypothesis is that albums with BNM will display a greater
change in popularity than albums that do not receive the award. Comparing each album’s
change in popularity allows me to circumvent the problem of relative search results. Two
albums may be vastly different in their absolute search queries, but calculating the change
in popularity gives me a unit of measurement I can use to directly compare albums.
Consider the following example that reinforces the intuition behind the design.
Best Coast’s album Crazy For You and The Hold Steady’s album Stay Positive both
received a score of 8.4, though Crazy For You was the only one of the two to receive
BNM. Figure 3a displays the album’s popularity over time. The week before the review
was published, the album’s popularity was 25 on the 0-100 scale. Popularity peaks at 100
on the week of the Pitchfork review and decreases to 59 the week after. I use these values
to calculate the album’s change in popularity as 34. Using the same procedure, I find that
The Hold Steady’s Stay Positive had a change in popularity of 6 (Figure 3b). Even though
the albums differ in their absolute popularity, this measure of change in popularity allows
me to compare the albums to each other. The difference-in-differences estimate I’m after
subtracts the average change in popularity for albums with no award from the average
change in popularity for albums with BNM. In this simplified example with two albums,
the difference-in-differences estimator would be 34− 6 = 28.
14
Figure 3b. Popularity path for a non-BNM album before and after its release.
Figure 3a. Popularity path for a BNM album before and after its release.
Notes: This album received BNM. Popularity is based on a 0-100 scale where the peak usually during the week of an album’s release. Change in popularity subtracts popularity the week before from the week after.
Notes: This album did not receive BNM. Popularity is based on a 0-100 scale where the peak usually during the week of an album’s release. Change in popularity subtracts popularity the week before from the week after.
15
A difference-in-difference design necessitates a treatment and two time periods.
The “treatment” in this case is a BNM award, while the control group consists of albums
that got no award. The “pre” time period is a week before the Pitchfork review, and the
“post” time is one week after the review. It is easiest to interpret this regression as an
OLS regression with change in popularity as the outcome variable. The basic regression
is given by
∆𝑝𝑜𝑝𝑢𝑙𝑎𝑟𝑖𝑡𝑦! = 𝑎! + 𝑎!𝑠𝑐𝑜𝑟𝑒! + 𝑎!𝑏𝑛𝑚! + 𝜀!
where 𝑖 indexes albums, ∆𝑝𝑜𝑝𝑢𝑙𝑎𝑟𝑖𝑡𝑦 is change in popularity, 𝑠𝑐𝑜𝑟𝑒 is an album’s
Pitchfork score, 𝑏𝑛𝑚 is a dummy equal to 1 if the album received “Best New Music” and
zero otherwise, 𝜀 is an error term, and 𝑎! is the parameter of interest that estimates the
causal effect of BNM on change in popularity. In order for my estimates to be unbiased, I
must assume that 𝐸 𝜀! 𝑏𝑛𝑚! , 𝑠𝑐𝑜𝑟𝑒! = 0. This assumption holds if 𝑐𝑜𝑟𝑟 𝑏𝑛𝑚! , 𝜀! = 0.
Conditional on score, BNM cannot be correlated with unobserved factors that might
affect popularity. The chi-square tests I ran demonstrated that likelihood of BNM does
not depend on record label, giving me more confidence in the assumption that 𝑏𝑛𝑚 and 𝜀
are not correlated. For a difference-in-differences design, I also have to make a parallel
trends assumption; in this case, I assume that if albums with BNM had not received the
award, their change in popularity would be the same as that of albums without BNM. In
other words, the treatment is uncorrelated with other variables that may affect the
albums’ change in popularity. While I can’t validate the parallel trends assumption
empirically, the chi-square tests give me confidence that albums of the same score differ
16
only in their treatment, so albums with BNM should have displayed similar changes in
popularity had they not received the award.
2. Spotify Methodology
For my outcome variable, I use Spotify’s measure of an album’s popularity,
calculated on a 0-100 scale with the most popular album being 100. I use an OLS model
to estimate the effect of BNM on popularity, controlling for score. The basic regression is
given by
𝑠𝑝𝑜𝑡𝑖𝑓𝑦! = 𝛽! + 𝛽!𝑠𝑐𝑜𝑟𝑒! + 𝛽!𝑏𝑛𝑚! + 𝜇!
where i indexes albums, 𝑠𝑝𝑜𝑡𝑖𝑓𝑦 is the measure of Spotify popularity, 𝑏𝑛𝑚 is a dummy
equal to 1 if the album received “Best New Music” and zero otherwise, and 𝜇 is an error
term. My coefficient of interest is 𝛽!, which measures the effect of BNM on Spotify
popularity, controlling for score. Once again, I must assume 𝐸 𝜀! 𝑏𝑛𝑚! , 𝑠𝑐𝑜𝑟𝑒! = 0.
While the results from the chi-square test give me some confidence in this assumption,
the longer time frame of the Spotify data may cause other problems. Pitchfork’s “Top
100 Tracks” and “Top 50 Albums” lists at the end of every year may give albums on
those lists an additional bump in popularity that could be sustained over time. If albums
with BNM are more likely to appear on these end-of-year lists, then 𝑏𝑛𝑚 will be
correlated with 𝜇, inducing bias into my coefficient of interest 𝛽! that will most likely
overstate the impact of BNM on popularity. It is possible that an appearance on one of
these lists drives popularity, not the distinction of BNM. Given that I don’t control for
appearance on these lists, it is possible that my Spotify estimates overstate the impact of
17
BNM on popularity. In future research, I hope to collect this data and include appearance
on an end-of-year list as another control in my regressions.
Results
1. Google Trends
Figures 4a and 4b depict the average change in popularity for non-BNM and
BNM albums, respectively. The error bars show two standard deviations above and
below the mean popularity at each score, indicating that popularity varies greatly at each
score level. Figure 4c depicts the average change in popularity for albums of each score,
separated by whether or not they received BNM. The graph clearly shows that, on
average, albums receiving BNM exhibited a greater change in popularity than their
-150
-100
-50
050
100
(Mea
n) C
hang
e in
Goo
gle
Tren
ds P
opul
arity
8 8.2 8.4 8.6 8.8Score
For Albums With No AwardAverage Change in Google Trends Popularity by Score
Figure 4a. Google Trends Popularity for non-BNM albums.
Notes: The bars extend two standard deviations above and below the mean change in popularity for each score. Non-BNM albums with a score of 8.8 or 8.9 were too obscure for Google to return results—this is why the figure does not show any values at 8.8 or 8.9.
18
-50
050
100
(Mea
n) C
hang
e in
Goo
gle
Tren
ds P
opul
arity
8 8.2 8.4 8.6 8.8 9Score
For BNM AlbumsAverage Change in Google Trends Popularity by Score
Figure 4b. Google Trends popularity for BNM albums.
Notes: The bars extend two standard deviations above and below the mean change in popularity for each score.
Notes: These values are the same as in Figures 4a and 4b, but they are plotted on the same scale. Standard deviation bars are omitted for clarity.
-40
-20
020
40(M
ean)
Cha
nge
in G
oogl
e Tr
ends
Pop
ular
ity
8 8.2 8.4 8.6 8.8 9Score
No Award Best New Music
For All AlbumsAverage Change in Google Trends Popularity by Score
Figure 4c. Google Trends Popularity for All Albums
19
. * p<
0.05
, **
p<0
.01,
***
p<0
.001
t st
atis
tics
in
pare
nthe
ses
Obse
rvat
ions
53
3
53
3
53
3
53
3
53
3
(-
1.65
)
(
0.79
)
(
0.71
)
(
1.90
)
(
2.01
)
Cons
tant
-
118.
2
72.6
8
106.
1
2
675.
9
2
945.
3*
(-
1.85
)
(-
1.94
)
Rele
ase
Year
-
1.27
6
-
1.36
9
(
0.28
)
(
0.66
)
Scor
e*BN
M
6.46
9
15.5
4
(
3.30
)
(-
0.20
)
(
3.63
)
(-
0.58
)
Best
New
Mus
ic
15.1
2**
-
38.7
5
16.9
7***
-
112.
3
(
1.77
)
(-
0.77
)
(-
0.69
)
(-
1.17
)
(-
1.23
)
Scor
e
15.1
6
-
8.58
2
-
12.6
4
-
13.3
0
-
23.3
9
(1
)
(2
)
(3
)
(4
)
(5
)
Chan
ge i
n Go
ogle
Tre
nds
Popu
lari
ty
Tabl
e 2.
Reg
ress
ion
resu
lts w
ith c
hang
e in
Goo
gle
Tren
ds
popu
larit
y as
the
outc
ome
varia
ble.
20
counterparts that received no award. The regression in column (2) of Table 2 confirms
this observation. Controlling for score, albums with BNM displayed a change in
popularity 15.122 greater than albums with no award, where change in popularity is
based on a common 0-100 scale. This coefficient on 𝑏𝑛𝑚 is large and statistically
significant at the 99% level, suggesting that the award of BNM gives albums a large,
immediate bump in popularity upon their release. The coefficient on the interaction term
in column (3) is not statistically significant, suggesting that the effect of score on
popularity is not different for albums with BNM. That is, the two trends observable in
Figure 4 have similar slopes. Controlling for the album’s release year slightly increases
the coefficient on 𝑏𝑛𝑚 to 16.97, which is significant at the 99.9% level.
2. Spotify
Figures 5a and 5b depict the average Spotify popularity at each score for non-
BNM and BNM albums, respectively. The bars extend two standard deviations above and
below the mean Spotify popularity at each score. It is immediately clear that there is great
variation in popularity at each score. Figure 5c depicts the average Spotify popularity for
albums of each score, separated by whether or not they received BNM. It is immediately
apparent that albums with BNM are on average, much more popular than albums with no
award. Column (2) of Table 3 confirms this observation; controlling for score, albums
with BNM were 17.47 more popular on a 0-100 scale compared to albums without BNM.
This coefficient tells us the average difference in popularity between albums with and
without BNM, but the graph reveals a curious trend: the difference in popularity between
21
these two types of albums seems to widen as score increases. This result was certainly not
one I expected to find. The simplest regression in column (1) reinforces the basic
intuition that, as an album’s score increases, so does its Spotify popularity. I expected
that to be the case for both BNM and non-BNM albums, but the results show otherwise.
-20
020
4060
80(M
ean)
Spo
tify
Popu
larit
y
8 8.2 8.4 8.6 8.8 9Score
For Albums With No AwardAverage Spotify Popularity by Score
Figure 5a. Spotify Popularity for non-BNM albums.
Notes: The bars extend two standard deviations above and below mean Spotify popularity at each score. Spotify only returned results for one non-BNM album with a score of 8.9—this is why there are no bars extending from the point at 8.9.
22
2040
6080
100
(Mea
n) S
potif
y Po
pula
rity
8 8.2 8.4 8.6 8.8 9Score
For BNM AlbumsAverage Spotify Popularity by Score
Figure 5b. Spotify Popularity for BNM albums.
Notes: The bars extend two standard deviations above and below mean Spotify popularity at each score. Spotify only returned results for one non-BNM album with a score of 8.9—this is why there are no bars extending from the point at 8.9.
Notes: These values are the same as in Figures 5a and 5b, but they are plotted on the same scale. Standard deviation bars are omitted for clarity.
Figure 5c. Spotify Popularity for all albums.
1020
3040
50(m
ean)
Spo
tify
Popu
larit
y
8 8.2 8.4 8.6 8.8 9Score
No Award Best New Music
Separated by BNMAverage Spotify Popularity by Score
23
* p<
0.05
, **
p<0
.01,
***
p<0
.001
Popu
lari
ty b
ased
on
0-10
0 sc
ale.
t st
atis
tics
in
pare
nthe
ses
Obse
rvat
ions
87
5
87
5
87
5
87
5
87
5
(-
3.73
)
(
4.14
)
(
5.62
)
(-
6.93
)
(-
6.24
)
Cons
tant
-
85.8
9***
104.
6***
185.
1***
-2
488.
9***
-2
280.
3***
(
7.24
)
(
6.77
)
Rele
ase
Year
1.27
0***
1.19
7***
(
3.76
)
(
2.83
)
Scor
e*BN
M
23.1
1***
17.1
2**
(1
3.54
)
(-
3.42
)
(1
1.32
)
(-
2.54
)
Best
New
Mus
ic
17.4
7***
-
176.
6***
14.8
0***
-
128.
8*
(
5.38
)
(-
2.86
)
(-
4.63
)
(-
1.14
)
(-
2.72
)
Scor
e
14.8
5***
-
8.74
3**
-
18.4
9***
-
3.49
0
-
11.0
2**
(1
)
(2
)
(3
)
(4
)
(5
)
Spot
ify
Popu
lari
ty
Tab
le 3
. Reg
ress
ion
resu
lts w
ith S
potif
y Po
pula
rity
as th
e ou
tcom
e va
riabl
e
24
Column (2) tells us that, for albums without BNM, a .1 increase in score is
associated with a decrease in popularity of .874. This effect appears both large and
statistically significant. Conversely, the trend for albums with BNM is positive, though
not statistically significant. For albums with BNM, a .1 increase in score is associated
with a .462 increase in popularity. This coefficient is calculated by adding the
coefficients of the score and interaction terms in Column (3). Interestingly, it appears that
once an album has received BNM, its score is not a strong determinant of its popularity.
The award alone seems to give albums a large popularity boost that does not increase
very much as the album’s score increases. The strong negative trend for albums without
BNM, however, is perplexing. The statistically significant coefficient on the interaction
term in column (3) confirms this strange finding that the effect of score on popularity is
quite different for albums with and without BNM.
It seems that some unobservable characteristic that differs between the two types
of albums may be driving these divergent trends. If this were the case, then my key
assumption that BNM is randomly assigned to albums of the same score would be
violated; this outside factor would cause 𝑏𝑛𝑚 to be correlated with the error term in all
the regressions. In an effort to unpack what might drive this difference, I went back to the
context of the albums. Given that it’s very unusual for an album with a score of 8.8 or 8.9
to not receive BNM, I postulated that perhaps those albums differed in some way from
other albums without BNM.
Consider the case of The Is Rip Hop by Death Comet Crew, an album that
received a score of 8.8 but no BNM distinction. Every track on the album has been
played fewer than 1,000 times on Spotify. A Google search for the album reveals little
25
more than the Pitchfork review written about it. In short, the album is completely
obscure, even by Pitchfork’s standards. The other albums with high scores and no BNM
seem to follow the same pattern. It seems, then, that as albums without BNM increase in
score, they also increase in obscurity. If this is the case, then Pitchfork may be factoring
anticipated popularity into its BNM decisions. That is, when albums are so obscure that
they’re unlikely to reach any discernible audience, Pitchfork might still give them a high
score without bestowing the BNM award upon them. Of course, the variable 𝑏𝑛𝑚 would
then be correlated with the error term, which includes anticipated popularity. While I
can’t prove this theory empirically, the context suggests that anticipated popularity may
factor into BNM decisions, thus presenting a confound to my regression models.
However, the significant negative trend for albums without BNM greatly
attenuates when controlling for album’s release year. All of my regression tables add
release year as a control, allowing me to arrive at more precise estimates. As we can see
from Figure 6, Pitchfork has become more generous over time in the percentage of
albums it gives BNM. Release year also appears to be positively correlated with Spotify
popularity, as shown in Figure 7. More recent albums are more popular. Spotify doesn’t
explain exactly how it determines popularity, but it’s possible that more recently played
albums are given heavier weight. Or, it may be that recently released albums are more
popular simply because Spotify’s consumer base continues to grow. Google trends
accounts for user growth over the past decade, but it’s unclear if Spotify takes this same
approach.
26
3035
4045
50M
ean
Spot
ify P
opul
arity
2004 2006 2008 2010 2012 2014Release Year
Current Spotify Popularity Depending on Release Year
Figure 6. How does the likelihood of BNM change over time?
Notes: The y-axis denotes the percentage of albums with a score 8.1-8.9 that received BNM in a given year. Over time, Pitchfork gives a greater percentage of BNM to albums in this range.
Figure 7. Spotify Popularity Over Time
Notes: Again, only albums in the 8.1-8.9 score range are included. The y-axis denotes the average Spotify popularity for all albums in a given year. On average, more recent albums are more popular.
.2.3
.4.5
Perc
enta
ge o
f Alb
ums
With
BN
M
2004 2006 2008 2010 2012 2014Release Year
Change in BNM Frequency (For Albums With Score 8.1-8.9)
27
My initial regression of Spotify popularity on score and BNM suffers from
omitted variable bias. If an album’s release year is positively correlated with both its
Spotify popularity and its chances of BNM, then the initial regression overstates the
impact of BNM on Spotify popularity. Indeed, Column (4) of Table 3 shows that adding
release year as a control decreases the coefficient on BNM decreases to about 14.8,
though it is still large and highly significant. The coefficient on score, while still
negative, decreases in magnitude to -3.5 and is no longer significant from zero. Including
the interaction term shows that the effect of score on Spotify popularity is still
significantly different for albums that get BNM compared to those that don’t. These
results suggest that release year is a greater determinant of popularity than obscurity.
Figure 8 attempts to show the same relationship as in Figure 5, “controlling” for
release year. Though I am not able to formally control for release year, I compute an
adjusted popularity measure for each album. Mathematically,
𝑎𝑑𝑗𝑢𝑠𝑡𝑒𝑑𝑠𝑝𝑜𝑡𝑖𝑓𝑦! = 𝑠𝑝𝑜𝑡𝑖𝑓𝑦! − 𝑠𝑝𝑜𝑡𝚤𝑓𝑦_𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑦𝑒𝑎𝑟!
where i indexes albums, 𝑠𝑝𝑜𝑡𝑖𝑓𝑦! is the Spotify popularity calculated in the original
regression, and 𝑠𝑝𝑜𝑡𝚤𝑓𝑦_𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑦𝑒𝑎𝑟! is the mean popularity score for all albums
released in the same year as album i. By standardizing albums to the average popularity
of all albums released in the same year, I should be able to correct for the changes in
popularity over time observed in Figure 7. Compared to Figure 5c, the trend for non-
BNM albums in Figure 8 is noticeably less steep, though still negative, and the two trends
more closely mirror each other. Both trends are statistically insignificant from zero,
suggesting that the BNM award, not score, is what determines an album’s popularity.
28
Discussion
Based on the Google Trends results, I find strong evidence suggesting that the
BNM award has a large and statistically significant impact on album popularity
immediately upon release. On a 0-100 scale, albums with BNM displayed a change in
popularity about 15 to 17 higher than albums with no award. Access to Nielsen
SoundScan data would provide a more meaningful connection between a Pitchfork
review and an album’s success, but Google Trends gives a useful measure for how often
an album was searched for.
Given that Spotify is a better measure of actual consumption, I wanted to find the
correlation between an album’s Google Trends popularity the week of its release and its
-20
-10
010
20(M
ean)
Adj
uste
d Sp
otify
Pop
ular
ity
8 8.2 8.4 8.6 8.8 9Score
No Award Best New Music
Controls for Release YearAdjusted Spotify Popularity by BNM
Notes: Adjusted Spotify popularity attempts to control for differences in popularity over time. The popularity of each album is scaled by subtracting the mean popularity of all albums in its release year from the album’s own popularity score.
Figure 8. Adjusted Spotify popularity for all albums.
29
Spotify popularity. In order to obtain a measure of absolute search volume for each
album, I tried comparing every album to one benchmark album. The disparity between
popular and unpopular albums is so large that comparing them directly to each other
always returns a value of zero for the unpopular album. I reasoned that, by calculating a
ratio where I compared all albums to an album of “medium” popularity, I could then
compare all albums to each other. I used the album Parastropics by Mouse on Mars, as it
was popular enough to compare to the most popular albums, but unpopular enough to
compare to the least popular albums. Figures 9a-9c reinforce the problem and solution
visually. Comparing The Suburbs by Arcade Fire directly to Satan Is Real by the Louvin
Brothers is useless because the former album dwarfs the latter in popularity. But by
comparing each of those two albums to Parastropics by Mouse on Mars, I can compute a
ratio of the form
𝑎𝑏𝑠𝑜𝑙𝑢𝑡𝑒𝑝𝑜𝑝𝑢𝑙𝑎𝑟𝑖𝑡𝑦! =𝑝𝑜𝑝𝑢𝑙𝑎𝑟𝑖𝑡𝑦!𝑝𝑎𝑟𝑎𝑠𝑡𝑟𝑜𝑝𝑖𝑐𝑠
where 𝑝𝑜𝑝𝑢𝑙𝑎𝑟𝑖𝑡𝑦! is the peak popularity of album i on the graph, 𝑝𝑎𝑟𝑎𝑠𝑡𝑟𝑜𝑝𝑖𝑐𝑠 is the
peak popularity of the album Parastropics on the graph, and 𝑎𝑏𝑠𝑜𝑙𝑢𝑡𝑒𝑝𝑜𝑝𝑢𝑙𝑎𝑟𝑖𝑡𝑦! is the
absolute popularity of album i. This approach still misses some of the most popular and
least popular albums—in total, almost 2/3 of the dataset is missing, compared to only
52% when using the change in popularity method. Although this measure of popularity is
less precise and required an arbitrary benchmark, it allows me to correlate week-of
popularity with Spotify data. As expected, the correlation between Google and actual
consumption (Spotify) is a fairly small .24, suggesting that Google Trends does not
strongly predict the final popularity of albums. Still, I am able to show that BNM does, in
30
fact, create an initial buzz around albums that receive it. For future research, I will
attempt to obtain SoundScan data so I can relate my data directly to album sales.
Figure 9a. Comparing a highly popular album to a highly unpopular album in Google Trends.
Notes: Popularity is calculated on a 0-100 scale where 100 is the highest point on the graph—in this case, the release of Arcade Fire’s The Suburbs. In comparison to this album, Satin Is Real by the Louvin Brothers is so unpopular that it does not register on the graph. Comparing popular albums to unpopular ones will entirely miss the least popular albums.
Figure 9b. Comparing a highly popular album to a benchmark album in Google Trends.
Notes: Parastrophics by Mouse of Mars is an effective benchmark because it still registers on the graph when compared to some of the most popular albums like The Suburbs. The absolute popularity for The Suburbs is calculated by dividing its peak of 100 by the small peak of Parastrophics.
31
The results for Spotify popularity are somewhat unexpected and may suffer from
more unobserved variables given the longer time frame. On a 0-100 scale, it seems that
albums with BNM are roughly 15 to 17 higher in popularity. These magnitudes are
strikingly similar to those found in the Google Trends regressions, suggesting that the
large effect of BNM is both immediate and sustained over time. Bizarrely, albums
without BNM seem to strongly decrease in popularity as they increase in score. A closer
look at these high-score albums without BNM reveals them to be quite arcane and
strange. Pitchforks, perhaps, includes anticipated popularity into its BNM decisions, a
notion that violates my assumption that BNM is randomly assigned to albums of the same
score. However, when controlling for release year, this unexpected effect dissipates and
Figure 9c. Comparing a benchmark album to a highly unpopular album in Google Trends.
Notes: Using Parastrophics by Mouse of Mars as a benchmark allows me to capture the popularity of Satan Is Real, which was previously unobservable. The absolute popularity ratio is calculated by dividing Satan Is Real’s peak by Parastrophics’s peak.
32
becomes insignificant from zero. Including this control, along with the results from the
chi-square tests, still gives me confidence in this key assumption.
The extended timeframe of the Spotify data presents the possibility of other
confounds that could bias my results. Based on my small sample of 55 albums, it seems
possible that an appearance on a Pitchfork or Rolling Stone end-of-year list may give
albums that appear on those lists a significant boost in popularity. If bnm is correlated
with appearance on these lists, my estimates will be biased and may not capture the
causal effect of BNM on long-term popularity. If an appearance on these lists, rather than
BNM, causes an increase in popularity, then my estimates of BNM will be biased
upward. In the future, I hope to gather the data for every end-of-the-year list Pitchfork
has produced and include appearance on one of those lists as another control in my
regressions. Given Pitchfork’s ability to “break” new artists like Arcade Fire, I would be
interested to see if the effect of BNM is larger for an artist’s debut album. Additionally, it
would be fascinating to study the actual content of these reviews. A text mining analysis
could help reveal what phrases are predictive of higher scores and BNM.
Conclusion
This study contributes to the growing body of literature that finds significant
effects of critical reviews and awards on creative products such as movies, books, and
music. Similar to many of these studies, I am able to track popularity both at the
product’s release and over time. By taking advantage of Pitchfork’s grading system that
includes both a score and the possibility of a BNM accolade, I am able to arrive at a
causal estimate of BNM on popularity. Using Google Trends, I find that albums with
33
BNM display a greater change in popularity one week before to one week after their
release compared to albums with no award. That effect, on a 0-100 scale, is estimated to
be about 15-17 and is highly significant. The effect of BNM on Spotify popularity also
appears to be about 15-17 on a 0-100 scale that captures every album. I am able to
conclude that the “Pitchfork Effect” is, in fact real. In particular, Pitchfork’s “Best New
Music” award gives albums an immediate boost in popularity that appears to continue
over time.
Acknowledgements I would like to thank Myles Gurule, a talented computer scientist and even better
friend, who helped me obtain the Google Trends and Spotify data. This project would not
have been possible without his coding skills, generosity, and frequent advice.
References Basuroy, S., Chatterjee, S., & Ravid, S. A. (2003). How critical are critical reviews? The
box office effects of film critics, star power, and budgets. Journal of Marketing, 67(4), 103-117.
Caramanica, J. (2010). Upstart Music Site Becomes Establishment. The New York Times. Dhar, V., & Chang, E. A. (2009). Does chatter matter? The impact of user-generated
content on music sales. Journal of Interactive Marketing, 23(4), 300-307. Eliashberg, J., & Shugan, S. M. (1997). Film critics: Influencers or predictors?.The
Journal of Marketing, 68-78.
Goel, S., Hofman, J. M., Lahaie, S., Pennock, D. M., & Watts, D. J. (2010). Predicting consumer behavior with Web search. Proceedings of the National Academy of Sciences, 107(41), 17486-17490.
Itzkoff, D. (2006). The Pitchfork Effect. Wired, September, http://www. wired.
34
com/wired/archive/14.09/pitchfork. html.
Singer, Dan (2014). Music Critics See Their Role and Influence Waning in the Era of Digital Music. American Journalism Review. http://ajr.org/2014/11/13/music- critics-role-changing/
Sorensen, A. T. (2007). BESTSELLER LISTS AND PRODUCT VARIETY*. The journal of industrial economics, 55(4), 715-738.