San Francisco Bay Area News Ecology

24
San Francisco San Francisco Bay Area News Bay Area News Ecology Ecology Daniel Ramos Daniel Ramos CS790G CS790G Fall 2010 Fall 2010

description

San Francisco Bay Area News Ecology. Daniel Ramos CS790G Fall 2010. Outline. Introduction Related Work Methodology Conclusion. Evolution of News. Keeping current on events has changed radically. “Mass Media” Radio, television, newspapers, magazines, books, etc. “Networked Media” - PowerPoint PPT Presentation

Transcript of San Francisco Bay Area News Ecology

Page 1: San Francisco Bay Area News Ecology

San Francisco San Francisco Bay Area News Bay Area News

Ecology Ecology Daniel RamosDaniel Ramos

CS790GCS790G

Fall 2010Fall 2010

Page 2: San Francisco Bay Area News Ecology

OutlineOutline

IntroductionIntroduction

Related WorkRelated Work

MethodologyMethodology

ConclusionConclusion

Page 3: San Francisco Bay Area News Ecology

Evolution of NewsEvolution of News

Keeping current on events has changed Keeping current on events has changed radically.radically.

““Mass Media”Mass Media” Radio, television, newspapers, magazines, Radio, television, newspapers, magazines,

books, etc.books, etc.

““Networked Media”Networked Media” Based on the Internet.Based on the Internet. Collaborative and global in nature.Collaborative and global in nature.

Page 4: San Francisco Bay Area News Ecology

Role of JournalistsRole of Journalists

Traditional ReportingTraditional Reporting Journalists worked mostly alone and Journalists worked mostly alone and

locally.locally. National/International news from other National/International news from other

organizations (Associated Press).organizations (Associated Press).

Network Media ReportingNetwork Media Reporting Journalists can easily talk to others across Journalists can easily talk to others across

the globe.the globe. They can freelance for many news outlets.They can freelance for many news outlets.

Page 5: San Francisco Bay Area News Ecology

Project GoalsProject Goals

Use network analysis to characterize Use network analysis to characterize a “news ecosystem”.a “news ecosystem”. Traditional news outlets are shrinking.Traditional news outlets are shrinking. Start-up news organizations are quickly Start-up news organizations are quickly

forming.forming. Using the San Francisco Bay area.Using the San Francisco Bay area. Transitioning from “mass media” to Transitioning from “mass media” to

“networked media”?“networked media”?

Page 6: San Francisco Bay Area News Ecology

Tracing TiesTracing Ties

Between news organizationsBetween news organizations

Between reporters and anybody elseBetween reporters and anybody else

Between users and news organizationsBetween users and news organizations

Potentially measure the density of tiesPotentially measure the density of ties

Page 7: San Francisco Bay Area News Ecology

Related WorkRelated Work

I. Himelboim, “The International I. Himelboim, “The International Network Structure of News Media: Network Structure of News Media: An Analysis of Hyperlinks Usage in An Analysis of Hyperlinks Usage in News Web sites”News Web sites” Analyzed 6,298 foreign news stories.Analyzed 6,298 foreign news stories. 223 news web sites.223 news web sites. 73 countries.73 countries. Studied use of external hyperlinks.Studied use of external hyperlinks.

Page 8: San Francisco Bay Area News Ecology

Related Work (Cont.)Related Work (Cont.) Found news sites rarely used external Found news sites rarely used external

hyperlinks.hyperlinks. Only 6% had one or more.Only 6% had one or more.

If they did, it followed patterns based on:If they did, it followed patterns based on: Preferential Attachment TheoremPreferential Attachment Theorem World System TheoryWorld System Theory

ConclusionsConclusions Journalists trained to not reveal sources.Journalists trained to not reveal sources. Distrust for outside sources.Distrust for outside sources. Lead users away from the news site.Lead users away from the news site.

Page 9: San Francisco Bay Area News Ecology

Related Work (Cont.)Related Work (Cont.)

Gordon, Contractor, and Johnson, Gordon, Contractor, and Johnson, “Linking Audiences to News and “Linking Audiences to News and Information: A Network Analysis of Information: A Network Analysis of Chicago Websites”Chicago Websites” Collected a list of 277 “seed sites”Collected a list of 277 “seed sites” Categorized the sites into: legacy, legacy-Categorized the sites into: legacy, legacy-

affiliated, micropublisher, affiliated, micropublisher, organization/institution, national brand, organization/institution, national brand, and service.and service.

Used a web crawler to examine links.Used a web crawler to examine links.

Page 10: San Francisco Bay Area News Ecology

Related Work (Cont.)Related Work (Cont.)

ConclusionConclusion Organizations are authorities.Organizations are authorities. Micropublishers and organizations are Micropublishers and organizations are

hubs.hubs. Organizations are intermediaries and Organizations are intermediaries and

switchboards.switchboards.

Organizations are most prestigious.Organizations are most prestigious.

Page 11: San Francisco Bay Area News Ecology

MethodologyMethodology

Use network theory to study three Use network theory to study three main ties:main ties:

News Organization to News News Organization to News OrganizationOrganization

Journalists to the “Community”Journalists to the “Community”

Commenters to News OrganizationsCommenters to News Organizations

Page 12: San Francisco Bay Area News Ecology

News Organization TiesNews Organization Ties

Compiled a list of 143 different web sites Compiled a list of 143 different web sites we feel encompasses our news ecosystem.we feel encompasses our news ecosystem. Traditional news outlets web sitesTraditional news outlets web sites BlogsBlogs Other non-traditional (e.g. news aggregators)Other non-traditional (e.g. news aggregators)

Use a web crawler to crawl the seed sites Use a web crawler to crawl the seed sites and record all external links to a and record all external links to a database.database. Each site will be its own network at first.Each site will be its own network at first.

Page 13: San Francisco Bay Area News Ecology

News Organization TiesNews Organization Ties

Won’t record duplicates, but will Won’t record duplicates, but will record number of references.record number of references.

News Organization graphs will be News Organization graphs will be generated from the database.generated from the database. Nodes are websites.Nodes are websites. Edges are directional hyperlink Edges are directional hyperlink

references.references. Edge weights are number of times linked.Edge weights are number of times linked.

Page 14: San Francisco Bay Area News Ecology

News Organization TiesNews Organization Ties

Categorizing LinksCategorizing Links First pass will be try to categorize news First pass will be try to categorize news

sites if they match the seed site list.sites if they match the seed site list. Second pass will require manual human Second pass will require manual human

codingcoding Remove all links deemed not a news Remove all links deemed not a news

organizationorganization

Merge all independent networks Merge all independent networks together.together.

Page 15: San Francisco Bay Area News Ecology

News Organization TiesNews Organization Ties

MetricsMetrics Degree (both in and out)Degree (both in and out)

Determine hubs and authorities.Determine hubs and authorities. BetweennessBetweenness

Determine which sites link otherwise Determine which sites link otherwise unconnected sites.unconnected sites.

CentralityCentrality Determine which sites are important to the Determine which sites are important to the

network.network.

Page 16: San Francisco Bay Area News Ecology

Journalists to the Journalists to the Community Community

Determine the linking patterns of Determine the linking patterns of reporters who publish on the seed sites.reporters who publish on the seed sites. Traditional writing versus using the web to Traditional writing versus using the web to

its full potential.its full potential.

Use a web crawler to crawl the seed Use a web crawler to crawl the seed sites and record all external links to a sites and record all external links to a database.database. Focus on only a few larger sites.Focus on only a few larger sites. No standard for bylines of article authors.No standard for bylines of article authors. Requires site specific crawling rules.Requires site specific crawling rules.

Page 17: San Francisco Bay Area News Ecology

Journalists to the Journalists to the CommunityCommunity

Journalist graph will be generated Journalist graph will be generated from the database.from the database. Forms a bipartite graph.Forms a bipartite graph. Nodes are authors and sites.Nodes are authors and sites. Edges are an author linking a site.Edges are an author linking a site. Some manual human coding required to Some manual human coding required to

remove non-community sites.remove non-community sites.

Page 18: San Francisco Bay Area News Ecology

Journalists to the Journalists to the CommunityCommunity

MetricsMetrics

Degree (both for journalists and sites)Degree (both for journalists and sites) Determine which authors cite more oftenDetermine which authors cite more often

Determine which sites are referenced most Determine which sites are referenced most often.often.

Page 19: San Francisco Bay Area News Ecology

Commenters to News Commenters to News Organizations Organizations

Determine the patterns of users who Determine the patterns of users who comment on stories the seed sites.comment on stories the seed sites. How do they interact with news How do they interact with news

organizations and each other?organizations and each other? Use a web crawler to crawl the seed Use a web crawler to crawl the seed

sites and record all commenters to a sites and record all commenters to a database.database. Focus on only a few larger sites.Focus on only a few larger sites. No standard for user comments and No standard for user comments and

accounts.accounts. Requires site specific crawling rules.Requires site specific crawling rules.

Page 20: San Francisco Bay Area News Ecology

Commenters to News Commenters to News OrganizationsOrganizations

Commenter graph will be generated Commenter graph will be generated from the database.from the database. Forms a bipartite graph.Forms a bipartite graph. Each site will be its own graph.Each site will be its own graph. Nodes are commenters and news stories.Nodes are commenters and news stories. Edges are a user commenting on a story.Edges are a user commenting on a story. Might require some manual human Might require some manual human

coding to remove spam & bots.coding to remove spam & bots.

Page 21: San Francisco Bay Area News Ecology

Commenters to News Commenters to News OrganizationsOrganizations

MetricsMetrics Degree (both for users and stories)Degree (both for users and stories)

Determine which users comment most.Determine which users comment most. Determine which stories garner most Determine which stories garner most

attention.attention. Transform to a 1-mode network leaving Transform to a 1-mode network leaving

users.users. Edge weights are how many of the same Edge weights are how many of the same

stories two users commented on.stories two users commented on. Do users form clusters and communities?Do users form clusters and communities?

Page 22: San Francisco Bay Area News Ecology

ToolsTools

WebSPHINXWebSPHINX

PajekPajek

GUESS or GephiGUESS or Gephi

Page 23: San Francisco Bay Area News Ecology

ConclusionConclusion

Is news media transitioning because Is news media transitioning because of new technologies like it has in the of new technologies like it has in the past?past?

How is the Internet affecting news How is the Internet affecting news outlets, journalists, and readers?outlets, journalists, and readers?

Hopefully, network theory and Hopefully, network theory and analysis can help find these answers.analysis can help find these answers.

Page 24: San Francisco Bay Area News Ecology

ReferencesReferences [1] I. Himelboim, "The International Network [1] I. Himelboim, "The International Network

Structure of News Media: An Analysis of Structure of News Media: An Analysis of Hyperlinks Usage in News Web sites," Journal Hyperlinks Usage in News Web sites," Journal of Broadcasting & Electronic Media, Volume of Broadcasting & Electronic Media, Volume 54, Issue 3, pp. 373-390, July 2010.54, Issue 3, pp. 373-390, July 2010.

[2] R. Gordon, N. Contractor, and Z. P. [2] R. Gordon, N. Contractor, and Z. P. Johnson, "Linking Audiences to News and Johnson, "Linking Audiences to News and Information: A Network Analysis of Chicago Information: A Network Analysis of Chicago Websites," unpublished, Websites," unpublished, http://www.cct.org/sites/cct.org/files/CNM_Linkhttp://www.cct.org/sites/cct.org/files/CNM_LinkingAudiences1.pdfingAudiences1.pdf