Building and Managing Social Media Collections

71
Building and Managing Social Media Collections Laura Wrubel @liblaura Jason Casden @cazzerson Slides: http://j.mp/DLF_Social_Media DLF Forum October 27, 2015

Transcript of Building and Managing Social Media Collections

Page 1: Building and Managing Social Media Collections

Building and Managing Social Media Collections

Laura Wrubel @liblauraJason Casden @cazzerson

Slides: http://j.mp/DLF_Social_Media

DLF ForumOctober 27, 2015

Page 2: Building and Managing Social Media Collections

Outline

1. Introductions2. Tour of social media archives3. Ethical and legal discussion4. Questions for cultural heritage organizations5. Technical tools review6. Collecting workflows demo7. Wrap-up

Page 3: Building and Managing Social Media Collections

Introductions

● Have you done any work related to social media archives?

● What are you hoping to get out of this workshop?

Page 4: Building and Managing Social Media Collections

Social Media in Collections• 50% Social media data in collections, but not in

significant amounts

• 39% No social media in their collections

NCSU Social Media Archives Toolkit North Carolina C.H.O. survey

Page 5: Building and Managing Social Media Collections

“I strongly believe in the relevance of this information because it is the "front lines" of movement development--this is where the important ideas and debates are happening. Traditional academic spaces are usually behind (it takes 1-3 years for articles and books to be published) and, again, they tend to bias in favor of whites, men, and long-standing leaders. Ignoring social media means ignoring marginalized voices and it thus provides an incomplete picture of the movement.”

NCSU Social Media Archives Toolkit Researcher Survey

Page 6: Building and Managing Social Media Collections

Future value• 71% of surveyed researchers saw future value

in using social media as a source for research

• Only 51% of surveyed cultural heritage organizations thought it was likely that their institution would archive social media in the future

NCSU Social Media Archives Toolkit NCSU researcher survey

Page 7: Building and Managing Social Media Collections

More representative collections

Page 8: Building and Managing Social Media Collections
Page 9: Building and Managing Social Media Collections

Cases

Page 10: Building and Managing Social Media Collections
Page 11: Building and Managing Social Media Collections

“Twitter has been a public and open communications platform since its beginning. Twitter is donating an archive of what it determines to be public. Private account information and deleted tweets will not be part of the archive. Linked information such as pictures and websites is not part of the archive, and the Library has no plans to collect the linked sites. There will be at least a six-month window between the original date of a tweet and its date of availability for research use.”

The Library and Twitter: An FAQ, April 28, 2010

Page 12: Building and Managing Social Media Collections

Student life

Page 13: Building and Managing Social Media Collections

Community

Page 14: Building and Managing Social Media Collections

Donated accounts

Page 15: Building and Managing Social Media Collections

Institutional record

“Archiving social media,” UK National Archives.http://blog.nationalarchives.gov.uk/blog/archiving-social-media/

Page 16: Building and Managing Social Media Collections

WUSTL: Documenting Ferguson

Page 18: Building and Managing Social Media Collections

GWU Researchers

● Media and public affairs faculty and graduate students researching how media outlets and journalists use Twitter, how members of Congress tweet

● International relations graduate students studying how ISIS tweets

● Freshman writing seminar student analyzing tweets with hashtag #YesAllWomen and #BringBackOurGirls

● Business graduate students studying social media use by Korean companies

Page 19: Building and Managing Social Media Collections

What potential do you see for the archival and research value of social media?

Page 20: Building and Managing Social Media Collections

Funding

● Institute for Museum and Library Services○ National Leadership Grants [ODU/Archive-It]○ Library Services and Technology Act (LSTA)

Grants [NCSU]○ Sparks Innovation Grants

● NEH / ODH - Digital Humanities Start-Up Grants [Univ. of Florida]

● National Historical Publications and Records Commission (NHPRC) [GWU]

● Council on East Asian Libraries (from Mellon) [JHU, GWU, Georgetown]

Page 21: Building and Managing Social Media Collections

Ethical and legal issues

Page 22: Building and Managing Social Media Collections

Ethical and legal discussion scenarios

Form a group of 2-3 people who have selected the same scenario as you.

1. What legal and ethical issues do you see arising in these scenarios?

2. What are some ways you might address and manage these issues?

Page 23: Building and Managing Social Media Collections

Scenario #1

A researcher writing a book visits your library to use the university archives and study student activism related to the environment. The university archives has collected tweets by several university-sponsored environmental clubs and has around 5,000 tweets from eight clubs over two years. The researcher would like to use the social media collection as part of her research.

Page 24: Building and Managing Social Media Collections

Scenario #2

A local person of prominence has donated their personal papers to your library’s archives. They also have exported their Facebook account data and would like to include that in their donation. This data includes their posts, messages, photos, and videos as well as all other information in the Facebook-supported account download feature.

Page 25: Building and Managing Social Media Collections

Scenario #3

A faculty member is using Twitter as a discussion medium in her class on public policy. Students are asked to tweet as part of their class participation. The professor knows that your library is able to collect tweets and asks if you can help her in collecting tweets by her students for the purpose of class evaluation.

Page 26: Building and Managing Social Media Collections

Scenario #4

Your university has a well-regarded political science department. To support faculty and students exploring the role of social media in elections, your library has been proactively collecting tweets by presidential candidates and tweets using particular hashtags during the presidential debates. The collection currently contains close to a million tweets over two years. A faculty member is researching differences in communication patterns by party and requests your dataset.

Page 27: Building and Managing Social Media Collections

Developing a collecting program

Page 28: Building and Managing Social Media Collections

“If we are to begin actively archiving and using social media content, plans need to be developed as to what we are saving and who social media portrays and how it portrays individuals and large communities.”NCSU Social Media Toolkit Researcher Survey

Page 29: Building and Managing Social Media Collections

Collecting strategies

● Hashtags● Searches● Account targeting● Friend networks● Geolocation● Donations

Page 30: Building and Managing Social Media Collections

Account spreadsheet

Page 31: Building and Managing Social Media Collections

Hashtag calendar

Page 32: Building and Managing Social Media Collections

Role of the institution

● How do we handle consent?● These items are ephemeral, but not unique, right?● How do we determine what to collect?● Are there special preservation considerations?

Page 33: Building and Managing Social Media Collections

What is the item?

Page 34: Building and Managing Social Media Collections

Web content?

Page 36: Building and Managing Social Media Collections

Images?

Page 37: Building and Managing Social Media Collections

Associated content?

● Linked web pages● Responses● Videos and other media● Retweeting accounts● Engagement metrics

Page 38: Building and Managing Social Media Collections

Vendor API responses?

Page 39: Building and Managing Social Media Collections

{ contributors: null, truncated: false, text: "We love it when artists like @cyndilauper speak up for our youth!

#EndYouthHomelessness u2013 On C-SPAN http://t.co/Gw17OHyTiO #edchat", in_reply_to_status_id: null, id: 524985632775741440, favorite_count: 7, source: "<a href='http://twitter.com' rel='nofollow'>Twitter Web Client</a>", retweeted: false, coordinates: null, entities: {

symbols: [ ], user_mentions: [

{ id: 74501824, indices: [

29, 41

], id_str: "74501824", screen_name: "cyndilauper", name: "Cyndi Lauper"

}

Twitter

Page 40: Building and Managing Social Media Collections

], hashtags: [

{ indices: [

66, 87

], text: "EndYouthHomelessness"

}, {}

], urls: [

{ url: "http://t.co/Gw17OHyTiO", indices: [

101, 123

], expanded_url: "http://cs.pn/1FCx6KY", display_url: "cs.pn/1FCx6KY"

} ]

},

Twitter

Page 41: Building and Managing Social Media Collections

in_reply_to_screen_name: null, in_reply_to_user_id: null, retweet_count: 7, id_str: "524985632775741440", favorited: false,

geo: null, in_reply_to_user_id_str: null, possibly_sensitive: false, lang: "en", created_at: "Wed Oct 22 18:08:23 +0000 2014", in_reply_to_status_id_str: null, place: null,

user: {

follow_request_sent: false, profile_use_background_image: false, profile_text_color: "333333", default_profile_image: false, id: 22789766, profile_background_image_url_https: "https://pbs.twimg.

com/profile_background_images/70908209/NYLono_MercerCo_LarchmontElem_182.jpg_twitter.jpg",

verified: true, profile_location: null, profile_image_url_https: "https://pbs.twimg.

com/profile_images/502152204040425472/eVCt0lz8_normal.jpeg", profile_sidebar_fill_color: "DDEEF6",

Twitter

Page 42: Building and Managing Social Media Collections

{ "data": { "type": "image", "users_in_photo": [{ "user": { "username": "kevin", "full_name": "Kevin S", "id": "3", "profile_picture": "..." }, "position": { "x": 0.315, "y": 0.9111 } }], "filter": "Walden", "tags": [], "comments": { "data": [{ "created_time": "1279332030", "text": "Love the sign here", "from": { "username": "mikeyk",

Instagram

Page 43: Building and Managing Social Media Collections

{ "created_time": "1279341004", "text": "Chilako taco", "from": { "username": "kevin", "full_name": "Kevin S", "id": "3", "profile_picture": "..." }, "id": "3" }], "count": 2 }, "caption": null, "likes": { "count": 1, "data": [{ "username": "mikeyk", "full_name": "Mikeyk", "id": "4", "profile_picture": "..." }] }, "link": "http://instagr.am/p/D/", "user": { "username": "kevin", "full_name": "Kevin S", "profile_picture": "...", "id": "3" }, "created_time": "1279340983", "images": { "low_resolution": { "url": "http://distillery.s3.amazonaws.com/media/2010/07/16/4de37e03aa4b4372843a7eb33fa41cad_6.jpg", "width": 306, "height": 306 },

Instagram

Page 44: Building and Managing Social Media Collections

Metadata from harvesting software

Page 45: Building and Managing Social Media Collections

What is the container?

● Should we mix content from multiple platforms?

● How do we define container boundaries?● How do we describe containers?

Page 46: Building and Managing Social Media Collections

What is the collection?

● To what extent are these artificial collections?● Should these materials be integrated into existing

collections?

Page 47: Building and Managing Social Media Collections

Access policies

● Can we balance privacy and research value?● Can we provide research access while adhering

to the Terms of Service?○ “Hydration?”

● How do researchers browse materials?

Page 48: Building and Managing Social Media Collections

Building research datasets

● Dataset stability and decay○ Snapshots○ Deletion

■ All Tweets will eventually be deleted● Reproducibility● Data sharing● Research area restrictions

Page 49: Building and Managing Social Media Collections

Tools and methods

Page 50: Building and Managing Social Media Collections

“Along with email, social media will probably provide the main source of information for researchers studying our current time. However, our institution just does not have the resources right now to collect and store the social media of other people or organizations.”

NCSU Social Media Archives Toolkit C.H.O. survey

Page 51: Building and Managing Social Media Collections

What are your goals? ● create archival / special collections● support current faculty research ● support students with class projects

Page 52: Building and Managing Social Media Collections

What data do you need?● current and going forward; recent or far past● metadata● images and other media referenced● comments, responses, conversation

Page 53: Building and Managing Social Media Collections

What do you want to do? ● analyze, visualize● archive, locally accession● play back ● hydrate

Page 54: Building and Managing Social Media Collections

What are your financial resources?

Page 55: Building and Managing Social Media Collections

What are your technical resources?

Page 56: Building and Managing Social Media Collections

Some of the many optionsCommercial

● Gnip● Texifter● Crimson Hexagon● Sysomos● Archive-It● ArchiveSocial● Radian6, Sprout Social,

HootSuite

Free / open source

● TAGS● NodeXL● IFTTT● R (twitteR)● twarc● Social Feed Manager

[Twitter, Tumblr*, Flickr*]● lentil [Instagram]● youtube-dl [YouTube]● MassMine* [Twitter, Tumblr]

*pre-release

Page 57: Building and Managing Social Media Collections

Twitter Collection Options

Commercial:Bulk data purchase(Gnip, Texifter)

Commercial: Firehose access(Gnip)

Commercial:Value-added platform (ArchiveSocial, Texifter)

TAGS, NodeXL

twarc Social Feed Manager

Real-time data x x x x x x

Historical data x* x* past week

user data only, limited

Collect by user handle x x x x x x

Collect by filter / hashtag x x x x x x

Collect sample stream x x x x

High reliability (backfill and redundancy)

x x x

Built-in analysis or visualization tools

x x x

CSV export x x x x (user data)

Free x x x

Requires some local technical expertise

x depends x x

Page 58: Building and Managing Social Media Collections
Page 59: Building and Managing Social Media Collections
Page 60: Building and Managing Social Media Collections
Page 61: Building and Managing Social Media Collections
Page 62: Building and Managing Social Media Collections

twarc

Page 63: Building and Managing Social Media Collections

twarc-report visualizations with d3

Page 64: Building and Managing Social Media Collections

twarc-report visualizations with d3

Page 65: Building and Managing Social Media Collections

lentil

“The Shawu150 Project: Viewing DH from an HBCU,” Desiree Dighton.

Page 67: Building and Managing Social Media Collections

Wrap-up

Page 68: Building and Managing Social Media Collections

Bibliography“National Archives and Records Administration White Paper on Best Practices for the Capture of Social Media Records,” May 2013. http://www.archives.gov/records-mgmt/resources/socialmediacapture.pdf.

Beckles, Julian, Samuel Collins, Glenn Daniels, Natalie Demyan, Matthew Durington, Cara Heasley, and David Rico. “Tagging Culture: Building a Public Anthropology through Social Media.” Human Organization 72, no. 4 (December 1, 2013): 358–68.

boyd, danah, and Kate Crawford. “Critical Questions for Big Data.” Information, Communication & Society 15, no. 5 (June 1, 2012): 662–79. doi:10.1080/1369118X.2012.678878.

Bruns, Axel, and Tim Highfield. “POLITICAL NETWORKS ON TWITTER: Tweeting the Queensland State Election.” Information, Communication & Society 16, no. 5 (June 2013): 667–91. doi:10.1080/1369118X.2013.782328.

Casden, Jason and Brian Dietz (co-PI). Social Media Archives Toolkit. http://www.lib.ncsu.edu/social-media-archives-toolkit

Cohen, Dan. “Digital Ephemera and the Calculus of Importance.” Dan Cohen, May 17, 2010. http://www.dancohen.org/2010/05/17/digital-ephemera-and-the-calculus-of-importance/.

Page 69: Building and Managing Social Media Collections

Collins, Samuel, Matthew Durington, Glenn Daniels, Natalie Demyan, David Rico, Julian Beckles, and Cara Heasley. “Tagging Culture: Building a Public Anthropology through Social Media.” Human Organization 72, no. 4 (November 13, 2013): 358–68. doi:10.17730/humo.72.4.v5x0205248427516.

Dash, Anil. “What Is Public? — The Message.” Medium. Accessed August 12, 2014. https://medium.com/message/what-is-public-f33b16d780f9.

Dixon, Kitsy. “Feminist Online Identity: Analyzing the Presence of Hashtag Feminism.” Journal of Arts & Humanities 3, no. 7 (2014): 34–40.

“Ethical Decision-Making and Internet Research Recommendations from the AoIR Ethics Working Committee (Version 2.0),” 2012. http://aoir.org/reports/ethics2.pdf

Halegoua, Germaine R., and Raz Schwartz. “The Spatial Self: Location-Based Identity Performance on Social Media.” New Media & Society, April 9, 2014, 1–18. doi:10.1177/1461444814531364.

Jules, Bergis. “Documenting the Now: #Ferguson in the Archives — On Archivy.” Medium, April 8, 2015. https://medium.com/on-archivy/documenting-the-now-ferguson-in-the-archives-adcdbe1d5788.

Lomborg, Stine. “Personal Internet Archives and Ethics.” Research Ethics 9, no. 20 (2013). doi:10.1177/1747016112459450.

Marshall, Catherine C. “Rethinking Personal Digital Archiving, Part 1: Four Challenges from the Field.” D-Lib Magazine, April 2008. http://www.dlib.org/dlib/march08/marshall/03marshall-pt1.html#Top.

Page 70: Building and Managing Social Media Collections

Nathan, Lisa P., and Elizabeth Shaffer. “Preserving Social Media: Opening a Multi-Disciplinary Dialogue.” UNESCO, n.d. http://www.unesco.org/new/fileadmin/MULTIMEDIA/HQ/CI/CI/pdf/mow/VC_Nathan_Shaffer_27_B_1140.pdf.

Rivero, Enrique. “Twitter ‘Big Data’ Can Be Used to Monitor HIV and Drug-Related Behavior, UCLA Study Shows.” UCLA Newsroom, February 26, 2014. http://newsroom.ucla.edu/portal/ucla/twitter-big-data-can-be-used-to-250162.aspx.

Storrar, Tom. “Archiving Social Media.” The National Archives, May 8, 2014. http://blog.nationalarchives.gov.uk/blog/archiving-social-media/.

Summers, Ed. “An Invitation to Study Ferguson — On Archivy.” Medium, December 3, 2014. https://medium.com/on-archivy/an-invitation-to-study-ferguson-367b423cff29.

Tufekci, Zeynep. “Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls.” arXiv:1403.7400 [physics], March 28, 2014. http://arxiv.org/abs/1403.7400.

Zimmer, Michael, and Nicholas John Proferes. “A Topology of Twitter Research: Disciplines, Methods, and Ethics.” Aslib Journal of Information Management 66, no. 3 (2014): 250–61.

Zimmer, Michael. “The Twitter Archive at the Library of Congress: Challenges for Information Practice and Information Policy.” First Monday 20, no. 7 (June 21, 2015). http://firstmonday.org/ojs/index.php/fm/article/view/5619.

Page 71: Building and Managing Social Media Collections

Social Feed Manager is supported by the National Historical Publications & Records Commission

Grant NAR14-DI-50017-14 (2014-2017)