Building the Open Computational Social Science...

54
Computational Social Science Data Gathering Analyzing content Conclusion Building the Open Computational Social Science Toolbox Wouter van Atteveldt June 2019 Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Transcript of Building the Open Computational Social Science...

Page 1: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Building the Open Computational SocialScience Toolbox

Wouter van Atteveldt

June 2019

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 2: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CSS: What? Why? How?

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 3: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CSS: What? Why? How?

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 4: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CSS: What? Why? How?

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 5: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CSS: What? Why? How?

CSS & Societal Resilience• What, why, how?• Building the Toolchain:

• Data gathering: scraping & tracking• Analysis: Text and beyond

• Open Science and Research Transparency

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 6: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CSS: What? Why? How?

Computational Communication Research

Welcoming your submissions!

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 7: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CSS: What? Why? How?

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 8: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CSS: What? Why? How?

Computational Social Science

• Our life is increasingly online• Leaving ’digital traces’• Which can be analysed to study social behaviour

(Lazer et al., 2009, science)

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 9: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CSS: What? Why? How?

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 10: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CSS: What? Why? How?

Why now?

• Explosive increase in available data, tools, processing• (and many "big data" is communicative)

• Potential radical boost to study of communication• But has numerous problems, challenges, pitfalls

(Van Atteveldt & Peng, 2018)

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 11: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CSS: What? Why? How?

CSS: challenges & pitfalls

• Accessibility of data• Representativeness/validity of ’found’ data• Validity of computational methods• Ethical conduct• Skills & Infrastructure

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 12: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CSS: What? Why? How?

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 13: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CSS: What? Why? How?

Elements of the ’microscope’

• Data: How do we get the ’digital traces’?• Analysis: From (textual) traces to data• Open Science: Resilient Science?

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 14: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Goals and challenges

CSS & Societal Resilience• What, why, how?• Building the Toolchain:

• Data gathering: scraping & tracking• Analysis: Text and beyond

• Open Science and Research Transparency

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 15: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Goals and challenges

Why do we need digital trace data?

Fragmentation of information• Minimal mass media effects?• TV as last homogenous medium, demographically

challenged?Specific effects of online communication

• E.g. Fear of online filter bubbles / polarisation

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 16: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Goals and challenges

Ideal goal

• Message consumption/production data• Full text and metadata• Of representative sample of population

• and/or fully connected subsample(s)• Linked with attitude/behaviour measures

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 17: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Goals and challenges

Getting text: Technical challenges

• News: can be scraped, retrieved via Nexis etc• Paywalls make it more difficult• Social media companies tries to block scraping

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 18: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Goals and challenges

"We are in the post-API age"

(Deen Freelon, PolComm, forthcoming)Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 19: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Goals and challenges

Getting text: Legal challenges

• Copyright/database law can block scraping, blockssharing

• Contract law can block scraping of restricted content• Hacking laws might make scraping actually illegal• Laws are uncertain and vary over jurisdictions/time• Many researchers are anarchists, many institutions

cautious(IANAL!)

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 20: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Goals and challenges

Gathering digital trace dataDesktop browsing:

• Desktop plugin(e.g. ASCoR personalized communication)

• History donation(e.g. Web Historian; Menchen-Trevino)

Mobile app use• App can access (e.g. MobileDNA)

Mobile phone logs• App can access (e.g. Kobayashi & Boase)

Mobile news browsing• problem: how to get it :-)

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 21: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Mobile news tracking

Mobile news viewing: Challenges

We want to know what people see on their mobile, but• Most mobile browsers don’t allow plugins• HTTPS/encrypted app communication makes

proxy/MITM difficult• esp. combined with certificate pinning

• Most apps have in-app browsing; proprietary protocols

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 22: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Mobile news tracking

Mobile news viewing: possibilities

1 Browser sync + desktop plug-in / application• Can build on browser plugins

2 GDPR requests by user• No facility to make request on behalf user• Instructions needed for each app• (need something akin to FSD)

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 23: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Mobile news tracking

CCS.Amsterdam: Tracking the filter bubble

• NWO-Joint Escience Data Science (JEDS) programme• Goal: develop mobile tracking, analyse effects of mobile

news on attitudes• Team:

• Social science: me, Damian Trailling (UvA) , JudithMoller (UvA), Felicia Locherbach (VU)

• Engineering: Antske Fokkens (VU), Laura Hollink(CWI), Jisk Attema (NLeSC), Laurens Bogaardt(NLeSC)

• Law/normative theory: Natali Helberger (UvA)

(See Van Atteveldt et al., ICA 2019; and ICA postconf)

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 24: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Goals and challenges

CSS & Societal Resilience• What, why, how?• Building the Toolchain:

• Data gathering: scraping & tracking• Analysis: Text and beyond

• Open Science and Research Transparency

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 25: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Goals and challenges

The need for content analysis

• Trace data often partially unstructured/symbolic (text,speech, image, video)

• Need to convert ’text to data’• Measure relevant quantities• In a valid and scalable way

• Focus on text• But really cool stuff is happening with image analysis,

see e.g. ICA pre-conf & panel!

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 26: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Goals and challenges

What are the relevant quantities?Depends on RQ, but often see message as (collection of)statement(s):

• source• topic and/or target• tone/sentiment

This can yield measurement per message, or per statement• Per message is easier, but many texts contain multiple

statements• Construct semantic network from text

• (Core Sentence approach; political claims analysis; NET)

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 27: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Goals and challenges

What techniques do we need?• Identifying actors: (easiest)

• Dictionaries, Named Entity Recognition, Coreferenceresolution

• Identifying issues/topics: (doable)• "Automatic text classification"• Dictionaries, (Structural) Topic modeling, Supervised

machine learning• Identifying tone/sentiment: (hard!)

• "Sentiment anlysis"/"Opinion extraction"• Dictionaries, Supervised machine learning

• From text to statements• "Semantic Role Labeling" (sort of)• Syntactic analysis, Supervised machine learning

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 28: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

CCS.Amsterdam: Text analysis

• Multiple projects on Sentiment analysis, syntacticanlaysis, deep learning, crowd coding, topic modeling, etc

• Members (i.a.): Damian Trilling (UvA), Anne Kroon(UvA), Kasper Welbers (VU), Antske Fokkens (VU)

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 29: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

The importance of sentiment analysis

• Many theories in (political) communication connected totone

• Issue positions• Negative campaigning• Conflict news• Reviews, reputation, etc

• Tone is notoriously hard to define & measure(automatically)

• Ambiguous• Creative

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 30: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

Test case: Dutch economic news

• Is news positive or negative about economy?• Interesting for retrospective voting, framing, news bias• Should be ‘best-case’ scenario for automatic analysis

• Relatively unambiguous• Relatively factual

• RQ: Can we automatically measure the tone of economicnews?

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 31: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

How do off-the-shelf dictionaries do?

• Mark Boukes et al., ICA 2018• Compare undergrad coders with existing dictionaries

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 32: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 33: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

What else can we try?

Compare (triple-coded) gold standard with:• Undergrads• Dictionaries• Crowd coding• (translation + dictionaries)• Machine learning

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 34: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

Gold standard

• Selected ~300 headlines from ICR sample• Coded independently by me, Mark, Mariken van der

Velden (α=.78)• All differences resolved except some remaining

disagreements• (E.g. “Interest rates hit zero”, “Greece will be fine for a

couple more weeks”, “Aging population puts brake onhouse prices”)

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 35: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

Crowd coding

• Crowd coding promising solution for sentiment coding• Decision is simple / “intuitive”• More cheap coders > Fewer better coders• Method:

• Same n~300 sentences• Each sentences coded by ~5 coders• (.02$ per sentence/coders, <50$ total)• Simple instructions, single question• Use gold questions to filter coders

[note: current results based on subset of data]

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 36: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 37: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 38: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

Machine learning

• Train on 6,203 manually coded headlines• Test on gold sample• Compare:

• ’Traditional’ SVM on document-lemma matrix• ’Deep learning’ Convolutional Neural Network with word

embeddings

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 39: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

Problems with classical machine learning

• Data scarcity• Never enough coded data available• More parameters than cases• Words not in training material have unknown ’meaning’

• Simplistic representation• Bag of words

• "it wasn’t bad, it was actually quite good"• "it wasn’t good, it was actually quite bad"

• Richer features increase data scarcity (and requiredomain/NLP knowledge)

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 40: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

Solution part 1: Word embeddings• Bag of words treats each word as unique• Similar words can be treated similarly• Treat each word as (relatively small) vector of scores, so:

• unseen word can be interpolated• fewer parameters need to be trainied

• Embedding vectors based on (very large) uncoded textcollection

• Amsterdam Embedding Model trained on >10M newsarticles (Kroon et al, ICA 2019)

• Trained to maximize prediction of each word based oncontext window

Note: essentially dimensionality reduction similar to factor analysis,topic modeling, latent semantic indexing (but more effective due totraining on more data and using explicit contexts)

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 41: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

Solution part 2: "Deep Learning"

• Deep learning builds richer features as part of training• Possibility to use context of words• See: Yoav Goldberg, Neural Network Methods for Natural

Language Processing; Anne Kroon et al, 2019 ICA onDutch embeddings.

• This study: Convolutional Neural Network• Method originating from image analysis• N-grams of words representations are concatenated,

pooled per unit• Pooled output is then used as input for regular learning

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 42: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

Convolutional Neural Network

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 43: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

Results

• How do all methods compare to gold standard?• How do methods correlate with each other?

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 44: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 45: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

CCS.Amsterdam: Sentiment Analysis

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 46: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Open Science: Sharing & Transparancy

CSS & Societal Resilience• What, why, how?• Building the Toolchain:

• Data gathering: scraping & tracking• Analysis: Text and beyond

• Open Science and Research Transparency

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 47: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Open Science: Sharing & Transparancy

What is open science?

• Research Transparency:• data access• transparent design• analytical transparency

• Oppenness boosts:• Reproducibility• Robustness• Replicability• Generalizability

(e.g. Bowman & Keene, 2018; Klein et al., 2018; Munafò etal., 2017; Nosek et al., 2015)

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 48: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Open Science: Sharing & Transparancy

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 49: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Open Science: Sharing & Transparancy

Why open CSS?

Opportunity:• Data and tools are digital• Culture of open source, sharing• Possibility for reproducible research

Need:• Big data can be abused easily• Skills and tools can be scarce• Strong need for openness: share, inspect, improve

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 50: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Open Science: Sharing & Transparancy

Open CSS: ChallengesData sharing challenges:

• Fear of being scooped• Proprietary data, copyright, legal uncertainty• Not enough incentives

Code/Tool sharing challenges:• Fear of being caught making mistakes• Effort required to turn research code into software• Not enough incentives

(Van Atteveldt et al., 2019, IJoC; Van Atteveldt et al., 2019,CCR)

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 51: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

Open Science: Sharing & Transparancy

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 52: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

What’s next? Towards an open CSS

What’s next? Towards an open CSS

• Building the toolchain, doing the research• Focus on validity, usability, re-usability:

• Sharing the tools• Sharing the data• Sharing the results• Sharing the skills

• This effort needs to be collaborative!• See https://github.com/ccs-amsterdam/

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 53: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

What’s next? Towards an open CSS

Conclusion

• Computational Social Scienceoffers great promise to study societal resilience

• Need to overcome specific challenges• Acquiring data• Building tools• Growing skills (& instutitional incentives)

• Requires Transparant, Open, Collaborative researchto thrive

Building the Open Computational Social Science Toolbox Wouter van Atteveldt

Page 54: Building the Open Computational Social Science Toolboxvanatteveldt.com/wp-content/uploads/atteveldt_stavanger... · 2019. 6. 11. · Computational Social Science Data Gathering Analyzing

Computational Social Science Data Gathering Analyzing content Conclusion

What’s next? Towards an open CSS

Some links• https://vanatteveldt.com (this talk, publications)• https://ccs.amsterdam (project descriptions)• https://github.com/ccs-amsterdam (code)• https://computationalcommunication.org

(submit & first issue!)

Building the Open Computational Social Science Toolbox Wouter van Atteveldt