Post on 23-Feb-2016
description
DIGGING INTO DATAA presentation to the NSF
Cascades, Islands, or Streams? Time, Topic and Scholarly Activities in Humanities and Social Science
Research
DIGGING INTO DATAWho is
LARIVIÈRE
MILOJEVIĆSUGIMOTO DINGTHELWALL HOLMBERG
DIGGING INTO DATAWhat
DIGGING INTO DATAWhat
Time: 1743-2011Dissertations: 2,307,555Subjects: 166Schools: 1,490Countries: 66
DIGGING INTO DATAWhat
Time: 1900-2011Medicine Articles: 14,698,810Medicine References: 380,058,817Social Science Articles: 4,228,702Social Science References: 77,908,552Arts & Humanities Articles: 3,151,986Arts & Humanities References: 26,180,296Natural Science Articles: 14,853,029Natural Science References: 335,144,498
DIGGING INTO DATAWhat
Time: 2007-2012Articles: 744,584Broad Subject areas: 7Matching ISI records: ~50%
DIGGING INTO DATAWhat
Time: 2010-currentTweets: 100,000 per monthSubjects: 11Generalist journals: 4Scientists and science journalists: 350
DIGGING INTO DATAWhat
Time: 2006-2012Videos: 1,202Views on TED: 620,406,446Views on YouTube: 111,681,275Comments on YouTube: 414,311
DIGGING INTO DATAWhy are we
Integrate several datasets representing a broad range of scholarly activities
Use methodological and data triangulation to explore the lifecycle of topics within and across a range of scholarly activities
Develop transparent tools and techniques to enable future predictive analyses
DIGGING INTO DATAShow me the
Domain # Papers Aver. DelayEarth and Space 52438 0.40Physics 176557 0.57Biomedical Research 3119 0.57Chemistry 1664 0.61Engineering and Technology 8020 1.04Biology 285 1.18Mathematics 44535 1.44Social Sciences 367 1.61Professional Fields 382 1.77Clinical Medicine 566 1.93Psychology 80 2.26Humanities 111 3.88Health 37 4.54Arts 36 4.69All disciplines 288200 0.69
DIGGING INTO DATAShow me the
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
0
0.5
1
1.5
2
2.5
3
3.5
ArXiv and WoS (WoS version)
Only in WoS
ArXiv and WoS (ArXiv version)
ArXiv Only
Mea
n ci
atio
n ra
te
DIGGING INTO DATAShow me the
DIGGING INTO DATAShow me the
H=Hedges: lowered certainty (“perhaps”)B=Boosters: heightened certainty (“absolutely”)SM=Self-mentions: self-references (“the author”)AM=Attitude markers: author-text positions (“admitedly”)EM=Engagement markers: reader positions (“should”)
DIGGING INTO DATAShow me the
Metric Minimum Median Mean Maximum Total ValidTED web site views 44,441 338,969 517,437 9,946,996 620,406,446 1,199YouTube views 462 43,311 99,184 3,991,983 111,681,275 1,126Blog citations 0 3,120 9,073 441,000 10,905,376 1,202YouTube Likes 2 485 900 26,591 1,013,231 1,126YouTube Favorite count 3 299 767 38,139 863,458 1,126YouTube comments 0 195 368 21,703 414,311 1,126TED web site comments 8 117 187 5,921 224,629 1,199YouTube Dislikes 0 34 69 1,456 78,053 1,126Academic syllabi 0 1 2 50 2,070 1,202PDF and Word documents 0 0 0 49 592 1,202Google Scholar citations 0 0 0 75 505 1,202Google Books citations 0 0 0 18 434 1,202PowerPoint presentations 0 0 0 238 392 1,202Mendeley readers 0 0 0 30 231 1,202Web of Knowledge citations 0 0 0 5 47 1,202YouTube Like proportion 0.260 0.941 0.900 1.000 - 1,126
DIGGING INTO DATAShow me the
Metric Academic Non-academics TED web site views 327,904 321,320YouTube views 49,660 45,414Blog citations 2,340 2,246YouTube comments 223 190TED web site comments 111 112Online mentions related to academic syllabi 1 1Online mentions in PDF and Word documents (acad. higher) 0 0*Google Scholar citations 0 0Google Books citations 0 0Online mentions in PowerPoint presentations 0 0Mendeley readers 0 0Web of Knowledge citations 0 0YouTube Like proportion 0.9574 0.9271**
DIGGING INTO DATAKeep on
DIGGING INTO DATAKeep on
DIGGING INTO DATAComments
DIGGING INTO DATAAnalyzing sentiment
We are developing sentiment analysis software SentiStrength for the texts in the project
The program will classify the sentiment of texts based upon lexicons of words – e.g., good, bad – plus special rules for negation, booster words (e.g., very) etc.
The lexicon will be customised for different genres – e.g., flawed, incomplete for academic texts, dull, inspiring for videos
DIGGING INTO DATALead-lag analysis
DIGGING INTO DATAAfter
Scott Weingart
DIGGING INTO DATATowards a new model
Draft
Report
Conf. paper
ArticleReview
BookTweet
EmailBlog
Slide
show
Multi
med
ia
Cura
ted
DBProducer
Consumer
Prosumer
DIGGING INTO DATAQuestions about
Cassidy R. Sugimoto (PI)Assistant Professor
School of Library and Information ScienceIndiana University Bloomington
sugimoto@indiana.edu