Data matters-bournemouth-2015
Transcript of Data matters-bournemouth-2015
![Page 1: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/1.jpg)
Data Matters
Alan DixTalis & University of Birmingham
http://alandix.com/ref2014/
![Page 2: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/2.jpg)
University ofBirmingham
Tiree
Tiree Tech Wave22-26 October 2015
![Page 3: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/3.jpg)
today I am not talking about …
• intelligent internet interfaces• visualisation and sampling• situated displays, eCampus,
small device – large display interactions• fun and games, virtual crackers,
artistic performance, slow time• creativity and Bad Ideas• modelling dreams and regret
and the emergence of self
…
![Page 4: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/4.jpg)
… or even lots of lights
http:/www.hcibook.com/alan/projects/firefly/
![Page 5: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/5.jpg)
I am talking about ...
REF data analysis
long tail of small data
![Page 6: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/6.jpg)
REF
![Page 7: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/7.jpg)
REF 2014Research Excellence Framework
approx 5 yearly research assessment in the UK
not just about the UK …lots of countries thinking to do similar ... and looking to REF as example
![Page 8: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/8.jpg)
REF elements
three elements:
outputs (mainly papers)
impact environment
focus of this work
![Page 9: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/9.jpg)
REF panels
4 main panels, 36 sub-panels, ~200K outputs
sub-panel 11: computer science and informatics
I was on this panel but NO confidential data hereeverything public domain
![Page 10: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/10.jpg)
REF profilesevery output graded: 4* / 3* / 2* / 1*
individual grades confidential and destroyed
each ‘Unit of Assessment’ (dept) given a profile
http://results.ref.ac.uk/Results/ByUoa/11/Outputs
![Page 11: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/11.jpg)
sub-area profilesN.B. computing only
each output given ACM code
originally to enable allocation to panelists
… but, also used to create sub-area profiles …
![Page 12: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/12.jpg)
sub-area profiles
From Morris Sloman’s slides & panel report
theoretical areas30-40% 4*
applied/human areas10-20% 4*
![Page 13: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/13.jpg)
data not information
sub-panel report warning:"These data should be treated with circumspection …
however already affecting institutional policyhiring, internal investment
… and may influence research council policy
![Page 14: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/14.jpg)
possible reasons for variation …
1. best applied work is weak– including HCI :-/
2. long tail– weak researchers choose applied areas
3. latent bias– despite panel’s efforts to be fair
can bibliometrics disentangle these?
![Page 15: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/15.jpg)
metrics and assessment
citation metrics known to be good post-hoc correlates of sophisticated measures
… but not for individuals and small cohorts and danger of gaming and policy distortion
suitable for verifying large-scale patterns(and HEFCE using them for this)
![Page 16: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/16.jpg)
data used for analysisall in public domain
(virtually) complete list of outputs:– excluding a few confidential ones– for each: name, doi, ACM topic area, Scopus citations
Google scholar citations for each– gathered after REF (not used in assessment)
UoA and sub-area profiles
![Page 17: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/17.jpg)
metrics used
Scopus (late 2013 census )– with/without 2012/13 as few citations
‘Normalised Scopus’– using ‘contextual data’, corrects for
different citation patterns between areas– places output in top 1%, 5%, 10% of its area worldwide
Google Scholar (late 2014 census)– with/without 2012/13; zero treated as zero/missing
seven variants – all give similar results
![Page 18: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/18.jpg)
results … massive differences% citations intop quartile
% REF 4* ratio
winners
losers
![Page 19: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/19.jpg)
‘scatter’ graph
% outputs in top quartile for citations
% outputsawardedREF 4*
![Page 20: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/20.jpg)
rank scores
winners
losers
diagram thanks to Andrew Howes
![Page 21: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/21.jpg)
Another way of looking at it …world ranking within own field
![Page 22: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/22.jpg)
recall REF …
![Page 23: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/23.jpg)
for example,HCI research (web similar) …
on average …
• HCI/CSCW paper needs to be in top 0.5%worldwide to get 4*
• logic/algorithms paper just needs to be in top 5%
10 fold difference
![Page 24: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/24.jpg)
and just as you thought it was all over …… institutional effectslook at +/- 25% REF compared with citationsN.B. use high-end weighted measure as money is focused (4:1:0:0)
of 35 losers, 25 are post-1992 universitiesof 17 winners, 16 are pre-1992 universities
![Page 25: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/25.jpg)
an example …
XXXXXXX – a new universityYYYYYYYY – an old university
World Rankings
REF
![Page 26: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/26.jpg)
and Gender?Female authors in main panel B were significantly less likely to achieve a 4* output than male authors with the same metrics ratings. When considered in the UOA models, women were significantly less likely to have 4* outputs than men whilst controlling for metric scores in the following UOAs: Psychology, Psychiatry and Neuroscience; Computer Science and Informatics; Architecture, Built Environment and Planning; Economics and Econometrics.
The Metric Tide (HEFCE, 2015)
![Page 27: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/27.jpg)
implicit bias?
HEFCE analysis:male staff in computing is 1/3 more likely to get a 4* than female
areas and types institutions disadvantaged by REFoften those with more women
… implications for future recruitment?
![Page 28: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/28.jpg)
future for research assessment?
• pure metrics?
• metrics as part (e.g. older outputs)
• metrics as under-girding (burden of proof)
• human process – metrics for in-process feedback
![Page 29: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/29.jpg)
![Page 30: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/30.jpg)
..
long tail of small data
![Page 31: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/31.jpg)
Big Dataeveryone is talking about it
Twitter, Google, Facebook, NSA, universities, … and funding
Big Data does it with MapReduceSemantic Data does it with RDF
![Page 32: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/32.jpg)
the long tail
size ofdata set
a few very large data setse.g. Twitter, streams,Open Govt., OS, geonames, dbpedia the small data of ordinary life:
from local bus timetables to squash club league tables
![Page 33: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/33.jpg)
stories of small data …
Walking Wales
Learning analytics
Open Data Islands and Communities
Musicology
![Page 34: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/34.jpg)
![Page 35: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/35.jpg)
Alan Walks Wales
1058 miles (1700km)3 million footfalls3 ½ monthsApril-July 2013 focus on IT at the margins
one thousand miles of poetry, technology and community
![Page 36: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/36.jpg)
vision
personalencircling, encompassing, pilgrimage, homecoming,
practicalIT for the walker & IT for local communities
philosophicalreflections on walking and space, locality and identity
researchpersonal agenda and living lab
lots of
data
![Page 37: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/37.jpg)
data
locationGPX ... batteries ... sporadic signals ....
bio-sensingECG (heart), EDA (skin) and accelerometers
audio and imagesin the moment
textafter the event
implicit
explicit
The largest ECG trace in the public domain
![Page 38: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/38.jpg)
challenges (1)
locationGPX – merging and mending
bio-sensingECG & EDA – special formats & volume
audio and imagesvolume, transcription and annotation
textsemantic markup, synchronising sources
![Page 39: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/39.jpg)
challenges (2)
documentationmethodology of creation, data formatsfor other people to use!
meta-datafor machines to use
PRtelling the world about it!
academic culturewe do not value data!
![Page 40: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/40.jpg)
an offer
multiple synchronisable data streamslargest public domain ECG trace
post-hoc analysissimulate real use
please use it!
![Page 41: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/41.jpg)
![Page 42: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/42.jpg)
Learning analytics
macro-analyticsuniversity strategyMOOCs
micro-analyticsindividual course, student, resource
![Page 43: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/43.jpg)
time frames for learning analytics
days and hoursemail, during lectures and labs, stduent meetings, gaps
weekpreparing for teaching, exercises
months/mid-semesterreporting points, staff meetings, cohort/student progress
end of semester/term/yearexams, exam boards, course revew,
start of semester/term/yearpreparing for new courses or re-runs, rollover!
yearsnew courses, professional development, appraisal, promotion
![Page 44: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/44.jpg)
![Page 45: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/45.jpg)
Open Data
everyone is doing it
Governments, Cities, local gov.
In C21 Data is Power
![Page 46: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/46.jpg)
why not an island?
![Page 47: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/47.jpg)
island data flows
Community
groups and individuals
rest ofthe world
othercommunities
12
3
4
![Page 48: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/48.jpg)
island data flowsfrom community to world
Community
groups and individuals
rest ofthe world
1• visibility and
control• identity and
empowerment• level of detail• local knowledge
![Page 49: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/49.jpg)
island data flowsfrom world to community
Community
groups and individuals
rest ofthe world
2 • making the mostof open data
• local decision making
• lobbying and negotiation
![Page 50: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/50.jpg)
island data flowswithin the community
Community
groups and individuals
3
• gossip is not enough!• sparse, dispersed population• social cohesion and economic benefits
![Page 51: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/51.jpg)
island data flowsbetween communities
Community
groups and individuals
othercommunities
4
• sharing best practice• brand presence• interlinked data
![Page 52: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/52.jpg)
benefits to …
the communityempowerment and controlavailability of informationcommunication within and between communities
the worldimproved quality of datalevel of detail of datalocal knowledge and understanding
![Page 53: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/53.jpg)
![Page 54: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/54.jpg)
In Concert
Concert ephemera1750–1800 Calendar of London Concerts1815–1895 Concert Life in London1894–1944 Concert Programme Exchange (BL)
External sourcesMusicBrainzMBz id as connect into Linked Data, BBC, etc.
Authoritative sources (future)e.g. British Library BNB, Concert Programmes metadata
![Page 55: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/55.jpg)
![Page 56: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/56.jpg)
![Page 57: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/57.jpg)
concert databaseclassic digital humanities?
original sources
selectedsources
systematicsample
transcription& extraction
(medium expertise)
interpretation(high expertise)
digitisedsources
authoritativedata
analysis & use(high expertise)
academicpublication
large digitalarchive
(e.g. BBC)
possiblycreatelinkage
![Page 58: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/58.jpg)
Barriers to progress
effort and expertiseauthority and qualitydigital acontextualityopenness
![Page 59: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/59.jpg)
Openness and Reward
Career developmentLeverhulme & REFBuilding the discipline?
![Page 60: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/60.jpg)
Re-envisioning the Digital Archive:Curation and Use
![Page 61: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/61.jpg)
big bang to incremental
digitisedsources
authoritativedata
academicpublication
...
![Page 62: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/62.jpg)
big bang to incremental
problem focused augmentationtransform cost-benefit
digitialarchive
academicpublications
...
partialenhancement
& interpretation
![Page 63: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/63.jpg)
scenario-focused investigations
![Page 64: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/64.jpg)
=> reflection and requirements
digital symbiosissuggestion and confirmation
provenance and authority
spreadsheet as user interface
semantics through interaction
![Page 65: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/65.jpg)
![Page 66: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/66.jpg)
themes and take-aways ...
data in context
heterogeneity and linking
value and values
ethics and empowerment
…. and please use my data
![Page 67: Data matters-bournemouth-2015](https://reader036.fdocuments.net/reader036/viewer/2022070516/586fcf711a28aba24c8b8097/html5/thumbnails/67.jpg)