Managing sensitive data and authorship in Humanities and Social Sciences

34
Managing sensitive data and authorship in Humanities and Social Sciences ODIN conference, Cologne October 2013 Louise Corti Collections Development and Producer Support

description

Managing sensitive data and authorship in Humanities and Social Sciences. Louise Corti Collections Development and Producer Support. ODIN conference, Cologne October 2013. Overview. Introducing the UK Data Service Our data portfolio and users Citation, impact measurement and DOIs - PowerPoint PPT Presentation

Transcript of Managing sensitive data and authorship in Humanities and Social Sciences

Page 1: Managing sensitive  data and authorship in Humanities and  Social Sciences

Managing sensitive data and authorship in Humanities and Social Sciences

ODIN conference, CologneOctober 2013

Louise CortiCollections Development and Producer Support

Page 2: Managing sensitive  data and authorship in Humanities and  Social Sciences

Overview

• Introducing the UK Data Service

• Our data portfolio and users

• Citation, impact measurement and DOIs

• Challenges for social science citation

Page 3: Managing sensitive  data and authorship in Humanities and  Social Sciences

The UK Data Archive

• Based at the University of Essex, since 1967

• 45 years of selecting, ingesting, curating and providing access to social science data

• designated as Place of Deposit by The National Archives

• Data and data support services for higher and further education for research, teaching and learning

• Recently attained the highest information security standard, ISO 27001

Page 4: Managing sensitive  data and authorship in Humanities and  Social Sciences

University of Essex

The Archive

Page 5: Managing sensitive  data and authorship in Humanities and  Social Sciences

SISTER DATA ARCHIVES

Council of European Social Science Data Archives (CESSDA )

ADAAustralian Social

Science Data Archive

ICPSR (USA)Inter-University Consortium

for Political and Social Research

Page 6: Managing sensitive  data and authorship in Humanities and  Social Sciences

What is the UK Data Service?

• Comprehensive data resource funded by the UK Economic and Social Research Council

• Single virtual point of access to a wide range of secondary data for social science research (Directed from Essex)

• Offer promotion, support, training and guidance

Page 7: Managing sensitive  data and authorship in Humanities and  Social Sciences

What does the UK Data Service do?

• Put together a collection of the most valuable data• Preserve data for the long term for future research

purposes• Make the data and documentation available for reuse• Provide data management advice for data creators• Provide training and support for users of the service• Bring together owners, producers and users• Demonstrate impact through evidence of usage • Easy access through website - ukdataservice.ac.uk

Page 8: Managing sensitive  data and authorship in Humanities and  Social Sciences

Who is our service for?

• Data for secondary analysis, research, policy making

• Teaching and learning

• Academic researchers and students

• Government analysts

• Charities and foundations

• Business consultants

• Independent research centres

• Think tanks

Page 9: Managing sensitive  data and authorship in Humanities and  Social Sciences
Page 10: Managing sensitive  data and authorship in Humanities and  Social Sciences

Our data portfolio

• Over 6,000 datasets in the collection• 230 new datasets added each year

• Official agencies - mainly central government• International statistical time series• Individual academic’ research grants• Market research agencies• Public records/historical sources• Access to international data via links with

other data archives worldwide

Page 11: Managing sensitive  data and authorship in Humanities and  Social Sciences
Page 12: Managing sensitive  data and authorship in Humanities and  Social Sciences

UK survey series• High quality repeated cross-sectional surveys

• Individual or household level data

• Cover many topics including health, work, crime, social attitudes, family expenditure, living costs, housing etc.

• Labour Force Survey• British Crime Survey• Health Survey for England• British Social Attitudes• Annual Population Survey….

Page 13: Managing sensitive  data and authorship in Humanities and  Social Sciences

Cross-national surveys and macro databanks

• Eurobarometers

• European Social Survey

• European Values Survey

• International Social Survey Programme• Time series data aggregated to country/region• International governmental organisations (IMF, OECD,

IEA, World Bank)

Page 14: Managing sensitive  data and authorship in Humanities and  Social Sciences

Longitudinal studies

• British Household Panel Survey and Understanding Society

• Understanding Society (2009-) • English Longitudinal Study of Ageing• Families and Children Study • Growing Up in Scotland• Longitudinal Study of Young People in England

Page 15: Managing sensitive  data and authorship in Humanities and  Social Sciences

UK census data

• 1971-2011 census data• Baseline for other statistics• Detailed combinations of characteristics• Small geographies• Census outputs

• Aggregate data• Boundary data• Flow data• Microdata

Page 16: Managing sensitive  data and authorship in Humanities and  Social Sciences

Business data

• Collected through a wide range of surveys, and administrative sources:

• productivity, innovation, workforce skills, earnings• international trade, foreign direct investment• research and development• business demography• industrial relations

Page 17: Managing sensitive  data and authorship in Humanities and  Social Sciences

Qualitative data

• Interviews, focus groups

• Essays, diaries, open-ended survey questions

• Observations, case notes etc.

• Family Life and Work Experience before 1918, Middle and Upper Class Families in the Early 20th Century,1870-1977

• Gender Difference, Anxiety and the Fear of Crime, 1995

• Mothers Alone: Poverty and the Fatherless Family, 1955-1966

Page 18: Managing sensitive  data and authorship in Humanities and  Social Sciences

Usage of data

• Operate a spectrum of access• Web download under End

User Licence• Permission only via Special

Licence access• ‘Approved researcher’ access

via remote secure access

• End user licence includes:• Appropriate data usage• Full citation of data and informing us of re-use• Have always provided a citation format

•over 22,000 registered users

•approximately 60,000 downloads worldwide p.a.

•3,000+ user support queries

Page 19: Managing sensitive  data and authorship in Humanities and  Social Sciences

Evidence of access and re-useUser access information• Collect user information and ‘projects’ upon registration• Collate data and documentation download statistics• Users can share project information for others to see• Report data access stats on demand

Usage information • Email all users every 6 months after registration about activity• Manually add all research outputs references to the data record• Reporting rate of publications is poor!• Prior to DOIs, have scanned citation literature for dataset

mentions – very manual and unreliable, and poorly cited

Page 20: Managing sensitive  data and authorship in Humanities and  Social Sciences

Impactful case studies of use

• Identify and seek out case studies of re-use: research or teaching.

• Very successful!

• 125 case studies in our database• Can help provide impact stories for data owners/producers

and users• And can inspire others!• Some are harvested by ESRC for their website• Often include ongoing work – no need to wait for

publications

Page 21: Managing sensitive  data and authorship in Humanities and  Social Sciences
Page 22: Managing sensitive  data and authorship in Humanities and  Social Sciences
Page 23: Managing sensitive  data and authorship in Humanities and  Social Sciences

Our Persistent identifiers approach

• Our data collections are not digital objects

• Need to capture changes made to data– Versioning data in a commonly understood manner– Needed rule-based definition of a‘significant’change

• Integrate processes with digital preservation activities & work flows

• In 2011 we assigned Datacite DOIs for all of our collections• Mint and update DOIs with our metadata management

infrastructure

Page 24: Managing sensitive  data and authorship in Humanities and  Social Sciences

Recording significant change

• Approx. 15% UKDA data collections are altered within first year after first publication

• We have distinguished between major and minor changes to a data collection = high impact vs. low impact

• DOI allocated to a metadata instance of a data collection– DOIs resolve to jump page pointing to all external instances– New DOI = High Impact change, with explicit logging

• Provided access only to most up-to-date version of data

Page 25: Managing sensitive  data and authorship in Humanities and  Social Sciences

Major changes – high impact• New variable added

• New labels/value codes added

• Weighting variables reconstructed

• Wrong data supplied (e.g., March not April)

• Mis-coded data (e.g., Don’t know/Refused confused)

• Change in format (file migration)

• Significant changes in documentation

• Change in access conditions

Page 26: Managing sensitive  data and authorship in Humanities and  Social Sciences
Page 27: Managing sensitive  data and authorship in Humanities and  Social Sciences
Page 28: Managing sensitive  data and authorship in Humanities and  Social Sciences

Raising awareness in the social sciences

• ESRC funding for short-term project on citation

• Advocacy for best practice in citing research data

• Audiences • Professional organisations• Academic publishers and journal editors• Researchers and postgraduates

• Key activities• Data citation principles for social sciences

• Personal communications

• Events with BL DataCite, JISC and wider PI community

• Outreach through Doctoral Training Centres

Page 29: Managing sensitive  data and authorship in Humanities and  Social Sciences

Making

Page 30: Managing sensitive  data and authorship in Humanities and  Social Sciences

Demonstrating impact with citation

• Assuming better use of DOIS…

• Starting to search for use of our DOIs – Google

• Automate this process and compile reports; promote

• Gather data citation statistics from Thomson Reuters Data Citation Index. One of the early 20 feeder repositories, but our own access limited!

• Work with BL Datacite and ODIN to gain connectivity between identifiers & outputs – early adopters

Page 31: Managing sensitive  data and authorship in Humanities and  Social Sciences

CHALLENGES FOR THE FUTURE

• Citing parts (fragments) of data collections– single files

– subsets of quantitative data

– extracts of textual data

• ESRC project Digital Futures will enable extract level citation within a web-based browsing system

• Using rich highly structured XML metadata

• GUIDS for everything

Page 32: Managing sensitive  data and authorship in Humanities and  Social Sciences

UK Quali Bank

Page 33: Managing sensitive  data and authorship in Humanities and  Social Sciences

Resolving citation objects

• Will enable extract level citation

• Citation object and citation format created on the fly – using GUIDS and URI

• URI resolves directly to the data extract

• Some more sensitive collections will be closed, so cannot resolve to data

• As yet uncertain of relationship to our collection-level DOIs

Page 34: Managing sensitive  data and authorship in Humanities and  Social Sciences

CONTACT

UK Data Service University of EssexWivenhoe ParkColchesterEssex CO4 3SQ• ……………..…..………………………..T +44 (0)1206 872001 E [email protected]