Databrary Advisory Board Spring Meeting

147
Databrary Advisory Board Spring Meeting April 7, 2014 NYU 1

Transcript of Databrary Advisory Board Spring Meeting

Databrary Advisory BoardSpring Meeting

April 7 2014NYU

1

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

2

AnnouncementsNext Board Meeting October 13 2014 at NYUWi-fi access nyuguest

Guest ID guest132Password keddlith

Meeting will be recordedPush-to-talk microphonesReimbursement form and receipts to Lina Wictoren Roy Dinner following at Hundred Acres

38 Macdougal Street corner of Prince

3

Databrary PIs

Karen AdolphNYU

David MillmanNYU

Rick GilmorePenn State

4

Agency partnersJim Griffin NICHD Science OfficerLisa Freund NICHD Program OfficerLaura Namy NSF Director DLS

5

Special GuestsLynn Liben

Distinguished Professor of Psychology Penn State President of the Society for Research in Child Development

Daphne Maurer Distinguished University Professor of Psychology Neuroscience amp Behaviour McMaster University President of the International Society on Infant Studies

Amanda Woodward William S Gray Professor of Psychology University of Chicago President of the Cognitive Development Society

6

Special thanks to NYUDavid Ackerman IT support storage hiring

Nancy Daneau Authorization

Alison Dewhurst and Marti Dunne Participant release IRB

Pamela Morris IHDSC administrative support

Eric Rasmussen and Mark Righter Authorization university agreements legal

7

Very special thanksRichard Louth

8

Databrary team

Dylan Simon Jesse Lingeman John FranchakVicky FooJon CoeMike Continues

Andrea ByrneLisa Steiger Lina Wictoren Roy

9

Board member updatesWelcome to new advisors

Daniel Messinger University of Miami analytic toolsBrian Nosek University of Virginia COSTal Yarkoni University of Texas neurosynthorg

Farewell to departing memberSarah Morrow PSU

10

PublicationsSubmitted

International Journal of Digital LibrariesIn progress

WIRErsquos Cognitive Science (invited paper)Best practices in behavioral codingCall to armsTop 10 reasons to shareTop 10 concerns about sharing

11

PresentationsNFAISCENDILibrary of Congress (Nov 12 2013)CNI 2013 Fall Meeting (Dec 9-10 2013)Spatial DataMine Workshop (Feb 7-9 2014)12+ presentations to psychologists pediatric physicaloccupational therapists movement scientists

12

DatavyuReleased stable version of Datavyu 11Published user guideBest practices in behavioral coding forthcoming

13

Seeding the repository Datasets acquired

AdolphTamis-LeMondaKarasikGordonLoBue

14

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

2

AnnouncementsNext Board Meeting October 13 2014 at NYUWi-fi access nyuguest

Guest ID guest132Password keddlith

Meeting will be recordedPush-to-talk microphonesReimbursement form and receipts to Lina Wictoren Roy Dinner following at Hundred Acres

38 Macdougal Street corner of Prince

3

Databrary PIs

Karen AdolphNYU

David MillmanNYU

Rick GilmorePenn State

4

Agency partnersJim Griffin NICHD Science OfficerLisa Freund NICHD Program OfficerLaura Namy NSF Director DLS

5

Special GuestsLynn Liben

Distinguished Professor of Psychology Penn State President of the Society for Research in Child Development

Daphne Maurer Distinguished University Professor of Psychology Neuroscience amp Behaviour McMaster University President of the International Society on Infant Studies

Amanda Woodward William S Gray Professor of Psychology University of Chicago President of the Cognitive Development Society

6

Special thanks to NYUDavid Ackerman IT support storage hiring

Nancy Daneau Authorization

Alison Dewhurst and Marti Dunne Participant release IRB

Pamela Morris IHDSC administrative support

Eric Rasmussen and Mark Righter Authorization university agreements legal

7

Very special thanksRichard Louth

8

Databrary team

Dylan Simon Jesse Lingeman John FranchakVicky FooJon CoeMike Continues

Andrea ByrneLisa Steiger Lina Wictoren Roy

9

Board member updatesWelcome to new advisors

Daniel Messinger University of Miami analytic toolsBrian Nosek University of Virginia COSTal Yarkoni University of Texas neurosynthorg

Farewell to departing memberSarah Morrow PSU

10

PublicationsSubmitted

International Journal of Digital LibrariesIn progress

WIRErsquos Cognitive Science (invited paper)Best practices in behavioral codingCall to armsTop 10 reasons to shareTop 10 concerns about sharing

11

PresentationsNFAISCENDILibrary of Congress (Nov 12 2013)CNI 2013 Fall Meeting (Dec 9-10 2013)Spatial DataMine Workshop (Feb 7-9 2014)12+ presentations to psychologists pediatric physicaloccupational therapists movement scientists

12

DatavyuReleased stable version of Datavyu 11Published user guideBest practices in behavioral coding forthcoming

13

Seeding the repository Datasets acquired

AdolphTamis-LeMondaKarasikGordonLoBue

14

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

AnnouncementsNext Board Meeting October 13 2014 at NYUWi-fi access nyuguest

Guest ID guest132Password keddlith

Meeting will be recordedPush-to-talk microphonesReimbursement form and receipts to Lina Wictoren Roy Dinner following at Hundred Acres

38 Macdougal Street corner of Prince

3

Databrary PIs

Karen AdolphNYU

David MillmanNYU

Rick GilmorePenn State

4

Agency partnersJim Griffin NICHD Science OfficerLisa Freund NICHD Program OfficerLaura Namy NSF Director DLS

5

Special GuestsLynn Liben

Distinguished Professor of Psychology Penn State President of the Society for Research in Child Development

Daphne Maurer Distinguished University Professor of Psychology Neuroscience amp Behaviour McMaster University President of the International Society on Infant Studies

Amanda Woodward William S Gray Professor of Psychology University of Chicago President of the Cognitive Development Society

6

Special thanks to NYUDavid Ackerman IT support storage hiring

Nancy Daneau Authorization

Alison Dewhurst and Marti Dunne Participant release IRB

Pamela Morris IHDSC administrative support

Eric Rasmussen and Mark Righter Authorization university agreements legal

7

Very special thanksRichard Louth

8

Databrary team

Dylan Simon Jesse Lingeman John FranchakVicky FooJon CoeMike Continues

Andrea ByrneLisa Steiger Lina Wictoren Roy

9

Board member updatesWelcome to new advisors

Daniel Messinger University of Miami analytic toolsBrian Nosek University of Virginia COSTal Yarkoni University of Texas neurosynthorg

Farewell to departing memberSarah Morrow PSU

10

PublicationsSubmitted

International Journal of Digital LibrariesIn progress

WIRErsquos Cognitive Science (invited paper)Best practices in behavioral codingCall to armsTop 10 reasons to shareTop 10 concerns about sharing

11

PresentationsNFAISCENDILibrary of Congress (Nov 12 2013)CNI 2013 Fall Meeting (Dec 9-10 2013)Spatial DataMine Workshop (Feb 7-9 2014)12+ presentations to psychologists pediatric physicaloccupational therapists movement scientists

12

DatavyuReleased stable version of Datavyu 11Published user guideBest practices in behavioral coding forthcoming

13

Seeding the repository Datasets acquired

AdolphTamis-LeMondaKarasikGordonLoBue

14

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Databrary PIs

Karen AdolphNYU

David MillmanNYU

Rick GilmorePenn State

4

Agency partnersJim Griffin NICHD Science OfficerLisa Freund NICHD Program OfficerLaura Namy NSF Director DLS

5

Special GuestsLynn Liben

Distinguished Professor of Psychology Penn State President of the Society for Research in Child Development

Daphne Maurer Distinguished University Professor of Psychology Neuroscience amp Behaviour McMaster University President of the International Society on Infant Studies

Amanda Woodward William S Gray Professor of Psychology University of Chicago President of the Cognitive Development Society

6

Special thanks to NYUDavid Ackerman IT support storage hiring

Nancy Daneau Authorization

Alison Dewhurst and Marti Dunne Participant release IRB

Pamela Morris IHDSC administrative support

Eric Rasmussen and Mark Righter Authorization university agreements legal

7

Very special thanksRichard Louth

8

Databrary team

Dylan Simon Jesse Lingeman John FranchakVicky FooJon CoeMike Continues

Andrea ByrneLisa Steiger Lina Wictoren Roy

9

Board member updatesWelcome to new advisors

Daniel Messinger University of Miami analytic toolsBrian Nosek University of Virginia COSTal Yarkoni University of Texas neurosynthorg

Farewell to departing memberSarah Morrow PSU

10

PublicationsSubmitted

International Journal of Digital LibrariesIn progress

WIRErsquos Cognitive Science (invited paper)Best practices in behavioral codingCall to armsTop 10 reasons to shareTop 10 concerns about sharing

11

PresentationsNFAISCENDILibrary of Congress (Nov 12 2013)CNI 2013 Fall Meeting (Dec 9-10 2013)Spatial DataMine Workshop (Feb 7-9 2014)12+ presentations to psychologists pediatric physicaloccupational therapists movement scientists

12

DatavyuReleased stable version of Datavyu 11Published user guideBest practices in behavioral coding forthcoming

13

Seeding the repository Datasets acquired

AdolphTamis-LeMondaKarasikGordonLoBue

14

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Agency partnersJim Griffin NICHD Science OfficerLisa Freund NICHD Program OfficerLaura Namy NSF Director DLS

5

Special GuestsLynn Liben

Distinguished Professor of Psychology Penn State President of the Society for Research in Child Development

Daphne Maurer Distinguished University Professor of Psychology Neuroscience amp Behaviour McMaster University President of the International Society on Infant Studies

Amanda Woodward William S Gray Professor of Psychology University of Chicago President of the Cognitive Development Society

6

Special thanks to NYUDavid Ackerman IT support storage hiring

Nancy Daneau Authorization

Alison Dewhurst and Marti Dunne Participant release IRB

Pamela Morris IHDSC administrative support

Eric Rasmussen and Mark Righter Authorization university agreements legal

7

Very special thanksRichard Louth

8

Databrary team

Dylan Simon Jesse Lingeman John FranchakVicky FooJon CoeMike Continues

Andrea ByrneLisa Steiger Lina Wictoren Roy

9

Board member updatesWelcome to new advisors

Daniel Messinger University of Miami analytic toolsBrian Nosek University of Virginia COSTal Yarkoni University of Texas neurosynthorg

Farewell to departing memberSarah Morrow PSU

10

PublicationsSubmitted

International Journal of Digital LibrariesIn progress

WIRErsquos Cognitive Science (invited paper)Best practices in behavioral codingCall to armsTop 10 reasons to shareTop 10 concerns about sharing

11

PresentationsNFAISCENDILibrary of Congress (Nov 12 2013)CNI 2013 Fall Meeting (Dec 9-10 2013)Spatial DataMine Workshop (Feb 7-9 2014)12+ presentations to psychologists pediatric physicaloccupational therapists movement scientists

12

DatavyuReleased stable version of Datavyu 11Published user guideBest practices in behavioral coding forthcoming

13

Seeding the repository Datasets acquired

AdolphTamis-LeMondaKarasikGordonLoBue

14

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Special GuestsLynn Liben

Distinguished Professor of Psychology Penn State President of the Society for Research in Child Development

Daphne Maurer Distinguished University Professor of Psychology Neuroscience amp Behaviour McMaster University President of the International Society on Infant Studies

Amanda Woodward William S Gray Professor of Psychology University of Chicago President of the Cognitive Development Society

6

Special thanks to NYUDavid Ackerman IT support storage hiring

Nancy Daneau Authorization

Alison Dewhurst and Marti Dunne Participant release IRB

Pamela Morris IHDSC administrative support

Eric Rasmussen and Mark Righter Authorization university agreements legal

7

Very special thanksRichard Louth

8

Databrary team

Dylan Simon Jesse Lingeman John FranchakVicky FooJon CoeMike Continues

Andrea ByrneLisa Steiger Lina Wictoren Roy

9

Board member updatesWelcome to new advisors

Daniel Messinger University of Miami analytic toolsBrian Nosek University of Virginia COSTal Yarkoni University of Texas neurosynthorg

Farewell to departing memberSarah Morrow PSU

10

PublicationsSubmitted

International Journal of Digital LibrariesIn progress

WIRErsquos Cognitive Science (invited paper)Best practices in behavioral codingCall to armsTop 10 reasons to shareTop 10 concerns about sharing

11

PresentationsNFAISCENDILibrary of Congress (Nov 12 2013)CNI 2013 Fall Meeting (Dec 9-10 2013)Spatial DataMine Workshop (Feb 7-9 2014)12+ presentations to psychologists pediatric physicaloccupational therapists movement scientists

12

DatavyuReleased stable version of Datavyu 11Published user guideBest practices in behavioral coding forthcoming

13

Seeding the repository Datasets acquired

AdolphTamis-LeMondaKarasikGordonLoBue

14

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Special thanks to NYUDavid Ackerman IT support storage hiring

Nancy Daneau Authorization

Alison Dewhurst and Marti Dunne Participant release IRB

Pamela Morris IHDSC administrative support

Eric Rasmussen and Mark Righter Authorization university agreements legal

7

Very special thanksRichard Louth

8

Databrary team

Dylan Simon Jesse Lingeman John FranchakVicky FooJon CoeMike Continues

Andrea ByrneLisa Steiger Lina Wictoren Roy

9

Board member updatesWelcome to new advisors

Daniel Messinger University of Miami analytic toolsBrian Nosek University of Virginia COSTal Yarkoni University of Texas neurosynthorg

Farewell to departing memberSarah Morrow PSU

10

PublicationsSubmitted

International Journal of Digital LibrariesIn progress

WIRErsquos Cognitive Science (invited paper)Best practices in behavioral codingCall to armsTop 10 reasons to shareTop 10 concerns about sharing

11

PresentationsNFAISCENDILibrary of Congress (Nov 12 2013)CNI 2013 Fall Meeting (Dec 9-10 2013)Spatial DataMine Workshop (Feb 7-9 2014)12+ presentations to psychologists pediatric physicaloccupational therapists movement scientists

12

DatavyuReleased stable version of Datavyu 11Published user guideBest practices in behavioral coding forthcoming

13

Seeding the repository Datasets acquired

AdolphTamis-LeMondaKarasikGordonLoBue

14

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Very special thanksRichard Louth

8

Databrary team

Dylan Simon Jesse Lingeman John FranchakVicky FooJon CoeMike Continues

Andrea ByrneLisa Steiger Lina Wictoren Roy

9

Board member updatesWelcome to new advisors

Daniel Messinger University of Miami analytic toolsBrian Nosek University of Virginia COSTal Yarkoni University of Texas neurosynthorg

Farewell to departing memberSarah Morrow PSU

10

PublicationsSubmitted

International Journal of Digital LibrariesIn progress

WIRErsquos Cognitive Science (invited paper)Best practices in behavioral codingCall to armsTop 10 reasons to shareTop 10 concerns about sharing

11

PresentationsNFAISCENDILibrary of Congress (Nov 12 2013)CNI 2013 Fall Meeting (Dec 9-10 2013)Spatial DataMine Workshop (Feb 7-9 2014)12+ presentations to psychologists pediatric physicaloccupational therapists movement scientists

12

DatavyuReleased stable version of Datavyu 11Published user guideBest practices in behavioral coding forthcoming

13

Seeding the repository Datasets acquired

AdolphTamis-LeMondaKarasikGordonLoBue

14

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Databrary team

Dylan Simon Jesse Lingeman John FranchakVicky FooJon CoeMike Continues

Andrea ByrneLisa Steiger Lina Wictoren Roy

9

Board member updatesWelcome to new advisors

Daniel Messinger University of Miami analytic toolsBrian Nosek University of Virginia COSTal Yarkoni University of Texas neurosynthorg

Farewell to departing memberSarah Morrow PSU

10

PublicationsSubmitted

International Journal of Digital LibrariesIn progress

WIRErsquos Cognitive Science (invited paper)Best practices in behavioral codingCall to armsTop 10 reasons to shareTop 10 concerns about sharing

11

PresentationsNFAISCENDILibrary of Congress (Nov 12 2013)CNI 2013 Fall Meeting (Dec 9-10 2013)Spatial DataMine Workshop (Feb 7-9 2014)12+ presentations to psychologists pediatric physicaloccupational therapists movement scientists

12

DatavyuReleased stable version of Datavyu 11Published user guideBest practices in behavioral coding forthcoming

13

Seeding the repository Datasets acquired

AdolphTamis-LeMondaKarasikGordonLoBue

14

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Board member updatesWelcome to new advisors

Daniel Messinger University of Miami analytic toolsBrian Nosek University of Virginia COSTal Yarkoni University of Texas neurosynthorg

Farewell to departing memberSarah Morrow PSU

10

PublicationsSubmitted

International Journal of Digital LibrariesIn progress

WIRErsquos Cognitive Science (invited paper)Best practices in behavioral codingCall to armsTop 10 reasons to shareTop 10 concerns about sharing

11

PresentationsNFAISCENDILibrary of Congress (Nov 12 2013)CNI 2013 Fall Meeting (Dec 9-10 2013)Spatial DataMine Workshop (Feb 7-9 2014)12+ presentations to psychologists pediatric physicaloccupational therapists movement scientists

12

DatavyuReleased stable version of Datavyu 11Published user guideBest practices in behavioral coding forthcoming

13

Seeding the repository Datasets acquired

AdolphTamis-LeMondaKarasikGordonLoBue

14

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

PublicationsSubmitted

International Journal of Digital LibrariesIn progress

WIRErsquos Cognitive Science (invited paper)Best practices in behavioral codingCall to armsTop 10 reasons to shareTop 10 concerns about sharing

11

PresentationsNFAISCENDILibrary of Congress (Nov 12 2013)CNI 2013 Fall Meeting (Dec 9-10 2013)Spatial DataMine Workshop (Feb 7-9 2014)12+ presentations to psychologists pediatric physicaloccupational therapists movement scientists

12

DatavyuReleased stable version of Datavyu 11Published user guideBest practices in behavioral coding forthcoming

13

Seeding the repository Datasets acquired

AdolphTamis-LeMondaKarasikGordonLoBue

14

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

PresentationsNFAISCENDILibrary of Congress (Nov 12 2013)CNI 2013 Fall Meeting (Dec 9-10 2013)Spatial DataMine Workshop (Feb 7-9 2014)12+ presentations to psychologists pediatric physicaloccupational therapists movement scientists

12

DatavyuReleased stable version of Datavyu 11Published user guideBest practices in behavioral coding forthcoming

13

Seeding the repository Datasets acquired

AdolphTamis-LeMondaKarasikGordonLoBue

14

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

DatavyuReleased stable version of Datavyu 11Published user guideBest practices in behavioral coding forthcoming

13

Seeding the repository Datasets acquired

AdolphTamis-LeMondaKarasikGordonLoBue

14

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Seeding the repository Datasets acquired

AdolphTamis-LeMondaKarasikGordonLoBue

14

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Forthcoming datasetsMessinger labGernsbacher labMaher longitudinal studyCHILDES videosGesell archives

In partnership with MIAP Howard Besser

15

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

16

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Why Databrary exists

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costly

17

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Why Databrary exists Challenge in developmental psychology

Most developmental researchers collect video dataNot exploiting the richness in videoGood data going to wasteImpeding transparency and the pace of discovery

Collecting and storing video is costlyOpen up videos to be repurposed for new questions

17

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Key aims of the Databrary projectBuild Databrary repository for sharing videoCreate appropriate policy environmentProvide data management tools Enhance Datavyu tool for scoring videoTransform the culture of developmental science

18

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

View amp UseVideos

Videos toShare

Roadmap

19

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

View amp UseVideos

Videos toShareUpload

Roadmap

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

20

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

View amp UseVideos

Videos toShareUpload

Roadmap

Data dumpDatasets are browsable but not searchable at session level

Structured datasets and studiesAssign appropriate metadata to each session

20

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

View amp UseVideos

Videos toShareUpload

Roadmap

Videos are identifiableSo each video (session) must have a release levelDataset dump not possible because metadata tags for each session are required

21

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Videos toShareUpload

Roadmap

22

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Videos toShareUpload

Roadmap

23

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

24

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

25

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Roadmap

UploadAs You Go

Videos toShareUpload

UploadAfter the Fact

Incentives Preservation data easily available to collaborators menschDisincentives Time effort $

26

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Roadmap

27

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

28

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Datavyu

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

29

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAfter the Fact

Videos toShareUpload

Incentives Preservation ldquolab serverrdquo easy access for students amp collaborators data organization coding transcoding syncing and splitting videosDisincentives Change current practices

UploadAs You Go

DataCollection

Coding AnalysesWriting

Paper in Press

Ask Ss to Share

30

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Roadmap

31

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Roadmap

DataCollection

Coding AnalysesWriting

Paper in Press

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Incentives Use of excerptsDisincentives None

IRB Ask Ss to Share

32

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAfter the Fact

UploadAs You Go

Videos toShareUpload

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

Roadmap

Ask Ss to Share

33

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Ask Ss to Share

DataCollection

Coding AnalysesWriting

Paper in PressIRB

UploadAs You Go

Upload

UploadAfter the Fact

Grants ampContracts

Incentives Full accessDisincentives None

Videos toShare

View ampUse

Videos

34

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2018-04

IRB

Timeline

Ask Ss to Share

35

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

36

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in Press

2017-07

IRB2014-04

Timeline

Ask Ss to Share

37

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072015-042014-04

Timeline

Ask Ss to Share

38

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-04 2016-04

Timeline

Ask Ss to Share

39

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-07

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2015-09

Timeline

Ask Ss to Share

40

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09

Timeline

Ask Ss to Share

41

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

UploadAs You Go

Upload Videos toShare

UploadAfter the Fact

View ampUse

Videos

Grants ampContracts

DataCollection

Coding AnalysesWriting

Paper in PressIRB

2017-072014-09 2016-09

Timeline

Ask Ss to Share

41

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Upload as you go (Labnanny)We need to get the video data into DatabraryWe canrsquot rely on researchers uploading data after their paper is submittedWe need to make it easy to upload data while researchers are collecting it

42

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

43

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search Datavyu 20330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

44

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Policy frameworkSpecial problem with video Identifiable dataPolicies that enable open sharing of identifiable data

Informed consent to share videosEnsuring adherence to a common set of practices and ethical principles

45

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Databrary releaseTemplate release formInformed consent can share identifiable data if you tell participants (all depicted individuals) and they agree Standardization across contributors

Developed to correspond directly with release levels

46

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Release levelsAvailable restrictions on identifiable data

Did not ask If undocumented functions as privatePrivate Restricted to data owners and editorsShared Restricted to authorized Databrary investigatorsShared + excerpts Restricted like shared but excerpts may be used for informational scientific and educational purposesPublic Available to the public

47

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

What data are being releasedVideos from a session (or part of a session)

Can decline sharing for sessions or segments of sessions that contains sensitive information

Session- and participant-level metadata that are identifiable

Birthdates faces names interior of homes classrooms disabilities self-reported health info etc

Codes of behaviors

48

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Databrary release What is uniqueCompletely separates consent to participate from release to share

Consent to participate (in the study) beforeRelease to share aftermdashitrsquos clear what was recorded

49

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Our template worksMultiple iterations of release form and procedure for talking with participants

Tested on our own participantsAdaptable and flexible

Only local IRB must approveCan collect decisions of all depicted individuals on one page

Used in many contextsInternational oralwritten diverse ethnic groups children with disabilities

50

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Video example

databraryorguser-guidegetting-startedrelease-scriptexample-videoshtml

51

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Just do itAsking to share does not obligate you to shareCanrsquot share without askingWersquoll store everything so long as participants are askedWe need as many researchers as possible to incorporate Databrary releases

52

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Progress on releases9 labs are currently requesting participant release10 labs have pending IRB protocol amendments17 other labs in process of adding Databrary to IRB

53

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Discussion re excerpts Most common (pre-Databrary) use case

Clips for talks teachingDemonstration of specific procedure or methodRelations between behaviors and codes

Why ask for excerpts separatelyEffectively publicNeeds to be clear to participants

54

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

Databrary and excerpts Should Databrary make excerpts publicly available on the siteWho decides which excerpts are public on Databrary

Only the data owner or any PI on Databrary

55

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

AuthorizationAccess to shared videos Contribute andor use data

No requirement to contributeInstitutional sign-off by Authorizing Official

Grants and contractsNo IRB needed

56

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130

AccessPIs can authorize and manage affiliatesCan also add lab members and collaborators

Regardless of authorizationFour pre-selected options

Lab and collaborators onlyDatabrary data onlyLab data and Databrary dataProxy

57

EVERYBODY

REGISTERED USERS

AUTHORIZED USERS

CO

LLABORATORS LAB

INTERESTED PARTIES

Access publicly shared data

Access and share Databrary data

Upload data

Access released data selectively shared

Access and edit all private data

58

Terms of the agreementResearcher upholds same ethical standards as with their own dataResponsibility of researcher to get whatever approval is necessary when use constitutes human subjects researchInstitution must verify that Investigator is eligible to be a Databrary Principal Investigator

59

Progress on authorization7 schools signed

NYU PSU Rochester Indiana McMaster UVA RutgersPoints of clarification

Does not go to the IRB office for signingFurther IRB approval not required for authorizationOnly authorizing official can bind institution or enter agreement

60

FeedbackAgreement written with NYU and PSUPiloted agreement on advisory boardWe want as much feedback as possible

Where are researchers getting hung upWhere are institutions getting hung up

61

Registration demo

62

Next step for authorization policyShift to institutional agreements

Once schools are comfortable with Investigator AgreementNo substantive changes to re-frame as an institutional agreement

Allows universities to manage their authorizations on Databrary

63

Beta releaseSending email tomorrow (48) with private beta link

Register and request authorizationExplore the site and tell us what you thinkWe will authorize you temporarily through the beta if neededPlease donrsquot download Tajik videos

64

Outreach plan for Databrary 10Community outreach

Invites to ICIS CDS SRCD membersConference exhibits at major academic society meetingsWorkshops on sharing identifiable data

Timesavers as incentivesDeveloping lsquoboilerplatersquo Databrary language for grant proposals and reportsData management plans (partner with DMPtool)Resource sharing plans

65

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

66

Break

67

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

68

Datavyu current and futureReleased stable Datavyu 11 desktop version and user guideAddressing remaining bugs as they trickle in but otherwise completeNext steps (to begin 2015+) Bring Datavyu into Databrary web framework

Target simple coding tasks on-lineLater possibility for off-line version

69

Databrary site

70

October 2013 board meetingLots of videos

Piled up into studiesMinimal organization

ldquoStaticrdquo viewsSingle presentation optionLimited flexibility

71

Development progressRevised technical requirements

Flexible discovery browsing options Responsive intuitive editing uploadingHigh-level visualizations summariesAPI scripting interface

Built new completely dynamic JS interfaceFront-end uses API

72

Uploading after the factLearned a lot from manually curating and organizing existing and new datasetsBroadened types of possible standard metadataDeveloped better ways to group and relate videos metadata studies datasets

73

Lessons from the dataAlways exceptions irregularities in real data

Missing multiple videosDonrsquot fit cleanly in defined groupsExcluded and later re-purposed dataAlways something non-rectangular

Different types of data (classroom public longitudinal) have different needs

74

Updated modelBetter understanding of constraints required flexibilityNeed responsive iterative ldquodata-drivenrdquo approach

Real data are always more nuanced

75

Walkthrough of live site

76

Next stepsImprove discovery within and across datasets

Searching by keyword text descriptions names etcFiltering limiting results by age gender numeric values etcSorting order to present results for scanning visually

View summarized or aggregated data (counts distributions visualizations etc)Many existing user stories and use cases

77

User stories Teaching and talks Video clips for teachingIllustrate an ideaShow the range of behaviors and exceptionsShow an excerpt in a talk

78

User stories Pre-researchBrowse the work in my fieldDecide whether a study is worth doingPreliminary data for grant proposalIdeas and inspirationReplicate expand on or review previous work based on the procedure or coding manual

79

User stories ResearchRepurpose videos for new usesReplicate existing work by recoding videosGrow sample sizeInclude participants from other contexts and populationsConduct integrative analysesComplete grant progress report

80

Next stepsPrimary focus was on browsing searching using (downloading coding commenting tagging)New focus Upload as you goBuild browser interface that also makes sense for entering and editing data

81

User stories Upload as you goDescribe study design (conditions groups etc)Enter (typepasteimport) session metadataUpload new video associate it with sessionKeep track of which data were enteredExport previously entered data for analysisCustomize presentation of dataset (title excerpts display)

82

Design process for upload as you goFocus group including developers UIUX experts researchersRefine user storiesIdentify requirements and prioritiesIterative process test and improve interface

83

Inferring structure as you goInitial questions about study design are usefulBut things are always changing midcourse Allow users to iterate easily by finding smart ways to expand structure as they go

84

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

85

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadata

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

86

Session metadataRelease levelDate

PilotExcludedParticipantsConditionsTasksGroupsLocation

87

Session release levelNot askedPrivateSharedShared + excerptsPublic

88

Session dateTechnically considered identifiable informationCould be unknown (Gesell)

89

PilotIndicates that the methods used in the session were not finalized or standard

90

Excluded

ReasonDid not meet inclusion criteriaProceduralexperimenter errorWithdrewfussytiredOutlier

Indicates that session was not usable for the target study

91

Participant

Subject IDBirthdateGenderRace (NIH)Ethnicity (NIH)

Disabilitytypically devLanguageCountry (of origin)SES

Represents an individual depicted or represented in this session

92

Task

NameSummaryDescription

Represents a particular task activity or phase that occurred during all or a portion of the session

93

Condition

NameSummaryDescriptionOther variables

Represents an experimenter-determined manipulation (within or between sessions)

94

Group

NameSummaryDescriptionOther variables

Represents a grouping of sessions determined by an aspect of the data (participant ability age experience measurements usedavailable)

95

Location

Settinglab home classroom outdoors clinic museum

Language (of study materials)StateCountry

Represents a particular setting or other variable aspect of wherewhenhow session data were collected

96

Datasets and studies (provenance)Datasets raw data collected in sessionsStudies generated data analyses papersStudies include sessions from datasets and ldquolayerrdquo files over themAllow re-purposing re-presenting data

97

Session is the basic unitCovered by a release levelData collected at same timeContains timeline of raw data filesAnnotated with metadataAnalysis files layered on

Session April 7 2014

intakemp4

baby-cammp4

coding-spreadsheetopf

overhead-cammp4

analysissav

surveydoc

98

Datasets and StudiesOne to one mapping likely to be rare

Excluded Ss pilot SsIn interest of transparency and repurposing should be no disincentive for sharing

Excerpts only displays onlyStudies no datasets

Repurposing dataStudies and datasets

99

Simple after-the-fact casesEasy to upload studies with

Publication abstractExcerpts onlyStimuli displays

100

Whatrsquos on the table for standardizingMetadata variables describing sessions participants studies datasetsNot interfering with the scienceNot standardizing tasks

Subgroups can do this (emo-group)Not standardizing codes

Datavyu is agnostic

101

Discussion How much to standardizeContinuum between enforced structure and flexibilityCurrently types of metadata (participant task location etc) are fixed

Is this enough Too much Are there othersShould we let users define their own

Variables within metadata (birthdate language description) are user-extensible

Should we restrict these Add more

102

Benefits and costs of standardizationMore powerful searching

Video or session level searchesAggregating across datasets

But possible change in current practicesMore work for PIs to document measures

103

Potential solutions for flexibilityEncourage use of existing tags (pop-up suggestions)Our curators clean up after the fact (merge conceptually identical tags)Leave the mess as-is

Semantic search (google ldquocrawlrdquo finds ldquocrawlingrdquo ldquocrawlersrdquo etc)

104

Already doing stuff as you goOrganizing typing writing data anyway after each data collection session

How is this done nowHow can we make this non-invasivesimpler than current practices

Uploading transcoding splitting videos anywayExporting to SPSSExcelDatavyu anyway

105

Things that change laterExclusions groups entire study organization can change laterNeeds to be easy to do this in Databrary drag around sessions includeexclude add to study make new groupEasy to create download excerptsLast step add publication citation done

106

Taking advantage of your Databrary labOrganization within labs requires less standardization

Each can have their own standardsSearch within your own dataMine your data for subsequent analysesEasy to disseminate examplesNeed effective lsquooverviewrsquo and lsquomonitoringrsquo tools

107

What we need from advisory boardAdvocate for data sharing and DatabraryLead by exampleIdentify and share already-collected data Request participants to share their dataBecome authorized Databrary InvestigatorsExplore the beta site and give us feedbackProvide insights for designing upload and search

108

Break

109

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

110

Quick SWOT analysis

Strengths Weaknesses

Opportunities Threats

111

StrengthsSolution for sharing identifiable dataGrowing awareness visibility within content domainFlexible data model poised for growthUniversity agency supportFree open-source video coding tool

112

WeaknessesData curation hard slowFixing the past vs fixing the futureSustainability model undeterminedSmall curation development staffDatavyu user base small

How to competecooperate with commercial entities

113

Your thoughtsStrengths we neglected or overstatedWeaknesses we neglected or overstated

114

ThreatsOpen science space moving quicklyAttracting venture capital

115

Data-sharing

116

117

118

ThreatsDatabrary software not yet releasedHow can Databrary competeWhy share with Databrary vs alternativesWhat features will attract keep users What features will ensure sustainability change the culture of developmental science

119

Your turnMore threatsDifferent spin on them

120

OpportunitiesPotential for partnering

TalkBankCHILDESOther sharing organizations

- Research Data Alliance (RDA)- Center for Open Science (COS)- Stanford Meta-Research Center

Societies- Report on developmental society meeting

121

OpportunitiesPotential for partnering

Journals- Store supplemental data- Store extended materials for all submissions- Which fields

bull Developmental science onlybull Psychologybull Neuroscience

122

OpportunitiesPotential for partnering

Conferences (both society-affiliated and not)- Store share talk recordings- Store supplemental data

Sustainability component- Member or attendee fees-for-service

123

OpportunitiesEmphasize competitive advantages

Domain-focus visibility- Your colleagues are here

Policies to share identifiable research data especially recordingsUniversity amp Federal supportStability and long-term sustainabilitySharing is our business model

124

Your thoughtsPartnershipsJournalsConferencesSharpen our messageOther ideas

125

OpportunitiesIntegration with external services

Cloud storagedata sharingOffer new services via APIs

Lab managementProjecttask managementData visualizationSocial media communicationsElectronic CV profile management

126

Your thoughtsDo you use any of these servicesIs providing data sharing services enoughDoes integration with other tools help or hinder our goals

127

OpportunitiesBeyond storing tagging video

VisualizationAdditional analysis streams

128

Meeting Agenda100-115 Welcome and recent accomplishments115-145 Roadmap145-215 Policies215-230 Break230-330 Beta Upload and search330-400 Researcher asks and support400-410 Break410-500 Positioning Databrary for the future500-530 Wrap up and discussion

129

Thank you

130