TAIR: Bringing together data for the global plant biology community
TAIR funding and database sustainability Eva Huala...
Transcript of TAIR funding and database sustainability Eva Huala...
![Page 1: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/1.jpg)
TAIRfundinganddatabasesustainability
EvaHualaPhoenixBioinforma=cs
![Page 2: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/2.jpg)
WhatisTAIR?
GenomedatabaseforthemodelplantArabidopsisthaliana,establishedin1999Focusonmanualcura8onofgenefunc8on(~33,000genes)WasNSF-funded1999-2013(1.1M/yrdirectcosts)Interna8onalusage(~53Kuniquevisitors/month
March14,2015
![Page 3: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/3.jpg)
Exampledatasources
• PubMed-publishedar=clesaboutArabidopsis• Individualresearchers–func=onalannota=ons• UniProt–func=onalannota=ons• NCBI–sequencedata• Araport–newgenomerelease• PANTHER–genefamilies
![Page 4: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/4.jpg)
TAIRUsagein20162.2millionsessions14millionpageviews53,800uniquevisitors/month
![Page 5: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/5.jpg)
TypicalTAIRusecases
• Plantbiologyresearcherneedsinforma=ononanunfamiliargene– IfArabidopsis
gene,searchmayusesymbolorlocusID
– Ifnon-Arabidopsisgene,userlocatesclosestArabidopsishomologsfirstviaBLASTsearch
• SeedandDNAstockordering
![Page 6: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/6.jpg)
TopTAIRPages-2016
![Page 7: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/7.jpg)
Laborandcost-intensiveac=vi=es
• Extrac=onofgenefunc=onandotherdatafromresearchar=cles
• So^waremaintenance(bugfixes,upgrades)• Developmentofnewinterfacesorfeatures• Integra=onofnewdatasets(e.g.Araport11)• Helpdesksupportforresearchers
![Page 8: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/8.jpg)
TAIRCura=onSta=s=cs
2014 2015
Arabidopsis articles added
4491 4107
Subset with curatable gene information
2094 (47% of total) 2112 (51% of total)
Number of validated gene-article matches
6972 10,144
Number of new gene symbols added
676 596
![Page 9: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/9.jpg)
Effortstoreducecura=oncosts
• Canwegetauthorstodomorecura=onoftheirownwork?– DevelopedTOASTcommunitycura=ontoolandpartnershipswith
journals
• Canwedecreasecostsandincreasecommunitypar=cipa=onbyrecrui=ng(andpaying)postdocsasexternalcurators?– Experimentusing4postdocsatUCB/PGEC
• Canwedomoretoautomatethecura=onprocess?– Adop=onofTextpressotextminingmethodsforcellularcomponent
cura=on– Star=ngcollabora=onwithStanfordDeepDivegroup
![Page 10: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/10.jpg)
Reininginso^waredevelopmentandmaintenancecosts
• Bestprac=cestoavoidintroducingnewbugs– Con=nuousdeployment,con=nuousintegra=on,unittests
• Agilemethodsanduser-drivendevelopmentprocessforincreasedefficiencyandbeeeroutcomes
![Page 11: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/11.jpg)
Subscrip=onfundingmodel
• Advantages– Stablerevenuestreamdistributedovermanyins=tu=onsand
countries– Thosethatvaluetheresourceaskedtosupportit– Financialincen=vesarewellalignedwithourgoaltomakethedata
usableanddiscoverable– Supportnaturallyscalesupasusageandvaluetoresearchersgrows
• Disadvantages– Poten=alforresearchers,studentsandotherstoloseaccess– Barrierstodatasharingandreuse
![Page 12: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/12.jpg)
Nonprofitapproachcanmi=gatedisadvantagesProblem:• Lossofaccess
– Researchersintherelevantdisciplinewhocan’taffordtopay– Researchersfromotherdisciplines,public(infrequentusers)– Students
Solu=ons:• Emphasisonlargesubscrip=onsoversmall
– countries,consor=aprovideaccesstolargenumberofresearchers
• Affordablepricingwithslidingscaleaccordingtousagelevel• Meteredaccess(somefreepageviewseachmonth)
– providesaccessforinfrequentusers• Freeaccessforlowestincomecountries• Freeaccountsforstudentsusingthedatabaseinacourse
![Page 13: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/13.jpg)
Nonprofitapproachcanmi=gatedisadvantagesProblem:• Barriertodatareuse
– Researchersneedtopublishsubsetsofdatainthecourseoftheirwork– Otherrepositoriescan’tdisplay,reuseandfurtherenhancethedata
Solu=ons:• Adoptacademiclicensetermsthatallowresearcherstopublishandreuse
limitedexcerpts• Freelyreleasealldatasetsa^eroneyearintherepository
– TAIRdoesthisviaquarterlydatareleasefiles
![Page 14: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/14.jpg)
SourcesofTAIRsubscrip=onfunds• WhopaysforTAIRsubscrip=ons?
– Universitylibrariesandconsor=a(55%)– Countries(China,Switzerland)(27%)– Companies(16%)– Individualresearchers(2%)
• Howaretheycharged?– Invoicedforannualormul=-year
subscrip=on(similartojournalsubscrip=on)
![Page 15: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/15.jpg)
$0
$10,000
$20,000
$30,000
$40,000
$50,000
$60,000
$70,000
$80,000
$90,000
$100,000
J FMAMJ J ASOND J FMAMJ J ASOND J FMAMJ J ASOND
TAIR-MonthlySubscrip=onRevenue1/2014-12/2016
Actual
Projected
$0
$10,000
$20,000
$30,000
$40,000
$50,000
$60,000
$70,000
$80,000
$90,000
$100,000
J FMAMJ J ASOND J FMAMJ J ASOND J FMAMJ J ASOND
TAIRMonthlySubscrip5onRevenue1/2014-7/2016
2014 2015 2016
AnnualRevenue:
$627,000 $919,000 $1,035,000
![Page 16: TAIR funding and database sustainability Eva Huala …phoenixbioinformatics.org/assets/pag2017_tair_huala.pdfTAIR Curaon Stas=cs 2014 2015 Arabidopsis articles added 4491 4107 ...](https://reader031.fdocuments.net/reader031/viewer/2022030407/5a83da517f8b9a9f1b8b4d9a/html5/thumbnails/16.jpg)
NewFundingParadigm
• Grant-basedfundingforini=aldatabasedevelopment
• Subscrip=onfundingtosupportopera=onsofmaturedatabase– Mustbeaffordable,manysubscrip=onop=ons– Accessforarangeofusersmustbeconsidered
(domainresearchers,researchersfromotherfields,students)– Freelyreleasedataa^erminimalperiodrequiredtoprovide
subscrip=onincen=ve
• Addi=onalgrant-basedfundingtodevelopmajornewfeatures