Scott Edmunds: Revolutionizing Data Dissemination: GigaScience
-
Upload
gigascience-bgi-hong-kong -
Category
Technology
-
view
133 -
download
3
description
Transcript of Scott Edmunds: Revolutionizing Data Dissemination: GigaScience
![Page 1: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/1.jpg)
Revolutionizing data dissemination.
www.gigasciencejournal.com
GSC13, ShenzhenScott Edmunds
![Page 2: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/2.jpg)
www.gigasciencejournal.com
Large-Scale Data Journal/Database
Editor-in-Chief: Laurie Goodman, PhDEditor: Scott Edmunds, PhDAssistant Editor: Alexandra Basford, PhDLead Curator: Tam Sneddon D.Phil
In conjunction with:
Now taking submissions…
![Page 3: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/3.jpg)
www.gigaDB.org
Associated Database
![Page 4: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/4.jpg)
BGIData Reuse
Funders
Databases
Journals
Data Producers
Users
…Data Flow
![Page 5: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/5.jpg)
Data Re-use
($)
Effort
Usability
![Page 6: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/6.jpg)
Need to lower the hurdles…
($)
Effort
Usability
![Page 7: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/7.jpg)
Need to lower the hurdles…
($)
Effort
Usability
![Page 8: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/8.jpg)
Need to lower the hurdles…
Cloud solutions?
Better tools for assessing data quality…
Better handling of metadata…
![Page 9: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/9.jpg)
Cloud?
Need to lower the hurdles…More efficient handling of data…
Do we need to keep everything?
Compression?
![Page 10: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/10.jpg)
![Page 11: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/11.jpg)
![Page 12: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/12.jpg)
Better incentives?
($)
Effort
Usability
![Page 13: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/13.jpg)
?
New incentives/credit
Credit where credit is overdue:“One option would be to provide researchers who release data to public repositories with a means of accreditation.”“An ability to search the literature for all online papers that used a particular data set would enable appropriate attribution for those who share. “Nature Biotechnology 27, 579 (2009)
Prepublication data sharing (Toronto International Data Release Workshop)“Data producers benefit from creating a citable reference, as it can later be used to reflect impact of the data sets.” Nature 461, 168-170 (2009)
![Page 14: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/14.jpg)
Datacitation: Datacite and DOIs
“increase acceptance of research data as legitimate, citable contributions to the scholarly record”.
Aims to:
“data generated in the course of research are just as valuable to the ongoing academic discourse as papers and monographs”.
![Page 15: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/15.jpg)
For data citation to work, needs:
• Proven utility/potential user base.
• Acceptance/inclusion by journals.
• Data+Citation: inclusion in the references.
• Tracking by citation indexes.
• Usage of the metrics by the community…
![Page 16: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/16.jpg)
>1.3 million DOIs since Dec 2009
Datacitation: utility/user base.
![Page 17: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/17.jpg)
BGI Datasets Get DOI®s
doi:10.5524/100004
PLANTSChinese cabbageCucumberFoxtail milletPigeonpeaPotatoSorghum
MicrobeE. Coli O104:H4 TY-2482
Cell-LineChinese Hamster Ovary
Human Asian individual (YH) - DNA Methylome - Genome Assembly- TranscriptomeAncient DNA (coming soon)- Saqqaq Eskimo - Aboriginal Australian
VertebratesGiant panda Macaque - Chinese rhesus - Crab-eatingNaked mole rat Penguin - Emperor penguin- Adelie penguinPigeon, domesticPolar bearSheepTibetan antelope
InvertebrateAnt - Florida carpenter ant- Jerdon’s jumping ant- Leaf-cutter antRoundwormSilkworm
Many released pre-publication…
![Page 18: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/18.jpg)
To maximize its utility to the research community and aid those fighting the current epidemic, genomic data is released here into the public domain under a CC0 license. Until the publication of research papers on the assembly and whole-genome analysis of this isolate we would ask you to cite this dataset as:
Li, D; Xi, F; Zhao, M; Liang, Y; Chen, W; Cao, S; Xu, R; Wang, G; Wang, J; Zhang, Z; Li, Y; Cui, Y; Chang, C; Cui, C; Luo, Y; Qin, J; Li, S; Li, J; Peng, Y; Pu, F; Sun, Y; Chen,Y; Zong, Y; Ma, X; Yang, X; Cen, Z; Zhao, X; Chen, F; Yin, X; Song,Y ; Rohde, H; Li, Y; Wang, J; Wang, J and the Escherichia coli O104:H4 TY-2482 isolate genome sequencing consortium (2011) Genomic data from Escherichia coli O104:H4 isolate TY-2482. BGI Shenzhen. doi:10.5524/100001 http://dx.doi.org/10.5524/100001
Our first DOI:
To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring rights to Genomic Data from the 2011 E. coli outbreak. This work is published from: China.
![Page 19: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/19.jpg)
![Page 20: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/20.jpg)
![Page 21: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/21.jpg)
![Page 22: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/22.jpg)
Data Citation: acceptance by journals
![Page 23: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/23.jpg)
Data Citation: acceptance by journals
![Page 24: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/24.jpg)
Data+Citation: inclusion in the references
![Page 25: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/25.jpg)
• Data submitted to NCBI databases:
• Submission to public databases complemented by its citable form in GigaDB.
Published 21st November 2011
- Raw data SRA:SRA046843 - Assemblies of 3 strains Genbank:AHAO00000000-AHAQ00000000 - SNPs dbSNP:1056306 - CNVs- InDels dbGAP:nstd63 - SV
}
![Page 26: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/26.jpg)
![Page 27: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/27.jpg)
In the references…
![Page 28: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/28.jpg)
Is the DOI…
![Page 29: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/29.jpg)
![Page 30: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/30.jpg)
And now in Nature Biotech…
![Page 31: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/31.jpg)
Datacitation: tracking?
![Page 32: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/32.jpg)
Datacitation: tracking?
Plans in 2012 to link central metadata repository with WoS
- Will finally track and credit use!
To be continued…
DataCite metadata in harvestable form (OAI-PMH)
![Page 33: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/33.jpg)
www.gigasciencejournal.com
Thanks to:
@gigascience
facebook.com/GigaScience
blogs.openaccesscentral.com/blogs/gigablog/
Contact us:
Laurie Goodman Alexandra BasfordTam Sneddon Shaoguang LiangTin-Lap Lee (CUHK) Qiong Luo (HKUST)
Follow us:
![Page 34: Scott Edmunds: Revolutionizing Data Dissemination: GigaScience](https://reader031.fdocuments.net/reader031/viewer/2022013100/54c8ae514a7959da358b4571/html5/thumbnails/34.jpg)
www.gigasciencejournal.comContact: [email protected]
GSC13 special series
• Rapid review - rolling publication after launch issue• High-visibility – published/promoted by BMC/GigaScience• Article Processing Charge covered by BGI• Hosting of any test datasets in GigaDB
Seeking submissions highlighting best practice in genomics research:
• Discussion/comment/white papers• Cloud computing, software for data handling• Research highlighting best practice