Standing on the shoulders of giants, German Demidov,...

68
Standing on the shoulders of giants, German Demidov, Bioinformatics Summer School 2017

Transcript of Standing on the shoulders of giants, German Demidov,...

Page 1: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

Standing on the

shoulders of giants,

German Demidov,

Bioinformatics

Summer School

2017

Page 2: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

BiologyandBigData

> Discoveringtruth

bybuildingon

previous

discoveries

Page 3: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

Whyitisuseful?

Justoneexample:

Page 4: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

Usingdatafromconsortia

> Whichtypesofdatacanyouobtainfrom

consortia?Howtoaccessanddownload

data?

> Howtoworkasapartofconsortia?Which

problemsyoumayface?

Page 5: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ImportantRemark

> Workshops“Howtouseconsortium_name”

usuallytake~3days(ie

https://www.encodeproject.org/tutorials/

encode-meeting-2016/),wewilltrytomake

anoverviewin1hour

> However,ifyouwanttofindmoreinformation

– google“consortium_nameworkshop”

> Thereareseparatepapers(i.e.EwanBirney,

2012,Nature,aboutENCODE)

Page 6: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

GWASConsortia

> http://

www.wikigenes.org/

e/art/e/185.html

> 500.000genotyped

peopleinUK

Page 7: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

EWASConsortia

Page 8: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

GenomicsConsortia

> TheExomeAggregationConsortium

> 1000Genomes

> HumanReferenceGenome

> InternationalCancerGenomeConsortium

> TheCancerGenomeAtlas

> PanCancerAnalysisofWholeGenomes

> GTEx

Page 9: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

EpigenomicsConsortia

> ENCODE

> RoadmapEpigenomics

> BluePrint

> InternationalHumanEpigenome

Consortium

Page 10: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ExACOverivew

> http://exac.broadinstitute.org/about

> Firstthingtodo–lookandreadflagship

paper!

> Thedatasetprovidedonthiswebsitespans

60,706unrelatedindividualssequencedas

partofvariousdisease-specificand

populationgeneticstudies.

Page 11: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ExAC:Whyitisuseful

Itisusedto

> calculateobjectivemetricsofpathogenicityforsequencevariants,

> identifygenessubjecttostrongselectionagainstvariousclassesofmutation;identifying3,230geneswithnear-completereductionofnumberofpredictedprotein-truncatingvariants,with72%ofthesegeneshavingnocurrentlyestablishedhumandiseasephenotype,

> efficientfilteringofcandidatedisease-causingvariants

Page 12: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ExAC:Results

•  ANNOVARandATAVwereupdatedusing

ExACdata

•  CADDscoreswerere-calculated

•  CommercialtoolssuchasGoldenHelixand

GeneTalkalsoincorporatedExACdata

Page 13: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ExAC:Download

> Download

Page 14: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ExAC:Methods

> FlagshipPaper–Methods–short

descriptionwithdetailedpipelinesin

SupplementaryInformation

> 91,796individualexomesdrawnfroma

widerangeofprimarilydisease-focused

consortia

Page 15: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ExACQualityAssesment

> Comparisonwithintrios:singletontransmissionrateof50.1%(~50%)

> >10.000sampleswerecheckedwithSNPArrays–97-99%heterozygousconcordance

> Platinumstandardgenomesequencedwith5differenttechnologies–99.8%Sensitivity,0.056%FDR

> Comparisonwith13WGS~30x,PCR-free

> IndelFDRishigher(4.7%),singletonvariantsshowhigherFDR

> FDRisdifferentfordifferentannotationclasses(missense,synonymous,proteintruncating)

Page 16: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ExACSampleFiltering

> Only60.706samplespassedQCoutof91.796

> SetofcommonSNPswasselected(5.400)andsampleswithoutlierheterozygositywereremovedpriortoPCA

> Persamplenumberofvariants,transition/transversion(TiTv)ratio,alternatealleleheterozygous/homozygous(Het/Hom)ratioandinsertion/deletion(indel)ratio

> Closerelativeswereremoved

> Finalcoverage:80%oftargetedbases>20x

> 77%wereenrichedwithAgilentKit(33MBtarget)

Page 17: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP

> http://www.internationalgenome.org

Page 18: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:Overview,goals

> http://www.internationalgenome.org/data-portal/sample

> Prettyconvenientdataportalthatallowsyounicefiltering!

> Thegoalofthe1000GenomesProjectwastofindmostgeneticvariantswithfrequenciesofatleast1%inthepopulationsstudied.

> Theprojectplannedtosequenceeachsampleto4xgenomecoverage;atthisdepth,sequencingcannotdiscoverallvariantsineachsample,butcanallowthedetectionofmostvariantswithfrequenciesaslowas1%.

Page 19: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:MainPublications

> Pilot:Amapofhumangenomevariationfrompopulation-scalesequencingNature467,1061–1073(28October2010)

> Phase1:Anintegratedmapofgeneticvariationfrom1,092humangenomesNature491,56–65(01November2012)

> Phase3:AglobalreferenceforhumangeneticvariationNature526,68–74(01October2015)

> Anintegratedmapofstructuralvariationin2,504humangenomesNature526,75–81(01October2015)

Page 20: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:Pipeline

Page 21: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:PowerofDetection,Heterozygous

Discordance,SequencingDepth

Page 22: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:Results

Page 23: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:VariantCalling

Page 24: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:CNVs

Page 25: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:CNVsconcordance

Page 26: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

PanCancerAnalysisOfWG

> https://dcc.icgc.org/pcawg

Page 27: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

PanCancerAnalysisOfWG

1.  Novelsomaticmutationcallingmethods

2.  Analysisofmutationsinregulatoryregions

3.  Integrationofthetranscriptomeandgenome

4.  Integrationoftheepigenomeandgenome

5.  Consequencesofsomaticmutationsonpathwayandnetworkactivity

6.  Patternsofstructuralvariations,signatures,genomiccorrelations,retrotransposonsandmobileelements

7.  Mutationsignaturesandprocesses

8.  Germlinecancergenome

9.  Inferringdrivermutationsandidentifyingcancergenesandpathways

10.  Translatingcancergenomestotheclinic

11.  Evolutionandheterogeneity

12.  Portals,visualizationandsoftwareinfrastructure

13.  Molecularsubtypesandclassification

14.  Analysisofmutationsinnon-codingRNA

15.  Mitochondrial

16.  Pathogens

Page 28: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

PCAWG,WG8:Validation

> High-coveragevalidation

> 3maincallers:BroadInstitute–HaplotypeCaller,Annai-RTG(privatecompany),Freebayes(EMBL-DKFZ)

> 50samples,5000sitespersamplesequencedwith~1000depth

> ~2300SNVs,~2700indels

> SNPRecall/PPV/concordance~0.995

> Indels:0.94Recall,0.91PPV,concordance0.88

Page 29: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

PCAWGWG8,CNVs

> CNVs

Page 30: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

PCAWGWG8:Results

> Sensitivity,deletionsonly~60%,

duplications~40%!

Page 31: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

FurtherInformation

> Flagshippaperisnotinformative:/

> 16papersarereleasedinbioRxiv

Page 32: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

GTEx

> TheGenotype-TissueExpressionprojectaimstoprovidetothescientificcommunityaresourcewithwhichtostudyhumangeneexpressionandregulationanditsrelationshiptogeneticvariation

> Variationsingeneexpressionthatarehighlycorrelatedwithgeneticvariationcanbeidentifiedasexpressionquantitativetraitloci,oreQTLs

Page 33: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

GTEx

> Alotofgeneticchangesassociatedwithcommonhumandiseases,suchasheartdisease,cancer,diabetes,asthma,andstroke,liesoutsideoftheprotein-codingregionsofgenes

> ThecomprehensiveidentificationofhumaneQTLswillgreatlyhelptoidentifygeneswhoseexpressionisaffectedbygeneticvariation

Page 34: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

GTExDataOverview

Page 35: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

GTExScheme

Page 36: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

GTEx:CausesofDeath

Page 37: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:Overview

> https://www.encodeproject.org

> EncyclopediaofDNAelements

> ThegoalofENCODEistobuilda

comprehensivepartslistoffunctional

elementsinthehuman(mouse/fly/worm)

genome

Page 38: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODETimeline

Page 39: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODEasfor2012

Page 40: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:TypesofData

> https://www.encodeproject.org

Page 41: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:DataMatrix

Page 42: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:AuditCategory

Eachsamplecanhavemultiple

QCissuesandcanstill

Beavailablefordownloading!

Page 43: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:ResultofAnalysis

Page 44: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:GroundLevel

Page 45: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:Mid-level

Page 46: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:Top-Level

Page 47: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODEpublications

> Ofcourse,oneoftheproductsis

publicaitons!

0

100

200

300

400

500

600

Nu

mb

er

of

Pu

blic

ati

on

s

Cumulative ENCODE Publications Over Time

Papers from Non-ENCODE Authors

Papers from ENCODE 2 Production Groups

Page 48: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODEstandards

> DataStandards

Page 49: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

BluePrint

> “BLUEPRINTisalarge-scaleresearchprojectreceivingcloseto30millioneurofundingfromtheEU.”

> 42leadingEuropeanscientificcenters

> Theaimtofurthertheunderstandingofhowgenesareactivatedorrepressedinbothhealthyanddiseasedhumancells

> Focusondistincttypesofhaematopoieticcellsfromhealthyindividualsandontheirmalignantleukaemiccounterparts

Page 50: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

BluePrint

> http://www.blueprint-epigenome.eu

> Publications(CellPapers)&DataPortal

Page 51: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

BluePrint

> http://dcc.blueprint-epigenome.eu/#/home

Page 52: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

BluePrint

Page 53: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

BluePrint

Page 54: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

> TheNIHRoadmapEpigenomicsResearchtotransformourunderstandingofhowepigeneticscontributestodisease

> TheConsortiumleveragesexperimentalpipelinesbuiltaroundnext-generationsequencingtechnologiestomapDNAmethylation,histonemodifications,chromatinaccessibilityandsmallRNAtranscriptsinstemcellsandprimaryexvivotissuesselectedtorepresentthenormalcounterpartsoftissuesandorgansystemsfrequentlyinvolvedinhumandisease

Page 55: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

Page 56: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

Page 57: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

ItlookslikewecangetProtocolsclickingonthelink,however,

therearenotalotofthemthere.Theprotocolsaresuper

outdated!(egREMCSTANDARDSANDGUIDELINESFORCHIP-

SEQDEC.2,2011—V1.0)

Page 58: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

> Ifyouwannatoworkwiththesedata–readthepaper“Integrativeanalysisof111referencehumanepigenomes”(+16ENCODE2012,donotprintthepaper!)

> Gothroughthe“Publications”list

Page 59: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

ThemostusefulsectionisMethods:

> RNA-sequniformprocessingandquantificationforconsolidatedepigenomes

> ChIP-seqandDNase-sequniformreprocessingforconsolidatedepigenomes

> Methylationdatacross-assaystandardizationanduniformprocessingforconsolidatedepigenomes

> Chromatinstatelearning

> Etc.

Page 60: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

> Publications

Page 61: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

>  HistonemarkcombinationsshowdistinctlevelsofDNAmethylationandaccessibility,andpredictdifferencesinRNAexpressionlevelsthatarenotreflectedineitheraccessibilityormethylation.

>  Megabase-scaleregionswithdistinctepigenomicsignaturesshowstrongdifferencesinactivity,genedensityandnuclearlaminaassociations,suggestingdistinctchromosomaldomains.

>  Approximately5%ofeachreferenceepigenomeshowsenhancerandpromotersignatures,whicharetwofoldenrichedforevolutionarilyconservednon-exonicelementsonaverage.

>  Epigenomicdatasetscanbeimputedathighresolutionfromexistingdata,completingmissingmarksinadditionalcelltypes,andprovidingamorerobustsignalevenforobserveddatasets.

>  Dynamicsofepigenomicmarksintheirrelevantchromatinstatesallowadata-drivenapproachtolearnbiologicallymeaningfulrelationshipsbetweencelltypes,tissuesandlineages.

Page 62: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

WorkinginConsortia

Page 63: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

WorkingwithData

•  GettingRawData

•  Workingwiththedatafromdifferent

consortiasimultaneously:differentQCs,

differentdataanalysispipeline

•  Versionsoftoolsmissedoroutdated/

unsupportedtools–failureofreplication!

Page 64: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

WorkinginConsortiaI

•  WhenyourServergetsdownorallyour

datawereaccidentallyremoved

•  Deadlines–add3-6monthstoexpected

date!

•  Communication:teleconferences

•  Passwordsrenewal,permissionstoaccess

•  Efficientdatasharing–speed,reliability,

confidentiality

Page 65: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

WorkinginConsortiaII

•  Differentnamingofthesamesamplesindifferentworkinggroups/labs

•  Wrong/MissingIdentifiers(egwrongcancertypeorpopulation)–case:normalandsomaticwereactuallyswapped

•  Thesame,butfromclinicians

•  Differentlabs-differentlibrarypreparation(egcoveragedepthsafterPCR-freeandPCR-basedWGS)

•  Severaltoolscanbeusedfortheanalysis–establishmentofthebesttoolorgenerationofjointcallset

•  Multipleblacklistoroutlierlists(everylab/grouphasitsownandtheydonotcompletelyoverlap)

Page 66: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

WorkinginConsortiaIII

•  UnbalancedPopulationStructure

•  Mixofdifferenteffects(egCancervs.

Population)

•  IsyourGermlinereallyGermline?

Page 67: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

SlidefromAgENCODE,EwanBirney

Page 68: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

Спасибозавнимание!