BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part...
Transcript of BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part...
![Page 1: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/1.jpg)
BMS353
BMS353Bioinforma.csforBiomedicalScience
Modulecoordinator:DrMartaMilo
![Page 2: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/2.jpg)
BMS353
PartA:Presenta.onofthemoduleBreak–ques.onansweringPartB:Introduc.on
Today’sOutline
![Page 3: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/3.jpg)
BMS353
PartAPresenta.onofthemodule
![Page 4: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/4.jpg)
BMS353
Whatisallabout?
This module will describe fundamental concepts and technologies underlyingcomputa.onalbiologyandbioinforma.cs.
Computa(onalBiologyisthedevelopmentandapplica5onofdatadrivenmathema5calmodelingandcomputa5onalsimula5ontechniquestostudyofbiological,behavioral,andsocialsystems
Bioinforma(csisaninterdisciplinaryfieldofsciencethatdevelopsmethodsandso?waretoolsforunderstandingbiologicaldataBioinforma5cscombinescomputerscience,sta5s5cs,mathema5cs,andengineeringtoanalyseandinterpretbiologicaldata. adaptedfromwikipedia
![Page 5: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/5.jpg)
BMS353
• Nextgenera.onSequencingdataanalysis
• Noisedeconvolu.on
• Modellinguncertainty
• Integra.onofdata
• Modellingobserveddataforpredic.ons
WhatisaBioinforma.cian?
WhatIdoinmyresearch:
WhatamIgoingtoteachyou?
SomeofthatSTUFF
![Page 6: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/6.jpg)
BMS353
Whatarethelearningoutcomesofthismodule?
Thismoduleaimsto:1. provideanunderstandingofthefundamentalconceptsandtechnologies
underlyingcomputa.onalbiologyandbioinforma.cs
2. equipbiologystudentswithbasicknowledgeofmathema5calconceptsthemwithmethodsofBioinforma.csandComputa.onalbiology
3. useamul5disciplinaryapproachintegratedwithprogrammingtoolsandsta.s.calconceptsunderpinningadvanceddataanalysisandmethodsthataresuitableforhigh-throughputdataanalysis
4. providenewtransferableskills
![Page 7: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/7.jpg)
BMS353
Howwillyoubelearning?
• Lecturesontheore.calconcepts
• OnlineresourcesfromopensourcesoSware
• Wri.ngsimplescriptsfordataanalysisduringprac.calclasses
• Self-markingandforma.vefeedback
• Groupdiscussionandforumthroughthemodulewebsite
• Smallresearchprojectonrealdata• Bangingyourheadonthecomputer..
• Givingyourself.metoadapttothisnewwayofthinking…
![Page 8: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/8.jpg)
BMS353
WhatwillyougainfromBMS353?
• Trainingindataanalysisandbasicprogrammingskillswiththeaimsofbeingawareoftheeffectsofexperimentaldesigninthedataanalysis
• AgoodunderstandingoftechnologiesandmethodsforBioinforma.csanduseofworkflowandpipelinesfordataanalysis
• Newqualifica.onsthatwillincreaseyouremployability
• Deeperinsightintotheprinciplesofconduc.ngaresearchdataanalysisproject
• Anewseoftransferableskills,likeprogrammingandawarenessofcloudcompu.nganddatasharing
• Learninganewterminologyandnewinterdisciplinaryskills
![Page 9: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/9.jpg)
BMS353
ModuleOutline
Theteachingconsistsoftwohoursoflecturesandtwooflabclasseseachweek.Thelectureswillbefollowedbyprac.calclasses.Inthelabclasseswewillusecodingtotransformtheoryinprac.ceLabclassesaresplitintwogroupstoreduceclassnumbersCodingrequiresprac.ce,themorethebe\er
![Page 10: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/10.jpg)
BMS353
ModuleOutline(cont.)
• Course-worksareessen.altolearnthecodingskills–dothem.
• Hitthedeadlinesfortheself-assessmenttomonitoryourprogressandhighlightproblemyoumighthave.
• Makesureyoupar.cipateac.velytotheinterac.vesessionsintheclassandinthelabs.
• Usetheresourcesonthemodulewebsiteand
• Readcarefullythenotebookandfollowtheinstruc.ons
• Avoidcatchup!RememberBMS353isdifferentfrombiologyteachingandcanbeoverwhelmingifleSallattheend.
• Pleasenote,ifyouemailaques.onthatcanbeansweredbyreadingthemodulehandbookorinstruc.onsonMOLEormodulewebsiteyouwillnotreceiveananswer.
![Page 11: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/11.jpg)
BMS353
BMS353website
![Page 12: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/12.jpg)
BMS353
ThetoolswewilluseJupyternotebook(OriginallyIpythonnotebook)Combinescomputerprograms(code),text,data,resultsintooneinterac.vedocument
Apopularprogramminglanguageinareassuchasbioinforma.cs,sta.s.csanddataanalysis.
Wewilluseacloudcompu.ngenvironmentcalledCoCalc(SageMathCloud)
Wewilluseourbraintocreatenewknowledge
Somemathema.calconcepts
![Page 13: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/13.jpg)
BMS353
BMS353assessment
Theexamforthismodulewillbesplitintwoparts:PartA–AMul.pleChoiceQues.ontestforthedura.onof1hrand30minutes,thatwillcount30%ofthefinalgradePartB–Anotebookwiththeimplementa.onofallocatedprojectsthatwillcountfor70%ofthefinalgrade.Theprojectwillbeacollec.onofallthetoolsexperiencedintheprac.callabsimplementedonasetofrealdata.Itwillbedevelopedingroupsofthreestudents,butnotebookwillhavetobehandedindividually.Thelabprac.calnotebookshandedineveryweekduringthemodulewillcons.tuteforma.vefeedbackthatcanbeusedforthefinalproject.
![Page 14: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/14.jpg)
BMS353
MCQassessment
Eachques.onwillhave4possibleresponsesA,B,CorD.ONLYONERESPONSEISCORRECTINEACHCASE.Eachques.onisworthonemark,correctanswerwillcountas1,anincorrectanswerwillcountas-0.5.Notansweredques@onswillcountas0.
1. WhatisthemainsubjectofBMS353:A.PhycologyB.Sta.s.csC.Computa.onalbiologyD.ComputerScience
2.WhatlevelstudentsBMS353isaimedat:A.Level3-BMSB.PostgraduateC.MasterstudentsD.ComputerSciencestudents
3.Therewillbenomathema.csinBMS353:A.TRUEB.FALSEC.TRUEonlyinodddaysD.TRUEonlyinevendays
Student1:1C,2B,3Amark=0Student2:1C,2-,3-mark=1
![Page 15: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/15.jpg)
BMS353
PartBIntroduc.on
![Page 16: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/16.jpg)
BMS353
Cloudcompu.ng
Cloudcompu5ng,orsimply“thecloud”,alsoknownason-demandcompu.ngisamodelforenablingon-demandaccesstoasharedpoolofconfigurableresources
SamJohnston–fromWikipedia
Thecloudmetaphor:thenetworkelementsrepresen.ngtheservicesareinvisibletotheuser,likeobscuredbyacloud
• Costefficient• Largespacestorage• Backupandrecovery• Easyaccess• Quicktogainfunc.onality• Incen.vescollabora.onanddatasharing
Advantages
Disadvantages• Technicalissues• Securityinthecloud• Pronetoa\ack
![Page 17: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/17.jpg)
BMS353
Cloudcompu.ng:anexampleAveryeffec.veuseofthecloudresourcesanditscommercialexploita.onisgivenbyAmazon
Theyusedcloudcompu.ngtocreatetheconceptofElas@cCompu@ng(EC2).ItisakeypartoftheAmazonWebServices(AWS),whichiscomposedofscalableelas.ccomputeunit(ECU)thatwereintroducedasanabstrac.onofcomputerresources.Ausercancreate,launch,andterminateserverusageasneeded.Itisbasedona“payingbythehourforac.veservers”thisiswhyitiscalled"elas.c".Itsglobalfeatureallowsuserstocontroloverthegeographicalloca.onofinstances(serverusage),op.misinglatencyandredundancy.
Firsttoallowcompanytorentscalablecompu.ngresourcesTheirretailecommercesiteisen.relybaseoncloudcompu.ng
![Page 18: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/18.jpg)
BMS353
BigDataandDataSharingBigdataisaverygenerictermtoindicatedatasetsthataresolargeorcomplexthattradi.onaldataprocessingapplica.onsareinadequateforminingit.
Visualiza.onofdailyWikipediaeditscreatedbyIBM.Atmul.pleterabytesinsize,thetextandimagesofWikipediaareanexampleofbigdata.
HighvolumeHighvelocityHighvarietyHighlyvariableHighvaria.oninqualityHighcomplexity
![Page 19: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/19.jpg)
BMS353
Therearemanychallengeswhendealingwithbigdata,someofthemare:• Dataanalysis• Datacura.on• Searchingengines• Datasharing• Datastorageandtransfer• Datavisualiza.on• Informa.onprivacyHowever,bigdatahasahighpredic5vepoweranditsaccuracymayleadtomoreconfidentdecisionmaking.
BigDataandDataSharing(cont.)
Inbiology:Withtheadventofhigh-throughputgenomics,lifescien.stsarestar.ngtograpplewithmassivedatasets,encounteringbigdatachallenges
TechnologyFeature,Nature2013
Analysingthelargeamountofgenomicdatawithlocalinfrastructureisimpossible.Thedataisthenmovedtothecloudforanalysisandstorage.Datasharingisbecomingcrucialforbiologicaldata.
![Page 20: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/20.jpg)
BMS353
CoCalcwww.cocalc.com
WeareusingthecloudtolearninBMS353.Theresourcesonthecloudareusedasteachingtool
![Page 21: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/21.jpg)
BMS353
JupyterNotebooksonCoCalcCloud
WewilluseJupyterNotebooksandtheirkernelsonCoCalcforallourprac.calclasses.AJupyterNotebookskernelisa“computa5onalengine”thatexecutesthecodewri\enintheNotebookdocument.Inthismodule(BMS353)wewilluseRkernelstoimplementourdataanalysisinthenotebooks.Therewillbeallocatedfolderandstoragespacetoourproject:BMS353YouwillaccessyourassignmentsanddatausingCoCalcwithawebbrowser.EverythingwillbestoredinCoCalcfolderallocatedtoyou.Thecloudwillbackupandsecureourwork,aswellasgivinguscomputa.onal.meforthedataanalysisAllthelabprac.calsandthefinalprojectwillbemarkedandassessedfromnotebookssavedintheCoCalcfolders.
![Page 22: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/22.jpg)
BMS353
BasicprogrammingterminologyProgramminglanguage=isalanguageformallydesignedtocommunicateinstruc.ontoamachine,i.e.acomputer,tocontrolbehaviorortoexpressamathema.calconstructinnumericalform(makeopera.ons,moreorlesscomplex)Algorithm=itisaprocedureorformulaforsolvingaproblemKernels=computa.onalenginethatisac.vatedbyaspecificlanguage(i.e.R,Python,Cetc.)Scripts=alistofinstruc.onsthatrepresentthecommandneededtorepresentatask.IthasalogicalstructureandadefinedstructurefordatainputImplementa@on=theprocessofpuqngintoeffectthelistofinstruc.onsthatarespecifiedinthescript.Thisisdonebyusingnumericalvaluesasinput.Theimplementa.onprocesswillproduceaafinalsetofvalues.Debug=Processforiden.fyingandremovingerrorfromscriptsObject=virtualcontainerofvaluesstoredintheworkingspace.Itisusedtoimplementtheinstruc.onsandtostorevaluesduringtheimplementa.onandasfinalset.ProgrammingFunc@on=itisaprocedureorarou.nethatencapsulatea“task”.Manyinstruc.onsarecombinedinone“word”(thenameoftheprogrammingfunc.on)whichwillimplementthat“task”onasetofspecifiedinput.ReadandWrite=Theprocessofuploadingdataintotheworkspaceandtodownloaddatafromtheworkingspaceintoalocalorremotearchive(folder)
![Page 23: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/23.jpg)
BMS353
Basicmathema.cs
![Page 24: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/24.jpg)
BMS353
Basicmathema.csnota.on
Singlevaluesandvectors
xandyarevaluesfromtherealnumbersx, y ∈ℜ
Z
X
Y
�
A ≡ (x, y, z)A
x
y
z Ingeneralx ≡ (x1, x2,..., xN )
xi ∈ℜ
i =1,...,N
Thevaluesxi arecalledvariablessincetheycanassumearangeafixedvaluesTheparameterarefixedvaluesthatweindicateinmathema.calnota.onwithGreekle\ers
α,β,µ,σ ,λ.....
![Page 25: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/25.jpg)
BMS353
Basicmathema.csnota.on(cont.)Matricesaretablesofvaluesorle\ersthatareorganisedinrowsandcolumns.Incommonusetheyonlyhavetwodimensions,inmoreadvancedusetheycanhavethree.Vectorsarespecialcasesofmatrices,theyhaveanumberofNcolumnsandonlyonerow
A = [3x4] Opera@onwithMatrixSumandDifferencesamedimensionsMul.plica.onsnumberofcolumnofthefirstmatrixneedtobethesameasnumberofrawofthesecondmatrix.Mul.plica.onisdonesothat:
![Page 26: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/26.jpg)
BMS353
Basicmathema.csnota.on(cont.)
Awayofwri.nganota.onforlargesumsormul.plica.onistousetheGreeksymbolsof
∑
∏
Forsumming
Formul.plying
ForsummingNvalueswewillusethefollowingnota.on:ix /σ
i=1
N
∑
Formul.plyingNvalueswewillusethefollowingnota.on: ix /σi=1
N
∏
![Page 27: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/27.jpg)
BMS353
Basicmathema.csnota.on(cont.)
Afunc.onisarela.onfromasetofinputtoasetofpossibleoutputs,whereeachinputisrelatedtoexactlyoneoutput.
f (x) = x / 2
outputInput(variable)
f (x) = 4x + 4
Whentheinputisonewesayaone-dimensionfunc5on
f (x, y) = 2x +2y
Whentheinputismorethatonevariablewesayamul5-dimensionfunc5on.Withtwovariablewesayabi-dimensionalfunc5on
f (x, y /α) =2x +
2yα
Wecanalsohavefunc.oncondi5onaltoaparameter.Inthiscasewecallthemcondi5onalfunc5ons
Whereαhasvaluefromasetofevennumberbetween0and10
![Page 28: BMS 353 Bioinformacs for Biomedical Scienceopendsi.cc/bioinformatics/assets/Lecture_Wk7.pdf · Part B – A notebook with the implementaon of allocated projects that will count for](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42f1a0ed27cc35e4078edb/html5/thumbnails/28.jpg)
BMS353
Summary
• WhatisBMS353aboutandwhatyouexpecttolearnandgainaSertakingBMS353
• Howtogaininforma.onaboutthemoduleandwheretofindlinkstoaddi.onalreadingmaterial,lecturescontentandprac.calclasses(Website)
• Howtointeractfordiscussionandproblem-solving
• Howyouwillgetassessed
• ToolswewillbeusinginBMS353
• Cloudcompu.ngandBigData
• JupyterNotebooksandCoCalcCloud
• Basicprogrammingterminology
• Refreshedsomebasicmathema.calno.onsandnota.ons.