생물학 연구를 위한 컴퓨터 활용기술 11강
-
Upload
suk-namgoong -
Category
Education
-
view
2.032 -
download
1
Transcript of 생물학 연구를 위한 컴퓨터 활용기술 11강
![Page 1: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/1.jpg)
Computational Skill for Modern Biology Research
Department of BiologyChungbuk National University
11th Lecture 2015.11.24
NGS Analysis IV : Gene Set Analysis
![Page 2: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/2.jpg)
Syllabus주 수업내용1 주차 Introduction : Why we need to learn this stuff?
2 주차 Basic of Unix and running BLAST in your PC
3 주차 Unix Command Prompt II and shell scripts
4 주차 Basic of programming (Python programming)
5 주차 Python Scripting II and sequence manipulations
6 주차 Ipython Notebook and Pandas
7 주차 Basic of Next Generation Sequencings and Tutorial
8 주차9 주차 Next Generation Sequencing Analysis I
10 주차 Next Generation Sequencing Analysis II
11 주차 Next Generation Sequencing Analysis III
12 주차 Next Generation Sequencing Analysis IV
13 주차14 주차
![Page 3: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/3.jpg)
Differential Expression Data MiningSlueth
Analysis-Test Table
Download Table
![Page 4: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/4.jpg)
Data Mining with Ipython NotebookRead ‘test_table.csv’ as dataFrame
P Values FDR(False Discovery Rate)
Mean expression level (logged)
Fold Change
![Page 5: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/5.jpg)
Remove datasets without data
![Page 6: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/6.jpg)
Filtering
Fold change is bigger than 2 FDR is less than 0.01
Observation(expression level is higher than 2)
![Page 7: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/7.jpg)
Read abundance table for each samples
Save them as abundance.csv
![Page 8: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/8.jpg)
Read abundance Table in Pandas
Same Transcripts
Different Samples
![Page 9: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/9.jpg)
Extract transcripts id with differential expression
![Page 10: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/10.jpg)
Select transcripts with differential expression met on criterions
![Page 11: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/11.jpg)
Using ‘pivot’, reshape dataFrame
![Page 12: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/12.jpg)
Calcurate average of tpm for treatment and samples, and filter them out
![Page 13: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/13.jpg)
Draw Clustermap
Use packages called ‘seaborn’ (if it is not there)
In command line, conda install seaborn
![Page 14: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/14.jpg)
Clustermap
Red : overexpressed geneBlue : Downregulated gene
![Page 15: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/15.jpg)
Zoom out these regions
![Page 16: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/16.jpg)
Find out Gene names corresponding upper regions
![Page 17: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/17.jpg)
Application of NGS technology- DNA : Genome Sequencing
• Genome Sequence• Personal Genomic Sequencing : Variant Discovery
- RNA : RNA-Seq
• Expression levels of mRNA
- Anything Else?
![Page 18: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/18.jpg)
- Epigenetics States of Cell
DNA methylationHistone methylation
- Transcription Factors binding : ChIP Sequencing
- Chromatin Status
- RIP-Seq : RNA-Protein Interactions
Application of NGS technology
![Page 19: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/19.jpg)
Application of NGS on Epigenetics
Epigenetics : changes in gene expression without sequence changes
During development of organisms, cell undergoes various differentiation stageAlthough they share common DNA, they have different expression pattern
How these different expresion patterns were determined?
![Page 20: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/20.jpg)
DNA Methylation
Histone Modification
Two Factors in epigenetics
![Page 21: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/21.jpg)
![Page 22: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/22.jpg)
Histone Modification
![Page 23: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/23.jpg)
![Page 24: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/24.jpg)
NGS and Epigenetics- How we can deduce DNA methylation or Histone Mark?
- DNA methylation : Bisulfide Sequencing
- Histone Mark : Chromatin Immunoprecipitation – Sequencing (CHiP-Seq)
* 어떻게 Methylation 된 C 를 알 수 있는가 ?
![Page 25: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/25.jpg)
Bisulfide Sequencing
• By treatment of Bisulfide on DNA, Cytosine is changed as Uracile (Read as T)
• Methylated Cytosin resistant to bisulfide treatments
![Page 26: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/26.jpg)
Genome Wide BS-Seq
![Page 27: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/27.jpg)
Analysis of Bisulfide Sequencing
CGGGCGTGGTGGCGCGCGCCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGGAGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCCAGCCTGG
1. Reference Sequence
2. C-T Conversion except CG
CGGGCGTGGTGGCGCGCGTTTGTAATTTTAGTTATTCGGGAGGTTGAGGTAGGAGAATCGTTTGAATTCGGGAGGCGGAGGTTGTAGTGAGTCGAGATCGCGTTATTGTATTTTAGTTTGG
CGGGCGTGGTGGCGCGCGCCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGGAGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCCAGCCTGG
3. Converted sequence
4. Align sequecing results in Converted Sequence
![Page 28: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/28.jpg)
Analysis of small portion of sequences
http://services.ibc.uni-stuttgart.de/BDPC/index.php
![Page 29: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/29.jpg)
http://services.ibc.uni-stuttgart.de/BDPC/BISMA/examples_unique.php
![Page 30: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/30.jpg)
<- 시퀀싱 데이터
<- 레퍼런스 데이터
![Page 31: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/31.jpg)
![Page 32: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/32.jpg)
![Page 33: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/33.jpg)
![Page 34: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/34.jpg)
DNA Methylation in Genome Browser
![Page 35: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/35.jpg)
![Page 36: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/36.jpg)
- Histone Mark : Chromatin Immunoprecipitation – Sequencing (CHiP-Seq)
Histone MethylationHistone Acetylation
![Page 37: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/37.jpg)
-> align on reference genome
지놈의 어떤 영역에 어떤 Histone Mark 가 있는지를 파악가능
![Page 38: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/38.jpg)
Histone Mark
![Page 39: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/39.jpg)
![Page 40: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/40.jpg)
After Sequencing
Quality Control
Align to reference Genome
Analysis of alignment file (Finding Peak)
Motif Discovery / Secondary Analysis
![Page 41: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/41.jpg)
ChIP results in Genome Browser
![Page 42: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/42.jpg)
H3K4me3 : Mark for active Promoter
H3K27ac : Mark for active Promoter
Transcription Start
![Page 43: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/43.jpg)
H3K27me3 : Inactive chromatin
![Page 44: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/44.jpg)
ChIP with other factors
Transcription Factors
![Page 45: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/45.jpg)
“Yamanaka Factors”
- Oct4, Sox2, Klf-4, c-Myc (OSKM)- Transcription Factors which express abundantly in Embryonic Stem Cell- Screened from 24 transcription factors expressed in ESC- Retroviral expression of these 4 genes in embryonic/Adult fibroblast transformCells into ‘Stem Cell Like’ cells
![Page 46: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/46.jpg)
iPSC (induced Pluripotent Stem Cell)
![Page 47: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/47.jpg)
Molecular event of induced pluripotency
![Page 48: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/48.jpg)
Questions
How we know the specific transcription factors bind which DNA?
Electrophoresis Mobility Shift Assay (EMSA)
Binding of Protein with DNASlow down migration speed in gel
Label DNA with isotpe
Drawbacks : Low throughput, You cannot test genome wide levels
![Page 49: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/49.jpg)
![Page 50: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/50.jpg)
Genome Sequence
Target Site of Transcription Factor
Chromatin immunoprecipitationSequencing
Genome Sequence
Read Depth ( 얼마나 많은 시퀀싱 Read 가 특정위치에 쌓여있는가 ?) 에 의해 전사인자의결합부위를 확인
Sequence Mapping
![Page 51: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/51.jpg)
Transcription Factor
Gene Expressed by Estrogen Stimulations
Transcription Factor Binding
Transcription
![Page 52: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/52.jpg)
![Page 53: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/53.jpg)
Chromatin Status
“ 단단히 꼬여있는 부분과 그렇지 않은 부분의 파악”
![Page 54: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/54.jpg)
![Page 55: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/55.jpg)
![Page 56: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/56.jpg)
RIP-Seq
ChIP-Seq : Find out DNA regions bind to specific protein
Then, How about RNA?How we can find RNAs bound on specific Proteins?
RIP-Seq : RNA interacting Protein Sequencing
http://rbpdb.ccbr.utoronto.ca//
고등생물에는약 200-400 개의RNA bindingProtein 이 존재
![Page 57: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/57.jpg)
![Page 58: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/58.jpg)
http://cistrome.org/dc
![Page 59: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/59.jpg)
![Page 60: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/60.jpg)
![Page 61: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/61.jpg)
![Page 62: 생물학 연구를 위한 컴퓨터 활용기술 11강](https://reader036.fdocuments.net/reader036/viewer/2022062400/58a1d85c1a28abb6678b58b1/html5/thumbnails/62.jpg)