BNFO 602 Lecture 1
description
Transcript of BNFO 602 Lecture 1
![Page 1: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/1.jpg)
BNFO 602Lecture 1
Usman Roshan
![Page 2: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/2.jpg)
Bio background
• DNA
• Transcription and translation
• Proteins: folding and structure
• SNPs
• SNP genotyping, sequencing
![Page 3: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/3.jpg)
Representing DNA in a format manipulatable by computers
• DNA is a double-helix molecule made up of four nucleotides:– Adenosine (A)– Cytosine (C)– Thymine (T)– Guanine (G)
• Since A (adenosine) always pairs with T (thymine) and C (cytosine) always pairs with G (guanine) knowing only one side of the ladder is enough
• We represent DNA as a sequence of letters where each letter could be A,C,G, or T.
• For example, for the helix shown here we would represent this as CAGT.
![Page 4: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/4.jpg)
Transcription and translation
![Page 5: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/5.jpg)
Amino acidsProteins are chains ofamino acids. There aretwenty different aminoacids that chain indifferent ways to formdifferent proteins.
For example,FLLVALCCRFGH (this is how we could storeit in a file)
This sequence of aminoacids folds to form a 3-Dstructure
![Page 6: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/6.jpg)
Protein folding
![Page 7: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/7.jpg)
Protein folding• The protein foldingproblem is to determinethe 3-D protein structurefrom the sequence.• Experimental techniquesare very expensive. • Computational are cheap but difficult to solve. • By comparing sequences we can deduce the evolutionary conserved portions which are also functional (most of the time).
![Page 8: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/8.jpg)
Protein structure
• Primary structure: sequence ofamino acids.• Secondary structure: parts of thechain organizes itself into alpha helices, beta sheets, and coils. Helices and sheets are usually evolutionarily conserved and can aid sequence alignment.• Tertiary structure: 3-D structure of entire chain• Quaternary structure: Complex of several chains
![Page 9: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/9.jpg)
Key points
• DNA can be represented as strings consisting of four letters: A, C, G, and T. They can be very long, e.g. thousands and even millions of letters
• Proteins are also represented as strings of 20 letters (each letter is an amino acid). Their 3-D structure determines the function to a large extent.
![Page 10: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/10.jpg)
SNPs
• DNA sequence variations that occur when a single nucleotide is altered.
• Must be present in at least 1% of the population to be a SNP.
• Occur every 100 to 300 bases along the 3 billion-base human genome.
• Many have no effect on cell function but some could affect disease risk and drug response.
![Page 11: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/11.jpg)
Toy example
![Page 12: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/12.jpg)
SNPs on the chromosome
SNP
Chromosome
Gene
![Page 13: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/13.jpg)
Bi-allelic SNPs
• Most SNPs have one of two nucleotides at a given position
• For example:– A/G denotes the varying nucleotide as
either A or G. We call each of these an allele
– Most SNPs have two alleles (bi-allelic)
![Page 14: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/14.jpg)
SNP genotype
• We inherit two copies of each chromosome (one from each parent)
• For a given SNP the genotype defines the type of alleles we carry
• Example: for the SNP A/G one’s genotype may be– AA if both copies of the chromosome have A– GG if both copies of the chromosome have G– AG or GA if one copy has A and the other has G– The first two cases are called homozygous and latter two
are heterozygous
![Page 15: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/15.jpg)
SNP genotyping
![Page 16: BNFO 602 Lecture 1](https://reader036.fdocuments.net/reader036/viewer/2022081513/5681517b550346895dbfb617/html5/thumbnails/16.jpg)
Real SNPs
• SNP consortium: snp.cshl.org
• SNPedia: www.snpedia.com