Relational Databases: Basic Concepts BCHB524 2014 Lecture 21 11/12/2014BCHB524 - 2014 - Edwards.
9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.
-
Upload
quentin-lee -
Category
Documents
-
view
214 -
download
0
Transcript of 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.
![Page 1: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/1.jpg)
9/10/2014 BCHB524 - 2014 - Edwards
Introduction to Python
BCHB5242014
Lecture 5
![Page 2: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/2.jpg)
9/10/2014 BCHB524 - 2014 - Edwards
Outline
Review, Homework #2 DNA as a string
Extracting codons in DNA Counting in-frame codons in DNA Reverse Complement
Program Input/Output raw_input, command-line arguments standard-input, standard-output, redirection
2
![Page 3: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/3.jpg)
9/10/2014 BCHB524 - 2014 - Edwards
Review Printing and execution Variables and basic data-types:
integers, floats, strings Arithmetic with, conversion between String characters and chunks, string methods
Functions, using/calling and defining: Use in any expression Parameters as input, return for output
Control Flow: if statements – conditional execution for statements – iterative execution
3
![Page 4: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/4.jpg)
9/10/2014 BCHB524 - 2014 - Edwards
DNA as a string
seq = "gcatgacgttattacgactctgtgtggcgtctgctgggg"
seqlen = len(seq)
# set i to 0, 3, 6, 9, ..., 36for i in range(0,seqlen,3):
# extract the codon as a string codon = seq[i:i+3] print codon
print "Number of Met. amino-acids", seq.count("atg")
4
![Page 5: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/5.jpg)
9/10/2014 BCHB524 - 2014 - Edwards
DNA as a string
What about upper and lower case? ATG vs atg?
Differences between DNA and RNA sequence? Substitute U for each T?
How about ambiguous nucleotide symbols? What should we do with ‘N’ and other ambiguity
codes (R, Y, W, S, M, K, H, B, V, D)? Strings don’t know any biology!
5
![Page 6: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/6.jpg)
9/10/2014 BCHB524 - 2014 - Edwards
DNA as a string
seq = "gcatgacgttattacgactctgtgtggcgtctgctgggg"
def inFrameMet(seq): seqlen = len(seq) count = 0 for i in range(0,seqlen,3): codon = seq[i:i+3] if codon.upper() == "ATG": count = count + 1 return count
print "Number of Met. amino-acids", inFrameMet(seq)
6
![Page 7: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/7.jpg)
9/10/2014 BCHB524 - 2014 - Edwards
DNA as a stringinput_seq = "catgacgttattacgactctgtgtggcgtctgctgggg"
def complement(nuc): nucleotides = 'ACGT' complements = 'TGCA' i = nucleotides.find(nuc) comp = complements[i] return comp
def reverseComplement(seq): newseq = "" for nuc in seq: newseq = complement(nuc) + newseq return newseq
print "Reverse complement:", reverseComplement(input_seq)
7
![Page 8: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/8.jpg)
9/10/2014 BCHB524 - 2014 - Edwards
DNA as a stringinput_seq = "catgacgttattacgactctgtgtggcgtctgctgggg"
def complement(nuc): nucleotides = 'ACGT' complements = 'TGCA' i = nucleotides.find(nuc) if i >= 0: comp = complements[i] else: comp = nuc return comp
def reverseComplement(seq): seq = seq.upper() newseq = "" for nuc in seq: newseq = complement(nuc) + newseq return newseq
print "Reverse complement:", reverseComplement(input_seq) 8
![Page 9: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/9.jpg)
9/10/2014 BCHB524 - 2014 - Edwards
Creating reusable programs
Need to get input data and options from the user …often us, but sometimes others, or us later.
Sometimes, want completely new inputs …but often, want the same or similar input.
Sometimes, typing the input is OK …but often, want to use data in a file.
Sometimes, output to the screen is OK …but often, want the result to go into a file.
9
![Page 10: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/10.jpg)
Interactive input
9/10/2014 BCHB524 - 2014 - Edwards
input_seq = raw_input("Type your codon: ")
def complement(nuc): nucleotides = 'ACGT' complements = 'TGCA' i = nucleotides.find(nuc) if i >= 0: comp = complements[i] else: comp = nuc return comp
def reverseComplement(seq): seq = seq.upper() newseq = "" for nuc in seq: newseq = complement(nuc) + newseq return newseq
print "Reverse complement:", reverseComplement(input_seq)
10
![Page 11: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/11.jpg)
Command-line input
9/10/2014 BCHB524 - 2014 - Edwards
import sysinput_seq = sys.argv[1]
def complement(nuc): nucleotides = 'ACGT' complements = 'TGCA' i = nucleotides.find(nuc) if i >= 0: comp = complements[i] else: comp = nuc return comp
def reverseComplement(seq): seq = seq.upper() newseq = "" for nuc in seq: newseq = complement(nuc) + newseq return newseq
print "Reverse complement:", reverseComplement(input_seq)
11
![Page 12: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/12.jpg)
Interactive and file input
9/10/2014 BCHB524 - 2014 - Edwards
import sysinput_seq = sys.stdin.read()
def complement(nuc): nucleotides = 'ACGT' complements = 'TGCA' i = nucleotides.find(nuc) if i >= 0: comp = complements[i] else: comp = nuc return comp
def reverseComplement(seq): seq = seq.upper() newseq = "" for nuc in seq: newseq = complement(nuc) + newseq return newseq
print "Reverse complement:", reverseComplement(input_seq)
12
![Page 13: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/13.jpg)
File input only
9/10/2014 BCHB524 - 2014 - Edwards
import sysseq_file = sys.argv[1]
# MAGIC: open file, read contents, and remove whitespaceinput_seq = ''.join(open(seq_file).read().split())
def complement(nuc): nucleotides = 'ACGT' complements = 'TGCA' i = nucleotides.find(nuc) if i >= 0: comp = complements[i] else: comp = nuc return comp
def reverseComplement(seq): seq = seq.upper() newseq = "" for nuc in seq: newseq = complement(nuc) + newseq return newseq
print "Reverse complement:", reverseComplement(input_seq) 13
![Page 14: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/14.jpg)
Input Summary
raw_input provides interactive values from the user (also copy-and-paste)
sys.stdin.read() provides interactive or file-based values from the user (also copy-and-paste)
sys.argv[1] provides command-line values from the user (also copy-and-paste) value can be a filename that provides user-input
Terminal standard-input redirection "<" can be used to send a file's contents to raw_input or sys.stdin.read()
9/10/2014 BCHB524 - 2014 - Edwards 14
![Page 15: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/15.jpg)
Output is easy…
Just use print, right? Print statements go to the terminal's
standard-output. We can redirect to a file using ">" Errors still get printed to the terminal.
We can also link programs together – standard-output to standard-input using "|" Also, cat just writes its file to standard out
9/10/2014 BCHB524 - 2014 - Edwards 15
![Page 16: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/16.jpg)
Connect reverse complement w/ codon counting…
Create and test rc.py from earlier slides: Sequence from standard-input Reverse complement sequence to standard-output
Create and test codons.py from earlier slides: Sequence from standard-input Count to standard-output
Place example sequence in file: test.seq Execute:
cat test.seq | python rc.py | python codons.py
9/10/2014 BCHB524 - 2014 - Edwards 16
![Page 17: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/17.jpg)
In general
Windows and OS X have similar facilities cmd in windows, terminal in OS X
Powerful mechanism for making reusable programs No knowledge of python required for use!
Most bioinformatics software is used from the command-line w/ command-line arguments: Files provide sequence data, etc.
I'll promote this style of program I/O.
9/10/2014 BCHB524 - 2014 - Edwards 17
![Page 18: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/18.jpg)
9/10/2014 BCHB524 - 2014 - Edwards
Exercise 1
Use UniSTS (“google UniSTS”) to look up PCR markers for your favorite gene Write a command-line program to compute the
reverse complement sequence for the forward and reverse primer.
18
![Page 19: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/19.jpg)
9/10/2014 BCHB524 - 2014 - Edwards
Exercise 2
Write a command-line program to test whether a PCR primer is a reverse complement palindrome. Such a primer might fold and self-hybridize! Test your program on the following primers:
TTGAGTAGACGCGTCTACTCAA TTGAGTAGACGTCGTCTACTCAA ATATATATATATATAT ATCTATATATATGTAT
19
![Page 20: 9/10/2014BCHB524 - 2014 - Edwards Introduction to Python BCHB524 2014 Lecture 5.](https://reader033.fdocuments.net/reader033/viewer/2022051621/5697bfeb1a28abf838cb7e3a/html5/thumbnails/20.jpg)
9/10/2014 BCHB524 - 2014 - Edwards
Homework 3
Due Monday, September 15th. Submit using Blackboard Use only the techniques introduced so far. Make sure you can run the programs
demonstrated in lecture(s). Exercises 4.1, 4.2 from Lecture 4 Exercises 5.1, 5.2 from Lecture 5 Rosalind exercises 6, 7
20