Primitives 1. Simple Values logical (Boolean) integer float string...
-
Upload
sierra-horton -
Category
Documents
-
view
218 -
download
1
Transcript of Primitives 1. Simple Values logical (Boolean) integer float string...
Primitives
1
Simple Values
• logical (Boolean)• integer• float• string
– 'MNKMDLVADVAEKTDLSKAKATEVIDAVFA'– "AARHQGRGAPCGESFWHWALGADGGHGHAQP"– """MKQLNFYKKN SLNNVQEVFS YFMETMIST"""– '''MWNSNLPKPN AIYVYGVANA NITFFKGSDI'''
• None
2
Expressions
• Numeric Operators• Logical Operations
– '' and "A''– 0 and 1 or 2– 'yes' if 2 - 1 else 'no'
• String Operations– 'TATA' in 'TATATATATATATATATATATATA'– 'MNKMDLVADVAEKTDLSKAKATEVIDAVFA'[0]– 'MNKMDLVADVAEKTDLSKAKATEVIDAVFA'[1:4]– MNKMDLVADVAEKTDLSKAKATEVIDAVFA'[:8]– 'MNKMDLVADVAEKTDLSKAKATEVIDAVFA'[9:]
3
Expressions
4
0 1 2 3 4
T A A C G
-5 -4 -3 -2 -1
• String Operations– 'MNKMDLVADVAEKTDLSKAKATEVIDAVFA'[9:-1]– 'MNKMDLVADVAEKTDLSKAKATEVIDAVFA'[:]– 'MNKMDLVADVAEKTDLSKAKATEVIDAVFA'[16:0:-4]– 'MNKMDLVADVAEKTDLSKAKATEVIDAVFA'[16::-4]– 'MNKMDLVADVAEKTDLSKAKATEVIDAVFA'[::-1]
Calls
• Function calls– len('TATA')– print('AAT', 'AAC', 'AAG', 'AAA')
• Method calls– 'MNKMDLVADVAEKTDLSKAKATEVIDAVFA'.count('D
L')
5
Running Python interactively• Ctrl-A
– Go to the beginning of the line.• Ctrl-E
– Go to the end of the line.• Ctrl-B or left arrow
– Move one character to the left.• Ctrl-F or right arrow
– Move one character to the right.• Backspace
– Delete the preceding character.• Ctrl-D
– Delete the next character.• Ctrl-K
– Delete the rest of the line after the cursor.• Ctrl-Y
– “Yank” the last killed text into the line at the location of the cursor.• Ctrl-_ (underscore)
– Undo; can be repeated.• Ctrl-R
– Search incrementally for a preceding input line.• Ctrl-S
– Search incrementally for a subsequent input line.
6
Tracebacks
• NameError: 'Non' is not defined– Python doesn’t recognize a name (more on this in the
next chapter).
• IndexError: string index out of range– For a string of length N, an index (i.e., the value
between square brackets) must be in the range -N <= index < N-1.
• SyntaxError– Python syntax violation.
• ZeroDivisionError– /, //, or % with 0 as the second operand.
7
Names, Functions, and Modules
8
Names bound to objects
9
Names in different namespaces bound to objects
10
Assigning Names
11
Defining Functions
12
Function calls
13
Function returns
14
Do-nothing statement
15
Example
16
def validate_base_sequence(base_sequence):
"""Return True if the string base_sequence contains only upper- or lowercase T, C, A, and G characters, otherwise False"""
seq = base_sequence.upper()
return len(seq) == \
seq.count('A') + seq.count('G') + \
seq.count('T') + seq.count('C')
GC content of a given DNA sequencedef gc_content(base_seq):
17
GC content of a given DNA sequencedef gc_content(base_seq):
""""Return the percentage of G and C characters in base_seq"""
seq = base_seq.upper()
return (seq.count('G') + seq.count('C')) / len(seq)
18
Assertion
19
GC content of a given DNA sequencedef gc_content(base_seq):
"""Return the percentage of G and C characters in base_seq"""
assert validate_base_sequence(base_seq), \
'argument has invalid characters'
seq = base_seq.upper()
return ((base_seq.count('G') + base_seq.count('C')) /
len(base_seq))
20
Default Parameter Values
21
def validate_base_sequence(base_sequence, RNAflag=False):
"""Return True if the string base_sequence contains only upper- or lowercase T, C, A, and G characters, otherwise False"""
seq = base_sequence.upper()
return len(seq) == ( seq.count('A') +
seq.count('G') +
seq.count('U' if RNAflag else 'T') +
seq.count('C'))
Using Modules
22
Using Modules
23
Using Modules
24
from random import randint
def random_base(RNAflag = False):
return ('UCAG' if RNAflag else 'TCAG')[randint(0,3)]
def random_codon(RNAflag = False):
return random_base(RNAflag) + random_base(RNAflag) + random_base(RNAflag)
Python Files
25
def validate_base_sequence(base_sequence, RNAflag = False):"""Return True if the string base_sequence contains only upper- or lowercase T (or U, if RNAflag), C, A, and G characters, otherwise False"""seq = base_sequence.upper()return len(seq) == (seq.count('U' if RNAflag else 'T') +
seq.count('C') +seq.count('A') +seq.count('G'))
def gc_content(base_seq):"""Return the percentage of bases in base_seq that are C or G"""assert validate_base_sequence(base_seq), \
'argument has invalid characters'seq = base_seq.upper()return (base_seq.count('G') +base_seq.count('C')) / len(base_seq)
def recognition_site(base_seq, recognition_seq):"""Return the first position in base_seq where recognition_seqoccurs, or −1 if not found"""return base_seq.find(recognition_seq)
def test():
assert validate_base_sequence('ACTG')
assert validate_base_sequence('')
assert not validate_base_sequence('ACUG')
assert validate_base_sequence('ACUG', False)
assert not validate_base_sequence('ACUG', True)
assert validate_base_sequence('ACTG', True)
assert .5 == gc_content('ACTG')
assert 1.0 == gc_content('CCGG')
assert .25 == gc_content('ACTT')
print('All tests passed.')
test()
Python Files
• http://shared.tistory.com/79
26
Collections
27
compound
• they group together multiple objects• called collections or containers• Some collection types can even contain items
with a mixture of types, including other collections.
• sets - don’t allow individual access• sequences - use numerical indexes• mappings - use keys
28
Sets
• an unordered collection of items that contains no duplicates
• set('TCAGTTAT') -> set(['A', 'C', 'G', 'T'])
29
Sets
30
Sequences
• ordered collections that may contain duplicate elements
31
Sequences - String
• Testing– str1.isalpha()– str1.isalnum()– str1.isdigit()– str1.numeric()– str1.isdecimal()– str1.islower()– str1.isupper()– str1.istitle()
32
Sequences - String
• Searching– str1.startswith(str2[, startpos, [endpos]])– str1.endswith(str2[, startpos, [endos]])– str1.find(str2[, startpos[, endpos]])– str1.rfind(str2[, startpos[, endpos]])– str1.index(str2[, startpos[, endpos]])– str1.rindex(str2[, startpos[, endpos]])– str1.count(str2[, startpos[, endpos]])
33
Sequences - String
• Replacing– str1.replace(oldstr, newstr[, count])– str1.translate(dictionary)
• Changing case– str1.lower()– str1.upper()– str1.capitalize()– str1.title()– str1.swapcase()
34
Sequences - String
• Reformatting– str1.lstrip([chars])– str1.rstrip([chars])– str1.strip([chars])– str1.ljust(width[, fillchar])– str1.rjust(width[, fillchar])– fillchar (the default fill character is a space).– str1.center(width[, fillchar])– str1.expandtabs([tabsize])
35
Sequences - Tuples
• immutable sequence that can contain any type of element
• ('TCAG', 'UCAG')• bases = 'TCAG', 'UCAG'
36
Sequences - Lists
• mutable sequence of any kind of element
37
Sequences - Lists
38
Mappings
39
• mutable unordered collection of key/value pairs– associative arrays, lookup tables, and hash tables
The RNA amino acid translation table
40
Mappings - Dictionary
41
Mappings - Dictionary
42