OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Python course in Bioinformatics
Xiaohui Xie
March 31, 2009
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
General Introduction
Basic Types in Python
Programming
Exercises
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Why Python?
I Scripting language, raplid applications
I Minimalistic syntax
I Powerful
I Flexiablel data structure
I Widely used in Bioinformatics, and many other domains
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Where to get Python and learn more?
I Main source of information: http://docs.python.org/
I Tutorial: http://docs.python.org/tutorial/index.html
I Biopython: http://biopython.org/wiki/Main Page
Xiaohui Xie Python course in Bioinformatics
http://docs.python.org/http://docs.python.org/tutorial/index.htmlhttp://biopython.org/wiki/Main_Page
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Invoking Python
I To start: type python in command lineI It will look like
Python 2.5.2 (r252:60911, Mar 25 2009, 00:12:33)
[GCC 4.1.2 (Gentoo 4.1.2 p1.0.2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
I You can now type commands in the line denoted by >>>
I To leave: type end-of-file character ctrl-D on Unix, ctrl-zon Windows
I This is called interactive mode
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Appetizer Example
I Task: Print all numbers in a given fileI File: numbers.txt
2.1
3.2
4.3
I Code: print.py
# Note: the code lines begin in the first column of the file. In
# Python code indentation *is* syntactically relevant. Thus, the
# hash # (which is a comment symbol, everything past a hash is
# ignored on current line) marks the first column of the code
data = open("numbers.txt", "r")
for d in data:
print d
data.close()
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Appetizer Example cont’d
I Task: Print the sum of all the data in the fileI Code: sum.py
data = open("numbers.txt", "r")
s = 0
for d in data:
s = s + float(d)
print s
data.close()
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Interative Mode
I prompt >>> allows to enter command
I command is ended by newline
I variables need not be initialized or declared
I a colon “:” opens a block
I ... prompt denotes that block is expected
I no prompt means python output
I a block is indented
I by ending indentation, block is ended
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Differences to Java or C
I can be used interatively. This makes it much easier to testprograms and to debug
I no declaration of variables
I no brackets denote block, just indentation (Emacs supportsthe style)
I a comment begins with a “#”. Everything after that isignored.
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Numbers
I Example
>>> 2+2
4
>>> # This is a comment
... 2+2
4
>>> 2+2 # and a comment on the same line as code
4
>>> (50-5*6)/4
5
>>> # Integer division returns the floor:
... 7/3
2
>>> 7/-3
-3
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Numbers cont’d
I Example
>>> width = 20
>>> height = 5*9
>>> width * height
900
>>> # Variables must be defined (assigned a value) before they can be
>>> # used, or an error will occur:
>>> # try to access an undefined variable
... n
Traceback (most recent call last):
File "", line 1, in
NameError: name ’n’ is not defined
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Strings
I Strings can be enclosed in single quotes or double quotesI Example
>>> ’spam eggs’
’spam eggs’
>>> ’doesn\’t’
"doesn’t"
>>> "doesn’t"
"doesn’t"
>>> ’"Yes," he said.’
’"Yes," he said.’
>>> "\"Yes,\" he said."
’"Yes," he said.’
>>> ’"Isn\’t," she said.’
’"Isn\’t," she said.’
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Strings cont’d
I Strings can be surrounded in a pair of matching triple-quotes:""" or ’’’. End of lines do not need to be escaped whenusing triple-quotes, but they will be included in the string.
I Example
print """
Usage: thingy [OPTIONS]
-h Display this usage message
-H hostname Hostname to connect to
"""
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Strings cont’d
I Strings can be concatenated (glued together) with the +operator, and repeated with *:
I Example
>>> word = ’Help’ + ’A’
>>> word
’HelpA’
>>> ’’
’’
>>> ’str’ ’ing’ # >> ’str’.strip() + ’ing’ #
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Strings cont’d
I Strings can be subscripted (indexed); like in C, the firstcharacter of a string has subscript (index) 0.
I There is no separate character type; a character is simply astring of size one.
I Substrings can be specified with the slice notation: twoindices separated by a colon.
I Example
>>> word = ’Help’ + ’A’
>>> word[4]
’A’
>>> word[0:2]
’He’
>>> word[:2] # The first two characters
’He’
>>> word[2:] # Everything except the first two characters
’lpA’
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Strings cont’d
I Unlike a C string, Python strings cannot be changed.Assigning to an indexed position in the string results in anerror:
I Example
>>> word[0] = ’x’
Traceback (most recent call last):
File "", line 1, in ?
TypeError: object doesn’t support item assignment
>>> ’x’ + word[1:]
’xelpA’
>>> ’Splat’ + word[4]
’SplatA’
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Strings cont’d
I Example
>>> from string import *
>>> dna = ’gcatgacgttattacgactctg’
>>> len(dna)
22
>>> ’n’ in dna
False
>>> count(dna,’a’)
5
>>> replace(dna, ’a’, ’A’)
’gcAtgAcgttAttAcgActctg’
I Exercise: Calculate GC percent of dna
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Strings cont’d
I Solution: Calculate GC percent
>>> gc = (count(dna, ’c’) + count(dna, ’g’)) / float(len(dna)) * 100
>>> "%.2f" % gc
’64.08’
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Strings cont’d
I Exercise: Calculate the complement of DNA
A - T
C - G
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Lists
I A list of comma-separated values (items) between squarebrackets.
I List items need not all have the same type (compund datatypes)
>>> a = [’spam’, ’eggs’, 100, 1234]
>> a[0]
’spam’
>>> a[3]
1234
>>> a[-2]
100
>>> a[1:-1]
[’eggs’, 100]
>>> a[:2] + [’bacon’, 2*2]
[’spam’, ’eggs’, ’bacon’, 4]
>>> 3*a[:3] + [’Boo!’]
[’spam’, ’eggs’, 100, ’spam’, ’eggs’, 100, ’spam’, ’eggs’, 100, ’Boo!’]
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Lists cont’d
I Unlike strings, which are immutable, it is possible to changeindividual elements of a list
I Assignment to slices is also possible, and this can even changethe size of the list or clear it entirely
I Example
>>> a
[’spam’, ’eggs’, 100, 1234]
>>> a[2] = a[2] + 23
>>> a[0:2] = [1, 12] # Replace some items:
>>> a[0:2] = [] # Remove some:
>>> a
[123, 1234]
>>> a[1:1] = [’bletch’, ’xyzzy’] # Insert some:
>>> a
[123, ’bletch’, ’xyzzy’, 1234]
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Lists cont’d
I Functions returning a list
>>> range(3)
[0, 1, 2]
>>> range(10,20,2)
[10, 12, 14, 16, 18]
>>> range(5,2,-1)
[5, 4, 3]
>>> aas = "ALA TYR TRP SER GLY".split()
>>> aas
[’ALA’, ’TYR’, ’TRP’, ’SER’, ’GLY’]
>>> " ".join(aas)
’ALA TYR TRP SER GLY’
>>> l = list(’atgatgcgcccacgtacga’)
[’a’, ’t’, ’g’, ’a’, ’t’, ’g’, ’c’, ’g’, ’c’, ’c’, ’c’, ’a’,
’c’, ’g’, ’t’, ’a’, ’c’, ’g’, ’a’]
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Dictionaries
I A dictionary is an unordered set of key: value pairs, with therequirement that the keys are unique
I A pair of braces creates an empty dictionary: .I Placing a comma-separated list of key:value pairs within the
braces adds initial key:value pairs to the dictionaryI The main operations on a dictionary are storing a value with
some key and extracting the value given the keyI Example
>>> tel = {’jack’: 4098, ’sape’: 4139}
>>> tel[’guido’] = 4127
>>> tel
{’sape’: 4139, ’guido’: 4127, ’jack’: 4098}
>>> tel[’jack’]
4098
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Dictionaries cont’d
I Example
>>> tel = {’jack’: 4098, ’sape’: 4139, ’guido’ = 4127}
>>> del tel[’sape’]
>>> tel[’irv’] = 4127
>>> tel
{’guido’: 4127, ’irv’: 4127, ’jack’: 4098}
>>> tel.keys()
[’guido’, ’irv’, ’jack’]
>>> ’guido’ in tel
True
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Programming
I Example
a, b = 3, 4
if a > b:
print a + b
else:
print a - b
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Programming cont’d
I Example
>>> # Fibonacci series:
... # the sum of two elements defines the next
... a, b = 0, 1
>>> while b < 10:
... print b
... a, b = b, a+b
...
1
1
2
3
5
8
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Programming features
I multiple assignment: rhs evaluated before anything on theleft, and (in rhs) from left to right
I while loop executes as long as condition is True (non-zero,not the empty string, not None)
I block indentation must be the same for each line of block
I need empty line in interactive mode to indicate end of block(not required in edited code)
I use of print
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Printing
I Example
>>> i = 256*256
>>> print ’The value of i is’, i
The value of i is 65536
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Flow control
I Example
x = 35
if x < 0:
x = 0
print ’Negative changed to zero’
elif x == 0:
print ’Zero’
elif x == 1:
print ’Single’
else:
print ’More’
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Iteration
I Python for iterates over sequence (string, list, generatedsequence)
I Example
a = [’cat’, ’window’, ’defenestrate’]
for x in a:
print x, len(x)
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Iteration
I Python for iterates over sequence (string, list, generatedsequence)
I Example
a = [’cat’, ’window’, ’defenestrate’]
for x in a:
print x, len(x)
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Definiting functions
I Example
def fib(n): # write Fibonacci series up to n
"""Print a Fibonacci series up to n."""
a, b = 0, 1
while b < n:
print b,
a, b = b, a+b
# Now call the function we just defined:
fib(2000)
# will return:
# 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Reverse Complement of DNA
I Excercise: Find the reverse complement of a DNA sequenceI Example
5’ - ACCGGTTAATT 3’ : forward strand
3’ - TGGCCAATTAA 5’ : reverse strand
So the reverse complement of ACCGGTTAATT is AATTAACCGGA
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Reverse Complement of DNA
I Solution: Find the reverse complement of a DNA sequence
from string import *
def revcomp(dna):
""" reverse complement of a DNA sequence """
comp = dna.translate(maketrans("AGCTagct", "TCGAtcga"))
lcomp = list(comp)
lcomp.reverse()
return join(lcomp, "")
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral Introduction
Basic Types in PythonProgramming
Exercises
Translate a DNA sequence
I Excercise: Translate a DNA sequence to an amino acidsequence
I Genetic code
standard = { ’ttt’: ’F’, ’tct’: ’S’, ’tat’: ’Y’, ’tgt’: ’C’,
’ttc’: ’F’, ’tcc’: ’S’, ’tac’: ’Y’, ’tgc’: ’C’,
’tta’: ’L’, ’tca’: ’S’, ’taa’: ’*’ , ’tca’: ’*’,
’ttg’: ’L’, ’tcg’: ’S’, ’tag’: ’*’, ’tcg’: ’W’,
’ctt’: ’L’, ’cct’: ’P’, ’cat’: ’H’, ’cgt’: ’R’,
’ctc’: ’L’, ’ccc’: ’P’, ’cac’: ’H’, ’cgc’: ’R’,
’cta’: ’L’, ’cca’: ’P’, ’caa’: ’Q’, ’cga’: ’R’,
’ctg’: ’L’, ’ccg’: ’P’, ’cag’: ’Q’, ’cgg’: ’R’,
’att’: ’I’, ’act’: ’T’, ’aat’: ’N’, ’agt’: ’S’,
’atc’: ’I’, ’acc’: ’T’, ’aac’: ’N’, ’agc’: ’S’,
’ata’: ’I’, ’aca’: ’T’, ’aaa’: ’K’, ’aga’: ’R’,
’atg’: ’M’, ’acg’: ’T’, ’aag’: ’K’, ’agg’: ’R’,
’gtt’: ’V’, ’gct’: ’A’, ’gat’: ’D’, ’ggt’: ’G’,
’gtc’: ’V’, ’gcc’: ’A’, ’gac’: ’D’, ’ggc’: ’G’,
’gta’: ’V’, ’gca’: ’A’, ’gaa’: ’E’, ’gga’: ’G’,
’gtg’: ’V’, ’gcg’: ’A’, ’gag’: ’E’, ’ggg’: ’G’ }
Xiaohui Xie Python course in Bioinformatics
OutlineGeneral IntroductionBasic Types in PythonProgrammingExercises
Top Related