Python workshop #1 at UGA

62
Getting started How to design programs Python concepts & party tricks Python Workshop 2012 The essentials: getting started, program design, modules & I/O Eric Talevich Institute of Bioinformatics, University of Georgia Nov. 8, 2012 Eric Talevich Python Workshop 2012

description

An introduction to Python programming, covering Python's basic types and built-in collections, file handling, iteration, and designing basic scripts from scratch. Presented at the University of Georgia in Fall 2009, 2010, 2011 and 2012. The complete example script and sample input text can be downloaded at: https://gist.github.com/661869

Transcript of Python workshop #1 at UGA

Page 1: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Python Workshop 2012The essentials: getting started, program design, modules & I/O

Eric Talevich

Institute of Bioinformatics, University of Georgia

Nov. 8, 2012

Eric Talevich Python Workshop 2012

Page 2: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

1 Getting started

2 How to design programsData structuresAlgorithmsInterfaces

3 Python concepts & party tricksFile-like objectsIterationThe csv module

Eric Talevich Python Workshop 2012

Page 3: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Getting started: Interpreter

Your interactive interpreter is one of:

python on the command line

IDLE — Integrated DeveLopment Environment for Python 1

We’ll use IDLE.

Run this in the interpreter:

import this

1Included in the standard Python installation on Windows and MacEric Talevich Python Workshop 2012

Page 4: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Getting started: Scripts

1 Write this in a new, blank text file:print "Hello, Athens!"

2 Save it in plain-text format with the name hi.py in the samedirectory as your interpreter session is running. Run it:python hi.py

EXTRA CREDIT Add a line at the top of hi.py to tell Unix this is a Pythonscript:#!/usr/bin/env python

Make hi.py executable (from the terminal shell):chmod +x hi.py

Now you don’t need the .py extension:mv hi.py hi

./hi

Eric Talevich Python Workshop 2012

Page 5: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Getting started: Scripts

1 Write this in a new, blank text file:print "Hello, Athens!"

2 Save it in plain-text format with the name hi.py in the samedirectory as your interpreter session is running. Run it:python hi.py

EXTRA CREDIT Add a line at the top of hi.py to tell Unix this is a Pythonscript:#!/usr/bin/env python

Make hi.py executable (from the terminal shell):chmod +x hi.py

Now you don’t need the .py extension:mv hi.py hi

./hi

Eric Talevich Python Workshop 2012

Page 6: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Getting started: Scripts

1 Write this in a new, blank text file:print "Hello, Athens!"

2 Save it in plain-text format with the name hi.py in the samedirectory as your interpreter session is running. Run it:python hi.py

EXTRA CREDIT Add a line at the top of hi.py to tell Unix this is a Pythonscript:#!/usr/bin/env python

Make hi.py executable (from the terminal shell):chmod +x hi.py

Now you don’t need the .py extension:mv hi.py hi

./hi

Eric Talevich Python Workshop 2012

Page 7: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Getting started: Documentation

In the interpreter: help(function-or-module )

On the command line: pydoc function-or-module

In IDLE: <F1> (launches docs.python.org)

Also, type(obj ), dir(obj ), vars(obj ) and auto-completion(tab or Ctrl+Space) are pretty handy.

Try it:>>> help(str)

>>> s = ’Hello’

>>> type(s)

>>> dir(s)

>>> help(s.capitalize)

>>> help(help)

Eric Talevich Python Workshop 2012

Page 8: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Example: string methods

Find the string (str) methods to. . .

Convert gOoFy-cAsE words to title case, upper case orlower case

Remove certain characters from either end

Align (justify) within a fixed width, padded by spaces

Eric Talevich Python Workshop 2012

Page 9: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

How to design programs

Data structure: Organizes a collection of data units

Algorithm: Defines operations for a specific task

Interface: I/O, how a program interacts with the world

Theorem

Program = Data Structures + Algorithms + Interfaces

Eric Talevich Python Workshop 2012

Page 10: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Data structuresOrganize a program’s information

Eric Talevich Python Workshop 2012

Page 11: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Units of data: Atoms

Class int float bool — str

Literal 1, -2 .1, 2.9e8 True, False None ’abc’

Try it:>>> num = 0.000000002

>>> num

>>> type(num)

Features:

Create from a data literal or by calling the class name(constructor)

Always immutable — assigning another value replaces theoriginal object

Eric Talevich Python Workshop 2012

Page 12: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Collections of data

Class Literal Descriptionlist [ ] Ordered array of arbitrary objectsdict { } Unordered map of fixed to arbitrary objectsset Unordered collection of fixed objects

Eric Talevich Python Workshop 2012

Page 13: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Mutable vs. immutable objects

Some mutable objects have immutable counterparts:

list vs. tuple

set vs. frozenset

>>> x = (1, 2, 3)

>>> x[0] = 7

>>> y = list(x)

>>> y[0] = 7

>>> print y

Eric Talevich Python Workshop 2012

Page 14: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Variables are names, not values

Mutable collections can be shared:

>>> x = [1, 2, 3]

>>> y = x

>>> y[0] = 4

>>> x is y

>>> print ’x =’, x, ’y =’, y

Reassigning to an immutable object replaces (changes the id of)the original:

>>> x = (1, 2, 3)

>>> y = x

>>> y = (4, 2, 3)

>>> x is y

>>> print ’x =’, x, ’y =’, y

Eric Talevich Python Workshop 2012

Page 15: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

a fairly short

Break

Eric Talevich Python Workshop 2012

Page 16: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

AlgorithmsSpecify operations

Eric Talevich Python Workshop 2012

Page 17: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Control flow

Branching The if statement chooses between code paths:

x = -2

if x >= 0:

print x, "is positive"

else:

print x, "was negative"

x = -x

Iteration The for statement visits each element sequentially:

for x in [-2, -1, 0, 1, 2]:

if x >= 0:

print x, "is positive"

Eric Talevich Python Workshop 2012

Page 18: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Control flow

Branching The if statement chooses between code paths:

x = -2

if x >= 0:

print x, "is positive"

else:

print x, "was negative"

x = -x

Iteration The for statement visits each element sequentially:

for x in [-2, -1, 0, 1, 2]:

if x >= 0:

print x, "is positive"

Eric Talevich Python Workshop 2012

Page 19: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

# D e f i n e a f u n c t i o n# −−−−−−−−−−−−−−−−−# This example shows how to d e f i n e a s i m p l e f u n c t i o n .# The f u n c t i o n d e f i n i t i o n s t a r t s w i t h keyword ’ d e f ’ ,# then th e name o f t he f u n c t i o n and th e arguments .# The ’ d o c s t r i n g ’ i n q u o t e s i s o p t i o n a l , but n i c e .# The r e s t o f t he code , i n d e n t e d , i s t he f u n c t i o n body .## ( L i n e s t h a t b e g i n w i t h a hash (#) a r e comments . )

def n i c e a b s ( x ) :””” Return t he a b s o l u t e v a l u e o f a number . ”””i f x < 0 :

return −xe l s e :

return x

Eric Talevich Python Workshop 2012

Page 20: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Algorithms

Definition: A precise set of instructions for accomplishing a task.

We’ll treat an algorithm as one or more functions that operate onsome input data to produce some output according to aspecification.

Useful fact: The structure of a function will resemble thestructure of the data it operates on.

Eric Talevich Python Workshop 2012

Page 21: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

The design recipe: a general method

Steps, in order, for developing a program from scratch: 2

Contract: Name the function; specify the types (i.e. atoms anddata structures) of its input data and its output.abs: number → positive number

Purpose: Description of what the program does, in terms ofinputs and output — sufficient for a functiondeclaration and docstring.

Example: Function call with arguments, and expected result.abs(-1) → 1

abs(1) → 1

Definition: The code!

Tests: Convert the example to actual code — by runningthe program on it.

2http://htdp.org/2003-09-26/Book/curriculum-Z-H-1.html

Eric Talevich Python Workshop 2012

Page 22: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

The design recipe: a general method

Steps, in order, for developing a program from scratch: 2

Contract: Name the function; specify the types (i.e. atoms anddata structures) of its input data and its output.abs: number → positive number

Purpose: Description of what the program does, in terms ofinputs and output — sufficient for a functiondeclaration and docstring.

Example: Function call with arguments, and expected result.abs(-1) → 1

abs(1) → 1

Definition: The code!

Tests: Convert the example to actual code — by runningthe program on it.

2http://htdp.org/2003-09-26/Book/curriculum-Z-H-1.html

Eric Talevich Python Workshop 2012

Page 23: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

The design recipe: a general method

Steps, in order, for developing a program from scratch: 2

Contract: Name the function; specify the types (i.e. atoms anddata structures) of its input data and its output.abs: number → positive number

Purpose: Description of what the program does, in terms ofinputs and output — sufficient for a functiondeclaration and docstring.

Example: Function call with arguments, and expected result.abs(-1) → 1

abs(1) → 1

Definition: The code!

Tests: Convert the example to actual code — by runningthe program on it.

2http://htdp.org/2003-09-26/Book/curriculum-Z-H-1.html

Eric Talevich Python Workshop 2012

Page 24: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

The design recipe: a general method

Steps, in order, for developing a program from scratch: 2

Contract: Name the function; specify the types (i.e. atoms anddata structures) of its input data and its output.abs: number → positive number

Purpose: Description of what the program does, in terms ofinputs and output — sufficient for a functiondeclaration and docstring.

Example: Function call with arguments, and expected result.abs(-1) → 1

abs(1) → 1

Definition: The code!

Tests: Convert the example to actual code — by runningthe program on it.

2http://htdp.org/2003-09-26/Book/curriculum-Z-H-1.html

Eric Talevich Python Workshop 2012

Page 25: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Example: Counting words in text

Given a sample text, count the frequencies of each word.

Name: wordcount

Input: A string of text.

Output: A dictionary, with string keys and integers values.

Purpose: Create a dictionary associating each unique word inthe text with the number of times it appears.

Example: wordcount("yes no yes no maybe")

→ {’yes’: 2, ’no’: 2, ’maybe’: 1}

Eric Talevich Python Workshop 2012

Page 26: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Example: Counting words in text

Given a sample text, count the frequencies of each word.

Name: wordcount

Input: A string of text.

Output: A dictionary, with string keys and integers values.

Purpose: Create a dictionary associating each unique word inthe text with the number of times it appears.

Example: wordcount("yes no yes no maybe")

→ {’yes’: 2, ’no’: 2, ’maybe’: 1}

Eric Talevich Python Workshop 2012

Page 27: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Example: Counting words in text

Given a sample text, count the frequencies of each word.

Name: wordcount

Input: A string of text.

Output: A dictionary, with string keys and integers values.

Purpose: Create a dictionary associating each unique word inthe text with the number of times it appears.

Example: wordcount("yes no yes no maybe")

→ {’yes’: 2, ’no’: 2, ’maybe’: 1}

Eric Talevich Python Workshop 2012

Page 28: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

def wordcount ( t e x t ) :””” Count t he o c c u r e n c e s o f each word i n t e x t .

I n p u t : s t r i n g o f w h i t e s p a c e−d e l i m i t e d wordsOutput : d i c t o f s t r i n g s and i n t e g e r s (>0)Example :

>>> wordcount (” y e s no y e s no maybe ”){ ’ maybe ’ : 1 , ’ y e s ’ : 2 , ’ no ’ : 2}

”””w o r d c o u n t s = {}f o r word i n t e x t . s p l i t ( ) :

i f word not i n w o r d c o u n t s :# New word ; s t a r t c o u n t i n g from 0w o r d c o u n t s [ word ] = 0

w o r d c o u n t s [ word ] += 1return w o r d c o u n t s

Eric Talevich Python Workshop 2012

Page 29: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Snack Time

Eric Talevich Python Workshop 2012

Page 30: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

InterfacesInteract with the world

Eric Talevich Python Workshop 2012

Page 31: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Modules

Think of a module as a bundle of features. When you import amodule, you load an additional set of features that your programcan use.

>>> import this

>>> type(this)

>>> help(this)

>>> dir(this)

A module may also be called a “package” or “library”, or possibly“API”. (It’s nuanced.)Python has many built-in modules. Together, these are calledthe “standard library”.

Eric Talevich Python Workshop 2012

Page 32: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Built-in I/O functions

You may already know:

print Write a line to the command-line interface.

>>> print ’Eureka!’

Eureka!

raw input Read a line from the command-line interface.

>>> name = raw input(’Name: ’)

Name: Jonas

>>> print ’My name is’, name

My name is Jonas

Eric Talevich Python Workshop 2012

Page 33: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Built-in I/O functions

You may already know:

print Write a line to the command-line interface.

>>> print ’Eureka!’

Eureka!

raw input Read a line from the command-line interface.

>>> name = raw input(’Name: ’)

Name: Jonas

>>> print ’My name is’, name

My name is Jonas

Eric Talevich Python Workshop 2012

Page 34: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

The sys module

print and raw input are wrappers for standard output andstandard input:

import sys

sys.stdout File handle for writing to the program’s output.>>> sys.stdout.write(’Eureka!\n’)Eureka!

sys.stdin File handle for reading from the program’s input.>>> sys.stdout.write(’Name: ’); \... name = sys.stdin.readline()

Name: Jonas

>>> sys.stdout.write(’My name is ’ + name)

My name is Jonas

Eric Talevich Python Workshop 2012

Page 35: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

The sys module

print and raw input are wrappers for standard output andstandard input:

import sys

sys.stdout File handle for writing to the program’s output.>>> sys.stdout.write(’Eureka!\n’)Eureka!

sys.stdin File handle for reading from the program’s input.>>> sys.stdout.write(’Name: ’); \... name = sys.stdin.readline()

Name: Jonas

>>> sys.stdout.write(’My name is ’ + name)

My name is Jonas

Eric Talevich Python Workshop 2012

Page 36: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Program arguments

Command-line arguments are stored in sys.argv:

cat hi.py

#!/usr/bin/env python

print ’Hello, Athens!’

echo ’Some text’ > example.txt

python hi.py example.txt

>>> import sys

>>> print sys.argv

[’hi.py’, ’example.txt’]

Eric Talevich Python Workshop 2012

Page 37: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Program arguments

Command-line arguments are stored in sys.argv:

cat hi.py

#!/usr/bin/env python

print ’Hello, Athens!’

echo ’Some text’ > example.txt

python hi.py example.txt

>>> import sys

>>> print sys.argv

[’hi.py’, ’example.txt’]

Eric Talevich Python Workshop 2012

Page 38: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

Program arguments

Command-line arguments are stored in sys.argv:

cat hi.py

#!/usr/bin/env python

print ’Hello, Athens!’

echo ’Some text’ > example.txt

python hi.py example.txt

>>> import sys

>>> print sys.argv

[’hi.py’, ’example.txt’]

Eric Talevich Python Workshop 2012

Page 39: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

Data structuresAlgorithmsInterfaces

More types of interfaces

Pipelines (sys.stdin/stdout/stderr)

API — calling functions from other programs & libraries

IPC — inter-process communication (shm, subprocess)

GUI widgets

Web forms & Javascript events

(Don’t worry about the details here. Just keep this stuff separatefrom the main algorithm if you encounter it.)

Eric Talevich Python Workshop 2012

Page 40: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

What is a file?

Consider:

A local file larger than available RAM

Reading data over an internet connection

Problem: Data can’t be viewed all at once

Solution: Retrieve bytes on demand

Eric Talevich Python Workshop 2012

Page 41: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

File handles

A handle points to a location in a byte stream (e.g. the first byteof the file to be read/written).

>>> myfile = open(’example.txt’)

>>> print myfile.read()

>>> myfile.seek(0)

>>> print myfile.readlines()

>>> myfile.close()

Eric Talevich Python Workshop 2012

Page 42: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

File-like objects

Text (ASCII or Unicode) file on disk

Binary file on disk

Standard input, output, error

Devices, pretty much everything else on Unix

Network connection

Memory map (inter-process communication)

Any Python object supporting the right methods (e.g.StringIO)

Eric Talevich Python Workshop 2012

Page 43: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

Features of Python file objects

Modes:

infile = open(myfile, ’r’)

r, w — read, write

b — binary, to keep Windows from clobbering CR/LF bytes

U — Unicode

Iteration:

for line in infile:

print line.split()

Eric Talevich Python Workshop 2012

Page 44: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

another short

Break

Eric Talevich Python Workshop 2012

Page 45: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

Iteration

Iterable objects “know” how they can be looped over.Python has functions that operate on each item in an iterable:

“for” loop: for x in iterable : ...

zip: for left , right in zip( iter1 , iter2 ): ...

enumerate: for index , x in enumerate( iterable ): ...

List comprehension: [x ∗ x for x in range(20) if x % 2 == 0]

Eric Talevich Python Workshop 2012

Page 46: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

Example: Counting words in a file

Count the words in a file, and print a table sorted byword frequency.

Let’s expand the wordcount function into a complete script.

Separate algorithms from interfacesDon’t do any I/O inside wordcount.

Independent output formattingUse a separate function to print the dictionary wordcount

produces.

Flexible file inputTake a file name as a program argument, or read fromstandard input.

Eric Talevich Python Workshop 2012

Page 47: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

Enhancing wordcount

1 Read from an open file handle, not a stringDon’t worry about opening or closing it here:def count words(infile): ...

2 Remove punctuation, ignore case when counting wordsstr.strip and str.lower will work — but could make ’’

3 Save a line of code with collections.defaultdict

Write what we want more directly:>>> counts = collections.defaultdict(int)

>>> counts[’new’]

0

Eric Talevich Python Workshop 2012

Page 48: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

Enhancing wordcount

1 Read from an open file handle, not a stringDon’t worry about opening or closing it here:def count words(infile): ...

2 Remove punctuation, ignore case when counting wordsstr.strip and str.lower will work — but could make ’’

3 Save a line of code with collections.defaultdict

Write what we want more directly:>>> counts = collections.defaultdict(int)

>>> counts[’new’]

0

Eric Talevich Python Workshop 2012

Page 49: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

Enhancing wordcount

1 Read from an open file handle, not a stringDon’t worry about opening or closing it here:def count words(infile): ...

2 Remove punctuation, ignore case when counting wordsstr.strip and str.lower will work — but could make ’’

3 Save a line of code with collections.defaultdict

Write what we want more directly:>>> counts = collections.defaultdict(int)

>>> counts[’new’]

0

Eric Talevich Python Workshop 2012

Page 50: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

import c o l l e c t i o n s

def c o u n t w o r d s ( i n f i l e ) :””” Count o c c u r r e n c e s o f each word i n i n f i l e .I n p u t : F i l e h a n d l e open f o r r e a d i n gOutput : d i c t o f { s t r i n g : i n t e g e r }Example : ( i n f i l e c o n t a i n s ” y e s no y e s no maybe ”)

>>> c o u n t w o r d s ( i n f i l e ){ ’ maybe ’ : 1 , ’ y e s ’ : 2 , ’ no ’ : 2}

”””w counts = c o l l e c t i o n s . d e f a u l t d i c t ( i n t )f o r l i n e i n i n f i l e :

f o r word i n l i n e . s p l i t ( ) :# I g n o r e c a s e and a d j a c e n t p u n c t u a t i o nword = word . s t r i p (’,.;:?!-()"\’’ ) . l o w e r ( )w counts [ word ] += 1

return w counts

Eric Talevich Python Workshop 2012

Page 51: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

Output formatting

Given the dictionary produced by count words . . .

1 Sort entries by counts, in descending orderUse the key argument to select values for comparison:>>> pairs = [(’one’,1), (’three’,3), (’two’,2)]

>>> select second = lambda x: x[1]

>>> pairs.sort(key=select second); print pairs

[(’one’, 1), (’two’, 2), (’three’, 3)]

2 Justify the printed words in a fixed-with columnUse the length of the longest word as the column width.>>> words = ’thanks for all the fish’.split()’

>>> max(len(word) for word in words)

6

Eric Talevich Python Workshop 2012

Page 52: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

Output formatting

Given the dictionary produced by count words . . .

1 Sort entries by counts, in descending orderUse the key argument to select values for comparison:>>> pairs = [(’one’,1), (’three’,3), (’two’,2)]

>>> select second = lambda x: x[1]

>>> pairs.sort(key=select second); print pairs

[(’one’, 1), (’two’, 2), (’three’, 3)]

2 Justify the printed words in a fixed-with columnUse the length of the longest word as the column width.>>> words = ’thanks for all the fish’.split()’

>>> max(len(word) for word in words)

6

Eric Talevich Python Workshop 2012

Page 53: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

def p r i n t d i c t i o n a r y ( d ct ) :””” P r i n t t he k e y s and v a l u e s i n dc t .

S o r t s i t e m s by d e c r e a s i n g v a l u e ; l i n e s up columns .

I n p u t : d i c t w i t h s t r i n g k e y s and any−t y p e v a l u e s”””# Width needed f o r d i s p l a y i n g th e key columnk e y w i d t h = max ( l e n ( key ) f o r key i n dc t )k v p a i r s = dc t . i t e m s ( )

# S o r t by th e second i tem i n th e p a i r , d e c r e a s i n gk v p a i r s . s o r t ( key=lambda kv : kv [ 1 ] , r e v e r s e=True )f o r key , v a l u e i n k v p a i r s :

# A l i g n both columns a g a i n s t a s t r i p o f t a b sp r i n t key . r j u s t ( k e y w i d t h ) , ’\t’ , v a l u e

Eric Talevich Python Workshop 2012

Page 54: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

File I/O and scaffolding

Let’s put it all together.1 Document usage at the top of the script

Several ways to view this: pydoc, help, doc

2 Check the given program arguments

If none: read from standard inputIf one: open the file and read itOtherwise: print a helpful error message

3 Exit with proper Unix return codesUse sys.exit: 0 means OK, otherwise error$ python -c ’import sys; sys.exit(1)’ && echo OK

$ python -c ’import sys; sys.exit()’ && echo OK

OK

Eric Talevich Python Workshop 2012

Page 55: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

File I/O and scaffolding

Let’s put it all together.1 Document usage at the top of the script

Several ways to view this: pydoc, help, doc

2 Check the given program arguments

If none: read from standard inputIf one: open the file and read itOtherwise: print a helpful error message

3 Exit with proper Unix return codesUse sys.exit: 0 means OK, otherwise error$ python -c ’import sys; sys.exit(1)’ && echo OK

$ python -c ’import sys; sys.exit()’ && echo OK

OK

Eric Talevich Python Workshop 2012

Page 56: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

File I/O and scaffolding

Let’s put it all together.1 Document usage at the top of the script

Several ways to view this: pydoc, help, doc

2 Check the given program arguments

If none: read from standard inputIf one: open the file and read itOtherwise: print a helpful error message

3 Exit with proper Unix return codesUse sys.exit: 0 means OK, otherwise error$ python -c ’import sys; sys.exit(1)’ && echo OK

$ python -c ’import sys; sys.exit()’ && echo OK

OK

Eric Talevich Python Workshop 2012

Page 57: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

#! / u s r / b i n / env python

””” Count t he words i n a t e x t f i l e o r stream , andp r i n t a t a b l e o f word c o u n t s s o r t e d by f r e q u e n c y .

Usage :wordcount . py [ f i l e n a m e ]

”””

import s y simport c o l l e c t i o n s

def c o u n t w o r d s ( i n f i l e ) :. . .

def p r i n t d i c t i o n a r y ( d ct ) :. . .

Eric Talevich Python Workshop 2012

Page 58: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

i f n a m e == ’__main__’ :i f l e n ( s y s . a r g v ) == 1 :

# Text st ream i n p u t , e . g . from a p i p ei n f i l e = s y s . s t d i n

e l i f l e n ( s y s . a r g v ) == 2 :# Read t e x t from t he g i v e n f i l ei n f i l e = open ( s y s . a r g v [ 1 ] , ’r’ )

e l s e :# Too many arguments ! P r i n t usage & q u i ts y s . e x i t ( d o c )

# Now , do e v e r y t h i n gw o r d c o u n t s = c o u n t w o r d s ( i n f i l e )p r i n t d i c t i o n a r y ( w o r d c o u n t s )

Eric Talevich Python Workshop 2012

Page 59: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

CSV: Comma Separated Values

Python can read and write spreadsheet files in CSV format. 3

(It’s handy if you don’t have a database set up.)

KEY one three five

two 2 6 10

four 4 12 20

six 6 18 30

Let’s convert this to a dictionary-of-dictionaries in Python — theouter dictionary is keyed by the row labels (from the KEY column),and the inner dictionary is the row data, keyed by column label.

3Conversions seem to be easier in OpenOffice/LibreOffice Calc than in Excel.Eric Talevich Python Workshop 2012

Page 60: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

# Save t he p r e v i o u s t a b l e as ’ mtable . c s v ’# Here , we c o n v e r t i t to a d i c t−of−d i c t simport c s vi n f i l e = open (’mtable.csv’ )c s v r e a d e r = c s v . D i c t R e a d e r ( i n f i l e )t a b l e = {}f o r rowdata i n c s v r e a d e r :

key = rowdata . pop (’KEY’ )t a b l e [ key ] = rowdata

# P r e t t y−p r i n t t a b l e w i t h n i c e l y a l i g n e d rowsfrom p p r i n t import p p r i n tp p r i n t ( t a b l e )

# { ’ f o u r ’ : { ’ f i v e ’ : ’ 2 0 ’ , ’ one ’ : ’ 4 ’ , ’ t h r e e ’ : ’ 1 2 ’} ,# ’ s i x ’ : { ’ f i v e ’ : ’ 3 0 ’ , ’ one ’ : ’ 6 ’ , ’ t h r e e ’ : ’ 1 8 ’} ,# ’ two ’ : { ’ f i v e ’ : ’ 1 0 ’ , ’ one ’ : ’ 2 ’ , ’ t h r e e ’ : ’6 ’}}

p r i n t t a b l e [ ’four’ ] [ ’five’ ] # ’ 2 0 ’

Eric Talevich Python Workshop 2012

Page 61: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

Thanks’Preciate it.

Gracias

Eric Talevich Python Workshop 2012

Page 62: Python workshop #1 at UGA

Getting startedHow to design programs

Python concepts & party tricks

File-like objectsIterationThe csv module

Further reading

The official Python tutorial is quite good:http://docs.python.org/tutorial/index.html

This is the book I recommend for learning Python:http://learnpythonthehardway.org/

This presentation on Slideshare, and the example script on GitHub:http://www.slideshare.net/etalevich/

python-workshop-1-uga-bioinformatics

https://gist.github.com/661869

Eric Talevich Python Workshop 2012