Expanded I/O options

28
Expanded I/O options

description

Expanded I/O options. Building on basics. We had Input from the keyboard nameIn = raw_input(“What is your name?”) and output to the console print “Hello”, nameIn Additions: default for value not input: nameIn = raw_input(“What is your name?”) if not nameIn : - PowerPoint PPT Presentation

Transcript of Expanded I/O options

Page 1: Expanded I/O options

Expanded I/O options

Page 2: Expanded I/O options

Building on basics• We had

– Input from the keyboard• nameIn = raw_input(“What is your name?”)

– and output to the console• print “Hello”, nameIn

• Additions:– default for value not input:

• nameIn = raw_input(“What is your name?”)• if not nameIn: nameIn = “Anonymous”

No input provided

Page 3: Expanded I/O options

More additions• Printing a simple list of strings includes a

space between each pair.

– Unwanted space between team name and :– to fix this use concatenation of strings (+

operator). Must explicitly convert numbers to strings. Gain full control of the spacing

>>> team = "Wildcats">>> rank = 5>>> print team, ": ranked", rank, "this week."Wildcats : ranked 5 this week.>>>

>>> print team+": ranked " + str(rank) +" this week."Wildcats: ranked 5 this week.

Page 4: Expanded I/O options

Formatting Strings• Further control of how individual fields

of output will be presented.– % is used to indicate a formatting code and

also a tuple of items to be formatted– %s for strings– %d for integers (d for digits?)– %f for floats (numbers with decimal parts)

• %.3f displays three decimal places

Page 5: Expanded I/O options

5

Formatted Strings (continued)• Can write previous statement using

formatting strings like this.

• Format strings are:– %s is for strings– %d is for integers– %f is for floats. %.2f gives two decimal places.

>>> print '%s: ranked %d this week.'%(team, rank)Wildcats: ranked 5 this week.

Notice quotes around the whole specification of the formatting.

Page 6: Expanded I/O options

Formatting details• Further options

– %10s -- string with 10 spaces, minimum– %4d -- number with 4 spaces, minimum– -%5.2f -- float with 5 places, of which two are decimal

positions

>>> print 'Rank %5.2f as a float.'%rankRank 5.00 as a float.>>> print 'Rank %10.2f as a float.'%rankRank 5.00 as a float.

>>> rank = 100 >>> print "Rank %3.2f with field too small"%rankRank 100.00 with field too small

Note: %n.df makes the total columns for the number =n, of which d are for the decimal places

%3.2f means total 3 spaces, one is for the decimal point and two for the decimal digits, none for the whole number. Automatically expanded to fit the actual value.

Page 7: Expanded I/O options

7

Working with Files• Information stored in RAM (main

memory) goes away (is volatile) when the computer is shut off.

• Information stored on disk is non-volatile (does not go away when the computer is turned off).

• Writing to and reading from a file can help preserve information between different executions of a program.

Page 8: Expanded I/O options

8

Python File Type• creating a new file instance is

accomplished in the same way that a new list object is made.

fileObj = file(filename)

Page 9: Expanded I/O options

9

File Operations

Syntax Semantics

close() disconnect file from Python file variable and save file.

flush() flushes buffer of written characters.

read() returns a string with remaining contents of the file.

read(size) returns a string with size bytes remaining in file.

readline() returns string that contains next line in the file.

Page 10: Expanded I/O options

10

File Operations (continued)

Syntax Semantics

readlines() returns a list of strings of the remaining lines in the file.

write(s) writes s to the file. No newlines are added.

writelines(seq) writes the lines in seq to the file.

for line in f: iterates through the line f, one line at a time.

Page 11: Expanded I/O options

11

Reading from a File:Counting lines, words, and characters

version 1 – corrected typos and added formatting

filename = raw_input('What is the filename? ')source = file(filename)text = source.read() # Read entire file as one stringnumchars = len(text)numwords = len(text.split())numlines = len(text.split('\n'))print '%10d Lines\n%10d Words\n%10d Characters'%(numlines,numwords,numchars)source.close()

What is the filename? citeseertermcount.txt 30002 Lines 156521 Words 920255 Characters

Note – this version reads the whole file at once, as a single string

Page 12: Expanded I/O options

12

Reading from a File:Counting lines, words, and characters

version 2

numlines=numwords=numchars=0line=source.readline()while line: # line length is not zero numchars+=len(line) numwords +=len(line.split()) numlines+=1 # Done with current line. Read the next line=source.readline()

print '%10d Lines\n%10d Words\n%10d Characters'%(numlines,numwords,numchars)source.close()

Now, we read one line at a time, process it, and read the next.

What is the filename? citeseertermcount.txt 30001 Lines 156521 Words 920255 Characters Note different number of lines

Page 13: Expanded I/O options

13

Reading from a File:Counting lines, words, and characters

version 3

filename = raw_input('What is the filename? ')source = file(filename)numlines = numwords = numchars = 0for line in source: #reads one line at a time until no more. numchars += len(line) numwords += len(line.split()) numlines += 1

print '%10d Lines\n%10d Words\n%10d Characters'%(numlines,numwords,numchars)source.close()

30001 Lines156521 Words920255 Characters

Note that “for line in source” actually does the read of a line. No explicit readline is used.

Note the number of lines

Page 14: Expanded I/O options

Spot check 1• Why was there a difference in the

number of lines found by the three versions of the program?

• Discuss on the blackboard forum, then enter your answer. Consultation and collaboration is good, but write your own answer and be sure you understand it.

Page 15: Expanded I/O options

15

Writing to a File• Creating a new file object that can be written

to in Python with a file name of filename. result = file(filename, 'w')

• If the file with filename already exists then it will be overwritten.

• Only strings can be written to a filepi = 3.14159result.write(pi) #this is illegalresult.write(str(pi)) #this is legal

Page 16: Expanded I/O options

16

Writing to a File• When is the information actually written

to a file?• File writing is time expensive so files may

not be written immediately.• A file can be forced to be written in two

ways:– flush(): file written but not closed.– close(): file written and then closed.

Page 17: Expanded I/O options

File Write Danger• Note that there is no built-in protection

against destroying a file that already exists!

• If you want to safeguard against accidentally overwriting an existing file, what would you do?– Discuss

Page 18: Expanded I/O options

18

Trying to Read a File That Doesn't Exist.

• What if opening file for reading and no file with that name exists? IOError – crashes program. To avoid this use an exception.

filename = raw_input('Enter filename: ')try:

source = file(filename)except IOError:

print 'Sorry, unable to open file', filename

Page 19: Expanded I/O options

19

File Utilities# Prompt for filename until file is successfully opened.def fileReadRobust(): source = None while not source:

filename = raw_input('Input filename: ')try: source = file(filename)except IOError: print 'Sorry, unable to open file', filename

return source

Page 20: Expanded I/O options

20

File Utilities (continued)def openFileWriteRobust(defaultName): """Repeatedly prompt user for filename until successfully opening with write access. Return a newly open file object with write access. defaultName a suggested filename. This will be offered within the prompt and used when the return key is pressed without specifying another name. """ writable = None while not writable: # still no successfully opened file prompt = 'What should the output be named [%s]? '% defaultName filename = raw_input(prompt) if not filename: # user gave blank response filename = defaultName # try the suggested default try: writable = file(filename, 'w') except IOError: print 'Sorry. Unable to write to file', filename return writable

Page 21: Expanded I/O options

Testing the File Utilitiesfrom FileUtilities import *

sourceFile=openFileReadRobust()if sourceFile <> None: print "Successful read of ",sourceFile

filenone="anyname"outFile=openFileWriteRobust(filenone)if outFile <> None: print "File ", outFile, " opened for writing"

What is the filename? citeseertermcount.txtSuccessful read of <open file 'citeseertermcount.txt', mode 'r' at 0x60f9d0>What should the output be named [anyname]? abc.txtFile <open file 'abc.txt', mode 'w' at 0x60fa20> opened for writing

Page 22: Expanded I/O options

Numbering lines in a file

# Program: annotate.py# Authors: Michael H. Goldwasser# David Letscher## This example is discussed in Chapter 8 of the book# Object-Oriented Programming in Python#from FileUtilities import openFileReadRobust, openFileWriteRobust

print 'This program annotates a file, by adding'print 'Line numbers to the left of each line.\n'

source = openFileReadRobust()annotated = openFileWriteRobust('annotated.txt')

# process the filelinenum = 1for line in source: annotated.write('%4d %s' % (linenum, line) ) linenum += 1source.close()annotated.close()print 'The annotation is complete.'

Page 23: Expanded I/O options

23

Running the annotation program

FileUtilities.pyc citeseertermcount.txtreadfile1.pyabc.txt fileUtilTest.py

readfile2.pyannotate.py fileUtilities.py

readfile3.pyannotatedUtilities.txt readexception.py

This program annotates a file, by addingLine numbers to the left of each line.

What is the filename? fileUtilities.pyWhat should the output be named [annotated.txt]? annotatedUtilities.txtThe annotation is complete.

Directory after the program runs:

Page 24: Expanded I/O options

The

anno

tate

d fil

e 1 # Program: FileUtilities.py 2 # Authors: Michael H. Goldwasser 3 # David Letscher 4 # 5 # This example is discussed in Chapter 8 of the book 6 # Object-Oriented Programming in Python 7 # 8 """A few utility functions for opening files.""" 9 def openFileReadRobust(): 10 """Repeatedly prompt user for filename until successfully opening with read access. 11 12 Return the newly open file object. 13 """ 14 source = None 15 while not source: # still no successfully opened file 16 filename = raw_input('What is the filename? ') 17 try: 18 source = file(filename) 19 except IOError: 20 print 'Sorry. Unable to open file', filename 21 return source 22 23 def openFileWriteRobust(defaultName): 24 """Repeatedly prompt user for filename until successfully opening with write access. 25 26 Return a newly open file object with write access.

Rest not shown for space limitations

Page 25: Expanded I/O options

Spot Check 2• Run the annotate program against a file

of your choosing and get the line numbers added.– Be careful not to overwrite the original file.

• What would be the effect if you added line numbers to a program file?

• How would you remove the line numbers if you got them into the wrong file?

Page 26: Expanded I/O options

Tally • Read through the case study of

constructing a tally sheet class.

• Compare what you see here to the frequency distribution content that you saw in the NLTK book.

Page 27: Expanded I/O options

NLTK chapter 3• That is written very much as a tutorial

and I don’t think I can do much with slides and no narration.

• Please read through that chapter and do the “Your turn” exercises. Use the Discussion board to comment on what you do and to share observations and ask questions.

Page 28: Expanded I/O options

Assignment• In Two weeks:

– Do either exercise 8.18 or exercise 8.21– (Do you prefer to work with numbers or

words?)– Be sure to design good test cases for your

program.• For chapter review (and quiz

preparation) be sure you can do exercises 8.7 – 8.9