Chapter10 Chapter12 pyhton Python Programming:�An Introduction To�Computer Science

kristr1 9 views 109 slides Oct 19, 2025
Slide 1
Slide 1 of 109
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85
Slide 86
86
Slide 87
87
Slide 88
88
Slide 89
89
Slide 90
90
Slide 91
91
Slide 92
92
Slide 93
93
Slide 94
94
Slide 95
95
Slide 96
96
Slide 97
97
Slide 98
98
Slide 99
99
Slide 100
100
Slide 101
101
Slide 102
102
Slide 103
103
Slide 104
104
Slide 105
105
Slide 106
106
Slide 107
107
Slide 108
108
Slide 109
109

About This Presentation

power point python


Slide Content

Python Programming, 4/e 1 Python Programming: An Introduction To Computer Science Chapter 10 Persistent Data

Python Programming, 4/e 2 Objectives To understand basic file-processing concepts and techniques for opening, reading, and writing files in Python. To understand the structure of text files and be able to write programs that use them. To become familiar with the basic organization of file systems, including role of absolute and relative paths play in locating files, and be able to write Python programs that process collections of files.

Python Programming, 4/e 3 Objectives To understand binary data and the bytes data type and be able to create programs that store and load Python objects from files using the pickle module. To recognize the similarity between working with local files and working with network resources.

Python Programming, 4/e 4 Text Files In all of the examples so far, data has either been embedded in the program code or entered by the user when the program runs. We lack a mechanism for entering data and having that data persist from one run of the program to the next.

Python Programming, 4/e 5 Text Files Persistent data is a critical component of any modern computing system. Your word processor needs to save the paper you’re working on. Your programming environment needs to be able to save and reload your Python code. Typically, such information is stored in files.

Python Programming, 4/e 6 Text Files A file is a sequence of data that is stored in secondary memory (usually on a disk drive of some sort). Files can contain any data type, but the easiest files to work with a those that contain text. Files of text have the advantage that they can be read and understood by humans, and they are easily created and edited using general purpose text editors, like IDLE.

Python Programming, 4/e 7 Multi-line Strings You can think of a text file as a (possibly long) string that happens to be stored on disk. A special character or sequence of characters is used to mark the end of each line. While this convention varies by operating system, Python takes care of these different conventions for us and just uses the regular newline character ( \n ).

Python Programming, 4/e 8 Multi-line Strings Hello World Goodbye 32 When stored to a file, you get this: Hello\ nWorld \n\ nGoodbye 32\n Notice that the blank line becomes a bare newline.

Python Programming, 4/e 9 Multi-line Strings This is no different than when we embed newline characters into output strings to produce multiple lines of output with a single print statement. print("Hello\ nWorld \n\ nGoodbye 32\n") Remember, if you simply evaluate a string containing newline characters in the shell, you will just get the embedded newline representation back. "Hello\ nWorld \n\ nGoodbye 32\n"

Python Programming, 4/e 10 File Processing Outline Virtually all programming languages share certain underlying file manipulation concepts. We need some way to associate a file on disk with an object in a program – this is called opening a file. We need a set of operations that can manipulate the file object. At the very least, we need to be able to read the information from a file and to write new information to a file. Lastly, when a we are done we need to close the file.

Python Programming, 4/e 11 File Processing Outline This idea of opening and closing files is closely related to how you might work with files in an application program such as IDLE. When you open a file for editing in IDLE, the file is actually read from disk and stored in RAM. At this point, the file is closed (in the programming sense). As you edit the file, you are really making changes to the data in memory, not the file itself. Changes will not show up on disk until you “save” it.

Python Programming, 4/e 12 File Processing Outline The process of saving a file in IDLE is also a multi-step process. The original file on the disk is opened, this time in a mode that allows it to store information (opened for writing ). Doing this actually erases the old contents of the file! File writing operations are then used to copy the current contents of the in-memory file into the new file on disk.

Python Programming, 4/e 13 File Processing Outline Working with text files in Python is easy! Create a file object that corresponds to a file on disk: <variable> = open(<path>, <mode>) Here, path is a string that provides the location of the file on disk. For a text file, mode is either "r" or "w" depending on whether the file intended to be read from or written to. If the mode is omitted, the file is opened for reading.

Python Programming, 4/e 14 File Processing Outline # printfile.py # Prints a file to the screen. def main(): fname = input("Enter a filename: ") infile = open( fname , "r") data = infile.read () infile.close () print(data)

Python Programming, 4/e 15 File Processing Outline The program first prompts the user for a file name and then opens the file for reading through the variable infile . While any identifier works, here the name serves to remind us that the object is a file and it is being used for input. The entire contents of the file is then read as one multi-line string and stored in the variable data . Printing data causes the file contents to be displayed.

Python Programming, 4/e 16 File Processing Outline This process illustrates the basic three-step process for working with a file: Open the file. Use file operations to read or write data. Close the file. Any file that is opened should be closed when the program is done using it. Technically, all files get closed when the program terminates, but doing it explicitly is good programming style.

Python Programming, 4/e 17 File Processing Outline In order to make sure that necessary actions such as closing a file occur, Python has a powerful feature called a context manager . # printfile2.py # Prints a file to the screen. def main(): fname = input("Enter a filename: ") with open( fname , "r") as infile : data = infile.read () print(data)

Python Programming, 4/e 18 File Processing Outline The with statement associates the variable with the file object created by open . The file object acts as a context manager for executing the instructions in the indented body of the with. When the body has completed, the file will be closed automatically, even if control leaves the body due to an exception or return statement.

Python Programming, 4/e 19 Reading from a File read is just one of several options that can be used to access the contents of a file. <file>.read() – Returns the entire remaining contents of the file as a single (potentially large, multi-line) string. <file>. readline () – Returns the next line of the file, i.e. all text up to and including the newline character. <file>. readlines () – Returns a list of the remaining lines in the file. Each list item is a string of a single line including the newline character at the end.

Python Programming, 4/e 20 Reading from a File Text files are read sequentially – the system keeps track of what has been read since a file has been opened, so that a later read will pick up where the previous one left off. If you want to read a previous line, you need to close and reopen the file.

Python Programming, 4/e 21 Reading from a File Successive calls to readline () read successive line from the file. The string returned by readline () will always end with a newline character. Use slicing to strip off the newline character at the end of the line, otherwise it will look double-spaced. Or, you could also tell print to not add its own newline, e.g. print(line, end="") .

Python Programming, 4/e 22 Reading from a File with open( someFile , "r") as infile : for _ in range(5): line = infile.readline () print(line[:-1])

Python Programming, 4/e 23 Reading from a File One way to loop through the entire contents of a file is to read in all of the file using readlines , then loop through the resulting list. with open( someFile , "r") as infile : for line in infile.readlines (): # process the line here What happens if the file is too large to fit in your computer’s memory?

Python Programming, 4/e 24 Reading from a File Python treats a file as sequence of lines, so looping through the lines can be done directly: with open( someFile , "r") as infile : for line in infile : # process the line here

Python Programming, 4/e 25 Reading from a File Let’s improve our statistics library from last chapter. One disadvantage of the previous version is that getNumbers () gets numbers from the user interactively. What if you are trying to average one hundred numbers and you make a mistake on number 98? Doh! You’d need to start over again.

Python Programming, 4/e 26 Reading from a File A better approach – type all the numbers into a file. We can then edit the data before sending it to the program. This file-oriented approach is typically used for data-processing applications. We can improve the usefulness of our library by adding a getNumbersFromFile function that takes the name of a file as a parameter and returns a list of numbers read from the file.

Python Programming, 4/e 27 Reading from a File Suppose our numbers are in a text file, with each line containing a single number. def getNumbersFromFile ( fname ): nums = [] with open( fname , "r") as infile : for line in infile : nums.append (float(line)) return nums

Python Programming, 4/e 28 Reading from a File We could also do this more succinctly with a list comprehension: def getNumbersFromFile ( fname ): nums = [] with open( fname , "r") as infile : nums = [float(line) for line in infile ] return nums

Python Programming, 4/e 29 Reading from a File Using this approach, we need to be very careful with the format of the input file – there must be exactly one number on each line. A common error is to introduce an extra blank line at the bottom that may go unnoticed. This would cause in < listcomp > nums = [float(line) for line in infile ] ValueError : could not convert string to float: ’’

Python Programming, 4/e 30 Reading from a File We could make our function more flexible by having it accept multiple numbers on the same line. A single line can easily be turned into a list of numbers using split in the list comprehension, similar to what we did when we had multiple numbers on a single line of interactive input: nums = [float(num) for x in line.split ()]

Python Programming, 4/e 31 Reading from a File To get all the numbers across multiple lines, we simply wrap this up in an accumulator loop that processes the lines of the input file: def getNumbersFromFile ( fname ): nums = [] with open( fname , "r") as infile : for line in infile : newnums = [float(num) for x in line.split ()] nums.extend ( newnums ) return nums

Python Programming, 4/e 32 Reading from a File Here the accumulator is called nums and the list created from each line is called newnums . The final line in the loop body appends the numbers from the current line to the end of the accumulator using the list extend method introduced in chapter 9. This version of the stats program appears in stats3.py .

Python Programming, 4/e 33 Reading from a File Using this approach has several benefits: It allows you to create a data file with as many numbers on each line as you want. The program will also be more robust by handling accidental blank lines (Do you see how?).

Python Programming, 4/e 34 Writing to a File Opening a file for writing prepares that file to receive data. If no file with the given name exists, a new file will be created. If a file with the given name does exist, Python will delete it and create a new, empty file. with open(" mydata.out ", "w") as outfile : # do things with outfile here

Python Programming, 4/e 35 Writing to a File The easiest way to write information into a text file is to use the print function. To do this, simply add an extra keyword parameter that specifies the file: print(..., file=< outputfile >) This behaves exactly like a normal print , except the result is sent to outputfile rather than the screen.

Python Programming, 4/e 36 Writing to a File Here’s a program to create a text file with a haiku about programming: # haiku.py def main(): haiku = ["White space and syntax", "Python code flows like water", "Solutions emerge"] print("I have a haiku for you.")

Python Programming, 4/e 37 Writing to a File fname = input("Enter a file name to receive the haiku: ") with open( fname , "w") as haikufile : for line in haiku: print(line, file= haikufile ) print( f"Look in { fname } to see your haiku")

Python Programming, 4/e 38 Batch Processing To see how these pieces fit together in a larger example, let’s redo the username generation program from Chapter 8. Our previous version created usernames interactively by having the user type in his or her name. If we were setting up accounts for a large number of users, this process would probably not be done interactively, but in batch mode, where program input and output is done through files.

Python Programming, 4/e 39 Batch Processing Each line of the input file will contain the first and last names of a new user separated by one or more spaces. The program produces an output file containing a line for each generated username.

Python Programming, 4/e 40 Batch Processing # userfile.py # Program to create a file of usernames in batch mode. def main(): print("This program creates a file of usernames from a") print("file of names.") # get the file names infileName = input("What file are the names in? ") outfileName = input("What file should the usernames go in? ")

Python Programming, 4/e 41 Batch Processing # open the files with open( infileName , "r") as infile , open( outfileName , "w") as outfile : # process each line of the input file for line in infile : # get the first and last names from line first, last = line.split () # create the username uname = (first[0]+last[:7]).lower() # write it to the output file print( uname , file= outfile ) print("Usernames have been written to", outfileName )

Python Programming, 4/e 42 Batch Processing A couple things worth noticing: Two files are open at the same time, one for input ( infile ) and one for output ( outfile ). This is accomplished in the with by including two open(…) as <variable> clauses separated by a comma. It’s not unusual for a program to act on multiple files simultaneously. When creating the username, the lower string method was used to ensure that the username is all lowercase, even if the input names are mixed case.

Python Programming, 4/e 43 File Names and Paths So far in our examples we’ve indicated the file to be opened by supplying the name of the file as a string. Using this approach, files end up in the folder where the programs live. This might be OK for assignments, but in the real world we’d like users to be able to select files from anywhere in secondary memory.

Python Programming, 4/e 44 Absolute and Relative Paths Way back in Chapter 1 we looked at how a computer’s operating system generally organizes secondary memory as a hierarchical collection of directories (also called folders) that can contain files as well as other directories. The directory at the top of this hierarchy is called the root directory. A file is located by specifying a path from the root directory down through the hierarchy of directories.

Python Programming, 4/e 45 Absolute and Relative Paths E.g., the text of this chapter is in a file having the path /home/ zelle /Books/cs1book/cs1book4e/textbook/chapter10.tex The top-level directory on Dr. Zelle’s computer is designated with a / . His computer’s root directory contains around 20 subdirectories, including one called home . A slash ( / ) is also used to separate the directory names along the path.

Python Programming, 4/e 46 Absolute and Relative Paths You can think of the path from the root as representing the “full name” of any given file. The name has to be so complex because a typical computer contains millions of files; there must be a way to uniquely identify each of these files. This complete path to a given directory or file is called the absolute path . Anywhere in Python where a file path is needed, an absolute path can be used.

Python Programming, 4/e 47 Absolute and Relative Paths Anywhere in Python where a file path is needed, an absolute path can be used. Working with absolute paths can be a pain! They’re long Moving a file or folder changes the absolute paths of files and folders! Any path that beings with something other than the root directory is considered a relative path.

Python Programming, 4/e 48 Absolute and Relative Paths When we just use the name of a file in our examples, those were relative paths. Running programs always have an associated working directory which is the directory that it is currently working in. Typically, this is the directory where your program file is located.

Python Programming, 4/e 49 Absolute and Relative Paths Suppose we have a program data_analyzer.py stored in /home/ zelle /python . When this program is run its working directory will be /home/ zelle /python . path = input("What file should I analyze? ") with open(path, "r") as infile : # process the file If the user enters nums.txt , the program will look for /home/ zelle /python/nums.txt .

Python Programming, 4/e 50 Absolute and Relative Paths Suppose the user instead enters data/nums.txt . Python will threat this as a path starting at the current working directory: /home/ zelle /python/ data/nums.txt . The characters “.” and “..” have special meanings for relative paths. “.” indicates the current working directory “..” indicates the parent of the current working directory. In our previous example, an equivalent would be ../data/nums.txt

Python Programming, 4/e 51 Absolute and Relative Paths Dr. Zelle’s laptop is running Linux. While the ideas are the same, the details differ among operating systems. On macOS, a user’s home directory is in /Users . /Users/ zelle /data/nums.txt On Windows, the path notation is a little different. C:\Users\zelle\data\nums.txt Each hard drive ( C: , D : ) has its own file system with its own root directory. Windows uses \ rather than / in paths

Python Programming, 4/e 52 Absolute and Relative Paths Python always allows paths to be separated using a regular slash ( / ) on any OS for interoperability. It’s best practice to avoid “ \ ” in Windows paths in Python since the backslash is used in string literals to indicate special characters, i.e. \t , \n . To use an actual backslash in a literal, you’d need to escape it ( \\ ) or prefix the string with r to indicate it is a “raw” string (don’t interpret).

Python Programming, 4/e 53 Absolute and Relative Paths Three ways to open the same file in Windows with open("data/nums.txt") as infile : # generic Python # notation with open("data\\nums.txt") as infile : # Windows notation # using special char with open( r"data \nums.txt") as infile : # Windows notation # using raw string The best one? Number one – it will work on other operating systems besides Windows.

Python Programming, 4/e 54 Using pathlib File are a ubiquitous part of the computing landscape, and just about every program has to manipulate them in one way or another. Python provides a library called pathlib to help with some of the common, but tedious tasks. The main tool is the Path object. Path is a sort of “wrapper” around a path string that gives it some convenient superpowers.

Python Programming, 4/e 55 Using pathlib Let’s improve our batch-oriented username program so that it checks if the intended output file exists. If it does, create a backup of that file so that the contents aren’t lost when the new usernames are written.

Python Programming, 4/e 56 Using pathlib # userfile2.py from pathlib import Path def main(): print("This program creates a file of usernames from a") print("file of names.") # get the file names inPath = Path(input("What file are the names in? ")) outPath = Path(input("What file should the usernames go in? "))

Python Programming, 4/e 57 Using pathlib # backup the output file if it already exists if outPath.exists (): backupPath = outPath.with_suffix (". bak ") print( f"Renaming existing {outPath.name} to {backupPath.name}") outPath.rename ( backupPath )

Python Programming, 4/e 58 Using pathlib # open the files with open( inPath , "r") as infile , open( outPath , "w") as outfile : # process each line of the input file for line in infile : # get the first and last names from line first, last = line.split () # create the username uname = (first[0]+last[:7]).lower() # write it to the output file print( uname , file= outfile ) print("Usernames have been written to", outPath )

Python Programming, 4/e 59 Using pathlib You can extract different parts of a path using simple attributes from a Path object. >>> path = Path("/home/ zelle /python/data.txt") >>> path.name 'data.txt’ >>> path.stem 'data’ >>> path.suffix '.txt'

Python Programming, 4/e 60 Using pathlib We can create a slightly modified path by using with_<part> methods to replace specific parts in an existing path. backupPath = outPath.with_suffix (". bak ") This creates a new Path that is just like outPath , except it has the extension (suffix) “. bak ” instead of its original extension. Our program’s output will look something like Renaming existing usernames.txt to usernames.bak

Python Programming, 4/e 61 Using pathlib The actual renaming of the file happens with outPath.rename ( backupPath ) The rename method is one of a number of Path object methods that can be used to make changes in the underlying file system. The necessary commands differ by operating system, but the Path object handles the differences in a transparent way!

Python Programming, 4/e 62 Iterating over Directories Another task that programs often need to do is to process a whole batch of files at a time. For example, a photo management app might allow the user to load all the images in a given directory. If you have a Path object that points to a directory on your hard disk, there are a couple methods that allow you to loop over the contents of that directory.

Python Programming, 4/e 63 Iterating over Directories The simplest of these methods is iterdir . It produces a sequence of Path objects, one for each file or directory contained in the original directory. >>> path = Path(".") >>> for p in path.iterdir (): print(p) names.txt stats3.py …

Python Programming, 4/e 64 Iterating over Directories list( path.iterdir ()) [ PosixPath (’names.txt’), PosixPath (’test.txt’), PosixPath (’stats3.py’), PosixPath (’data’), PosixPath (’nums1.txt’), PosixPath (’ usernames.bak ’), PosixPath (’nums2.txt’), PosixPath (’usernames.txt’), PosixPath (’userfile2.py’), PosixPath (’userfile.py’), PosixPath (’haiku.py’)]

Python Programming, 4/e 65 Iterating over Directories Notice that each item in the sequence produce by listdir () is itself a Path object. It means we can make use of the various Path methods on these items. The is_file method returns True if the path is a file (as opposed to a directory). files = [p for p in path.iterdir () if p.is_file ()]

Python Programming, 4/e 66 Iterating over Directories If we wanted just the Python program files, we could grab just the items that had a . py suffix. python_files = [p for p in path.iterdir () if p.suffix == ". py "] This last example could have been handled more simply using a technique known as file globbing . You can select a subset of files that match a pattern using the glob method: path.glob (pattern)

Python Programming, 4/e 67 Iterating over Directories The pattern looks like a regular path string except that it can contain certain “wildcard” characters. “?” matches any single character “*” matches any sequence of characters python_files = list ( path.glob ("*. py ")) The glob "*. py " will match any file that ends with . py  

Python Programming, 4/e 68 Iterating over Directories Our last addition was a getNumbersFromFile (path) function that can be used to get a data set from a specific file. Suppose we have a number of data sets, each stored in a separate file in our data directory. It would have handy to have a getNumbersFromFiles function making use of file globbing to accumulate all the data across the set of files.

Python Programming, 4/e 69 Iterating over Directories Let’s write a function with two parameters. basedir gives the directory containing the data pattern is a pattern for which files to look in To get the number from all the flies in a data directory, we could do data = getNumbersFromFiles ("data", "*") To get data from all files having “exam” in the name, data = getNumbersFromFiles ("data", "*exam*") To write this you need an accumulator to build a list of all the numbers.

Python Programming, 4/e 70 Iterating over Directories def getNumbersFromFiles ( basedir , pattern): path = Path( basedir ) nums = [] for filepath in path.glob (pattern): newnums = getNumbersFromFile ( filepath ) nums.extend ( newnums ) return nums

Python Programming, 4/e 71 Iterating over Directories Notice how basedir was turned into a Path object at the start – that ensures that you can call glob in the heading. This function will work when basedir is passed as either a string or a Path object.

Python Programming, 4/e 72 File Dialogs Some operating systems (e.g. Windows and macOS), by default will only show the main stem of the filename and not the type suffix, making it hard to know the full filename for performing file operations. This situation is even more complicated when the file exists somewhere other than the current working directory. In order to operate on these far-flung files, we need the complete path to them! Do you know how to find the complete path to an arbitrary file on your computer?

Python Programming, 4/e 73 File Dialogs One solution to this problem is to allow users to browse the file system visually and navigate their way to particular file/directory. The usual technique incorporates a dialog box that allows a user to click around in the file system and either select or type in th ename of a file. Fortunately for us, the tkinter GUI library included with (most) standard Pythons has these kinds of functions!

Python Programming, 4/e 74 File Dialogs To ask the user for the name of a file to open, you can use the askopenfilename function found in the tkinter.filedialog module. from tkinter.filedialog import askopenfilename The reason for the dot notation is that tkinter is package composed of multiple modules. To get the name of the user names file infileName = askopenfilename ()

Python Programming, 4/e 75 File Dialogs

Python Programming, 4/e 76 File Dialogs The dialog box allows the user to either type in th ename of the file or to simply select it with the mouse. When the user clicks the “Open” button, the complete path name of the file is returned as a string and saved into the variable infileName . If the user clicks the “Cancel” button, the function will simpley return the empty string, "".

Python Programming, 4/e 77 File Dialogs from tkinter.filedialog import asksaveasfilename ... outfileName = asksaveasfilename () You could, of course, import both at once: from tkinter.filedialog import askopenfilename , asksaveasfilename

Python Programming, 4/e 78 File Dialogs

Python Programming, 4/e 79 File Dialogs If you need to get a directory path from the user, there’s also an askdirectory function. All these functions have numerous optional parameters that allow a program to customize the resulting dialogs.

Python Programming, 4/e 80 Binary Files and Pickling Files can store any kind of data, even though we’ve focused on string data so far. Files on disk are really just a sequence of bytes, so arbitrary data can be encoded into the bytes stored in a particular file. You undoubtedly have files on your computer that store images, audio, video, etc.

Python Programming, 4/e 81 Strings and Bytes There is a close correspondence between characters of a string and bytes. Before Unicode, each character in a string was treated as a single byte of data. When a string that contains only characters from the original ASCII alphabet is encoded as bytes, each character is stored as a single byte.

Python Programming, 4/e 82 Strings and Bytes >>> s = "Hello, Bytes!" >>> b = s.encode () >>> type(b) <class 'bytes’> Here, we created a string, s , then encoded it into bytes, storing it into variable b .

Python Programming, 4/e 83 Strings and Bytes A byte is 8 bits, which means there are 256 different byte values. Typically, bytes are stored as unsigned integers in the range 0-255, inclusive. >>> b[0] 72 >>> b[1] 101

Python Programming, 4/e 84 Strings and Bytes The first byte of b is 72, because that is the Unicode value of “H”. In other words, it is ord (“H”). >>> len (s) 13 >>> len (b) 13 >>> b b'Hello , Bytes!'

Python Programming, 4/e 85 Strings and Bytes s has 13 characters, b has 13 bytes The last line shows a string literal prefaced with b (for bytes), which is a compact way of showing the byte sequence, exploiting the standard ASCII mapping of byte values to character. What if our string contains non-ASCII characters? Let’s concatenate some Unicode characters with values greater than 255 to our string.

Python Programming, 4/e 86 Strings and Bytes sx = s + chr(128) + chr(256) + chr(512) + chr(1024) bx = sx.encode () len ( sx ) 17 len (bx) 21 bx b'Hello , Bytes!\xc2\x80\xc4\x80\xc8\x80\xd0\x80'

Python Programming, 4/e 87 Strings and Bytes We added four characters, so the length of the string is now 17 (characters). The encoding of the string, though, is now 21 bytes. The non-ASCII characters were encoded into a pair of bytes, and are displayed in hexadecimal (base 16) notation. We can also convert a bytes object back into a string. >>> b.decode () 'Hello, Bytes!'

Python Programming, 4/e 88 Strings and Bytes In fact, when we work with a text file in Python, this is exactly what’s happening behind the scenes! When reading from a file, Python reads in a sequence of bytes from the file and decodes them into a string. To write to a text file, Python encodes the string as a sequence of bytes and streams the bytes into the file.

Python Programming, 4/e 89 Binary Mode and Pickling Python also allows byte-level access to files. We can read and write data as sequences of bytes rather than strings. Let’s assume the haiku we wrote earlier is stored in the file haiku_out.txt .

Python Programming, 4/e 90 Binary Mode and Pickling >>> with open("haiku_out.txt", "r") as infile : data = infile.read () print(data) White space and syntax Python code flows like water Solutions emerge

Python Programming, 4/e 91 Binary Mode and Pickling To treat the file as a sequence of bytes instead of text, we just append a ‘b’ (for binary) to the mode string when opening the file. Notice the difference in our next interaction! Using the mode ‘ rb ’ opens the file for reading in binary mode. Reading the file in this mode gets back a bytes object instead of a string.

Python Programming, 4/e 92 Binary Mode and Pickling >>> with open("haiku_out.txt", " rb ") as infile : data = infile.read () print(data) b’White space and syntax\ nPython code flows like water\ nSolutions emerge\n’

Python Programming, 4/e 93 Binary Mode and Pickling If we want a string back, we must explicitly decode it. >>> with open("haiku_out.txt", " rb ") as infile : data = infile.read () print( data.decode ()) White space and syntax Python code flows like water Solutions emerge

Python Programming, 4/e 94 Binary Mode and Pickling We can also open a file for binary writing using the mode ‘ wb ’. To write to a file in this mode, we must write bytes, not strings. with open(" bytes.out ", " wb ") as outfile : outfile.write ( b"Hello , Bytes!") Notice we didn’t use print , since print turns its arguments into strings.

Python Programming, 4/e 95 Binary Mode and Pickling To output bytes to a file, use the file method write . The binary mode is really for manipulating non-text data. Doing so requires some sort of binary encoding to represent the data as a raw sequence of bytes. Usually, we can use existing libraries that handle whatever specialized data format we need.

Python Programming, 4/e 96 Binary Mode and Pickling One standard library that’s handy for storing binary data is pickle . The purpose of the library is to preserve your arbitrary Python objects as a sequence of bytes in a file. The process of turning an object into a sequence of bytes is called serialization .

Python Programming, 4/e 97 Binary Mode and Pickling Suppose we have created a data set and would like to save it so that it can be loaded back up again later. If we quit our program, our list of numbers will be lost unless we somewhow write it to a file! We could do this with a text file, e.g. writeNumbersToFile () (left as an exercise for you). But what’s the fun of that?

Python Programming, 4/e 98 Binary Mode and Pickling Let’s have two functions – one that serializes the list into a binary file and another that reads it back in again. import pickle def storeData ( nums , path): with open(path, " wb ") as outfile : pickle.dump ( nums , outfile )

Python Programming, 4/e 99 Binary Mode and Pickling In this function, nums is the list of numbers that we want to save and path is the path string (or Path object) for the file to save the list into. Our list is pickled for storage and later consumption with no loops our futzing around with the dump method.

Python Programming, 4/e 100 Binary Mode and Pickling Python uses its own binary format to do the serialization. To load the list back in again will require another use of pickle. The inverse of dump is load . All we need to do is open up the file for reading in binary mode and call pickle.load . Python will read in the bytes and decode them back into whatever was pickled in the first place.

Python Programming, 4/e 101 Binary Mode and Pickling def loadData (path): with open(path, " rb ") as infile : nums = pickle.load ( infile ) return nums >>> storeData ([3, 1, 4, 1, 5, 9], " test.pkl ") >>> nums = loadData (" test.pkl ") >>> nums [3, 1, 4, 1, 5, 9]

Python Programming, 4/e 102 Binary Mode and Pickling You can use pickle to save the state of a game so that users can pick up where they left off, or your AI application might serialize a trained neural network so that you can distribute it to thousands of users.

Python Programming, 4/e 103 Binary Mode and Pickling But there are some downsides: The resulting file is binary and so it is not in a human readable format. In many cases (like configuration files) it would be a better idea to keep it human readable. While pickle works for lots of objects and all Python’s built-in types, it won’t work for all object types. The process of loading a pickle file could cause the execution of arbitrary (and potentially nefarious) Python. Never load a pickle from an untrusted source!

Python Programming, 4/e 104 Remote Files A lot of the data that our programs might use is not stored on the local computer, but is accessed by the Internet. Sometimes this is referred to as storing data “in the cloud.” The supporting web site for this textbook has all the code and data file from the book. You can locate those files by typing the Uniform Resource Locater into your favorite browser.

Python Programming, 4/e 105 Remote Files https://mcsp.wartburg.edu/zelle/python/ppics4/code/chapter10/nums2.txt Assuming you have an Internet connection, this will direct your OS to send a request to another computer asking for the specified data. You’ll notice that this looks like a path…

Python Programming, 4/e 106 Remote Files You could use your browser to save this data to your computer, but wouldn’t it be more convenient if we had a program fetch the data directly off the web for us? Let’s add one more data fetching function to our statistics library. Python provides a function that allows us to open a remote file in a fashion analogous to opening a file on the local computer.

Python Programming, 4/e 107 Remote Files from urllib.request import urlopen def getNumbersFromURL ( url ): nums = [] with urlopen ( url ) as infile : for line in infile : line = line.decode () newnums = [float(x) for x in line.split ()] nums.extend ( newnums ) return nums

Python Programming, 4/e 108 Remote Files There are really only two slight changes from getNumbersFromFile . Instead of using the standard open function, it uses urlopen , which is imported from the module urllib.request . The urlopen function sends out a network request for the given URL and provides a file-like object from which we can read the data coming back over the network. This object acts like a file that has been opened in ‘ rb ’ mode since the URL may not point to textual data.

Python Programming, 4/e 109 Remote Files After opening the URL, we loop over the resulting data line-by-line. Since this is binary data, the line is initially a bytes object. The first line in the loop body decodes it into a string so that we can then turn the string into a list of number, newnums , and accumulate those numbers into the complete list, nums . data = getNumbersFromURL ("https://mcsp.wartburg.edu/ zelle /python ... ") >>> data [26.0, 53.0, 5.0, 89.0, 79.0, 32.0, 38.0, 46.0]
Tags