5.9. Iterating over lines in a file

We will now use a file as input in a program that will do some data processing. In the program, we will examine each line of the file and print it out to the console with some additional text. Because readlines() returns a list of lines of text, we can use the for loop to iterate through each line of the file.

A line of a file is defined to be a sequence of characters up to and including a special character called the newline character. If you evaluate a string that contains a newline character you will see the character represented as \n. If you print a string that contains a newline you will not see the \n, you will just see its effects (a carriage return). To see this in action run the following code:

The \n uses the backslash character as an ‘escape’ character. The backslash tells the Python interpreter that the next character is not a normal part of the string, but means something special. In this case, the n after the \ tells the Python interpreter that a new line is needed. When you open a .txt file and there are separate lines of text, there is essentially an invisible \n at the end of each line.

To read a file line by line, the readLines() method detects th \n and uses that to separate the text into separate strings. As the for loop iterates through each line of the file, the loop variable will contain the current line of the file as a string of characters. The general pattern for processing each line of a text file is as follows:

for line in myFile.readlines():

To process all of our olympics data, we will use a for loop to iterate over the lines of the file. Using the split method, we can break each line into a list containing all the fields of interest about the athlete. We can then take the values corresponding to name, team and event to construct a simple sentence. Note that in the example below, the code on line 5 uses the split() method on the string aline, which breaks the string into a series of smaller strings, and stores the smaller strings in a list. The code on line 6 gets specific items out of the list and prints them out to the console. We’ll cover the split() method in detail in Chapter 10. The important thing to see here is that the instructions inside the for loop execute, operating on each line of the file in turn.

To make the code a little simpler, and to allow for more efficient processing, Python provides a built-in way to iterate through the contents of a file one line at a time, without first reading them all into a list. Some students find this confusing initially, so we don’t recommend doing it this way, until you get a little more comfortable with Python. But this idiom is preferred by Python programmers, so you should be prepared to read it. And when you start dealing with big files, you may notice the efficiency gains of using it.

Check your Understanding

Sad upset blue down melancholy somber bitter troubled
Angry mad enraged irate irritable wrathful outraged infuriated
Happy cheerful content elated joyous delighted lively glad
Confused disoriented puzzled perplexed dazed befuddled
Excited eager thrilled delighted
Scared afraid fearful panicked terrified petrified startled
Nervous anxious jittery jumpy tense uneasy apprehensive
  1. Write code to find out how many lines are in the file emotion_words.txt as shown above. Save this value to the variable num_lines. Do not use the len method.

You have attempted of activities on this page