5.9. Iterating over lines in a file¶
We will now use a file as input in a program that will do some data processing. In the program, we will
examine each line of the file and print it out to the console with some additional text. Because
readlines()
returns a list of lines of text, we can use the for loop to iterate through each line of the file.
A line of a file is defined to be a sequence of characters up to and including a special character called
the newline character. If you evaluate a string that contains a newline character you will see the character
represented as \n
. If you print a string that contains a newline you will not see the \n
, you will just
see its effects (a carriage return). To see this in action run the following code:
The \n
uses the backslash character as an ‘escape’ character. The backslash tells the Python interpreter that the
next character is not a normal part of the string, but means something special. In this case, the n after the \ tells
the Python interpreter that a new line is needed. When you open a .txt file and there are separate lines of text, there is
essentially an invisible \n
at the end of each line.
To read a file line by line, the readLines() method detects th \n
and uses that to separate the text into separate strings.
As the for loop iterates through each line of the file, the loop variable will contain the current line of the
file as a string of characters. The general pattern for processing each line of a text file is as follows:
for line in myFile.readlines():
statement1
statement2
...
To process all of our olympics data, we will use a for loop to iterate over the lines of the file. Using
the split
method, we can break each line into a list containing all the fields of interest about the
athlete. We can then take the values corresponding to name, team and event to
construct a simple sentence. Note that in the example below, the code on line 5 uses the split() method on the string aline, which
breaks the string into a series of smaller strings, and stores the smaller strings in a list. The code on line 6 gets
specific items out of the list and prints them out to the console. We’ll cover the split() method in detail in Chapter 10.
The important thing to see here is that the instructions inside the for loop execute, operating on each line of the file in turn.
To make the code a little simpler, and to allow for more efficient processing, Python provides a built-in way to iterate through the contents of a file one line at a time, without first reading them all into a list. Some students find this confusing initially, so we don’t recommend doing it this way, until you get a little more comfortable with Python. But this idiom is preferred by Python programmers, so you should be prepared to read it. And when you start dealing with big files, you may notice the efficiency gains of using it.
Name,Sex,Age,Team,Event,Medal A Dijiang,M,24,China,Basketball,NA A Lamusi,M,23,China,Judo,NA Gunnar Nielsen Aaby,M,24,Denmark,Football,NA Edgar Lindenau Aabye,M,34,Denmark/Sweden,Tug-Of-War,Gold Christine Jacoba Aaftink,F,21,Netherlands,Speed Skating,NA Christine Jacoba Aaftink,F,25,Netherlands,Speed Skating,NA Christine Jacoba Aaftink,F,25,Netherlands,Speed Skating,NA Christine Jacoba Aaftink,F,27,Netherlands,Speed Skating,NA Per Knut Aaland,M,31,United States,Cross Country Skiing,NA Per Knut Aaland,M,33,United States,Cross Country Skiing,NA John Aalberg,M,31,United States,Cross Country Skiing,NA John Aalberg,M,33,United States,Cross Country Skiing,NA "Cornelia ""Cor"" Aalten (-Strannood)",F,18,Netherlands,Athletics,NA "Cornelia ""Cor"" Aalten (-Strannood)",F,18,Netherlands,Athletics,NA Antti Sami Aalto,M,26,Finland,Ice Hockey,NA "Einar Ferdinand ""Einari"" Aalto",M,26,Finland,Swimming,NA Jorma Ilmari Aalto,M,22,Finland,Cross Country Skiing,NA Jyri Tapani Aalto,M,31,Finland,Badminton,NA Minna Maarit Aalto,F,30,Finland,Sailing,NA Minna Maarit Aalto,F,34,Finland,Sailing,NA Pirjo Hannele Aalto (Mattila-),F,32,Finland,Biathlon,NA Timo Antero Aaltonen,M,31,Finland,Athletics,NA Win Valdemar Aaltonen,M,54,Finland,Art Competitions,NA
Check your Understanding
Sad upset blue down melancholy somber bitter troubled Angry mad enraged irate irritable wrathful outraged infuriated Happy cheerful content elated joyous delighted lively glad Confused disoriented puzzled perplexed dazed befuddled Excited eager thrilled delighted Scared afraid fearful panicked terrified petrified startled Nervous anxious jittery jumpy tense uneasy apprehensive
Write code to find out how many lines are in the file
emotion_words.txt
as shown above. Save this value to the variablenum_lines
. Do not use the len method.