# 19.7. Finding the Pollution for a State¶

This section uses the same data file that we have been using, but if you want to see all of the data click on the Show button below. Once it appears, you can hide it again by clicking on the Hide button.

To get the average for particular states, we need to be able to identify which state a record belongs to. Currently that is hard because the string that has the state also has the city: "Pocatello,ID". We need to separate the two. Fortunately, we can easily use split to do so.

We have already been using split to chop up the entire line of data into a list of the three values it contains - turning "Pocatello,ID:15:9" into ["Pocatello,ID", "15", "9"]. In the same way, we can split "Pocatello,ID" at the “,” to turn it into a list like ["Pocatello", "ID"] where the first value is the city and the second the state.

Step 6 is in the sample below is where we do the key work. We grab values[0], which is the city/state string, make a list out of it using split, and call that list cityState. We can then use cityState[0] to get the city name and cityState[1] to get the state.

Activity: CodeLens 19.7.1 (csppythondata_findpollstate1)

Now that we have the state isolated, we can use that to look for just records that have the state code that we desire. Let’s try that and look for records from Oregon. We will loop through all of the records, split the line up into values, split the city/state up into a list, and then test the state value against the state code “OR” and only print out records that have that code.

It seems to work. But if you look closely at the records it produces, there is nothing for Portland, OR. Looking at the data file, we can see why - Portland is listed as part of a metro area that extends into Washington, so its state code is listed as “OR-WA”. For this program to work correctly, we need to accept any state code that has “OR” anywhere in it. That is an easy fix, we just need to change the == operator into the in operator to see if the targetState is anywhere in the state code from the line we are working with. Try doing that and make sure Portland appears in the output.

Now, we can merge our average logic into that code. We will only count records that are in the target state.

The following program finds the highest PM10 value in a particular state. Arrange and indent the blocks so it works correctly.

We only want to check the PM10 values for cities that are located in the target state. Make sure to do the state check before worrying about checking the