5.3. Filtering the data

Let’s start by only looking at films that cost over a million dollars to make.

Create a variable called budget_df that contains all columns for the movies whose budget was over a million dollars.

budget_df = []
budget_df.shape

Q-1: How many movies have a budget over 1 million dollars?

With this more manageable list of 7000+ movies, I’d like to have a way to look up the budget of a particular movie.

Create a Series object called budget_lookup such that you are able to use a call to budget_lookup['Dead Presidents'] to find the budget of that movie.

budget_lookup = []
budget_lookup['Dead Presidents']

Q-2: What was the budget for Dead Presidents?

I have figured out that the first (alphabetically) movie whose title starts with an ‘A’ is ‘A Bag of Hammers’ and the last movie that starts with a ‘B’ is ‘Byzantium’.

budget_lookup[budget_lookup.index.str.startswith('A')].sort_index()[[0]]
title
A Bag of Hammers    2000000
dtype: int64
budget_lookup[budget_lookup.index.str.startswith('B')].sort_index()[[-1]]
title
Byzantium    10000000
dtype: int64

Use that knowledge to create a series that contains budget informations for all the movies that start with an ‘A’ or a ‘B’.

HINT: No need to use startswith like I did above, just use the movie titles to do a slice.

budget_lookup_as_and_bs = []
budget_lookup_as_and_bs.shape

Q-3: How many movies with a budget of over a million dollars and whose title starts with an ‘A’ or a ‘B’ are there?

Lesson Feedback

    During this lesson I was primarily in my...
  • Comfort Zone
  • Learning Zone
  • Panic Zone
    Completing this lesson took...
  • Very little time
  • A reasonable amount of time
  • More time than is reasonable
    Based on my own interests and needs, the things taught in this lesson...
  • Don't seem worth learning
  • May be worth learning
  • Are definitely worth learning
    For me to master the things taught in this lesson feels...
  • Definitely within reach
  • Within reach if I try my hardest
  • Out of reach no matter how hard I try
You have attempted of activities on this page
Next Section - 5.4. Numbers as indices