5.5. Dealing with multiple DataFrames

Forget about budget or runtimes as criteria for selecting a movie, let’s take a look at popular opinion. Our dataset has two relevant columns: vote_average and vote_count.

Let’s create a variable called df_high_rated that only contains movies that have received more than 20 votes and whose average score is greater than 8.

df_highly_voted = []
df_high_rated = []
df_high_rated[['title', 'vote_average', 'vote_count']]

Here we have some high-quality movies, at least according to some people.

Q-1: How many highly-rated movies are in this dataset?

But what about my opinion?

Here are my favorite movies and their relative scores. Create a DataFrame called compare_votes that contains the title as an index and both the vote_average and my_vote as its columns. Also only keep the movies that are both my favorites and popular favorites.

HINT: You’ll need to create two Series, one for my ratings and one that maps titles to vote_average.

    "Star Wars": 9,
    "Paris is Burning": 8,
    "Dead Poets Society": 7,
    "The Empire Strikes Back": 9.5,
    "The Shining": 8,
    "Return of the Jedi": 8,
    "1941": 8,
    "Forrest Gump": 7.5,

There should be only 6 movies remaining.

Now add a column to compare_votes that measures the percentage difference between my rating and the popular rating for each movie. You’ll need to take the different between the vote_average and my_vote and divide it by my_vote.


Q-2: What’s the percentage difference between my rating for Star Wars and its popular rating?

Q-3: Make up 3 questions you would like to answer about this movie data using the techniques you have learned in this lesson and write them in the box.

Q-4: Summarize the answers to your questions here.

Lesson Feedback

    During this lesson I was primarily in my...
  • Comfort Zone
  • Learning Zone
  • Panic Zone
    Completing this lesson took...
  • Very little time
  • A reasonable amount of time
  • More time than is reasonable
    Based on my own interests and needs, the things taught in this lesson...
  • Don't seem worth learning
  • May be worth learning
  • Are definitely worth learning
    For me to master the things taught in this lesson feels...
  • Definitely within reach
  • Within reach if I try my hardest
  • Out of reach no matter how hard I try
You have attempted of activities on this page
Next Section - 6. CIA World Factbook Data