7.2. Finding the CheatersΒΆ

In this lesson you are going to do a simplified version of the analysis outlined in the The Tennis Racket article. I have prepared an anonymized data file for you that contains a numeric identifier instead of a name, along with the starting odds and the ending odds of a number of tennis matches. Your goal is to identify the cheaters. You can get the anonymous from puzzle_anon.csv.

    Q-1: Check the ids of all of the cheaters here.

  • 7002589994262270000

  • Cheater number 1

  • 2416068425895370000

  • Cheater number 2

  • 1547483661413490000

  • Cheater number 3

  • 6228119144908420000

  • Cheater number 4

  • 1718561694846000000

  • Cheater number 5

  • 4643766977283540000

  • You accuse an innocent person

  • 1693568023468290000

  • You accuse an innocent person

  • All are cheaters

  • No, not everyone is a cheater

  • All are honest

  • No, not everyone is honest

Now that you have identified the cheaters can you match the cheaters with their real names. Here is a dataset that contains their names puzzle_real.csv

    Q-2: Match the numeric identifiers from the first part of the project with the real names. Please keep going with your analysis.
  • bob
  • 7002589994262270000
  • jane
  • 2416068425895370000
  • john
  • 1547483661413490000
  • sally
  • 6228119144908420000
  • sue
  • 1718561694846000000
  • don
  • 4643766977283540000
  • hill
  • 1693568023468290000

Q-3: What are three main points that you take away from the Tennis Racket article?

Q-4: What are three main points that you take away from the unmasking article? What ethical considerations are important to you when considering de-anonymizing some other data set?

Lesson Feedback

You have attempted of activities on this page
Next Section - 8. Text Analysis with UN General Debates