# 7.2. Finding the CheatersΒΆ

In this lesson you are going to do a simplified version of the analysis outlined in the The Tennis Racket article. I have prepared an anonymized data file for you that contains a numeric identifier instead of a name, along with the starting odds and the ending odds of a number of tennis matches. Your goal is to identify the cheaters. You can get the anonymous from puzzle_anon.csv.

Q-1: Check the ids of all of the cheaters here.

• 7002589994262270000

• Cheater number 1

• 2416068425895370000

• Cheater number 2

• 1547483661413490000

• Cheater number 3

• 6228119144908420000

• Cheater number 4

• 1718561694846000000

• Cheater number 5

• 4643766977283540000

• You accuse an innocent person

• 1693568023468290000

• You accuse an innocent person

• All are cheaters

• No, not everyone is a cheater

• All are honest

• No, not everyone is honest

Now that you have identified the cheaters can you match the cheaters with their real names. Here is a dataset that contains their names puzzle_real.csv

Q-2: Match the numeric identifiers from the first part of the project with the real names. Please keep going with your analysis.
• bob
• 7002589994262270000
• jane
• 2416068425895370000
• john
• 1547483661413490000
• sally
• 6228119144908420000
• sue
• 1718561694846000000
• don
• 4643766977283540000
• hill
• 1693568023468290000

Q-3: What are three main points that you take away from the Tennis Racket article?

Q-4: What are three main points that you take away from the unmasking article? What ethical considerations are important to you when considering de-anonymizing some other data set?

