14.6. The Tournament

The Tournament class encapsulates the details of the PD competition:

payoffs = {('C', 'C'): (3, 3),
           ('C', 'D'): (0, 5),
           ('D', 'C'): (5, 0),
           ('D', 'D'): (1, 1)}

num_rounds = 6

def play(self, agent1, agent2):
    agent1.reset()
    agent2.reset()

    for i in range(self.num_rounds):
        resp1 = agent1.respond(agent2)
        resp2 = agent2.respond(agent1)

        pay1, pay2 = self.payoffs[resp1, resp2]

        agent1.append(resp1, pay1)
        agent2.append(resp2, pay2)

    return agent1.score, agent2.score

payoffs is a dictionary that maps from the agents’ choices to their rewards. For example, if both agents cooperate, they each get 3 points. If one defects and the other cooperates, the defector gets 5 and the cooperator gets 0. If they both defect, each gets 1. These are the payoffs Axelrod used in his tournaments.

The play method runs several rounds of the PD game. It uses the following methods from the Agent class:

After the given number of rounds, play returns the total score for each agent. We chose num_rounds=6 so that each element of the genotype is accessed with roughly the same frequency. The first element is only accessed during the first round, or one sixth of the time. The next two elements are only accessed during the second round, or one twelfth each. The last four elements are accessed four of six times, or one sixth each, on average.

Tournament provides a second method, melee, that determines which agents compete against each other:

def melee(self, agents, randomize=True):
    if randomize:
        agents = np.random.permutation(agents)

    n = len(agents)
    i_row = np.arange(n)
    j_row = (i_row + 1) % n

    totals = np.zeros(n)

    for i, j in zip(i_row, j_row):
        agent1, agent2 = agents[i], agents[j]
        score1, score2 = self.play(agent1, agent2)
        totals[i] += score1
        totals[j] += score2

    for i in i_row:
        agents[i].fitness = totals[i] / self.num_rounds / 2

melee takes a list of agents and a boolean, randomize, that determines whether each agent fights the same neighbors every time, or whether the pairings are randomized.

i_row and j_row contain the indices of the pairings. totals contains the total score of each agent.

Inside the loop, we select two agents, invoke play, and update totals. At the end, we compute the average number of points each agent got, per round and per opponent, and store the results in the fitness attribute of each agent.

You have attempted of activities on this page