Creating a simple max damage player
The corresponding complete source code can be found here.
Note
A similar example using gen 7 mechanics is available here.
The goal of this example is to explain how to create a first custom agent. This agent will follow simple rules:
If the active pokemon can attack, it will attack and use the move with the highest base power
Otherwise, it will perform a random switch
Creating a player
The player that we are going to implement does not need to be trained: we can therefore directly inherit from the Player
class.
Let’s create the base class:
from poke_env.player import Player
class MaxDamagePlayer(Player):
pass
Player
’s has one abstract method, choose_move
. Once implemented, we will be able to instantiate and use our player.
Creating a choose_move
method
Method signature
The signature of choose_move
is choose_move(self, battle: Battle) -> str
: it takes a Battle
object representing the game state as argument, and returns a move order encoded as a string. This move order must be formatted according to the showdown protocol. Fortunately, poke-env
provides utility functions allowing us to directly format such orders from Pokemon
and Move
objects.
We therefore have to take care of two things: first, reading the information we need from the battle
parameter. Then, we have to return a properly formatted response, corresponding to our move order.
Selecting a move
The battle
parameter is an object of type Battle
which encodes the agent’s current knowledge of the game state. It offers several properties that make accessing the game state easy. Some of the most notable are active_pokemon
, available_moves
, available_switches
, opponent_active_pokemon
, opponent_team
and team
.
In this example, we are going to use available_moves
: it returns a list of Move
objects which are available this turn.
We can therefore test if at least one move can be used with if battle.available_moves:
. We are interested in the base power of available_moves, which can be accessed with the base_power
property of Move
objects.
class MaxDamagePlayer(Player):
def choose_move(self, battle):
# If the player can attack, it will
if battle.available_moves:
# Finds the best move among available ones
best_move = max(battle.available_moves, key=lambda move: move.base_power)
Returning a choice
Now that we have selected a move, we need to return a corresponding order, which takes the form of a string. Fortunately, Player
provides a method designed to craft such strings directly: create_order
. It takes a Pokemon
(for switches) or Move
object as argument, and returns a string corresponding to the order. Additionally, you can use its mega
, z_move
and dynamax
parameters to mega evolve, use a z-move, dynamax or gigantamax, if possible this turn.
We also have to return an order corresponding to a random switch if the player cannot attack. Player
objects incorporate a choose_random_move
method, which we will use if no attacking move is available.
class MaxDamagePlayer(Player):
def choose_move(self, battle):
# If the player can attack, it will
if battle.available_moves:
# Finds the best move among available ones
best_move = max(battle.available_moves, key=lambda move: move.base_power)
return self.create_order(best_move)
# If no attack is available, a random switch will be made
else:
return self.choose_random_move(battle)
Running and testing our agent
We can now test our agent by making it battle a random agent. The complete code is:
import asyncio
import time
from poke_env.player import Player, RandomPlayer
class MaxDamagePlayer(Player):
def choose_move(self, battle):
# If the player can attack, it will
if battle.available_moves:
# Finds the best move among available ones
best_move = max(battle.available_moves, key=lambda move: move.base_power)
return self.create_order(best_move)
# If no attack is available, a random switch will be made
else:
return self.choose_random_move(battle)
async def main():
start = time.time()
# We create two players.
random_player = RandomPlayer(
battle_format="gen8randombattle",
)
max_damage_player = MaxDamagePlayer(
battle_format="gen8randombattle",
)
# Now, let's evaluate our player
await max_damage_player.battle_against(random_player, n_battles=100)
print(
"Max damage player won %d / 100 battles [this took %f seconds]"
% (
max_damage_player.n_won_battles, time.time() - start
)
)
if __name__ == "__main__":
asyncio.get_event_loop().run_until_complete(main())
Running it should take a couple of seconds and print something similar to this:
Max damage player won 92 / 100 battles [this took 6.320682 seconds]
If you want to use Reinforcement Learning, take a look at the Reinforcement learning with the OpenAI Gym wrapper example.