Action Mapping and Strict/Fake Modes
SinglesEnv and DoublesEnv expose helpers to convert between encoded
actions and Showdown orders:
action_to_orderorder_to_actionget_action_mask
Both methods support two important flags:
strict(defaultTrue): raisesValueErrorwhen a conversion is invalid.fake(defaultFalse): allows best-effort conversions, even if not legal.
Prerequisites
You need a live
BattleorDoubleBattleobject, usually from a player callback or an environment step.This guide is most useful when building custom RL policies, wrappers, or action encoders.
Why This Matters
During RL training or debugging, invalid actions happen frequently. You can choose between hard-failing (strict) and fallback behavior (non-strict).
Use
strict=Truewhile validating your policy and action encoding.Use
strict=Falsewhen you prefer robustness and fallback to random legal orders.
Note
The Gymnasium action_space spans the full encoded action range.
Use get_action_mask to determine which actions are currently legal.
Sentinel values like default/forfeit are accepted by the converters, but
they are not emitted by the standard sampled action spaces.
Singles Example
import numpy as np
from poke_env.environment import SinglesEnv
mask = SinglesEnv.get_action_mask(battle)
# Convert action -> order
order = SinglesEnv.action_to_order(
np.int64(6),
battle,
fake=False,
strict=True,
)
# Convert order -> action
action = SinglesEnv.order_to_action(
order,
battle,
fake=False,
strict=True,
)
Doubles Example
import numpy as np
from poke_env.environment import DoublesEnv
mask = DoublesEnv.get_action_mask(battle)
action = np.array([7, 27], dtype=np.int64)
order = DoublesEnv.action_to_order(action, battle, fake=False, strict=False)
recovered_action = DoublesEnv.order_to_action(
order,
battle,
fake=False,
strict=False,
)
Recommended Workflow
Start with
strict=Trueand unit-test your action encoding.If you need fault tolerance during long runs, switch to
strict=False.Keep
fake=Falseunless you explicitly need best-effort conversion behavior.