TrainingMonitor
Logs episode statistics and plots training curves. Automatically used by Runner, but can also be used standalone.
from tinyrl import TrainingMonitor
monitor = TrainingMonitor(window=50)
Constructor
TrainingMonitor(window=50)
Args:
window— rolling average window size for smoothing plots
Methods
log(reward, length, entropy=None)
Record one episode's statistics.
Args:
reward— total episode rewardlength— number of steps in the episodeentropy— mean policy entropy over the episode (optional)
plot()
Display training curves side by side:
- Episode Reward
- Episode Length
- Policy Entropy — only shown if entropy data was logged
Each plot shows the raw data (transparent) and a rolling average (solid line).
Attributes
| Attribute | Type | Description |
|---|---|---|
rewards |
list[float] |
All logged episode rewards |
lengths |
list[int] |
All logged episode lengths |
entropies |
list[float] |
Logged mean entropies (only for episodes that provided entropy) |
Standalone usage
from tinyrl import TrainingMonitor
monitor = TrainingMonitor()
for episode in range(500):
# ... your training loop ...
monitor.log(total_reward, steps, entropy) # entropy is optional
monitor.plot()