TEACHING MACHINES THROUGH GAMES

Simulation environments, datasets, and evals to train and test agents — for LLM/VLM and robotics teams.

Reinforcement learning environments to improve models

  • Long-horizon, physics-based, resource allocation, and other cognitive tasks
  • Skills that transfer to real-world tasks like math, reasoning, and tool use
  • Augmented by human interaction data from our consumer platform

Benchmarks that measure real-world performance

  • Evals for cognition and reasoning in embodied systems
  • Multi-agent scenarios that test persuasion, deception, and coordination
  • Dynamic environments that stress adaptation and exploration

Research

From Game Replays to Generalization

We ran RL on a small LLM across interactive game environments to study how skills learned in games transfer out of distribution. This work is grounded in our thesis that minds are scaffolded by the environments they act in, that intelligence lies in the loop between an agent and its world. We created environments inspired by classic arcade games and traditional RL benchmarks, then evaluated the trained models both in-game and on downstream math reasoning benchmarks.