Official competition rules and policies. Please read carefully before submitting.
We adopt a permissive stance. Any approach is permitted unless explicitly prohibited below. This includes but is not limited to:
Cerebras Fast-Reasoning Track: Track 2 submissions must use the direct Cerebras-hosted gpt-oss inference setup described in the starter kit. Agents must expose the same dockerized A2A interface as Track 1. Multi-pass reasoning, private planning, self-verification, retries, ensembles, and parallel calls are allowed, but the agent must respect Track 2 inference-compute constraints and must not exploit evaluator internals. For each baseline LLM step, participants may use up to 5 sequential LLM calls; parallel calls inside each step do not count toward this sequential limit. Token usage may be up to 500k tokens on average per task, including input, reasoning, and output tokens. Token usage must be tracked through A2A turn_metrics token fields and may be counterchecked. Track 2 reports should include an architecture diagram so the sequential-call constraint can be audited. Cerebras will provide increased rate limits compared with a free personal account; access details will follow soon.
The following are strictly prohibited and will result in disqualification:
scenario.toml, using the official organizer-published evaluator image and their agent-under-test runtime config.task_split = "hidden" and -1 for each task-count field.