Trajrl-bench v3.3.0 rolling out to validators
Share
Trajectory RL's evaluation framework is upgrading with stricter codebase-fix scoring. Hidden tests now check concurrency, throughput, and memory churn instead of just boundary cases. Bug hints have been removed from documentation and visible tests. The no_repeat_mistake credit is no longer awarded for episodes with no prior failures. Submissions relying on telegraphing boundary fixes or vacuous credit may see score drops.
- •Hidden tests expanded to concurrency, throughput, memory churn
- •Bug-class hints stripped from docs and visible tests
- •no_repeat_mistake credit omitted when no prior failures exist
Distilled from 3 team messages in the official Bittensor Discord. Generated by Claude Haiku 4.5.