SN15oro·Sunday, April 26, 2026

Scoring Bug Fixed, Forge Agent Doubles Performance

A local testing bug was identified where missing `problem_id` fields caused all problems in a run to collapse into a single log, artificially tanking reasoning scores on multi-problem suites (0.7–0.85 on single problems vs. 0.13–0.31 on 30-problem runs). Mainnet is unaffected. Separately, the Forge agent improved from 0.29 to 0.57 via expanded search result pools and higher rescoring limits; product-category tasks now score 0.6 but shop and voucher lag at 0.3 and 0.2.

•Local test scoring bug: missing problem_id collapses multiple problems into one request log
•Mainnet unaffected; public API populates UUID problem_ids correctly
•Forge agent 2x improvement: expanded candidate pool and raised rescoring limits
•Category performance varies: products 0.6, shops 0.3, vouchers 0.2

Distilled from 5 team messages in the official Bittensor Discord. Generated by Claude Haiku 4.5.

View original messages

Discord message 1497621363752702022
Discord message 1497673602739208323
Discord message 1497729467660111973
Discord message 1497730999126196245
Discord message 1497731270468305017

Scoring Bug Fixed, Forge Agent Doubles Performance

More briefs for SN15