Eval system recovers from deploy rollback; king cycling confirmed
Share
Distil's evaluation pipeline recovered from a mid-round outage and state corruption following a deploy rollback. King UID 47 is actively re-evaluated each round (KL scores vary 1.45–1.56, confirming fresh evals). Challenger selection logic was temporarily reset, loading stale model commits from ~11 days prior; this has been identified and flagged for state rebuild. Team published comprehensive benchmark weightings, sample counts (300 prompts for KL axes, 6–18 items for v31 procedurals), and training guidance for closing gaps on weak axes.
- •King re-eval confirmed working: UID 47 scores vary per round, not cached.
- •Deploy rollback corrupted queue state; 256 total models listed instead of filtered round.
- •Sample count bumps blocked pending 8×B200 timing lock-in for stability.
- •17-item flagged backlog queued for next deploy (no deploys today).
- •Team provided axis weights, training data mixes, temperature tuning guidance.
Distilled from 92 team messages in the official Bittensor Discord. Generated by Claude Haiku 4.5.
View original messages
- Discord message 1504665720024465410
- Discord message 1504694994978603019
- Discord message 1504703703142109254
- Discord message 1504703755097083915
- Discord message 1504704020244074596
- Discord message 1504751329413955685
- Discord message 1504752337632366612
- Discord message 1504758156725850112
- Discord message 1504776572127936512
- Discord message 1504776583523860540
- Discord message 1504776989251338342
- Discord message 1504776993764278323
- Discord message 1504778382590742548
- Discord message 1504778409211859048