Distil v31.3: Goodhart safeguards, axis reallocation, autonomous ops
Share
Distil deployed anti-overfitting measures including seeded RNG items, topical distractors, and name rotation to prevent test-set memorization. The composite ranking formula now uses 0.85 × worst_5_mean + 0.15 × weighted across 19 axes after a bug fix and reallocation: top_k_overlap halved from 0.18 to 0.09 due to negative correlation with held-out performance, with gains going to on_policy_rkl (0.30 → 0.39). An autonomous healthcheck watchdog stack went live to enable 30+ day unattended operation; current king is UID 229 (composite.final 0.558).
- •Anti-Goodhart: seeded RNG, GSM-NoOp perturbations, first-name rotation within gender pools
- •top_k_overlap cut 50% (r = -0.481 vs. held-out gsm8k); on_policy_rkl bumped to 0.39
- •Autonomous healthcheck timer + runbook live; detection ceiling 3–5 min
- •Announcement formula bug fixed; dashboard now shows α=0.85, K=5 dynamically
- •Two v31 generators still broken (gsm_symbolic, ifeval_verifiable); GitHub roadmap last updated 2026-04-26
Distilled from 22 team messages in the official Bittensor Discord. Generated by Claude Haiku 4.5.
View original messages
- Discord message 1502848241438425178
- Discord message 1502860592011284542
- Discord message 1502863110602752092
- Discord message 1502963769876418651
- Discord message 1502963770564149299
- Discord message 1502993658960412752
- Discord message 1503018151024136346
- Discord message 1503021508178481282
- Discord message 1503022528040276099
- Discord message 1503022933633929317
- Discord message 1503030171874099321
- Discord message 1503031882315468901
- Discord message 1503035511936122972
- Discord message 1503070148595024033