Validator overhaul fixes Goodhart trap, UID 185 crowns
Share
Distil v30.6 released to fix anti-correlated composite scoring that predicted worse real-world performance. Held-out public benchmarks (gsm8k, humaneval, bbh, ifeval, mmlu_pro) now weighted 50% of composite score; skill groups reweighted and aggregation switched to bottom-half mean to prevent saturation gaming. UID 185 (const0312/zeus_03) dethroned UID 213 with composite final of 0.537. Eval GPU utilization bumped to 0.92 after chat migration.
- •Validator now anchors 50% weight on held-out evalscope canaries vs procedural probes
- •Reasoning skill group weight cut from 0.24 to 0.06; other groups rebalanced by correlation
- •Most DQs from one_eval_per_registration policy; reuse requires fresh hotkey per eval
- •UID 33 disqualified for weights/config fraud: ~16B model falsely claimed as 1.13B
Distilled from 17 team messages in the official Bittensor Discord. Generated by Claude Haiku 4.5.
View original messages
- Discord message 1502524528075735160
- Discord message 1502540635775696946
- Discord message 1502578570151727135
- Discord message 1502683909354422424
- Discord message 1502693178560479262
- Discord message 1502696243783274647
- Discord message 1502698324501729442
- Discord message 1502706064066478090
- Discord message 1502708556896403477
- Discord message 1502712680237240520
- Discord message 1502712681457516565
- Discord message 1502713171125993663
- Discord message 1502753870340821292
- Discord message 1502754482071666778