Quasar switches to tail-bucket KL evaluation metric
Share
Quasar migrated from top-128 KL to tail-bucket KL scoring to capture probability mass dumped into model tails, addressing optimization gaming. New temporary gates are KL 6.3 and quality floor 0.06 (old 2.1/0.20 not comparable). Full-vocab KL remains under calibration. GitHub/dashboard updates are rolling out; large eval queue ongoing with 20 models in current round.
- •Switched from top-128 KL to tail-bucket KL; accounts for tail probability mass
- •Temporary gates: KL threshold 6.3, quality floor 0.06 (live, not dashboard text)
- •100% burn/no miner weights when no model clears gates; protects subnet emissions
- •Full-vocab KL deferred due to compute cost (~100GB+ dense logits) and calibration needs
Distilled from 18 team messages in the official Bittensor Discord. Generated by Claude Haiku 4.5.
View original messages
- Discord message 1509005363716821143
- Discord message 1509005994846588978
- Discord message 1509006524922466377
- Discord message 1509007143666323496
- Discord message 1509009684147998760
- Discord message 1509012672715427961
- Discord message 1509013232566669602
- Discord message 1509158825507160104
- Discord message 1509222566924124190
- Discord message 1509223180286296195
- Discord message 1509231256414458098
- Discord message 1509237236665811094
- Discord message 1509245169566744639
- Discord message 1509269581380718662