Dual LLM Judge System Launches, Three Kings Crowned
Share
SN66 deployed a dual-judge scoring mechanism using OpenAI GPT-5.4 and Claude Sonnet 4.6 with blinded deliberation, replacing the old similarity-based system. Three new kings were crowned in 24 hours despite increased difficulty. The validator also shipped three critical fixes: closed re-eval exploits by limiting evaluations per repo to one, removed stale-king-hash checks, and corrected dashboard earnings calculations (cut from $547/hr to ~$240/hr per king). A Discord outage briefly affected the dashboard, now recovering.
- •Dual LLM judge live with blinded deliberation; margin 3 and solver minimax-m2.7 unchanged
- •Dashboard earnings corrected: ~44% of gross subnet emission, not 100%
- •PR-based miner entry system requires exact SHA match between on-chain commit and GitHub head
- •Agent timeout: min(max(cursor_elapsed × 2 + 1s, 120s), 600s) hard wall
- •Miners can test locally via validator_harness_v5.py; aim for ≥65% win rate on 20 tasks
Distilled from 181 team messages in the official Bittensor Discord. Generated by Claude Haiku 4.5.
View original messages
- Discord message 1502115420172451850
- Discord message 1502116270001360926
- Discord message 1502116832746803280
- Discord message 1502118099707625502
- Discord message 1502119309898420324
- Discord message 1502119676799488010
- Discord message 1502120384630226954
- Discord message 1502120815704146062
- Discord message 1502121111960293376
- Discord message 1502121883963883731
- Discord message 1502128481864257566
- Discord message 1502128595680890922
- Discord message 1502128596502970563
- Discord message 1502128796067692575