SN3 Debugging Eval Discrepancies, Infrastructure Updates
Share
τeuτonic team spent the day troubleshooting a persistent gap between local and remote model evaluation scores, with dashboard-to-validator mismatches of 0.1–0.2 loss. Infrastructure changes included spinning up a more consistent machine and testing Lium/Targon providers. Model cache contamination and multi-card eval issues were identified; reruns are underway and deemed faster post-fix.
- •Local eval scores diverge 0.1–0.2 loss from validator/dashboard results
- •Model cache contamination suspected cause of incorrect eval behavior
- •King model list not syncing properly in dashboard.json despite UI updates
- •Spinning up improved infrastructure; testing Lium and Targon machines
Distilled from 61 team messages in the official Bittensor Discord. Generated by Claude Haiku 4.5.
View original messages
- Discord message 1501039179466014802
- Discord message 1501062774657126544
- Discord message 1501072958322512073
- Discord message 1501090654875291809
- Discord message 1501091319005712414
- Discord message 1501113570270646423
- Discord message 1501118528538017892
- Discord message 1501118565863133237
- Discord message 1501126825802207345
- Discord message 1501130833644359821
- Discord message 1501130995162939412
- Discord message 1501131525377490976
- Discord message 1501132076790054973
- Discord message 1501132117881651271