Ninja Upgrades to LLM Judges, API Schema Bug Fixed
Share
Ninja upgraded its evaluation model from Sonnet-4 to Sonnet-4.6 with expanded research tools, and is testing a transition to scoring solely via LLM judges. GPT-5.4 was deployed for faster latency and smarter judgments. Team fixed an API schema bug and resolved a prior hotkey ban issue related to a previous UID owner's commitment flag.
- •Model upgraded: Sonnet-4.6 with full repo research tools for smarter evaluation.
- •Scoring migration: Testing LLM-only judging; moving away from code-based scoring.
- •Judge upgrade: GPT-5.4 deployed; team monitoring costs before adding Claude Sonnet.
- •Dethrone verifier in development to validate new king against hidden 50-task set.
- •API schema bug identified and fixed; validator paused for updates.
Distilled from 60 team messages in the official Bittensor Discord. Generated by Claude Haiku 4.5.
View original messages
- Discord message 1501386707684298803
- Discord message 1501394517318959225
- Discord message 1501395163224871005
- Discord message 1501398814156652658
- Discord message 1501410096062791802
- Discord message 1501410822201802762
- Discord message 1501410884327702598
- Discord message 1501411106948907038
- Discord message 1501411159557931139
- Discord message 1501413713847128145
- Discord message 1501413756238958743
- Discord message 1501413997491130498
- Discord message 1501414172997718139
- Discord message 1501414335635918899