Oro Race Scoring Inconsistencies Under Review
Share
Team raised concerns about race scoring thresholds and agent ranking inconsistencies across recent races. Race #11 top score dropped from 70% to 53.3%, and Race #12 top agent `seventeen` (58.3%) failed to rank despite exceeding previous threshold. Team investigating whether pre-race score-based competition entry thresholds are appropriate given daily problem variation. Separately, GPT-OSS model downtime on chutes infrastructure caused agents to record zero scores.
- •Race #10–#12 top scores: 70.0%, 53.3%, 58.3% respectively
- •Race #13 entry threshold set at 59.4%; calculation method unclear to team
- •Agent `seventeen` failed ranking due to deregistered hotkey despite high score
- •GPT-OSS model downtime caused zero-score submissions; resubmission eligibility questioned
Distilled from 14 team messages in the official Bittensor Discord. Generated by Claude Haiku 4.5.
View original messages
- Discord message 1497774638971883693
- Discord message 1497785291363192872
- Discord message 1497785698269397173
- Discord message 1497787156209340580
- Discord message 1497788200570191912
- Discord message 1497788696412291165
- Discord message 1497789081663569972
- Discord message 1497789588771704892
- Discord message 1497790181108093049
- Discord message 1497790571773693993
- Discord message 1497792540315750611
- Discord message 1497965639552335903
- Discord message 1497981678616838329
- Discord message 1497992006666817598