Mind Miners Overfitting Holdouts, Team Plans Harder Benchmarks
Share
Miners achieved 99.30% accuracy last round, indicating potential overfitting to holdout datasets. Team identified that holdout sets were partially derived from previously-used datasets, enabling miners to learn portions of them. Going forward, the team plans to ensure holdout diversity and uniqueness, and is considering additional evaluation criteria like inference time, model size, and robustness to new generators to maintain competitive pressure.
- •Last round: 99.30% accuracy on 24,933 samples signals overfitting to holdouts
- •Root cause: holdouts reused semantic variations from prior datasets
- •Incentive concerns: gen miner rewards reduced from 20% to 10% combined
- •Future evals may include inference time, model size, robustness metrics
Distilled from 6 team messages in the official Bittensor Discord. Generated by Claude Haiku 4.5.