It's surprising that it still manages to output such a consistent ranking. Are they simply evaluating on a fraction of the dataset, effectively?
By
–
It's surprising that it still manages to output such a consistent ranking. Are they simply evaluating on a fraction of the dataset, effectively?