DeepSWE was designed to make all of this impossible. Tasks written from scratch. Not pulled from public commits. No contamination. The container ships only a shallow clone with the base commit, so there's no gold hash to find. Hand-written verifiers. Solutions require over 5x
DeepSWE designed to prevent dataset contamination and cheating
By
–