It may strengthen your argument if you provided examples where AI fails consistently over time, rather than failures cases that expire in such a short time, as DL improves. So far all the fails you used to prove GPT-2's lack of language understanding in 2020 are now easily solved
AI Capability Evaluation: Moving Beyond Expiring Failure Cases
By
–
