We don't actually disagree, we all know that Transformers don't fit generalizable algorithms, they fit instance-based patterns. It doesn't change the fact that the crux of the problem is familiar vs unfamiliar (at the instance level, not at the abstract "task" level) E.g. adding
Transformers fit instance patterns not generalizable algorithms
By
–
Leave a Reply