I saw the per-layer embeddings in the code, but I don't think they were used in the final models. Maybe it was a left-over from some internal experiments.
Per-layer embeddings likely unused in final model implementation
By
–
Global AI News Aggregator
By
–
I saw the per-layer embeddings in the code, but I don't think they were used in the final models. Maybe it was a left-over from some internal experiments.
Leave a Reply