A beautiful "family tree" of large Transformer models from Vaswani et al's 2017 design So cool to see the lineages from GPT, BERT, T5, and PaLM develop and Diffusion models starting from a different place but converging with CLIP and ViT.
Family Tree of Transformer Models Evolution and Convergence
By
–
Leave a Reply