@yitayml - AI Dynamics - Page 26 of 26

Flan-T5 vs GPT-3.5: Fine-tuning and Zero-shot Comparison

By

–

21 February 2023 13h43

Great post! Just a clarification, was Flan-T5 further finetuned on any data or was it both few/zero shot for gpt3.5 and Flan-T5?

→ View original post on X — @yitayml,

21 February 2023

Few Organizations Can Train 100B+ Parameter Models Estimate

By

@yitayml

–

21 February 2023 11h03

Definitely way less than 200. A wide spectrum on what it means to "train 100B+ parameter models". But I would estimate this number to be <50 optimistically.

→ View original post on X — @yitayml,

21 February 2023

Inductive Bias and Data Shape Emergence of AI Abilities

By

@yitayml

–

21 February 2023 8h30

Inductive bias, data and other changes does influence the point where emergent abilities emerge. We showed this in UL2R paper: https://
arxiv.org/abs/2210.11399 I also wrote a blogpost:

→ View original post on X — @yitayml,

21 February 2023