> be google researcher
> exec says we need to beat mistral 7b
> “reclaim the narrative"
> train 7b model
> run eval. check plots
> cant do it. maybe with 8b
> exec says no dice
> “we'll look like chumps"
> ok
> maybe we can tie embeddings and linear_f, save some params
> not
Google researcher struggles to beat Mistral 7B model performance
By
–
