AI Dynamics

Global AI News Aggregator

LLMS

Meta-learning policies for LLM attention management during training

By

AI Dynamics

–

16 November 2022 4h05

Feels like a lot of fertile ground is left in managing the "attention" of an LLM during its training via a meta-learning policy, instead of the typical "memorize dataset uniformly at random" strategy. And giving it a calculator and a scratch pad.

→ View original post on X — @karpathy,

16 November 2022
Training Strategies: Skimming, Filtering Noise, and Revisiting Content

By

AI Dynamics

–

16 November 2022 4h05

More generally a few remarkable strategies people use during their training:
1) skim text because they already know it
2) ignore text because it's clearly noise (e.g. they won't memorize SHA256 hashes. LLMs will.)
3) revisit parts that are learnable but not yet learned

→ View original post on X — @karpathy,

16 November 2022
Examples vs. Presentations: Spaced Repetition in LLM Training

By

AI Dynamics

–

16 November 2022 4h05

Is it the number of examples that matters or the number of presentations to the model during training? E.g. humans used spaced repetition to memorize facts but there are no equivalents of similar techniques in LLMs where the typical training regime is uniform random.

→ View original post on X — @karpathy,

16 November 2022
LangChain 0.0.13 Release: Vector DB QA and Documentation Updates

By

AI Dynamics

–

14 November 2022 17h07

LangChain Version 0.0.13 Question/Answering w/ a vector DB chain (demo coming tmrw) Loading a prompt from a text file (
@edmarferreira first commit!) Misc cleanup (Eugene x4!!!) Big Documentation overhaul (w/ Eugene again)

→ View original post on X — @langchain,

14 November 2022
GPT-2’s Inscrutable Internal Matrices Remain Poorly Understood

By

AI Dynamics

–

13 November 2022 12h04

Nobody's ever even going to understand how GPT-2 worked, except that there sure were a lot of inscrutable matrices in there.

→ View original post on X — @esyudkowsky,

13 November 2022
Hybrid AI Systems More Practical Than Full LLM Retraining

By

AI Dynamics

–

12 November 2022 0h29

I love the dramatic futurism here! But it's more likely they will be hybrid systems. You don't want to retrain a LLM everytime a Wikipedia page is changed, or a news item is published. (Plus, Google would be in the best place to deliver this new system, incrementally.)

→ View original post on X — @alexjc,

12 November 2022
When We Realize We’re Just Stochastic Parrots

By

@sama

–

11 November 2022 19h13

what happens when we realize we were just stochastic parrots all along?

→ View original post on X — @sama,

11 November 2022
LangChain 0.0.12 Release with AI21Labs and Manifest Integrations

By

AI Dynamics

–

11 November 2022 16h57

LangChain version 0.0.12 Two super exciting integrations! @AI21Labs integration (from a friend of @YuvalinTheDeep
) Integration with @HazyResearch
's manifest library (with help from @laurel_orr1
)

→ View original post on X — @langchain,

11 November 2022
AGI Parameter Count vs Brain Synapses: Current Models Scale

By

AI Dynamics

–

10 November 2022 22h19

A common view is that human level AGI will require a parameter count in the order of magnitude of the brain’s 100 trillion synapses. The large language models and image generators are only about 1/1000 of that, but they already contain more information than a single human

→ View original post on X — @id_aa_carmack,

10 November 2022
LangChain 0.0.11 Release: New Embeddings and Text Splitting Features

By

AI Dynamics

–

10 November 2022 16h31

LangChain version 0.0.11: – @CohereAI embedding support from @abdrahman_issam – @NLTK_org and @spacy_io support for text splitting from @deliprao "Optimized Prompts" from @sjwhitmore misc cleanup from @deliprao and @nlarusstone https://
github.com/hwchase17/lang
chain
…

→ View original post on X — @langchain,

10 November 2022