@aymericroucher - AI Dynamics

Meta Releases Llama-3.1 with 405B Model and Expanded Context

By

–

23 July 2024 17h17

Mark Zuckerberg just announced the GPT-4 killer: Llama-3.1 Two breakthroughs:
New 405B, possibly the strongest LLM ever, slightly above GPT-4o in many domains
Improved 8B & 70B models, with a much larger context size of 128k vs 8k ⇒ game-changer for RAG and Agents.

→ View original post on X — @aymericroucher

23 July 2024

DoLa: New AI Decoding Method Reduces Hallucinations

By

@aymericroucher

–

10 July 2024 16h01

DoLa was just merged in transformers! This new decoding method works by contrasting token logits between the final layer and earlier layers, with the premise that high level knowledge builds in top layers rather than the first ones. It significantly reduces hallucinations!

→ View original post on X — @aymericroucher

10 July 2024

Using Transformer ReactCodeAgent with Llama-3-70B for Data Analysis

By

@aymericroucher

–

09 July 2024 19h13

I'm currently exploring Transformer's ReactCodeAgent as a data analyst on Kaggle's Titanic dataset, using Llama-3-70B-Instruct as the engine. The results are insane! The code and plot below were all generated by the agent.

→ View original post on X — @aymericroucher

9 July 2024

AI Agent for Self-Correcting Text-to-SQL Queries

By

@aymericroucher

–

09 July 2024 14h14

One more cookbook:
Agent for self-correcting Text-to-SQL What if the query generated by your Text-to-SQL pipeline is correct SQL but returns wrong results? We need to add a critique step That's very simple with an agent!
Check out the notebook!

→ View original post on X — @aymericroucher

9 July 2024

Agentic RAG with Transformers Agents Improves Retrieval Performance

By

@aymericroucher

–

08 July 2024 14h52

New cookbook! I show to to make agentic RAG using Transformers Agents. Compared to vanilla RAG, agentic RAG can: Reformulate the query Critique the retrived content to re-retrieve if needed Score increase of 8.5%! (Llama-3-70B-judge)

→ View original post on X — @aymericroucher

8 July 2024

Free AI Super-Resolution Model with 600M Parameters Released

By

@aymericroucher

–

02 July 2024 18h06

I remember when the "super resolution" filter used by Jack Bauer in 24 seemed like sci-fi bullshit.

But now you have free models, 600M parameters, that can do precisely that 🤯https://t.co/wMuNrshTM1 pic.twitter.com/29dJetIjV3
— m_ric (@AymericRoucher) 2 juillet 2024

I remember when the "super resolution" filter used by Jack Bauer in 24 seemed like sci-fi bullshit. But now you have free models, 600M parameters, that can do precisely that https://
huggingface.co/fal/AuraSR

→ View original post on X — @aymericroucher

2 July 2024

Effectiveness of Plain Prompting vs Fine-Tuned Models for AI Agents

By

@aymericroucher

–

02 July 2024 14h32

It's trendy to share models "fine-tuned for function calling", e.g. Command-R-Plus or Mixtral-8x22B. But you don't need this to make good agents Cf graph:
The count of incorrectly formatted actions is already close to 0 with plain prompting! (GPT-4o, GAIA validation run)

→ View original post on X — @aymericroucher

2 July 2024

Google Releases Gemma-2: Leading Open-Source LLM

By

@aymericroucher

–

27 June 2024 18h24

Google just released Gemma-2. The 27B version: directly becomes the best open-source LLM as per Chatbot Arena Punches wayyy above its weight: I plotted Arena ELO vs model size below, it's crazy

→ View original post on X — @aymericroucher

27 June 2024

Code Agent Built with Transformers Agents Tops GAIA Leaderboard

By

@aymericroucher

–

27 June 2024 14h54

With @sergeipetrov we built a Code agent with Transformers Agents to beat the GAIA leaderboard. It worked well! Our submission scores #2 overall on the test set and #1 on the validation set. On both sets we are #1 on the hardest Level 3 questions, reaching nearly 20%.

→ View original post on X — @aymericroucher

27 June 2024

Exploring Concept Emergence in Large Language Models

By

@aymericroucher

–

07 June 2024 14h46

LLMs are huge piles of neurons that somehow give useful outputs, but at which points do real concepts emerge from this mathematical mess? @Anthropic team did fascinating work on that: read my summary here

→ View original post on X — @aymericroucher

7 June 2024