@aymericroucher - AI Dynamics

L-Mul algorithm slashes transformer computation costs by 80%

By

–

08 October 2024 17h10

L-Mul: Addition-Only Multiplication can slash computational costs by 80%! Researchers at @MSFTResearch dropped a groundbreaking technique that could slash the energy use in transformer computations : their novel "linear-complexity multiplication" (L-Mul) algorithm

→ View original post on X — @aymericroucher

8 October 2024

Simplified old-school RNNs rival modern transformers

By

@aymericroucher

–

04 October 2024 11h52

Old-school RNNs can actually rival fancy transformers! Remember good old RNNs (Recurrent Neural Networks)? Well, researchers from Mila and @BorealisAI have just shown that simplified versions of decade-old RNNs can match the performance of today's transformers. They took a

→ View original post on X — @aymericroucher

4 October 2024

Chinese AI models expanding globally but underrated

By

@aymericroucher

–

03 October 2024 15h54

出海: Chinese AI is expanding globally Chinese LLMs are heavily underrated. I regularly feel like Chinese AI releases do not get the recognition they deserve, for instance the recent excellent Deepseek-v2.5 or Qwen models. Luckily for us, @AdinaYakup just wrote an

→ View original post on X — @aymericroucher

3 October 2024

Everyday Transformers Optimizations: KV Cache, FlashAttention, PagedAttention

By

@aymericroucher

–

02 October 2024 14h25

This blog post is really cool: to understand everyday Transformers optimizations like KV cache, FlashAttention or PagedAttention: https://
astralord.github.io/posts/transfor
mer-inference-optimization-toolset/
… Image below is the interactive visualization for KB cache!

→ View original post on X — @aymericroucher

2 October 2024

Emu3: single model handles text, images, and videos

By

@aymericroucher

–

01 October 2024 17h48

> Emu3: Next-token prediction conquers multimodal tasks This is the most important research in months: we’re now very close to having a single architecture to handle all modalities. The folks at BAAI just released Emu3, a single model that handles text, images, and videos all

→ View original post on X — @aymericroucher

1 October 2024

Add source highlighting to your RAG system for trust

By

@aymericroucher

–

01 October 2024 14h42

> Add source highlighting to your RAG system! 📄💡

RAG systems are supposed to make your LLM's answer more trustworthy, by inserting in the prompt some supporting documents from a knowledge base : we say that we're "adding some context".

👎 But if you don't know which part of… pic.twitter.com/5KmE7wcMua
— m_ric (@AymericRoucher) 1 octobre 2024

> Add source highlighting to your RAG system! RAG systems are supposed to make your LLM's answer more trustworthy, by inserting in the prompt some supporting documents from a knowledge base : we say that we're "adding some context". But if you don't know which part of

→ View original post on X — @aymericroucher

1 October 2024

Transformers v4.45.0: lightning-fast method to build tools

By

@aymericroucher

–

26 September 2024 12h11

Transformers v4.45.0 released: includes a lightning-fast method to build tools! During user research with colleagues @MoritzLaurer and Joffrey Thomas, we discovered that the class definition currently in used to define a Tool in transformers.agents is a bit tedious to use,

→ View original post on X — @aymericroucher

26 September 2024

Understanding Attention: K and V matrices, masking, and -inf for softmax

By

@aymericroucher

–

25 September 2024 13h58

This is a must-watch to understand how attention works! Great visualization, explaining:
– Why the K and V matrix, what do they represent?
– Why mask the lower left part of the KV product?
– Why apply -inf to the lower left part of the KV product before softmax rather than just

→ View original post on X — @aymericroucher

25 September 2024

IBM and NASA release open-source AI model for weather climate

By

@aymericroucher

–

25 September 2024 11h18

Read the announcement post https://
newsroom.ibm.com/2024-09-23-ibm
-and-nasa-release-open-source-ai-model-on-hugging-face-for-weather-and-climate-applications
… Model on the Hub https://
huggingface.co/Prithvi-WxC

→ View original post on X — @aymericroucher

25 September 2024

First Foundation Weather Model Prithvi WxC Enables Life-Saving Predictions

By

@aymericroucher

–

25 September 2024 11h18

> The first ever Foundation weather model: Prithvi WxC enables life-saving weather predictions! Hurricane Katrina killed hundreds of people as it made landfall on New Orleans in 2005 – many of these deaths could have been avoided if alerts had been given one day earlier.

→ View original post on X — @aymericroucher

25 September 2024