@aymericroucher - AI Dynamics

ShowUI: Small End-to-End AI Agent Navigates Any UI

By

–

04 December 2024 15h44

𝗦𝗵𝗼𝘄𝗨𝗜: 𝗮 𝘀𝗺𝗮𝗹𝗹 𝗲𝗻𝗱-𝘁𝗼-𝗲𝗻𝗱 𝗮𝗴𝗲𝗻𝘁 𝘁𝗵𝗮𝘁 𝗰𝗮𝗻 𝗻𝗮𝘃𝗶𝗴𝗮𝘁𝗲 𝗮𝗻𝘆 𝗨𝗜 𝗮𝗻𝗱 𝗼𝘂𝘁𝗽𝗲𝗿𝗳𝗼𝗿𝗺𝘀 𝗺𝘂𝗰𝗵 𝗯𝗶𝗴𝗴𝗲𝗿 𝘀𝘆𝘀𝘁𝗲𝗺𝘀! A team from NUS and Microsoft just released an agent that can act on any UI (Desktop, Android, Web)

→ View original post on X — @aymericroucher

4 December 2024

Adobe’s Code-Generating Agent Tops GAIA Leaderboard

By

@aymericroucher

–

02 December 2024 17h54

𝗔𝗱𝗼𝗯𝗲'𝘀 𝗰𝗼𝗱𝗲-𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗻𝗴 𝗮𝗴𝗲𝗻𝘁 𝗿𝗲𝗮𝗰𝗵𝗲𝘀 𝘁𝗵𝗲 𝘁𝗼𝗽 𝗼𝗳 𝗚𝗔𝗜𝗔 𝗹𝗲𝗮𝗱𝗲𝗿𝗯𝗼𝗮𝗿𝗱 – and they cite "Roucher, 2024" in the paper! Reminder: Broadly defined, an "Agent" is a system where a LLM is augmented with the ability to run

→ View original post on X — @aymericroucher

2 December 2024

Release of Qwen-QwQ-32B Preview Model on Hugging Face

By

@aymericroucher

–

29 November 2024 17h41

https://
huggingface.co/chat/models/Qw
en/QwQ-32B-Preview
…

→ View original post on X — @aymericroucher

29 November 2024

Recommendation to try QwQ on HuggingChat

By

@aymericroucher

–

29 November 2024 17h41

Just go try QwQ on HuggingChat, it's

→ View original post on X — @aymericroucher

29 November 2024

Original MNIST dataset updated on Hugging Face

By

@aymericroucher

–

26 November 2024 20h48

MNIST original dataset updated by the himself on Hugging Face!

→ View original post on X — @aymericroucher

26 November 2024

The State of Generative AI in the Enterprise

By

@aymericroucher

–

25 November 2024 15h34

https://
menlovc.com/2024-the-state
-of-generative-ai-in-the-enterprise/
…

→ View original post on X — @aymericroucher

25 November 2024

State of Enterprise AI 2024: Market Trends and Agent Adoption

By

@aymericroucher

–

25 November 2024 15h33

𝗦𝘁𝗮𝘁𝗲 𝗼𝗳 𝗘𝗻𝘁𝗲𝗿𝗽𝗿𝗶𝘀𝗲 𝗔𝗜 𝟮𝟬𝟮𝟰: 𝗔𝗻𝘁𝗵𝗿𝗼𝗽𝗶𝗰 𝗲𝗮𝘁𝗶𝗻𝗴 𝘂𝗽 𝗢𝗽𝗲𝗻𝗔𝗜, 𝗔𝗴𝗲𝗻𝘁𝘀 𝗿𝗮𝗺𝗽 𝘂𝗽 𝘁𝗼 𝟭𝟮% 𝗼𝗳 𝘂𝘀𝗲-𝗰𝗮𝘀𝗲𝘀, 𝗼𝗽𝗲𝗻 𝗺𝗼𝗱𝗲𝗹𝘀 𝗺𝗮𝗸𝗲 𝟭𝟵% 𝗼𝗳 𝘂𝘀𝗮𝗴𝗲 @MenloVentures surveyed 600 enterprise IT decision-makers

→ View original post on X — @aymericroucher

25 November 2024

Discussion on fine-tuning Llama-based models

By

@aymericroucher

–

23 November 2024 10h26

Yes it's a finetune of llama, thus I hesitated to include it, but they do build great models

→ View original post on X — @aymericroucher

23 November 2024

New app shows no European company in top 10 LLM rankings

By

@aymericroucher

–

22 November 2024 14h12

Made a new app to visualize the LLM race ⇒ 𝗡𝗼 𝗘𝘂𝗿𝗼𝗽𝗲𝗮𝗻 𝗰𝗼𝗺𝗽𝗮𝗻𝘆 𝗶𝗻 𝘁𝗵𝗲 𝘁𝗼𝗽 𝟭𝟬 I've adapted an app by @andrewrreed that tracks progress of LLMs on the Chatbot Arena leaderboard, to compare companies from different countries. The outcome is quite

→ View original post on X — @aymericroucher

22 November 2024

New leaderboard ranks LLMs for LLM-as-a-judge; Llama-3.1-70B tops

By

@aymericroucher

–

21 November 2024 17h00

New leaderboard ranks LLMs for LLM-as-a-judge: Llama-3.1-70B tops the rankings! Evaluating systems is critical during prototyping and in production, and LLM-as-a-judge has become a standard technique to do it. First, what is "LLM-as-a-judge"? It's a very useful technique

→ View original post on X — @aymericroucher

21 November 2024