agreed – automated AI slop doesn’t quite help either. But it’s good to stop every now and then and see where the field is headed too.
@reach_vb
-
Training Data Mix and Domain-Specific Evals Drive AI Model Performance
By
–
My intuition is two things:
1. Carefully curated pre-training data & finding the right data mix 2. Domain/ task specific evals for downstream use-cases In the end training data is still king! – finding which combination to go for is where the real moneys at. In addition- Post -
AI Personalization Gaming Disruption Unique Player Experiences
By
–
Unique experiences for each player – AI is going to disrupt gaming so hard! https://t.co/2RkQ7gche1
— Vaibhav (VB) Srivastav (@reach_vb) 1 novembre 2024Unique experiences for each player – AI is going to disrupt gaming so hard!
-
SmolLM2: Faster, Better, Cheaper Language Model
By
–
SmolLM2 – faster, better and cheaper! Intelligence is definitely too cheap to meter.
-
SmolLM2 1.7B Beats Larger Models with Apache 2.0 License
By
–
Fuck it – it’s raining smol LMs – SmolLM2 1.7B – beats Qwen 2.5 1.5B & Llama 3.21B, Apache 2.0 licensed, trained on 11 Trillion tokens > 135M, 360M, 1.7B parameter model
> Trained on FineWeb-Edu, DCLM, The Stack, along w/ new mathematics and coding datasets
> Specialises in -
VRAM Requirements for AI Models Across Hardware Architectures
By
–
It should work on CPU/ CUDA/ MPS across backends, w.r.t hardware requirements: 1B should take roughly 2GB VRAM to load in fp16/ bf16.
600M should take 1.2 GB VRAM
350M – ~700MB VRAM
125 – ~250MB VRAM Ofcourse at lower quants Q4/ Q8 you reduce this even further. -
Tiny AI Models Running on Everyday Devices
By
–
Models so smol that thay'd even run on your toaster!