Yeah! But that’s okay no? I see MoEs more for actual production usage.
@reach_vb
-
DeepSeek-VL2: Enhanced Multimodal Model with Dynamic Tiling
By
–
Improvements from v1: > 2x high-quality training data vs DeepSeek-VL1
> Dynamic image tiling for flexible resolutions + efficient DeepSeek-MoE for LM
> 3-stage training + new multi-modal parallel strategies for efficiency -
DeepSeek-VL2 Achieves SoTA Vision Performance with Fewer Parameters
By
–
The whale strikes again! DeepSeekVL 2 > DeepSeek-VL2-Tiny, DeepSeek-VL2-Small, and DeepSeek-VL2, with 1.0B, 2.8B, and 4.5B activated parameters
> SoTA perf with similar or fewer activated parameters compared to Qwen 2 VL
> Excels at visual question answering, optical -
Open AI ML Advances Span Multiple Modalities and Research Labs
By
–
It's been an ABSOLUTELY smashing last couple of weeks – across modalities, sizes and research labs! There's no stopping the open AI/ ML train!
-
Model Release Support: Hugging Face Weights Distribution
By
–
Congrats on the release! – let me know if you need any help with putting the weights on Hugging Face!
-
Qwen 2.5 72B Outperforms GPT-4o and Claude Sonnet
By
–
Wait WTF, that's Qwen 2.5 72B absolutely nailing GPT4o & Claude Sonnet!
-
Hermes 3 Llama 3.2 3B Fine-Tuned for Coding
By
–
Nous dropped a Hermes 3 Llama 3.2 3B Full fine-tune of L3.2 3B focused on coding, structured outputs and function calling! Handles long-context, multi-turn conversations and reasoning brilliantly too
-
AI Starter Pack: $1000 to Explore Cutting-Edge AI
By
–
Quite exxcited to announce the AI Starter Pack! If you've been thinking about getting into AI – this is cue – get 1000 of dollars to try out the bleeding edge of AI End the year with a bang! – APPLY today!
-
AI Tool for Research and Topic Understanding Workflows
By
–
ngl, this shit slaps! i'd totes use something like this for my research/ understanding-a-topic workflows! pic.twitter.com/aWOotvIuAe
— Vaibhav (VB) Srivastav (@reach_vb) 11 décembre 2024ngl, this shit slaps! i'd totes use something like this for my research/ understanding-a-topic workflows!