Thanks for covering this Simon, @tqchenml & @charlie_ruan are the real MVPs for developing Web-LLM – for LLMs and WebGPU it is literally the superior choice!
@reach_vb
-
Structured JSON Generation with SmolLM2 in Browser
By
–
Fuck it! Structured Generation w/ SmolLM2 running in browser & WebGPU π₯
— Vaibhav (VB) Srivastav (@reach_vb) 28 novembre 2024
Powered by MLC Web-LLM & XGrammar β‘
Define a JSON schema, Input free text, get structured data right in your browser – profit!!
To showcase how much you can do with just a 1.7B LLM, you pass free text,β¦ pic.twitter.com/x5GYWdmTe3Fuck it! Structured Generation w/ SmolLM2 running in browser & WebGPU Powered by MLC Web-LLM & XGrammar Define a JSON schema, Input free text, get structured data right in your browser – profit!! To showcase how much you can do with just a 1.7B LLM, you pass free text,
-
Optimizing AWQ Deployment for Flexible Model Distribution
By
–
Think a more likely one would be to help people create the best, most optimised AWQs and then they can deploy wherever they want ofc you can have direct deployment options. But Iβm 100% thereβs value in this.
-
AGI Accessible: Install QwQ with Two Lines of Code
By
–
You too can have AGI in just a couple lines of code! `pip install transformers` & QwQ is all you need
-
Open Source Model Challenges OpenAI o1 Moat
By
–
Thatβs an Apache 2.0 licensed model competing with OpenAI o1 preview – the moat never existed!
-
Qwen Q-32B Preview Model Now Available on HuggingFace
By
–
Try it out here: https://
huggingface.co/spaces/Qwen/Qw
Q-32B-preview
β¦ -
QwQ-32B Model Now Available on Hugging Face Hub
By
–
Model on the hub, try it out: https://
huggingface.co/Qwen/QwQ-32B-P
review
β¦ -
Qwen QwQ 32B Outperforms o1 Mini Model
By
–
WTF! Qwen COOKED – QwQ 32B beats o1 mini and competes with preview!
-
UV: Fast Python package manager and environment tool
By
–
mate have you looked at uv? just do `brew install uv` followed by: `uv venv –python 3.12` that's it https://
docs.astral.sh/uv/ -
Model weights and inference code now available
By
–
check out the model weights and inference code here: