this should not happen, can you try with the latest tagged image – 3.0.1 happy to flag it to the team if it still doesn't work! sorry for the inconvenience! https://
github.com/huggingface/te
xt-generation-inference/releases/tag/v3.0.1
…
@reach_vb
-
Text Generation Inference v3.0.1 release fixes reported issue
By
–
-
Frontier Model Development Cost Drops to 5.5 Million USD
By
–
Scarcity breeds Innovation – cost to build a frontier model – 5.5 Million USD In a way, it's the maximum it'd be (Note: H800s have ~2x slower chip-to-chip data transfer) This cost, will only go down further and further as we continue to find newer walls to scale!
-
Qwen and Meta GPU Mobilization in AI Competition
By
–
I wouldn't discount Qwen or Meta either – at least the latter is mobilising metric fk ton of GPUs
-
Open Science Momentum Builds Into 2025
By
–
4 days to the end of 2024, open science is winning! Bring it on in 2025!
-
TGI Native Support Available Through Version 2.5
By
–
We do support it natively in TGI (till v2.5) – top of the list
-
DeepSeek API Direct Consumption Performance Evaluation
By
–
That said, what’s wrong w/ consuming directly from DeepSeek API, it looks pretty fkn fast to me.
-
GPU Deployment Challenges for AI Model Infrastructure
By
–
Not sure if we have the spare GPUs atm, doesn’t look as likely in the short term tho – since it might need some updates in TGI. Folks from @hyperbolic_labs mentioned (on shared slack) they might deploy it!
-
Meta Llama’s Acceptable Use Policy: Comparable to Industry Standards
By
–
Note: Meta Llama comes with the same if not harsher AUP
-
DeepSeek V3 License More Liberal Than Llama Series
By
–
The license allows commercial usage *without* an revenue caps, only asks one to respect the Use Policy! This makes DeepSeek V3 even more liberal than Llama 3x series