Hey @michelleyhbn @LucSGeorges just wanted to share that there seem to be errors when using 4x A100 80gbs to host Bixtral on HF endpoints… seems like more than enough RAM, but still crashed — tried gptq etc. No dice, all failed. Not a huge deal for me, but wanted to make sure
Bixtral deployment errors on A100 GPUs HuggingFace endpoints
By
–