CodeLlama PR merged with 4-bit quantization inference benchmarks

AI Dynamics

Global AI News Aggregator

CodeLlama PR merged with 4-bit quantization inference benchmarks

–

01 September 2023 14h30

The CodeLlama PR just got merged: https://
github.com/Lightning-AI/l
it-gpt/pull/472
… When I tried it with bnb's 4-bit Normal Float quantization, the 34B Instruct and Python variants used about 20 Gb for inference:

→ View original post on X — @rasbt,

1 September 2023

AI AI HARDWARE CODE COMPUTING LLMS OPEN SOURCE

AI Dynamics

CodeLlama PR merged with 4-bit quantization inference benchmarks

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

AI Generates Perfect Jokes Using Image Generation Skills

Codex App Transformation: Atlas Integration Reshapes User Experience

AI File Access Limitations: Screenshot vs Disk Storage Issues

Synthetic Aperture Radar: Satellite Tech for Global Monitoring