OK, can confirm that it works on an M2 with 64GB of RAM I had to quit Firefox and VS Code to free up enough RAM for it to run – Meta-Llama-3-70B-Instruct.Q4_0.llamafile now uses 38GB of RAM and runs at about 7.5 tokens a second
Meta-Llama-3-70B runs on M2 Mac with 64GB RAM
By
–
