I am more and more excited about Local ML on Apple Silicon. Core ML in particular is starting to be a super nice stack to build on. Question: Do you want to have a 7B parameter model running at 30+ tokens/second, using less than 4GB of memory on your Mac? Then you need to
Running 7B Parameter Models Locally on Apple Silicon
By
–
Leave a Reply