Run Meta’s Leaked LLaMA on Mac M1/M2 (llama.cpp)

Updated on:

On March 3rd, user ‘llamanon’ leaked Meta’s LLaMA model on 4chan’s technology board /g/, enabling anybody to torrent it. A troll attempted to add the torrent link to Meta’s official LLaMA Github repo.

Here’s how I got LLaMA set up on my Mac M1, 64GB RAM. Windows guide here.

1. Get Model Weights

Here’s the torrent magnet link:

magnet:?xt=urn:btih:b8287ebfa04f879b048d4d4404108cf3e8014352&dn=LLaMA&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337%2fannounce

If you don’t have a torrent client, I recommend qBittorrent.

You do not have to download all models. I recommend getting the 7B version to start.

2. Get llama.cpp

Open your Terminal and enter these commands one by one:

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make

3. Set up Python environment

pipenv shell --python 3.10

You need to create a models/ folder in your llama.cpp directory that directly contains the 7B (or other models) and folders from the LLaMA model download.

Next, install the dependencies needed by the Python conversion script.

pip install torch numpy sentencepiece

Yeah, AI moves way too fast

Get the email that makes keeping up with AI easy and fun. Stay informed and entertained, for free.

Leave a Comment