Played around with running local LLMs this weekend. First I took the harder path of setting up ollama in Docker and n8n. I still have a lot to learn about n8n, but that has a ton of potential to do cool things.

Then I switched gears slightly and went with the easy route of trying LM Studio. That has been nice so far. Using a few sample prompts about Swift, SwiftUI, and TCA, I’m finding that qwen-coder 2.5 32b is a pretty effective model to work with.

I started by installing on a MBP M1 Max, and that was okay. However, the Mac Studio M3 Ultra generates about 3 times more tokens per second than an M1 Max, and is a much more pleasant experience.