Run 16+ AI coding agents locally — in parallel, on your hardware.
Run MLX and llama.cpp simultaneously on Apple Silicon, or vLLM on Linux/CUDA — across three coordinator modes (Flat, Pipeline, Router). No cloud, no API keys, no data leaves your machine.
npm i @keepdevops/matrix
Matrix Swarm is a local-first orchestration layer for open-weight LLMs. Spin up a fleet of role-specialized agents — Architect, Programmer, Security, DevOps, and more — then broadcast one prompt to all of them, pipe it through a sequence, or let a Router model dispatch each task to the best fit.
Models run entirely on your hardware: MLX + llama.cpp on Apple Silicon, vLLM on Linux/CUDA. Nothing leaves the box. Built for engineers who want Cursor-class productivity without sending their codebase to the cloud.