On-device AI coding.
No cloud. No limits.

A complete coding agent that executes entirely on your machine. No API calls. No usage caps.

The Problem

You don't own your AI.
And you're being watched.

Monitoring Active
Data Extraction001

They train on your code.

Every prompt. Every file. Every fix. It flows through infrastructure you don't control — improving systems they want to use to replace you.

Artificial Scarcity002

They meter your ambition.

Slowdowns, overages, caps. Right when you're deep in a sprint, the meter decides you've had enough.

Silent Downgrades003

They change the model.

They silently downgrade to cheaper models during peak load. Full price, degraded experience.

Cloud Dependency004

They control your flow.

Every completion makes a round trip across the internet. Thousands of tiny interruptions, every single day.

Introducing Rig

Everything local.
Own your AI.

A complete AI coding agent running entirely on your own hardware. No usage limits. No cloud dependency.

Cloud
Your Machine
YOUR CODE
KEYSTROKES · FILES
RIG
LOCAL INFERENCE
GPUINDEXMODEL
RESPONSE
<300ms · ON DEVICE
Telemetry
Cloud Servers
Your Machine
✓ Rig model active
Nothing Leaves
Offline

Work offline

Flights. Spotty Wi-Fi. Network outages. Nothing stops your flow.

Unlimited

Remove the meter

Refactor the whole codebase. Riff on an idea all day. Run agent loops without thinking about cost.

Privacy

Sever the connection

Your code, keystrokes, and files never leave your machine. Not anonymized. Not aggregated. Not sent.

Latency

Stop waiting

No round-trip to a data center. Inference happens on your machine, in single-digit milliseconds.

Our Approach

Purpose beats scale.

Rig is a closed system — model, context, tools, and inference — engineered together for one job: real coding work.

Step 01

A focused model, trained specifically for coding.

Every parameter in the model is dedicated to coding, planning, tool use, and structured edits. The entire training process is focused on engineering work.

By narrowing the domain, we concentrate intelligence where it matters — deeper reasoning, better code, sharper tool use.

Step 02

Full intelligence, compressed to fit your machine.

The model is compressed to run efficiently on consumer machines — carefully preserving the reasoning patterns that matter most.

The result is an 8 GB model that fits comfortably in memory on a MacBook. Full reasoning. Local execution. Zero cost per token.

Step 03

A custom runtime, engineered for Apple Silicon.

The model runs through a custom inference engine optimized specifically for Apple Silicon. Model, context engine, and tools are designed as a single coordinated system.

That tight integration is what makes local execution fast, reliable, and practical.

Capabilities

Your machine, unleashed.

[ 01 ]

Understands your architecture.

Builds a connected model of modules, dependencies, and relationships so reasoning happens across files and aligns with your architecture.

[ 02 ]

Tracks relationships, prevents breakage.

Edits that respect function contracts, type boundaries, and dependency graphs — reducing bugs and regressions.

[ 03 ]

Strategizes before acting.

Explore → Plan → Execute workflows ensure multiple steps are reasoned out before changes occur.

[ 04 ]

Executes complex coding workflows.

From refactors to test generation to feature builds — coordinate tools, code edits, web search, and commands as needed.

[ 05 ]

Isolates agent sandboxes.

Each agent runs in its own workspace so experiments are safe, parallel workflows don't clash, and code changes stay isolated until you merge them.

[ 06 ]

Runs at full speed.

Custom Rust inference engine optimized for CUDA and Metal — delivering up to 144 tokens per second on consumer hardware.

Latency
0ms
No round-trip required
Privacy
100%
Air-gapped by design
Cost / Token
$0
Your GPU, your tokens
Uptime
Local
No dependency on cloud
Engineered Intelligence

Built for control freaks

RIG://LOCALHOST · OFFLINE
λ rig init
██████╗  ██╗ ██████╗
██╔══██╗ ██║ ██╔════╝
██████╔╝ ██║ ██║  ███╗
██╔══██╗ ██║ ██║   ██║
██║  ██║ ██║ ╚██████║
╚═╝  ╚═╝ ╚═╝  ╚═════╝
> Scanning hardware...
> Found M4 · 16GB RAM
> Loading RIG Model OK
> Indexing 2,418 files · 87,102 symbols
Ready. Network: OFF · Telemetry: OFF
λ
Early Access

Rig is almost ready.

We're inviting engineers to run it on real code and help shape what ships.

Verify you are humanCloudflare
FAQ

Frequently asked questions.

Rig is a local-first AI coding assistant that runs entirely on your machine. It uses a modified open-source model post-trained exclusively for code, executed by a custom Rust inference engine optimized for Apple Silicon. Rig delivers fast and low latency agentic coding, requires no API calls, has no usage caps, collects zero telemetry, and costs $0 per token. All code and files stay on your machine. Rig currently supports macOS with Linux and Windows support planned.

Rig uses a customized open source model. We modified it to work exclusively with the Rig agent harness, context engine, and tools. This allows us to shrink the model's total footprint without losing intelligence or coding capability.

Rig is currently optimized to run on Apple Silicon devices using M2 or later with at least 32GB of RAM. We hope to continue optimizing to reduce the memory requirements and one day work well with only 16GB of RAM. Support for Window and Linux are coming soon.

Rig's model is still in development so we do not have benchmarks available yet. Our early tests indicate the Rig model will be on par with state of the art models thanks to the combination of our context engine and post training pipeline.

Yes, Rig has all the same tools you'd expect from a coding agent, including web search, file read / write, plan mode, and more.

Rig's pricing model is planned to be a flat monthly or annual subscription on par with other coding agents but completely unlimited and offline.

No, Rig is committed to being the most secure and private coding agent available. Our telemetry will be limited to a license check with a grace period. Your code and conversations will never leave your machine.

We are rolling out closed beta access now. Keep an eye on your email for an invite to the test builds and slack community. Wider release is planned for Q3 2026. We're focused on creating the best possible coding assistant capable of supporting real software engineers on their most important projects.

Break free
from big AI

Request Early Access ↵

No credit card. No usage meter.