How to Set Up Local AI Coding with Continue and Ollama (Updated Feb 2026)
A step-by-step guide to running AI code completions and chat entirely on your machine using Continue and Ollama---no cloud API keys, no data leaving your computer.
Editorial Team
The AI Coding Tools Directory editorial team researches and reviews AI-powered development tools to help developers find the best solutions for their workflows.
Continue and Ollama together provide AI-assisted coding without sending any code to the cloud. Continue is an open-source VS Code and JetBrains extension, and Ollama runs large language models locally on your machine. This setup gives you completions and chat at zero ongoing cost with full privacy. This guide walks through installation, model selection, configuration, and troubleshooting.
Open-source, model-agnostic AI coding assistant for VS Code and JetBrains
TL;DR
- Install Ollama (
brew install ollama) and pull a code model likedeepseek-coder-v2, then install the Continue extension in VS Code.- Configure Continue to use Ollama in
~/.continue/config.yaml-- no API keys or cloud accounts needed.- The setup costs $0 beyond hardware: all inference runs locally on your machine.
- You need at least 8 GB RAM (16 GB recommended) for code-capable models; GPU helps but is not required.
- You can mix local and cloud models in Continue, using Ollama for routine work and a cloud API for tasks that need frontier reasoning.
What You Need
| Requirement | Details |
|---|---|
| Machine | Mac, Linux, or Windows with at least 8 GB RAM (16 GB recommended for larger models) |
| Editor | VS Code or a JetBrains IDE |
| Time | About 10--15 minutes for initial setup |
| Cost | $0 (everything is free and open-source) |
Step 1 --- Install Ollama
Ollama runs LLMs locally on your machine. Download it from ollama.com or install via Homebrew on macOS:
brew install ollama
Start the Ollama service (it may auto-start after installation):
ollama serve
In a separate terminal, pull a code-capable model. Recommended options for 2026:
| Model | Size | RAM Needed | Best For |
|---|---|---|---|
deepseek-coder-v2 |
~16 GB | 16 GB+ | Strong code generation and refactoring |
codellama |
~7 GB | 8 GB+ | Fast completions on lower-end hardware |
llama3.1 |
~8--70 GB | 16--64 GB+ | General coding + reasoning (larger variants) |
ollama pull deepseek-coder-v2
Verify Ollama is working:
ollama run deepseek-coder-v2 "Write a Python function to reverse a string"
You should see a generated response. If so, Ollama is ready.
Step 2 --- Install Continue
- Open VS Code.
- Go to Extensions (
Cmd+Shift+X/Ctrl+Shift+X). - Search for Continue and install the extension by Continue Dev.
- After installation, move the Continue icon to the right sidebar for a better layout.
- Open the Continue sidebar---you may be prompted to sign in to Mission Control or configure locally.
Step 3 --- Connect Continue to Ollama
Open Continue's configuration. You can access it by:
- Clicking the gear icon in the Continue sidebar, or
- Using the command palette: Continue: Open Config
Your config is stored at ~/.continue/config.yaml. Add an Ollama model:
models:
- title: "DeepSeek Coder (Local)"
provider: ollama
model: deepseek-coder-v2
Save the file. Continue will automatically detect the change and start using your local model.
Open-source MoE coding model (V2) with 128K context
Step 4 --- Start Coding
Inline Completions
As you type, Continue suggests code completions from your local model. Press Tab to accept.
Chat
Open the Continue sidebar and type a message. Ask for explanations, refactors, or new code. The model runs entirely on your machine---no API calls are made.
Context-Aware Responses
Continue uses your open files and selected code as context. Select a block of code and ask a question to get relevant, targeted answers.
Optional: Add a Cloud Model for Hard Tasks
You can mix local and cloud models. Use Ollama for fast, free completions and add a cloud API key for tasks that need frontier-level reasoning:
models:
- title: "DeepSeek Coder (Local)"
provider: ollama
model: deepseek-coder-v2
- title: "Claude Sonnet 4.6 (Cloud)"
provider: anthropic
model: claude-sonnet-4-6-20260217
apiKey: "<your-anthropic-key>"
Switch between models using the model selector in the Continue chat panel.
Troubleshooting
No completions appearing?
- Confirm Ollama is running:
ollama serve - Confirm the model is pulled:
ollama list - Check that
config.yamlpoints to the correct model name
Slow responses?
- Smaller models (codellama, deepseek-coder-v2) are faster on limited hardware
- If you have limited RAM, stick to 7B-class models
- Close other memory-intensive applications while using AI completions
Ollama not found?
- Add Ollama to your PATH if needed
- Restart VS Code after installing Ollama
- On some systems, you may need to run
ollama servemanually
Continue sidebar not showing?
- Check that the extension is installed and enabled
- Try reloading VS Code:
Ctrl+Shift+P> "Developer: Reload Window"
Summary
| Component | Role |
|---|---|
| Ollama | Runs LLMs locally on your machine |
| Continue | Connects your IDE to Ollama (and optionally cloud APIs) |
| Together | Private, offline-friendly AI coding with no per-request costs |
This setup is ideal for privacy-sensitive work, air-gapped environments, or anyone who wants to avoid ongoing API costs.
Related
- Continue --- Full review in our directory
- Aider --- Terminal-based alternative that also supports Ollama
- Tabnine --- Enterprise option with on-premise deployment
- Free AI Coding Tools --- More no-cost options
Private AI coding assistant with free, pro, enterprise, and agentic tiers
Tools Mentioned in This Article
Aider
Open-source terminal pair programmer with git-native workflows
Open SourceContinue
Open-source, model-agnostic AI coding assistant for VS Code and JetBrains
Open SourceDeepSeek Coder
Open-source MoE coding model (V2) with 128K context
Open SourceOllama
Run AI models locally with Docker-like simplicity, 200+ model families, and full API compatibility
Open SourceTabnine
Private AI coding assistant with free, pro, enterprise, and agentic tiers
FreemiumFree Resource
2026 AI Coding Tools Comparison Chart
Side-by-side comparison of features, pricing, and capabilities for every major AI coding tool.
No spam, unsubscribe anytime.
Workflow Resources
Cookbook
Building AI-Powered Applications
Build applications powered by LLMs, RAG, and AI agents using Claude Code, Cursor, and modern AI frameworks.
Cookbook
Claude Code workflow for large repository refactors
Use Claude Code in terminal-centric repos to run plan-first refactors with deterministic checkpoints.
Cookbook
Local coding stack with Continue and Ollama
Set up a privacy-first local AI coding workflow with Continue, Ollama, and project-level rules.
Frequently Asked Questions
How do I get started with Local AI Coding with Continue and Ollama (Updated Feb 2026)?
Related Articles
What is Vibe Coding? The Complete Guide for 2026
Vibe coding is the practice of building software by describing intent in natural language and iterating with AI. This guide explains how it works, who it's for, and how to get started.
Read more →GuideWarp Oz: Cloud Agent Orchestration for DevOps
A practical guide to Warp's Oz cloud agent: what it does, how it fits into terminal and DevOps workflows.
Read more →GuideSWE-bench Wars: How AI Coding Benchmarks Hit 80%
A practical look at SWE-bench and AI coding benchmarks: what they measure, current results, and how to interpret claims.
Read more →