# ai-agent A fully local AI coding agent for the terminal -- powered by Ollama and small models, with intelligent routing, cross-session memory, and MCP tool integration. ``` ╭──────────────────────────────────────────╮ │ ai-agent │ │ 100% local. Your data never leaves. │ │ │ │ ASK -- PLAN -- BUILD │ │ 0.8B 4B 9B │ ╰──────────────────────────────────────────╯ ``` --- ## What is ai-agent? - **100% local** -- runs entirely on your machine via Ollama. No API keys, no cloud, no data leaving your device. - **Small model optimized** -- intelligent routing across Qwen 3.5 variants (0.8B / 2B / 4B / 9B) based on task complexity. - **Three operational modes** -- ASK for quick answers, PLAN for design and reasoning, BUILD for full execution with tools. - **MCP native** -- first-class Model Context Protocol support (STDIO, SSE, Streamable HTTP) for extensible tool integration. - **Beautiful TUI** -- built with Charm's BubbleTea v2, Lip Gloss v2, and Glamour for rich markdown rendering in the terminal. - **Infinite Context Engine (ICE)** -- cross-session vector retrieval that surfaces relevant past conversations automatically. - **Auto-Memory Detection** -- the LLM extracts facts, decisions, preferences, and TODOs from conversations and persists them. - **Thinking/CoT extraction** -- chain-of-thought reasoning is captured and displayed in collapsible blocks. - **Skills system** -- load `.md` skill files with YAML frontmatter to inject domain-specific instructions into the system prompt. - **Agent profiles** -- configure per-project agents with custom system prompts, skills, and MCP servers. --- ## Quick Start ### Prerequisites - [Go 1.25+](https://go.dev/dl/) - [Ollama](https://ollama.ai/) running locally - [Task](https://taskfile.dev/) (optional, for build commands) ### Install Pull the required model, then install: ```bash ollama pull qwen3.5:2b go install github.com/abdul-hamid-achik/ai-agent/cmd/ai-agent@latest ``` For the full model routing suite (optional): ```bash ollama pull qwen3.5:0.8b ollama pull qwen3.5:4b ollama pull qwen3.5:9b ollama pull nomic-embed-text # for ICE vector embeddings ``` ### Configure Create a config file (optional -- defaults work out of the box): ```bash mkdir -p ~/.config/ai-agent cp config.example.yaml ~/.config/ai-agent/config.yaml ``` ### Run ```bash ai-agent ``` Or from source: ```bash task dev ``` --- ## Features ### Model Routing ai-agent automatically selects the right model size for the task at hand. Simple questions go to the fast 2B model; complex multi-step reasoning escalates to the 9B model. The router analyzes query complexity using keyword heuristics and word count. | Complexity | Model | Speed | Use Cases | |------------|---------------|--------|----------------------------------------------| | Simple | qwen3.5:2b | 2.5x | Quick answers, simple tool use, single edits | | Medium | qwen3.5:4b | 1.5x | Code completion, refactoring, explanations | | Complex | qwen3.5:9b | 1.0x | Multi-step reasoning, debugging, code review | The fallback chain ensures graceful degradation if a model is not available: `2b -> 4b -> 9b`. ### Three Modes: ASK / PLAN / BUILD Cycle between modes with `shift+tab`. Each mode configures a different system prompt and preferred model tier. - **ASK** -- Direct, concise answers. Routes to the fastest available model. Tools available for file reads and searches. - **PLAN** -- Design and planning. Breaks tasks into steps. Reads and explores with tools but does not modify files. - **BUILD** -- Full execution mode. Uses the most capable model. All tools enabled including writes and modifications. ### MCP Tool Integration Connect any MCP-compatible tool server. Supports all three transport protocols: - **STDIO** -- Launch tools as subprocesses (default). - **SSE** -- Connect to Server-Sent Events endpoints. - **Streamable HTTP** -- Connect to HTTP-based MCP servers. Tool calls execute in parallel when possible. The registry handles graceful failure if a server becomes unavailable. ### Infinite Context Engine (ICE) ICE embeds each conversation turn using `nomic-embed-text` and stores them persistently. On every new message, it retrieves the most relevant past conversations via cosine similarity and injects them into the system prompt -- giving the agent memory that spans across sessions. ### Auto-Memory Detection After each conversation turn, a background process analyzes the exchange and extracts structured memories: - **FACT** -- objective information the user shared - **DECISION** -- choices made during the conversation - **PREFERENCE** -- user preferences and working styles - **TODO** -- action items and follow-ups Memories are stored in `~/.config/ai-agent/memories.json` with tag-weighted search scoring (tags weighted 3x over content). ### Thinking/CoT Display When the model produces chain-of-thought reasoning, ai-agent captures it and renders it in collapsible blocks. Toggle the display with `ctrl+t`. ### Skills System Drop `.md` files with YAML frontmatter into the skills directory to inject domain-specific instructions: ``` ~/.config/ai-agent/skills/ ``` Manage active skills with `/skill list`, `/skill activate `, and `/skill deactivate `. ### Agent Profiles Create per-project or per-domain agent profiles: ``` ~/.agents// AGENT.md # System prompt additions SKILL.md # Agent-specific skills mcp.yaml # Agent-specific MCP servers ``` Switch profiles with `/agent ` or `/agent list`. --- ## Configuration ### File Locations Config is searched in order (first match wins): 1. `./ai-agent.yaml` (project-local) 2. `~/.config/ai-agent/config.yaml` (user-global) ### Annotated Example ```yaml ollama: model: "qwen3.5:2b" # Default model base_url: "http://localhost:11434" # Ollama API endpoint num_ctx: 262144 # Context window size # Skills directory (default: ~/.config/ai-agent/skills/) # skills_dir: "/path/to/custom/skills" # MCP tool servers servers: # STDIO transport (default) - name: noted command: noted args: [mcp] # SSE transport # - name: remote-server # transport: sse # url: "http://localhost:8811" # Streamable HTTP transport # - name: streamable-server # transport: streamable-http # url: "http://localhost:8812/mcp" # ICE configuration # ice: # enabled: true # embed_model: "nomic-embed-text" # store_path: "~/.config/ai-agent/conversations.json" ``` ### Environment Variables | Variable | Description | Overrides | |--------------------------|------------------------------|----------------------| | `OLLAMA_HOST` | Ollama API base URL | `ollama.base_url` | | `LOCAL_AGENT_MODEL` | Default model name | `ollama.model` | | `LOCAL_AGENT_AGENTS_DIR` | Path to agents directory | `agents.dir` | --- ## Keyboard Shortcuts ### Input | Key | Action | |-----------------|-------------------------------| | `enter` | Send message | | `shift+enter` | Insert new line | | `up` / `down` | Browse input history | | `shift+tab` | Cycle mode (ASK/PLAN/BUILD) | | `ctrl+m` | Quick model switch | ### Navigation | Key | Action | |------------------|------------------------------| | `pgup` / `pgdown`| Scroll conversation | | `ctrl+u` | Half-page scroll up | | `ctrl+d` | Half-page scroll down | ### Display | Key | Action | |-----------------|-------------------------------| | `?` | Toggle help overlay | | `t` | Expand/collapse tool calls | | `space` | Toggle last tool details | | `ctrl+t` | Toggle thinking/CoT display | | `ctrl+y` | Copy last response | ### Control | Key | Action | |-----------------|-------------------------------| | `esc` | Cancel streaming / close overlay | | `ctrl+c` | Quit | | `ctrl+l` | Clear screen | | `ctrl+n` | New conversation | --- ## Slash Commands | Command | Description | |--------------------------------------|-----------------------------------| | `/help` | Show help overlay | | `/clear` | Clear conversation history | | `/new` | Start a fresh conversation | | `/model [name\|list\|fast\|smart]` | Show or switch models | | `/models` | Open model picker | | `/agent [name\|list]` | Show or switch agent profile | | `/load ` | Load markdown file as context | | `/unload` | Remove loaded context | | `/skill [list\|activate\|deactivate] [name]` | Manage skills | | `/servers` | List connected MCP servers | | `/ice` | Show ICE engine status | | `/sessions` | Browse saved sessions | | `/exit` | Quit | --- ## Architecture ``` cmd/ai-agent/ Entry point internal/ agent/ ReAct loop orchestration llm/ LLM abstraction (OllamaClient, ModelManager) mcp/ MCP server registry config/ YAML config, env overrides, Router ice/ Infinite Context Engine memory/ Persistent key-value store skill/ Skill file loader command/ Slash command registry tui/ BubbleTea v2 terminal UI logging/ Structured logging ``` ### Request Flow ``` User Input | v agent.AddUserMessage() | v ICE embeds message, retrieves relevant past context | v System prompt assembled (tools + skills + context + ICE + memory) | v Router selects model based on task complexity | v LLM streams response via ChatStream() | v Tool calls routed through MCP registry (parallel execution) | v ReAct loop continues (up to 10 iterations) until final text | v Conversation compacted if token budget exceeded Auto-memory detection runs in background ``` ### Key Interfaces - `llm.Client` -- pluggable LLM provider (`ChatStream`, `Ping`, `Embed`) - `agent.Output` -- streaming callbacks for TUI rendering - `command.Registry` -- extensible slash command dispatch ### Concurrency `sync.RWMutex` protects shared state in `ModelManager`, `mcp.Registry`, and `memory.Store`. Auto-memory detection and MCP connections run as background goroutines. Tool calls execute in parallel when independent. --- ## Comparison | Feature | ai-agent | opencode | crush | |----------------------------------|:-----------:|:--------:|:-----:| | 100% local (no API keys) | Yes | No | Yes | | Model routing by task complexity | Yes | No | No | | Operational modes (ASK/PLAN/BUILD)| Yes | No | No | | Cross-session memory (ICE) | Yes | No | No | | Auto-memory detection | Yes | No | No | | Thinking/CoT extraction | Yes | Yes | No | | MCP tool support | Yes | Yes | Yes | | Skills system | Yes | No | No | | Plan form overlay | Yes | No | No | | Small model optimized | Yes | No | No | | TUI chat interface | Yes | Yes | Yes | | Language | Go | TypeScript| Go | --- ## Building This project uses [Task](https://taskfile.dev/) as its build tool. ```bash task build # Compile to bin/ai-agent task run # Build and run task dev # Quick run via go run ./cmd/ai-agent task test # Run all tests: go test ./... task lint # Run golangci-lint run ./... task clean # Remove bin/ directory ``` Run a single test: ```bash go test ./internal/agent/ -run TestFunctionName ``` --- ## License MIT