ai-agent/README.md
admin 8dc496b626
Some checks failed
CI / test (push) Has been cancelled
Release / release (push) Failing after 4m36s
first commit
2026-03-08 15:40:34 +07:00

376 lines
13 KiB
Markdown

# ai-agent
A fully local AI coding agent for the terminal -- powered by Ollama and small models, with intelligent routing, cross-session memory, and MCP tool integration.
```
╭──────────────────────────────────────────╮
│ ai-agent │
│ 100% local. Your data never leaves. │
│ │
│ ASK -- PLAN -- BUILD │
│ 0.8B 4B 9B │
╰──────────────────────────────────────────╯
```
---
## What is ai-agent?
- **100% local** -- runs entirely on your machine via Ollama. No API keys, no cloud, no data leaving your device.
- **Small model optimized** -- intelligent routing across Qwen 3.5 variants (0.8B / 2B / 4B / 9B) based on task complexity.
- **Three operational modes** -- ASK for quick answers, PLAN for design and reasoning, BUILD for full execution with tools.
- **MCP native** -- first-class Model Context Protocol support (STDIO, SSE, Streamable HTTP) for extensible tool integration.
- **Beautiful TUI** -- built with Charm's BubbleTea v2, Lip Gloss v2, and Glamour for rich markdown rendering in the terminal.
- **Infinite Context Engine (ICE)** -- cross-session vector retrieval that surfaces relevant past conversations automatically.
- **Auto-Memory Detection** -- the LLM extracts facts, decisions, preferences, and TODOs from conversations and persists them.
- **Thinking/CoT extraction** -- chain-of-thought reasoning is captured and displayed in collapsible blocks.
- **Skills system** -- load `.md` skill files with YAML frontmatter to inject domain-specific instructions into the system prompt.
- **Agent profiles** -- configure per-project agents with custom system prompts, skills, and MCP servers.
---
## Quick Start
### Prerequisites
- [Go 1.25+](https://go.dev/dl/)
- [Ollama](https://ollama.ai/) running locally
- [Task](https://taskfile.dev/) (optional, for build commands)
### Install
Pull the required model, then install:
```bash
ollama pull qwen3.5:2b
go install github.com/abdul-hamid-achik/ai-agent/cmd/ai-agent@latest
```
For the full model routing suite (optional):
```bash
ollama pull qwen3.5:0.8b
ollama pull qwen3.5:4b
ollama pull qwen3.5:9b
ollama pull nomic-embed-text # for ICE vector embeddings
```
### Configure
Create a config file (optional -- defaults work out of the box):
```bash
mkdir -p ~/.config/ai-agent
cp config.example.yaml ~/.config/ai-agent/config.yaml
```
### Run
```bash
ai-agent
```
Or from source:
```bash
task dev
```
---
## Features
### Model Routing
ai-agent automatically selects the right model size for the task at hand. Simple questions go to the fast 2B model; complex multi-step reasoning escalates to the 9B model. The router analyzes query complexity using keyword heuristics and word count.
| Complexity | Model | Speed | Use Cases |
|------------|---------------|--------|----------------------------------------------|
| Simple | qwen3.5:2b | 2.5x | Quick answers, simple tool use, single edits |
| Medium | qwen3.5:4b | 1.5x | Code completion, refactoring, explanations |
| Complex | qwen3.5:9b | 1.0x | Multi-step reasoning, debugging, code review |
The fallback chain ensures graceful degradation if a model is not available: `2b -> 4b -> 9b`.
### Three Modes: ASK / PLAN / BUILD
Cycle between modes with `shift+tab`. Each mode configures a different system prompt and preferred model tier.
- **ASK** -- Direct, concise answers. Routes to the fastest available model. Tools available for file reads and searches.
- **PLAN** -- Design and planning. Breaks tasks into steps. Reads and explores with tools but does not modify files.
- **BUILD** -- Full execution mode. Uses the most capable model. All tools enabled including writes and modifications.
### MCP Tool Integration
Connect any MCP-compatible tool server. Supports all three transport protocols:
- **STDIO** -- Launch tools as subprocesses (default).
- **SSE** -- Connect to Server-Sent Events endpoints.
- **Streamable HTTP** -- Connect to HTTP-based MCP servers.
Tool calls execute in parallel when possible. The registry handles graceful failure if a server becomes unavailable.
### Infinite Context Engine (ICE)
ICE embeds each conversation turn using `nomic-embed-text` and stores them persistently. On every new message, it retrieves the most relevant past conversations via cosine similarity and injects them into the system prompt -- giving the agent memory that spans across sessions.
### Auto-Memory Detection
After each conversation turn, a background process analyzes the exchange and extracts structured memories:
- **FACT** -- objective information the user shared
- **DECISION** -- choices made during the conversation
- **PREFERENCE** -- user preferences and working styles
- **TODO** -- action items and follow-ups
Memories are stored in `~/.config/ai-agent/memories.json` with tag-weighted search scoring (tags weighted 3x over content).
### Thinking/CoT Display
When the model produces chain-of-thought reasoning, ai-agent captures it and renders it in collapsible blocks. Toggle the display with `ctrl+t`.
### Skills System
Drop `.md` files with YAML frontmatter into the skills directory to inject domain-specific instructions:
```
~/.config/ai-agent/skills/
```
Manage active skills with `/skill list`, `/skill activate <name>`, and `/skill deactivate <name>`.
### Agent Profiles
Create per-project or per-domain agent profiles:
```
~/.agents/<name>/
AGENT.md # System prompt additions
SKILL.md # Agent-specific skills
mcp.yaml # Agent-specific MCP servers
```
Switch profiles with `/agent <name>` or `/agent list`.
---
## Configuration
### File Locations
Config is searched in order (first match wins):
1. `./ai-agent.yaml` (project-local)
2. `~/.config/ai-agent/config.yaml` (user-global)
### Annotated Example
```yaml
ollama:
model: "qwen3.5:2b" # Default model
base_url: "http://localhost:11434" # Ollama API endpoint
num_ctx: 262144 # Context window size
# Skills directory (default: ~/.config/ai-agent/skills/)
# skills_dir: "/path/to/custom/skills"
# MCP tool servers
servers:
# STDIO transport (default)
- name: noted
command: noted
args: [mcp]
# SSE transport
# - name: remote-server
# transport: sse
# url: "http://localhost:8811"
# Streamable HTTP transport
# - name: streamable-server
# transport: streamable-http
# url: "http://localhost:8812/mcp"
# ICE configuration
# ice:
# enabled: true
# embed_model: "nomic-embed-text"
# store_path: "~/.config/ai-agent/conversations.json"
```
### Environment Variables
| Variable | Description | Overrides |
|--------------------------|------------------------------|----------------------|
| `OLLAMA_HOST` | Ollama API base URL | `ollama.base_url` |
| `LOCAL_AGENT_MODEL` | Default model name | `ollama.model` |
| `LOCAL_AGENT_AGENTS_DIR` | Path to agents directory | `agents.dir` |
---
## Keyboard Shortcuts
### Input
| Key | Action |
|-----------------|-------------------------------|
| `enter` | Send message |
| `shift+enter` | Insert new line |
| `up` / `down` | Browse input history |
| `shift+tab` | Cycle mode (ASK/PLAN/BUILD) |
| `ctrl+m` | Quick model switch |
### Navigation
| Key | Action |
|------------------|------------------------------|
| `pgup` / `pgdown`| Scroll conversation |
| `ctrl+u` | Half-page scroll up |
| `ctrl+d` | Half-page scroll down |
### Display
| Key | Action |
|-----------------|-------------------------------|
| `?` | Toggle help overlay |
| `t` | Expand/collapse tool calls |
| `space` | Toggle last tool details |
| `ctrl+t` | Toggle thinking/CoT display |
| `ctrl+y` | Copy last response |
### Control
| Key | Action |
|-----------------|-------------------------------|
| `esc` | Cancel streaming / close overlay |
| `ctrl+c` | Quit |
| `ctrl+l` | Clear screen |
| `ctrl+n` | New conversation |
---
## Slash Commands
| Command | Description |
|--------------------------------------|-----------------------------------|
| `/help` | Show help overlay |
| `/clear` | Clear conversation history |
| `/new` | Start a fresh conversation |
| `/model [name\|list\|fast\|smart]` | Show or switch models |
| `/models` | Open model picker |
| `/agent [name\|list]` | Show or switch agent profile |
| `/load <path>` | Load markdown file as context |
| `/unload` | Remove loaded context |
| `/skill [list\|activate\|deactivate] [name]` | Manage skills |
| `/servers` | List connected MCP servers |
| `/ice` | Show ICE engine status |
| `/sessions` | Browse saved sessions |
| `/exit` | Quit |
---
## Architecture
```
cmd/ai-agent/ Entry point
internal/
agent/ ReAct loop orchestration
llm/ LLM abstraction (OllamaClient, ModelManager)
mcp/ MCP server registry
config/ YAML config, env overrides, Router
ice/ Infinite Context Engine
memory/ Persistent key-value store
skill/ Skill file loader
command/ Slash command registry
tui/ BubbleTea v2 terminal UI
logging/ Structured logging
```
### Request Flow
```
User Input
|
v
agent.AddUserMessage()
|
v
ICE embeds message, retrieves relevant past context
|
v
System prompt assembled (tools + skills + context + ICE + memory)
|
v
Router selects model based on task complexity
|
v
LLM streams response via ChatStream()
|
v
Tool calls routed through MCP registry (parallel execution)
|
v
ReAct loop continues (up to 10 iterations) until final text
|
v
Conversation compacted if token budget exceeded
Auto-memory detection runs in background
```
### Key Interfaces
- `llm.Client` -- pluggable LLM provider (`ChatStream`, `Ping`, `Embed`)
- `agent.Output` -- streaming callbacks for TUI rendering
- `command.Registry` -- extensible slash command dispatch
### Concurrency
`sync.RWMutex` protects shared state in `ModelManager`, `mcp.Registry`, and `memory.Store`. Auto-memory detection and MCP connections run as background goroutines. Tool calls execute in parallel when independent.
---
## Comparison
| Feature | ai-agent | opencode | crush |
|----------------------------------|:-----------:|:--------:|:-----:|
| 100% local (no API keys) | Yes | No | Yes |
| Model routing by task complexity | Yes | No | No |
| Operational modes (ASK/PLAN/BUILD)| Yes | No | No |
| Cross-session memory (ICE) | Yes | No | No |
| Auto-memory detection | Yes | No | No |
| Thinking/CoT extraction | Yes | Yes | No |
| MCP tool support | Yes | Yes | Yes |
| Skills system | Yes | No | No |
| Plan form overlay | Yes | No | No |
| Small model optimized | Yes | No | No |
| TUI chat interface | Yes | Yes | Yes |
| Language | Go | TypeScript| Go |
---
## Building
This project uses [Task](https://taskfile.dev/) as its build tool.
```bash
task build # Compile to bin/ai-agent
task run # Build and run
task dev # Quick run via go run ./cmd/ai-agent
task test # Run all tests: go test ./...
task lint # Run golangci-lint run ./...
task clean # Remove bin/ directory
```
Run a single test:
```bash
go test ./internal/agent/ -run TestFunctionName
```
---
## License
MIT