Brain OS π§
Stop giving your AI amnesia.
Brain OS is a biologically-inspired, central cognitive engine written in pure Rust. Instead of every script, coding assistant, and chat UI keeping its own isolated, fragmented context, Brain OS acts as your single source of truth.
It routes intents through a Thalamus, scores importance via an Amygdala, and stores everything in a unified Hippocampus (FTS5 + HNSW Vector Search). Whether you connect via HTTP, WebSocket, gRPC, or MCP, your AI tools now share one localized, ever-growing memory that runs 24/7 on your machine.
Your data never leaves your hardware. Your AI never forgets.
How It Works
Every input β regardless of protocol β flows through the same pipeline:
Input β Intent Classification β Importance Scoring β Memory Store/Recall β LLM Response
The memory engine combines vector search (HNSW) with full-text search (BM25 FTS5), fuses results via Reciprocal Rank Fusion, and reranks by importance and recency. A forgetting curve runs every 24 hours to prune low-value memories and promote reinforced episodes to permanent semantic facts.
Beyond memory: the kernel it grew into
Memory is the hook β but the same daemon also mediates what your AI tools can do. Every capability it exposes β search the web, run a sandboxed command, send a notification, probe a host, audit its own config β is a typed entry in one capability manifest, each tagged with a safety tier and routed through the same consent, audit, and budget gates. Whether a request comes from your terminal, an MCP client, or Brainβs own resident reasoner, it sees the same manifest and is held to the same rules.
Design Principles
| Principle | Description |
|---|---|
| Local-first | Runs on your machine. No cloud, no telemetry, no account. |
| Protocol-agnostic | HTTP, WebSocket, gRPC, MCP β one memory behind every surface. |
| Memory that earns its place | Importance scoring + forgetting curve keep the signal sharp. |
| Open to any LLM | Ollama, OpenAI, OpenRouter, or any OpenAI-compatible endpoint. |
| Fail safe, never silently | Degraded-but-functional is the target state. |
Installation
Requirements
- Ollama (or any OpenAI-compatible API)
- Rust 1.91+ (only for building from source)
- Docker (optional β for SearXNG web search backend)
From crates.io (recommended)
cargo install brainos # requires Rust 1.91+
brain init # creates ~/.brain/ with config, database, vector index
ollama pull qwen2.5-coder:7b
ollama pull nomic-embed-text
brain deps up # optional: upgrade web search from DuckDuckGo (default) to SearXNG
From source
git clone https://github.com/keshavashiya/brain.git && cd brain
cargo install --path crates/cli
brain init
One-liner (pre-built binary)
curl -fsSL https://raw.githubusercontent.com/keshavashiya/brain/main/scripts/install.sh | sh
This downloads a pre-built binary when available, falling back to cargo install from source.
External services & auto-start
Docker (optional web search):
brain deps up # Start SearXNG
brain deps status # Check if running
brain deps down # Stop
Auto-start on login:
brain service install # launchd (macOS) / systemd (Linux) / Task Scheduler (Windows)
brain service uninstall # Remove
Verify your install
brain doctor # verify Ollama, models, ports β fix anything red
brain start # wake the daemon
brain status # check daemon health
Quick Start
Recommended setup order
# 1. Initialize (one-time)
brain init
# 2. Quick test β direct daemon
brain start
brain status
brain stop
# 3. Production β auto-start on login
brain service install # registers launchd/systemd/Task Scheduler
# Brain now wakes automatically on every login
Lifecycle commands
brain start # Start daemon
brain stop # Stop daemon
brain status # Check daemon status
brain tail # Stream BrainEvent bus (observability tap for headless/SSH)
Interactive usage
brain chat # Interactive chat
brain chat "remember that I use bun" # One-shot message
Foreground mode (development)
brain serve # All adapters (foreground)
brain serve --http # HTTP only
brain serve --http --ws # HTTP + WebSocket
brain serve --grpc # gRPC only
brain serve --mcp # MCP HTTP only
Checking memory
# Search memory
brain chat "what do I know about Rust?"
# Store a fact
brain chat "remember that my favorite editor is Neovim"
# List grants
brain chat "show me my grants"
Architecture Overview
Brain OS is built as a collection of specialized crates, each modelling a biological brain structure. The system is organized around a single SignalProcessor that all adapters share.
Crate Map
brain/
βββ crates/
β βββ core/ # BrainConfig + shared config types
β βββ signal/ # SignalProcessor β the single shared engine
β βββ thalamus/ # Intent classification (regex + LLM fallback)
β βββ amygdala/ # Importance scoring
β βββ hippocampus/ # Memory engine (episodic + semantic + search)
β βββ cortex/ # Reasoning core (LLM, context, action dispatch)
β βββ cerebellum/ # Procedural memory (triggerβpatterns)
β βββ ganglia/ # Proactivity / habit engine
β βββ audit/ # Append-only audit trail
β βββ confirm/ # Confirmation engine (nonce-based)
β βββ budget/ # Cost/token budget enforcement
β βββ sandbox/ # Command execution sandbox
β βββ vault/ # Credential vault
β βββ orchestrate/ # Task decomposition + execution
β βββ delegate/ # External agent delegation
β βββ channel/ # Channel routing + presets
β βββ observe/ # Observability bus + BrainEvent
β βββ identity/ # Principal, tier, authorization
β βββ intent/ # Intent Token + capability routing
β βββ mcphost/ # MCP host for external servers
β βββ reflex/ # Reactive signal sources
β βββ resilience/ # Circuit breaker, retry, rate limit
β βββ storage/ # SQLite pool + migrations
β βββ backends/ # World-touching backends
β βββ selfmodel/ # Self-model (host, capability, connectivity)
β βββ metrics/ # Performance metrics
β βββ bridge/ # Bridge library for external relays
β βββ adapters/ # Transport adapters (HTTP, WS, gRPC, MCP, Terminal)
β βββ cli/ # CLI binary (thin wrapper over backends)
βββ docs/ # Documentation (mdBook)
βββ scripts/ # Build + release scripts
βββ docker/ # Docker compose for SearXNG
Design Principle: One Capability, Many Faces
A capability is a typed entry β id, safety tier, preconditions β in a single registry. All transports (CLI, HTTP, WS, gRPC, MCP) and the resident reasoner are faces over that one registry. They hold no private capabilities and no business logic.
Signal Pipeline
Every input to Brain flows through a single pipeline:
Input β Intent Classification β Authorization β Importance Scoring
β Memory Store/Recall β LLM Response β Output
Processing stages
-
Signal Ingestion β signals arrive via any adapter (HTTP, WS, gRPC, MCP, CLI) as a typed
Signalcarrying content, namespace, principal, and metadata. -
Intent Classification β the
Thalamusclassifies each signal into one of 31 intent variants using a regex fast-path with async LLM fallback and timeout. -
Authorization β the
IdentityStoreenforces tier-based authorization on every signal. The pipeline gate runs after classification, checking the principalβs rights against the required tier. -
Importance Scoring β the
Amygdalascores memories on a [0,1] scale using keyword heuristics + per-process novelty detection. No LLM cost. -
Memory β the
Hippocampushandles storage and recall, combining BM25 FTS5 full-text search with HNSW vector search, fused via Reciprocal Rank Fusion. -
LLM Response β the
Cortexbuilds a token-budgeted prompt, invokes the configured LLM provider (with failover chain), and streams the response.
The Capability Loop
For autonomous actions (tool calls), Brain runs a consent-gated tool loop:
LLM Response β Tool Call Request β Authorization β Confirmation
β Execution β Audit β Result β LLM (next turn)
Each tool is a registered capability with a safety tier. Destructive/external actions require user confirmation via nonce-based approval flow.
Memory Model
Brain stores three kinds of memory, mirroring human memory structure:
| Type | What it stores | Storage |
|---|---|---|
| Episodic | Timestamped conversation history | SQLite + FTS5 |
| Semantic | Subjectβpredicateβobject facts | SQLite + HNSW vector |
| Procedural | Trigger β action patterns | SQLite |
Retrieval
Memory retrieval is hybrid β combining vector similarity (HNSW) with keyword matching (BM25 FTS5), fused via Reciprocal Rank Fusion (RRF). Results are reranked by importance and recency before being used as LLM context.
Forgetting Curve
retention = importance Γ e^(-decay_rate Γ hours_since_last_access)
High-importance, frequently-accessed memories persist indefinitely. Low-importance, stale memories decay and are pruned during nightly consolidation.
Namespaces
Memory can be scoped to a namespace (default: "personal"). Namespaces allow project-specific facts, clean separation of domains (work, personal, codebase), and residency policies (local_only vs any).
Consolidation
A background loop runs every 24 hours to:
- Prune low-retention memories using the forgetting curve
- Promote reinforced episodes into semantic facts (with idempotency guards)
- Apply data-residency enforcement
Memory Trust
Every memory stores its source agent. Brain supports provenance-weighted recall: per-agent trust weights determine how much agent-written memories influence context assembly. Unattested agent writes land quarantined until reviewed.
Capability System
Brainβs capability system is the unified substrate for everything the system can do.
Capability Registry
All capabilities are registered in a single ToolRegistry. Three producers feed into it:
- Native backends β built-in capabilities declared by each backend module
- MCP mounts β external MCP servers registered at runtime
- Skill packs β (future) declarative capability bundles
Every capability has:
- ID β unique identifier
- Safety tier β Read / Write / Execute / Destructive / External
- Preconditions β what must be true for it to work
- When-to-use β guidance for the LLM
- Embedding β semantic descriptor for hybrid retrieval
Capability Discovery
The IntentRouter and CapabilityIndex route intents to registered capabilities using hybrid (cosine + keyword) scoring. A learned CapabilityFitnessStore tracks per-tool success/failure and provides a bounded tiebreaker in ranking.
Runtime Health
Capabilities carry runtime health state:
- Verified β working normally
- Degraded β dependency unavailable (e.g., embedder down for
memory.store) - BreakerOpen β circuit breaker tripped
- PreconditionFailed β missing prerequisite
The capability digest renders all health state in chat, and tools/list annotates per-tool status.
Intent Taxonomy
Brain exposes 31 intent variants covering all user-facing actions. Intents are classified by the Thalamus using a regex fast-path with LLM fallback.
Intent Categories
| Category | Intents |
|---|---|
| Inspection | Recall, MemorySummary, SystemStatus, ProactivityStatus, BudgetStatus, List, TaskStatus, QueryAgents, QueryAudit, ChannelPreferences |
| Memory | StoreFact, Forget |
| Action | ExecuteCommand, WebSearch, SendMessage, DelegateTask |
| Lifecycle | Schedule, DecomposeTask, Cancel, OpenTerminalSession, CloseTerminalSession, MountMcpServer, UnmountMcpServer, ReconsentMcpServer |
| Governance | RespondToApproval, ApproveMemoryWriter, PruneAudit, SetChannelPreference, SetProactivity |
| Capability | ToolCall |
| Conversation | Chat |
Standardized Intent Tokens (SIT)
Every intent can be expressed as a typed IntentToken β a JSON object with a verb and optional object/modifiers. This enables programmatic intent dispatch alongside natural language classification.
Verb Vocabulary
The verb vocabulary is a compile-time constant set, cross-checked by tests that ensure every Intent variant resolves through the registry and matches its declared tier hint. Adding a verb requires code changes to the intent enum, auth mapping, handler, and tier hint β so a TOML registry would add friction without flexibility.
HTTP API
Brain exposes a REST API on port 19789 (default) for all operations.
Authentication
All endpoints (except /health) require an API key via the Authorization: Bearer <key> header. The key is generated at brain init and stored in ~/.brain/config.yaml.
Endpoints
Health
GET /health
Returns 200 OK when the daemon is running.
Signals
POST /v1/signals
Content-Type: application/json
Authorization: Bearer <api_key>
{
"content": "Remember I like Rust",
"namespace": "personal",
"source": "curl",
"channel": "http",
"sender": "tester"
}
Memory
POST /v1/memory/search
Authorization: Bearer <api_key>
{
"query": "Rust",
"namespace": "personal",
"top_k": 5
}
POST /v1/memory/store
Authorization: Bearer <api_key>
{
"content": "User prefers dark mode",
"namespace": "personal",
"kind": "fact"
}
Webhooks
POST /v1/webhooks/:id
UI
GET /ui # Live dashboard
GET /ui/approvals
GET /ui/memory
GET /ui/audit
Metrics
GET /metrics # Prometheus-formatted metrics
Adapter ports
| Adapter | Port | Host |
|---|---|---|
| HTTP | 19789 | 127.0.0.1 |
| WebSocket | 19790 | 127.0.0.1 |
| MCP HTTP | 19791 | 127.0.0.1 |
| gRPC | 19792 | 127.0.0.1 |
| Terminal | 19793 | 127.0.0.1 |
WebSocket API
Brain exposes a WebSocket API on port 19790 for streaming interactions.
Connection
const ws = new WebSocket('ws://localhost:19790');
Message format
Messages are JSON with the following structure:
{
"content": "remember my favorite editor is Neovim",
"namespace": "personal",
"source": "web-client"
}
The server streams responses as JSON-encoded SignalResponse messages.
gRPC API
Brain exposes a gRPC API on port 19792 for the memory service.
The gRPC adapter provides the same memory operations available via HTTP, with the efficiency of binary protobuf serialization. This is the recommended transport for programmatic clients doing high-volume memory operations.
MCP Integration
Any MCP-compatible client can connect to Brain as a stdio MCP server.
Configuration
Add to your MCP client config:
{
"mcpServers": {
"brain": {
"command": "brain",
"args": ["mcp"]
}
}
}
Tools exposed via MCP
| Tool | Arguments | Description |
|---|---|---|
memory_search | query, top_k?, namespace? | Semantic search of Brain memory (facts & episodes) |
memory_store | subject, predicate, object | Store a structured semantic fact |
memory_facts | subject | Retrieve all stored facts about a subject |
memory_episodes | limit?, namespace? | Retrieve recent conversation episodes |
user_profile | β | Retrieve user profile & Brain OS configuration |
memory_procedures | action, trigger?, steps? | Manage learned procedures (list/store/delete) |
brain_capabilities | β | List Brainβs live capability manifest |
MCP Host
Brain can also mount external MCP servers as capabilities. Configure servers in ~/.brain/config.yaml:
mcphost:
servers:
- name: "filesystem"
transport: "stdio"
command: "npx"
args: ["-y", "@modelcontextprotocol/server-filesystem"]
Mounted serversβ tools are registered in the capability manifest alongside native tools, available to every face (CLI, chat, HTTP, etc.).
Configuration
Brainβs configuration lives in ~/.brain/config.yaml. Config precedence (highest wins):
- Env vars prefixed
BRAIN_with__separator (e.g.BRAIN_LLM__API_KEY=β¦) ~/.brain/config.yaml- Embedded defaults (
crates/core/default.yaml)
LLM Configuration
Single provider:
llm:
provider: "ollama"
base_url: "http://localhost:11434"
model: "qwen2.5-coder:7b"
Multi-provider pool (actual default):
llm:
temperature: 0.7
max_tokens: 4096
context_window: 8192
providers:
- name: ollama
kind: ollama
base_url: "http://localhost:11434"
model: "qwen2.5-coder:7b"
preferred_models: ["qwen2.5-coder:7b", "llama3.1:8b"]
Model tiers (per-task routing):
llm:
tiers:
fast: ["local"] # classification, importance, compaction
deep: ["cloud", "local"] # chat, decompose, tool loop
Unset tiers alias the default chain. Unknown names fail closed at startup.
Adapter Configuration
adapters:
http: { enabled: true, host: "127.0.0.1", port: 19789, cors: true }
ws: { enabled: true, port: 19790 }
mcp: { enabled: true, port: 19791 }
grpc: { enabled: true, port: 19792 }
terminal: { enabled: true, port: 19793 }
Memory Namespaces
memory:
namespaces:
personal: { residency: any }
private: { residency: local_only } # never leaves your machine
work: { residency: any }
Monitoring
monitoring:
services: [] # external health-check endpoints
connectivity:
enabled: true
interval_secs: 60
timeout_secs: 5
power:
enabled: true
interval_secs: 60
defer_maintenance: true
manifest_health:
enabled: true
interval_secs: 120
Running & Deployment
Daemon lifecycle
brain start # Start (or via service if installed)
brain stop # Stop
brain status # Health check
brain tail # Stream events (observability)
Service management
brain service install # launchd (macOS) / systemd (Linux) / Task Scheduler (Windows)
brain service uninstall # Remove auto-start
Docker
Brain can run with Docker for the optional SearXNG web search backend:
brain deps up # Start SearXNG
brain deps status # Check
brain deps down # Stop
Data layout
Paths created at brain init:
~/.brain/config.yamlβ user config~/.brain/db/brain.dbβ SQLite database~/.brain/ruvector/β HNSW vector store~/.brain/vault/β encrypted credentials~/.brain/logs/brain.log
Encryption
Enable encryption-at-rest with brain init --encrypt. This uses AES-256-GCM with Argon2id key derivation. Encrypted exports are the default when at-rest encryption is enabled.
Security
Brainβs security model is built on layered guarantees.
Authentication
Every API request requires an API key (generated at brain init). Keys can be scoped to specific agents with limited permissions.
Authorization tiers
All capabilities are tagged with a safety tier:
| Tier | Examples | Requires confirmation? |
|---|---|---|
| Read | Memory search, status, audit query | No |
| Write | Store fact, set preference | No |
| Execute | Run command, web search | Yes (nonce-based) |
| Destructive | Delete memory, prune audit | Yes + budget check |
| External | Send message, delegate task | Yes + cost check |
Confirmation engine
Destructive and external actions require a nonce-based approval flow. The engine supports:
- Standing approvals (with optional TTL and scope)
- Confirmation timeouts (pauses when user is away)
- Cross-channel confirmation correlation
Audit trail
Every action is recorded in an append-only SQLite audit trail with immutable triggers. The audit covers who did what, when, and the authorization decision.
Sandbox
Command execution runs in a sandbox with:
- Process-group SIGKILL on timeout
- Binary allowlist
- rlimits (CPU, address space, file count, file size)
- macOS
sandbox-exec/ Linuxunsharenetwork isolation
Data residency
Namespaces can be marked local_only, preventing their data from reaching any non-local LLM provider. Enforcement happens at every egress point β recall, embedding, export.
Credential vault
Secrets are stored in the OS-native keychain (macOS Keychain, Linux Secret Service) with an encrypted-file fallback (Argon2id + AES-256-GCM).
Export & Import
Export
brain export # Plaintext JSON export
brain export --encrypt # Encrypted envelope (default when at-rest encryption is on)
brain export --output file.json
The export envelope includes:
- All episodic and semantic memories
- Procedural memory (trigger patterns)
- Memory namespaces and residency markers
- Config metadata (no secrets)
Encrypted exports are self-contained envelopes using AES-256-GCM with a fresh Argon2id-derived key and embedded salt β portable across machines.
Import
brain import file.json # Import plaintext
brain import file.enc # Import encrypted (auto-detected)
Imports preserve namespace boundaries. Encrypted imports prompt for passphrase.
Contributing
We welcome contributions! The project is organized as a Rust workspace with 29 crates.
Getting started
git clone https://github.com/keshavashiya/brain.git
cd brain
cargo build --workspace
cargo test --workspace
Development tools
just build # Build workspace (debug)
just test # Run all tests
just ci # fmt + clippy + tests
just fmt # Format code
just lint # Clippy
just serve-dev # Start with debug logging
PR checklist
cargo fmt --all --checkcargo clippy --workspace --all-targets -- -D warningscargo test --workspace- One intent per PR
- Conventional commits (
feat:,fix:,docs:, etc.)
Key conventions
- Single-word crate names matching folder names
brainos-package prefix for crates.io- Every capability lives in its backend crate, not in
cli - Operator commands (init, doctor, service, vault, config) stay CLI-only
- CI parity enforced before every push
Documentation
- Public docs are at keshavashiya.github.io/brain
- Root
ARCHITECTURE.mdcovers the high-level design CHANGELOG.mdtracks user-facing changes
Codebase Conventions
Crate naming
| Layer | Rule | Example |
|---|---|---|
Folder (crates/<name>/) | Single word, lowercase | crates/mcphost/ |
| Package name | brainos-<word> | brainos-mcphost |
| Workspace alias | Single word, matches folder | mcphost = { workspace = true } |
| Rust import | brainos_<word> | use brainos_mcphost::RmcpHost; |
Workspace deps
- All internal crates declared in root
[workspace.dependencies] - Consumers always use
<name> = { workspace = true } - Workspace version locked across all crates
Comments + docs
- No internal labels (Phase / PR references) in source code
- No stale feature references in docs
- PR memos and commit messages may use internal labels freely
CI parity
Every push runs: cargo fmt --all --check + cargo clippy --workspace --all-targets -- -D warnings + cargo test --workspace + cargo check --workspace --no-default-features.
Release Process
Versioning
Brain follows semantic versioning. The workspace version is locked across all 29 crates.
Release pipeline
Releases are driven by scripts/release.sh (local) and .github/workflows/release.yml (CI):
Local (human-driven)
scripts/release.sh X.Y.Z # Validate β CI check β publish β tag
scripts/release.sh X.Y.Z --dry-run # Dry run (no publish)
scripts/release.sh X.Y.Z --skip-ci # Skip CI check (re-runs)
Steps:
- Validates clean tree, version match, populated CHANGELOG
- Runs CI parity (fmt + clippy + tests)
- Publishes all crates in dependency order via
scripts/publish-order.sh - Creates annotated
vX.Y.Ztag and pushes
CI (automated off the pushed tag)
Triggered by pushing a v* tag:
- Builds
brain-<target>.tar.gz+.sha256for macOS/Linux (x86_64 + aarch64) - Generates SPDX SBOM
- Creates GitHub Release with binaries + checksums
Changelog
Every release requires an updated CHANGELOG.md with the [X.Y.Z] section populated. Extract release notes with:
scripts/changelog-extract.sh X.Y.Z
Product Vision
Brain OS is a biologically-inspired, central cognitive engine β the persistent memory and mediation layer between AI tools and the userβs world.
The vision in one sentence
Your AI tools share one localized, ever-growing memory that runs 24/7 on your machine β and they all play by the same rules.
Core principles
- Local-first, always β Your data never leaves your hardware. There is no account, no cloud, no telemetry.
- One capability ontology β A capability is a typed entry in one registry. CLI, HTTP, MCP, and the reasoner are faces over it.
- Memory that earns its place β Importance scoring, forgetting curves, and consolidation keep the signal sharp.
- Fail safe, never silently β Every error path is explicit. Degraded-but-functional is the target.
- Open to any LLM β Ollama, OpenAI, OpenRouter, or any OpenAI-compatible endpoint.
Version arc
| Version | Focus | Status |
|---|---|---|
| v0.1.0 | Memory layer (episodic + semantic + procedural) | β Released |
| v0.2.0 | Autonomous agent layer (safety, isolation, orchestration, delegation) | β Released |
| v0.3.0 | Natural language interface (chat, intents, approvals) | β Released |
| v0.4.0 | Wire pillars + fix stubs (all 30+ crates wired) | β Released |
| v0.5.0 | Structural polish (release automation, capability coherence) | β Released |
| v0.6.0 | The Connector (SDKs, connector protocol, situated kernel) | π΄ In progress |
| v0.7.0 | Skill packs (declarative capability bundles) | π΄ Planned |
| v0.8.0 | Multi-device (cuttable) | π΄ Planned |
| v0.9.0 | Brain Studio (trust console) | π΄ Planned |
| v1.0.0 | Launch β the private home for your AI life | π΄ Planned |
What Brain is not
- A cloud service
- A platform integration hub
- A consumer product with a polished GUI
- A replacement for tool-specific context windows
The web UI at /ui is a diagnostic tool for power users, not the primary interface.
Roadmap
This page summarizes the public-facing plan.
Active: Close the Loops
The current development focus is closing feedback loops across the system. Derived from a four-lens architecture review, the work is organized into tracks:
Track 0 β Ship & hygiene β
Tag and publish v0.5.0, fence tool outputs as untrusted, quarantine MCP servers on hash change, add clock to prompt, fix architecture doc drift.
Track 1 β Trustworthy substrate
Namespace data-residency enforcement, memory-trust (provenance-weighted recall + unattested-writer quarantine), grants ledger, TTL/scoped standing approvals, encrypted export, semantic capability retrieval. Pre-requisite for inviting third-party connectors and writers.
Track 2 β Situated kernel
Hardware self-model, model tiers with per-task routing, connectivity and power awareness as kernel state, observation-to-graph mirroring, live manifest health, per-turn telemetry.
Track 3 β Knowing companion
Discovery reflex, learned-normal monitoring, answer-quality fitness, project/workspace model, composable config.
Track 4 β Human surface
Studio as trust console, goal model, presence awareness.
After v1.0.0
- Multi-device CRDT sync (v1.1)
- IDE integration (v1.1)
- Dual-memory unification (v1.1)