Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Configuration

Brain’s configuration lives in ~/.brain/config.yaml, generated by brain init. Every key has a safe default, so you only set what you want to change.

Precedence (highest wins):

  1. Env vars prefixed BRAIN_ with __ as the section separator (e.g. BRAIN_LLM__API_KEY=…, BRAIN_ADAPTERS__HTTP__PORT=8080)
  2. ~/.brain/config.yaml
  3. Embedded defaults (crates/core/default.yaml)

This page is a complete reference to every section. Each block is independent — copy the ones you want to experiment with into your config and leave the rest out.


LLM

Brain probes each provider entry at startup, picks the first reachable one, and fails over to the next on rate-limit or error.

llm:
  temperature: 0.7
  max_tokens: 4096
  # The active model's input context window, in tokens. Drives how much
  # file/attachment + memory content the prompt assembler packs in. Raise to
  # your model's real size (e.g. 32768, 128000) so large-window models read in
  # detail instead of clipping to the conservative 8k default.
  context_window: 8192

  providers:
    - name: ollama
      kind: ollama                 # ollama | groq | openai | openrouter | deepseek | together | gemini-compat
      base_url: "http://localhost:11434"
      model: "qwen2.5-coder:7b"
      preferred_models: ["qwen2.5-coder:7b", "llama3.1:8b"]
    # - name: groq
    #   kind: groq
    #   api_key: "gsk_..."
    #   model: "llama-3.3-70b-versatile"
    #   preferred_models: ["llama-3.3-70b-versatile", "llama-3.1-8b-instant"]
    # - name: openrouter
    #   kind: openrouter
    #   api_key: "sk-or-..."
    #   model: "meta-llama/llama-3.1-8b-instruct:free"

  # Legacy single-provider fallback — only used when `providers` is empty.
  provider: "ollama"
  model: "qwen2.5-coder:7b"
  base_url: "http://localhost:11434"
  api_key: ""

Model tiers (per-task routing)

Each tier is an ordered failover chain of provider names from the list above. Kernel chores (classification fallback, importance scoring, history compaction, web-search synthesis, background nudges) use fast; chat and task decomposition use deep; everything else uses balanced.

llm:
  tiers:
    fast: ["ollama"]               # keep chores fully local
    deep: ["openrouter", "ollama"] # cloud for chat, local fallback

An empty or omitted tier aliases the default chain, so leaving the block out changes nothing. Putting a local provider in fast guarantees those chores never leave your machine even when chat rides a cloud provider. Unknown provider names fail closed at startup.


Embedding

Run ollama pull nomic-embed-text before first start. dimensions must match the model’s actual output size exactly.

embedding:
  model: "nomic-embed-text"
  dimensions: 768

Memory

memory:
  semantic:
    similarity_threshold: 0.65     # min cosine similarity for a semantic hit
    max_results: 20
  search:
    rrf_k: 60                      # Reciprocal Rank Fusion constant
    pre_fusion_limit: 50           # candidates from each source (BM25, ANN) before fusion
    importance_weight: 0.3         # weight for importance in final reranking
    recency_weight: 0.2            # weight for recency in final reranking
    decay_rate: 0.01               # forgetting-curve decay (higher = faster forgetting)
  consolidation:
    enabled: true
    interval_hours: 24
    forgetting_threshold: 0.05     # memories below this strength are dropped

Namespaces (data residency)

A namespace marked local_only never reaches a non-local provider: its memories are withheld from prompts bound for remote LLMs, embedded only by a loopback embedder, and marked in exports. An entry also covers its name/… sub-namespaces. Store into one with namespace: private on any client, or via a transport’s namespace setting.

memory:
  namespaces:
    private:
      residency: local_only        # any | local_only
    # work:
    #   residency: any

Per-agent memory trust

Recall scoring multiplies each memory’s score by the trust weight [0–1] of the agent that wrote it, so a low-trust agent’s memory cannot dominate context assembly no matter how its content is crafted. Memories from your own input always weigh 1.0.

memory:
  trust:
    default_agent_trust: 1.0
    # agents:
    #   some-external-agent: 0.4

Encryption

At-rest encryption of the local stores. Run brain init --encrypt to generate a salt and enable it.

encryption:
  enabled: false

Security

The sandbox that governs Brain’s own command execution and filesystem reads.

security:
  # Binaries the sandbox may execute. Intentionally narrow — read-only
  # inspection plus the toolchain. To run anything else (docker, brew, ssh,
  # custom scripts), add it here explicitly.
  exec_allowlist:
    - ls
    - cat
    - grep
    - git
    - cargo
    # `sh` enables the shell-wrapped tier for commands with pipes/redirects.
    # When used via that tier the per-binary allowlist is bypassed for the
    # wrapped command; rlimits, network deny, timeout, and the forbidden list
    # still apply.
    - sh
  exec_timeout_seconds: 30
  # Roots that read-only filesystem inspection may touch. Empty defaults to
  # $HOME. Set explicit entries like ["~/code", "~/work"] to restrict further.
  # Paths outside any allowed root (after canonicalization) are rejected.
  allowed_paths: []

Actions

What Brain is allowed to do in the world. Most actions are off by default.

actions:
  web_search:
    enabled: true
    provider: "duckduckgo"         # duckduckgo | searxng | tavily | custom
    endpoint: "http://localhost:8888"  # searxng/custom only
    api_key: ""                    # required for tavily
    timeout_ms: 3000
    default_top_k: 5

  scheduling:
    enabled: false                 # WRITE axis: lets Brain create/persist scheduled
                                   # intents. Firing them is the FIRE axis — see
                                   # `reflex.cron` below. Both are required to run.
    mode: "persist_only"

  messaging:
    enabled: false
    timeout_ms: 3000
    channels: {}
    # Webhook channels work for Discord, Telegram, Slack, or any HTTP endpoint.
    # Template vars: {{channel}} {{recipient}} {{content}} {{namespace}} {{timestamp}}
    #
    #   discord:
    #     url: "https://discord.com/api/webhooks/<ID>/<TOKEN>"
    #     body: '{"content": "{{content}}"}'
    #     headers: {}
    #   telegram:
    #     url: "https://api.telegram.org/bot<TOKEN>/sendMessage"
    #     body: '{"chat_id": "<CHAT_ID>", "text": "{{content}}", "parse_mode": "Markdown"}'
    #     headers: {}

  resilience:
    max_retries: 2
    retry_base_ms: 500
    circuit_breaker_threshold: 5
    circuit_breaker_cooldown_secs: 60

Proactivity

Whether and how Brain reaches out to you on its own.

proactivity:
  enabled: true
  max_per_day: 2
  min_interval_minutes: 60
  quiet_hours:
    start: "20:00"
    end: "10:00"
    timezone: "UTC"                # IANA timezone, e.g. "America/New_York"
  delivery:
    outbox: true
    broadcast: true
    webhook_channels: []           # channel keys from actions.messaging.channels
    max_outbox_age_days: 7
  open_loop:                       # detect unresolved threads and follow up
    enabled: true
    scan_window_hours: 72
    resolution_window_hours: 24
    check_interval_minutes: 120

Adapters

The network surfaces Brain exposes. Disable any you don’t use.

adapters:
  http:     { enabled: true, host: "127.0.0.1", port: 19789, cors: true }
  ws:       { enabled: true, port: 19790 }
  mcp:      { enabled: true, port: 19791 }
  grpc:     { enabled: true, port: 19792 }
  terminal: { enabled: true, port: 19793 }

Reflex (reactive signal sources)

Default is empty — no reflex tasks spawn unless configured here. Each firing becomes a Signal and flows through the normal pipeline (identity, confirmation, dispatch).

reflex:
  fs: []                           # filesystem watchers, one entry per path set
  # fs:
  #   - name: project-watch
  #     paths: ["~/notes", "~/projects"]
  #     recursive: true
  #     debounce_ms: 200

  cron:
    enabled: false                 # FIRE axis: fires due scheduled_intents through
                                   # the pipeline. Required for actions.scheduling
                                   # intents to ever run.
    poll_interval_seconds: 60

  sys:
    enabled: false                 # edge-triggered system state
    poll_interval_seconds: 30
    rules: []
    # Kinds (all edge-triggered — they fire on a transition, not a level):
    #   battery_below (needs `threshold`), on_ac_changed — platform power source
    #     (pmset on macOS, /sys on Linux).
    #   network_changed — the kernel's online/offline view; needs
    #     `monitoring.connectivity` enabled with targets, else it never flips.
    #   lock_changed — systemd-logind (`LockedHint`) on Linux, CoreGraphics
    #     (`CGSSessionScreenIsLocked`) on macOS; inert where no GUI session is
    #     reachable (headless / ssh).
    # rules:
    #   - kind: battery_below
    #     threshold: 20
    #   - kind: on_ac_changed
    #   - kind: network_changed
    #   - kind: lock_changed

Logging

Drives the tracing subscriber. RUST_LOG still overrides the computed filter at runtime. Long-running services (serve, mcp) log to a rotating file at ~/.brain/logs/brain.log; one-shot commands log to stderr.

logging:
  level: "info"                    # base level for the `brain` target
  format: "pretty"                 # "pretty" (human) or "json" (structured)
  rotation: "daily"                # "daily" | "hourly" | "never"
  targets: {}                      # per-subsystem overrides
  # targets:
  #   hippocampus: "debug"
  #   signal: "info"

Learning (capability fitness)

Brain records whether each tool succeeds or fails, decays those observations under the forgetting curve, and uses them as a tie-breaker when ranking the tools it offers the chat model. Awareness only — execution stays consent-gated.

learning:
  capability_fitness:
    enabled: true
    half_life_days: 30             # how long an observation keeps half its weight

Observability

A background task samples process RSS, CPU, open SQLite connections, and ~/.brain disk usage; crossing a ceiling emits a ResourcePressure event (visible in brain tail, brain doctor --deep, and /status). Ceilings are generous and fail-safe — set any threshold to 0 to disable it.

observability:
  resource_sample_secs: 30
  thresholds:
    rss_mb: 2048                   # resident-set-size ceiling (MiB)
    cpu_pct: 90.0                  # process CPU ceiling (% single-core basis)
    disk_mb: 10240                 # ~/.brain disk-usage ceiling (MiB)
    open_fds: 1024                 # open file-descriptor ceiling (fd-leak warning)
  log_sampling:
    high_volume_1_in_n: 1          # emit 1 in N high-volume log lines; 1 = log all

Monitoring

External service health

Each entry spawns one bounded probe loop (HTTP GET or raw TCP connect). Probes are edge-triggered: a notification fires only when a service crosses between reachable and unreachable, never once per interval. Empty by default.

monitoring:
  services: []
  # - name: ollama
  #   kind: http                   # http | tcp
  #   target: "http://localhost:11434/api/tags"
  #   interval_secs: 60
  #   timeout_secs: 10
  #   expect_status: 200           # http only; omit to accept any 2xx
  # - name: postgres
  #   kind: tcp
  #   target: "127.0.0.1:5432"
  #   interval_secs: 30

Connectivity

The kernel’s online / degraded / offline view. Targets default to the configured remote LLM provider endpoints, so probing never adds a new egress destination; a fully-local install has nothing to probe and stays online. While offline, chat rides the first fully-local model tier and web search degrades with an honest explanation instead of timing out.

monitoring:
  connectivity:
    enabled: true
    interval_secs: 60
    timeout_secs: 5                # per-target TCP-connect timeout
    targets: []                    # host:port overrides; empty = derive from llm.providers

Power

The kernel’s external / battery view, read from the platform (pmset on macOS, /sys/class/power_supply on Linux; no network). While on battery, heavy background maintenance holds until external power returns. Desktops and platforms without a readable power source stay pinned external.

monitoring:
  power:
    enabled: true
    interval_secs: 60
    defer_maintenance: true        # hold consolidation/sweeps while on battery

Manifest health

A periodic sweep that stamps each registered capability verified / degraded / breaker-open by probing what it depends on (the embedding model, network connectivity, per-tool circuit breakers). The capability digest and tools/list annotate unhealthy tools so the reasoner never promises a faculty that is dead right now.

monitoring:
  manifest_health:
    enabled: true
    interval_secs: 120

Channel (relays & transports)

Bidirectional gateways that connect Brain to chat platforms. Unlike one-shot actions.messaging webhooks, these are long-lived connections — approval responses from any channel are correlated automatically.

channel:
  # Long-lived WebSocket gateways.
  relays: []
  # - id: telegram
  #   label: "Telegram"
  #   url: "ws://127.0.0.1:7000/brain"
  #   namespace: "personal"
  #   api_key: ""
  #   initial_backoff_ms: 1000
  #   max_backoff_ms: 60000

  # Generic preset-driven transports (http_polled / webhook_inbound /
  # webhook_outbound). Each names a preset that ships embedded under
  # crates/channel/presets/ (discord, slack, telegram) or lives at
  # ~/.brain/presets/<id>.yaml.
  transports: []
  # - id: chat-main
  #   label: "Telegram"
  #   preset: telegram
  #   namespace: personal
  #   credential: "<bot-token-or-webhook-url>"   # plugged into the preset's templates
  #   signing_secret: "<hmac-or-pubkey>"         # webhook_inbound presets only

Agents (delegation)

Specialist CLI agents the orchestrator hands multi-step work to. Auto-discovery (on by default) finds well-known agents on $PATH without manual entries; use delegates for bespoke binaries.

agents:
  delegates: []
  fallbacks: []
  retry_on_timeout: true
  auto_discovery: true             # find claude_code, aider, cursor, … on $PATH
  # delegates:
  #   - name: script
  #     kind: subprocess
  #     binary: "/usr/local/bin/my-agent"
  #     args: ["--task", "{task_id}"]
  #     prompt_via_stdin: true
  #     tags: ["custom"]
  #
  # Per-agent overrides for the auto-discovered registry, keyed by canonical id
  # (claude_code, aider, cursor, …). Every field is optional.
  # discovery_overrides:
  #   claude_code:
  #     binary: "/opt/homebrew/bin/claude"
  #     args: ["--print", "--task", "{task_id}"]
  #     prompt_via_stdin: true
  #     disabled: false
  #     capabilities:
  #       tags: ["code-edit", "plan", "rust"]
  #       languages: ["rust", "typescript"]
  #       max_concurrency: 2
  #       needs_network: true

Access (API keys & rate limiting)

A random key is generated on brain init and printed once to stdout.

access:
  api_keys: []
  # - key: "brk_..."                  # `brain init` generates one in this format
  #   name: "laptop"
  #   permissions: ["read", "write"]   # read | write | export | admin
  #   agent_id: "my-laptop"            # binds the key to an identity principal
  rate_limit:
    enabled: true
    tokens_per_refill: 60
    refill_interval_ms: 60000
    burst_capacity: 20

Scopes do not imply each other — write does not grant read; list both if needed. admin is an implicit superset of every scope.


Confirm (standing approvals)

Pre-authorize specific (agent, verb) pairs so they don’t prompt for human confirmation every time. Loaded at startup into the standing-approval store and idempotent across launches (an existing active grant for the same triple is left alone).

confirm:
  standing_approvals: []
  # - agent_id: "my-laptop"
  #   verb_ns: "net"
  #   verb_action: "http"
  #   note: "trusted automation"

Identity (principals & authorization)

Binds requests to a principal and constrains what each may do. Default is empty — signals carry no principal and the identity gate is silently skipped.

Risk tiers, ordered by escalating risk: read < write < execute < destructive < external. Only destructive and external block on explicit human approval.

identity:
  user_id: ""
  principals: []
  # - agent_id: "my-laptop"
  #   tier: execute               # read | write | execute | destructive | external
  #   scopes: ["memory", "net"]
  #   # Path prefixes the principal may read/write. Empty = no path-scoped ops.
  #   path_allowlist: ["~/code", "~/work"]
  #   # General per-(verb, modifier) allowlists. Empty = unconstrained beyond
  #   # tier/scope/path. Opt-in.
  #   constraints:
  #     - verb: "net.http"         # exact, "net.*" wildcard, or "*"
  #       modifier: "host"         # request modifier key, e.g. host / command
  #       match_kind: host_suffix  # exact | prefix | host_suffix
  #       allow: ["example.com"]   # empty = deny everything for this (verb, modifier)

Storage

Internal defaults — safe to leave unchanged.

brain:
  version: "0.5.0"
  data_dir: "~/.brain"

storage:
  ruvector_path: "~/.brain/ruvector/"
  sqlite_path: "~/.brain/db/brain.db"
  hnsw:
    ef_construction: 200
    m: 16
    ef_search: 50
    # HNSW pre-allocates the index graph for max_elements up-front, so this is
    # a real memory cost. 100k covers personal-scale installs; raise to
    # 1_000_000+ for a team or large corpus.
    max_elements: 100000