Skip to main content

Zenii Architecture

Table of Contents


System Architecture

Zenii Architecture

Data Flow

Crate Dependency Graph

Project Structure

zenii/
├── Cargo.toml # Workspace root (5 members)
├── CLAUDE.md # AI assistant instructions
├── README.md # Project documentation
├── scripts/
│ └── build.sh # Cross-platform build script
├── docs/
│ ├── architecture.md # This file
│ └── processes.md # Process flow diagrams
├── crates/
│ ├── zenii-core/ # Shared library (NO Tauri dependency)
│ │ ├── src/
│ │ │ ├── lib.rs # Module exports + Result<T> alias
│ │ │ ├── error.rs # ZeniiError enum (30 variants, thiserror)
│ │ │ ├── boot.rs # init_services() -> Services -> AppState, single boot entry point
│ │ │ ├── config/ # TOML config (schema + load/save + OS paths)
│ │ │ ├── db/ # rusqlite pool + WAL + migrations + spawn_blocking
│ │ │ ├── event_bus/ # EventBus trait + TokioBroadcastBus (13 events)
│ │ │ ├── memory/ # Memory trait + SqliteMemoryStore (FTS5 + vectors) + InMemoryStore
│ │ │ ├── credential/ # CredentialStore trait + KeyringStore + FileCredentialStore + InMemoryCredentialStore
│ │ │ ├── security/ # SecurityPolicy + AutonomyLevel + rate limiter + audit log
│ │ │ ├── tools/ # Tool trait + ToolRegistry (DashMap) + 19 built-in tools (16 base + 3 feature-gated)
│ │ │ ├── ai/ # AI agent (rig-core), providers, session manager, tool adapter, context engine, delegation
│ │ │ │ └── delegation/ # Coordinator, SubAgent, DelegationTask, dependency-wave execution
│ │ │ ├── workflows/ # WorkflowRegistry, WorkflowExecutor, StepRuntime, templates (feature-gated)
│ │ │ ├── gateway/ # axum HTTP+WS gateway (105 base + 28 feature-gated = 133 routes, auth middleware, error mapping, ZENII_VALIDATION)
│ │ │ ├── identity/ # SoulLoader + PromptComposer + defaults (SOUL/IDENTITY/USER.md)
│ │ │ ├── skills/ # SkillRegistry + bundled/user skills (markdown + YAML frontmatter)
│ │ │ ├── user/ # UserLearner + SQLite observations + privacy controls
│ │ │ ├── channels/ # Channel traits + registry + 3 adapters (Telegram/Slack/Discord, feature-gated)
│ │ │ │ ├── mod.rs # Module exports with feature gates
│ │ │ │ ├── traits.rs # Channel, ChannelLifecycle, ChannelSender traits
│ │ │ │ ├── message.rs # ChannelMessage with builder pattern
│ │ │ │ ├── registry.rs # ChannelRegistry (DashMap-backed)
│ │ │ │ ├── protocol.rs # ConnectorFrame wire protocol
│ │ │ │ ├── telegram/ # TelegramChannel + config + formatting
│ │ │ │ ├── slack/ # SlackChannel + API helpers + formatting
│ │ │ │ └── discord/ # DiscordChannel + config
│ │ │ └── scheduler/ # Cron + scheduled tasks, feature-gated (Phase 8)
│ │ └── tests/ # Integration tests
│ ├── zenii-desktop/ # Tauri 2.10 shell (desktop)
│ │ ├── Cargo.toml # tauri 2.10, 4 plugins, devtools feature
│ │ ├── build.rs # tauri_build::build()
│ │ ├── tauri.conf.json # 1280x720, CSP, com.sprklai.zenii
│ │ ├── capabilities/default.json
│ │ ├── icons/ # 7 icon files
│ │ └── src/
│ │ ├── main.rs # Entry + Linux WebKit DMA-BUF fix
│ │ ├── lib.rs # Builder: plugins, tray, IPC, close-to-tray
│ │ ├── commands.rs # 4 IPC + boot_gateway() + 7 tests
│ │ └── tray.rs # Show/Hide/Quit menu + 1 test
│ ├── zenii-cli/ # clap CLI
│ ├── zenii-tui/ # ratatui TUI
│ └── zenii-daemon/ # Headless daemon (full gateway server)
└── web/ # Svelte 5 frontend (SPA)
├── src/
│ ├── app.css # Tailwind v4 + shadcn theme tokens
│ ├── app.html # SPA shell
│ ├── lib/
│ │ ├── api/ # HTTP client + WebSocket manager
│ │ ├── tauri.ts # isTauri detection + 4 invoke wrappers
│ │ ├── components/
│ │ │ ├── ai-elements/ # svelte-ai-elements (9 component sets)
│ │ │ ├── ui/ # shadcn-svelte primitives (14 component sets)
│ │ │ ├── AuthGate.svelte
│ │ │ ├── ChatView.svelte
│ │ │ ├── Markdown.svelte
│ │ │ ├── SessionList.svelte
│ │ │ └── ThemeToggle.svelte
│ │ ├── stores/ # 7 Svelte 5 rune stores ($state, includes channels)
│ │ ├── paraglide/ # i18n (paraglide-js, 8 locales, 577 keys)
│ │ └── utils.ts # shadcn utility helpers
│ └── routes/ # 9 SPA routes
│ ├── +page.svelte # Home
│ ├── chat/+page.svelte # New chat
│ ├── chat/[id]/+page.svelte # Existing session
│ ├── memory/+page.svelte # Memory browser
│ ├── schedule/+page.svelte # Placeholder (Phase 8)
│ ├── settings/+page.svelte # General settings
│ ├── settings/providers/ # Provider config
│ ├── settings/channels/ # Channel credential + connection management
│ └── settings/persona/ # Identity + skills editor
├── package.json
└── vitest.config.ts # 26 unit tests (vitest)

Default Paths by OS

Resolved via directories::ProjectDirs::from("com", "sprklai", "zenii").

Source: crates/zenii-core/src/config/mod.rs

OSConfig PathData Dir / DB Path
Linux~/.config/zenii/config.toml~/.local/share/zenii/zenii.db
macOS~/Library/Application Support/com.sprklai.zenii/config.toml~/Library/Application Support/com.sprklai.zenii/zenii.db
Windows%APPDATA%\sprklai\zenii\config\config.toml%APPDATA%\sprklai\zenii\data\zenii.db

Override in config.toml:

data_dir = "/custom/data/path"        # overrides default data directory
db_path = "/custom/path/zenii.db" # overrides database file directly

Feature Flag Composition

Trait-Driven Architecture

All major subsystems are abstracted behind traits, allowing swappable implementations for testing, migration, and scaling.

All binary crates receive these traits via AppState (Clone + Arc<T>), never concrete types.

Credential System

Per-Binary Keyring Access

BinaryKeyring AccessNotes
DesktopDirectTauri 2 has full OS access
MobileDirectTauri 2 mobile has keychain access
CLIDirectRuns as user process
TUIVia gatewayConnects to daemon over HTTP
DaemonDirectHeadless, runs as service

Fallback Chain

At boot, the credential store is selected via a three-tier fallback:

  1. KeyringStore — OS keyring (macOS Keychain, Windows Credential Manager, Linux Secret Service). Preferred when available.
  2. FileCredentialStore — AES-256-GCM encrypted JSON file at {data_dir}/credentials.enc. Key derived from SHA-256 of machine characteristics (hostname, username, data_dir, service_id). Activated when keyring is unavailable (e.g., macOS after binary recompilation changes code signature).
  3. InMemoryCredentialStore — Volatile RAM-only store. Last resort when both persistent stores fail.

All credential values are wrapped with zeroize for secure memory cleanup.

Provider Registry

The ProviderRegistry manages AI provider configurations (OpenAI, Anthropic, Gemini, OpenRouter, Vercel AI Gateway, Ollama, and custom providers). It is DB-backed with 6 built-in providers seeded on first boot. Any OpenAI-compatible API endpoint can be added as a custom provider — just supply a base_url and API key.

Credential Key Naming Convention

ScopePatternExamples
AI Provider API Keysapi_key:{provider_id}api_key:openai, api_key:tavily, api_key:brave
Channel Credentialschannel:{channel_id}:{field}channel:telegram:token, channel:slack:bot_token

Messaging Channels System

The channels module provides trait-based messaging integration with external platforms. Each channel is feature-gated and managed through a concurrent ChannelRegistry.

Feature Flags

FeatureDepends OnAdds
channels(none)Core channel traits + registry + gateway routes
channels-telegramchannelsTelegramChannel + teloxide dependency
channels-slackchannelsSlackChannel (uses existing reqwest/tungstenite)
channels-discordchannelsDiscordChannel + serenity dependency
workflows(none)WorkflowRegistry + WorkflowExecutor + petgraph + minijinja + 7 gateway routes

Identity / Soul System

Identity defines the AI assistant's personality, tone, and behavior through 3 markdown files with YAML frontmatter. All prompt content comes from .md files — zero hardcoded prompt strings in Rust code.

Identity File Format (IDENTITY.md)

---
name: Zenii
version: "2.0"
description: AI-powered assistant
---

# Identity details...
  • Storage: data_dir/identity/ (configurable via identity_dir in config.toml)
  • Bundled defaults: embedded via include_str!() at compile time, written to disk on first run
  • Reload: manual via POST /identity/reload endpoint (no notify dependency)
  • API: GET /identity, GET /identity/{name}, PUT /identity/{name}, POST /identity/reload

Skills System

Skills are instructional markdown documents loaded into the agent's context. They follow the Claude Code model — pure markdown with YAML frontmatter metadata, no parameter substitution.

Skill File Format (Claude Code model)

---
name: system-prompt
description: Generates effective system prompts for AI agents
category: meta
---

# System Prompt Generator

When creating system prompts, follow these principles:
...
  • No Tera/comrak: Skills are pure markdown context documents, not parameterized templates
  • 2 tiers: Bundled (compile-time) + User (disk). User skills with same id override bundled.
  • API: GET /skills, GET /skills/{id}, POST /skills, PUT /skills/{id}, DELETE /skills/{id}, POST /skills/reload
  • Bundled skills cannot be deleted — only user skills support DELETE

User Profile + Progressive Learning

Zenii learns user preferences over time via explicit observation API. Observations are stored in SQLite with category-based organization and confidence scoring.

  • USER.md: static user context template (part of identity system)
  • UserLearner: SQLite-backed observation store with CRUD operations
  • Observations: stored in user_observations table with category, key, value, confidence, timestamps
  • Privacy: learning toggled via config, denied categories block specific observation types, TTL auto-expires old observations
  • API: GET /user/observations, POST /user/observations, GET /user/observations/{key}, DELETE /user/observations/{key}, DELETE /user/observations, GET /user/profile

Context-Aware Agent System

The context engine provides 3-tier adaptive context injection that reduces token usage while keeping the agent contextually grounded.

Context Level Determination

ConditionLevelContent
New session (0 messages)FullBoot + runtime + identity + user + capabilities
Continuing (recent messages, within gap)MinimalOne-liner: "Zenii — AI assistant | date | OS | model"
Gap exceeded (> N minutes since last msg)FullSame as new session
Message count threshold exceededFullSame as new session
Resumed session with prior messagesSummaryFull + prior conversation summary
Toggle disabledFallbackConfig agent_system_prompt or default preamble

Prompt Strategy System

The prompt strategy system (Phase 8.13) replaces the dual-compose pipeline with a plugin-based architecture that reduces preamble tokens by ~65%:

PromptStrategyRegistry (implements PromptStrategy)
├── base: CompactStrategy or LegacyStrategy
│ └── Layers 0 + 1 + 3 (identity, runtime, overrides)
└── plugins: Vec<Arc<dyn PromptPlugin>>
├── MemoryPlugin (always)
├── UserObservationsPlugin (always)
├── SkillsPlugin (always)
├── LearnedRulesPlugin (if self_evolution)
├── ChannelContextPlugin (feature: channels)
└── SchedulerContextPlugin (feature: scheduler)

Handlers call state.prompt_strategy.assemble(&AssemblyRequest) -- a single entry point that:

  1. Base strategy produces Layer 0 (identity), Layer 1 (runtime), Layer 3 (overrides)
  2. Plugins contribute Layer 2 fragments with domain filtering and priority
  3. Registry merges all fragments and applies token budget trimming

Config: prompt_compact_identity (default true) selects CompactStrategy vs LegacyStrategy. prompt_max_preamble_tokens (default 1500) controls the overflow budget.

DB Schema (migration v5)

  • context_summaries — cached AI-generated summaries with hash-based change detection
  • skill_proposals — human-in-the-loop skill change approval workflow
  • sessions.summary — conversation summary column for session resume

Self-Evolving Framework

The agent can learn user preferences and propose skill changes, all subject to human approval.

Gateway Routes

All clients communicate via the HTTP+WebSocket gateway at localhost:18981. Routes are grouped by subsystem (105 base + 28 feature-gated = 133 total).

Health (1 route, no auth)

MethodPathDescription
GET/healthHealth check

Sessions & Chat (10 routes)

MethodPathDescription
POST/sessionsCreate new chat session
GET/sessionsList all sessions
GET/sessions/{id}Get session details
PUT/sessions/{id}Update session
DELETE/sessions/{id}Delete session
POST/sessions/{id}/generate-titleAuto-generate session title via AI
GET/sessions/{id}/messagesGet messages for a session
POST/sessions/{id}/messagesSend message to session
DELETE/sessions/{id}/messages/{message_id}/and-afterDelete message and all after it

Chat (1 route)

MethodPathDescription
POST/chatChat with AI agent

Memory (5 routes)

MethodPathDescription
POST/memoryCreate memory entry
GET/memoryRecall/search memories
GET/memory/{key}Get memory by key
PUT/memory/{key}Update memory by key
DELETE/memory/{key}Delete memory by key

Configuration (3 routes)

MethodPathDescription
GET/configGet current configuration (auth token redacted)
PUT/configUpdate configuration
GET/config/fileGet raw config file content

Setup / Onboarding (1 route)

MethodPathDescription
GET/setup/statusCheck if first-run setup is needed (missing location/timezone)

Credentials (5 routes)

MethodPathDescription
POST/credentialsSet a credential (key + value)
GET/credentialsList all credential keys (values hidden)
DELETE/credentials/{key}Delete a credential
GET/credentials/{key}/valueGet credential value (explicit retrieval)
GET/credentials/{key}/existsCheck if credential exists

Providers & Models (12 routes)

MethodPathDescription
GET/providersList all providers
POST/providersCreate user-defined provider
GET/providers/with-key-statusList providers with API key status
GET/providers/defaultGet default model
PUT/providers/defaultSet default model
GET/providers/{id}Get provider details
PUT/providers/{id}Update provider
DELETE/providers/{id}Delete user-defined provider
POST/providers/{id}/testTest provider connection (with latency)
POST/providers/{id}/modelsAdd model to provider
DELETE/providers/{id}/models/{model_id}Delete model from provider
GET/modelsList all available models across providers

Tools (2 routes)

MethodPathDescription
GET/toolsList available tools
POST/tools/{name}/executeExecute a tool by name

Permissions (4 routes)

MethodPathDescription
GET/permissionsList all known surfaces (desktop, cli, tui, telegram, slack, discord)
GET/permissions/{surface}List tool permissions for a surface
PUT/permissions/{surface}/{tool}Set a permission override for a tool on a surface
DELETE/permissions/{surface}/{tool}Remove an override (fall back to risk-level default)

System (1 route)

MethodPathDescription
GET/system/infoSystem information

WebSocket Channels (1 route)

PathDescription
/ws/chatStreaming chat responses

Identity (4 routes)

MethodPathDescription
GET/identityList all identity files
GET/identity/{name}Get identity file content
PUT/identity/{name}Update identity file content
POST/identity/reloadForce reload all identity files

Skills (6 routes)

MethodPathDescription
GET/skillsList all skills (optional ?category= filter)
GET/skills/{id}Get full skill definition
POST/skillsCreate user skill
PUT/skills/{id}Update skill content
DELETE/skills/{id}Delete user skill (bundled cannot be deleted)
POST/skills/reloadForce reload all skills

Skill Proposals (4 routes)

MethodPathDescription
GET/skills/proposalsList pending skill proposals
POST/skills/proposals/{id}/approveApprove and execute a proposal
POST/skills/proposals/{id}/rejectReject a proposal
DELETE/skills/proposals/{id}Delete a proposal

User Profile + Learning (6 routes)

MethodPathDescription
GET/user/observationsList observations (optional ?category= filter)
POST/user/observationsAdd observation
GET/user/observations/{key}Get observation by key
DELETE/user/observations/{key}Delete observation by key
DELETE/user/observationsClear all observations
GET/user/profileGet computed user context string

Channels (10 routes, 9 feature-gated)

MethodPathFeatureDescription
POST/channels/{name}/testalwaysTest channel credentials
GET/channelschannelsList registered channels with status
GET/channels/{name}/statuschannelsGet channel status
POST/channels/{name}/sendchannelsSend message via channel
POST/channels/{name}/connectchannelsConnect channel
POST/channels/{name}/disconnectchannelsDisconnect channel
GET/channels/{name}/healthchannelsHealth check
POST/channels/{name}/messagechannelsWebhook message endpoint
GET/channels/sessionschannelsList channel sessions
GET/channels/sessions/{id}/messageschannelsList channel session messages

Scheduler (7 routes, feature-gated)

MethodPathDescription
POST/scheduler/jobsCreate scheduled job
GET/scheduler/jobsList all jobs
PUT/scheduler/jobs/{id}Update job
DELETE/scheduler/jobs/{id}Delete job
PUT/scheduler/jobs/{id}/toggleToggle job enabled/disabled
GET/scheduler/jobs/{id}/historyGet job execution history
GET/scheduler/statusScheduler status

Embeddings (5 routes)

MethodPathDescription
GET/embeddings/statusCurrent embedding provider and model info
POST/embeddings/testTest embedding generation
POST/embeddings/embedEmbed arbitrary text
POST/embeddings/downloadDownload local embedding model
POST/embeddings/reindexRe-embed all stored memories

Plugins (9 routes)

MethodPathDescription
GET/pluginsList all installed plugins
POST/plugins/installInstall plugin from git URL or local path
GET/plugins/availableList available plugins from registry
DELETE/plugins/{name}Remove installed plugin
GET/plugins/{name}Get plugin info and manifest
PUT/plugins/{name}/toggleEnable or disable a plugin
POST/plugins/{name}/updateUpdate plugin to latest version
GET/plugins/{name}/configGet plugin configuration
PUT/plugins/{name}/configUpdate plugin configuration

Agent Delegation (2 routes)

MethodPathDescription
GET/agents/activeList active delegation runs
POST/agents/{id}/cancelCancel a delegation run

Approvals (3 routes)

MethodPathDescription
GET/approvals/rulesList approval rules
DELETE/approvals/rules/{id}Delete an approval rule
POST/approvals/{id}/respondRespond to a pending approval

Workflows (10 routes, feature-gated)

MethodPathFeatureDescription
POST/workflowsworkflowsCreate workflow from TOML
GET/workflowsworkflowsList all workflows
GET/workflows/{id}workflowsGet workflow definition
PUT/workflows/{id}workflowsUpdate workflow definition
DELETE/workflows/{id}workflowsDelete workflow
GET/workflows/{id}/rawworkflowsGet raw TOML source
POST/workflows/{id}/runworkflowsExecute workflow
POST/workflows/{id}/cancelworkflowsCancel running workflow
GET/workflows/{id}/historyworkflowsGet run history
GET/workflows/{id}/runs/{run_id}workflowsGet run details with step results

WebSocket Endpoints (2 routes)

PathFeatureDescription
/ws/chatalwaysStreaming chat responses
/ws/notificationsalwaysPush notifications to clients

API Docs (2 routes, feature-gated)

MethodPathFeatureDescription
GET/api-docsapi-docsScalar interactive documentation UI
GET/api-docs/openapi.jsonapi-docsOpenAPI 3.1 JSON specification

Desktop App Architecture

The desktop app is a Tauri 2.10 shell wrapping the SvelteKit SPA frontend. It embeds the gateway server by default, so no separate daemon process is required.

Tauri Plugins

PluginVersionPurpose
tray-iconbuilt-inSystem tray with Show/Hide/Quit menu
window-state2.4.1Persist window size, position, maximized state
single-instance2.4.0Enforce single running instance, focus existing
opener2.5.3Open data directory in OS file manager
devtools2.0.1WebView inspector (feature-gated, dev only)

IPC Commands

CommandDescription
close_to_trayHide window to system tray
show_windowShow and focus the main window
get_app_versionReturn app version string
open_data_dirOpen Zenii data directory in OS file manager

Desktop Boot Flow

Hybrid Gateway Architecture

The desktop app supports two gateway modes:

  1. Embedded (default): The gateway server starts in a background Tokio task during setup(). A oneshot channel provides graceful shutdown. This is the zero-configuration path -- users launch the desktop app and everything works.

  2. External: If ZENII_GATEWAY_URL is set to a valid URL, the desktop app connects to an external daemon instead of starting its own gateway. Useful for multi-device setups or when running the daemon as a system service.

Frontend Integration

The frontend detects the Tauri environment via window.__TAURI__ and provides typed wrappers in web/src/lib/tauri.ts:

  • isTauri -- boolean flag for environment detection
  • closeToTray() -- invoke close_to_tray IPC command
  • showWindow() -- invoke show_window IPC command
  • getAppVersion() -- invoke get_app_version IPC command
  • openDataDir() -- invoke open_data_dir IPC command

All wrappers are no-ops when running in a browser (non-Tauri) context, so the same frontend works for both desktop and web.

Frontend i18n

  • paraglide-js v2 for compile-time, type-safe translations
  • 8 locales auto-detected from project.inlang/settings.json (EN, ZH, ES, JA, HI, PT, KO, FR)
  • Locale store (locale.svelte.ts) mirrors theme store pattern
  • messages/{locale}.json flat-key files with _meta_label for native names
  • 577 message keys across 40+ components
  • Language switcher in Settings > General

Scheduler Notification Flow (Stage 8.6.1)

The PayloadExecutor (scheduler/payload_executor.rs) handles 4 payload types dispatched by the scheduler tick loop. The TokioScheduler and AppState have a circular dependency resolved via OnceCell — the scheduler is constructed first, then wired to AppState post-construction via wire().

Key Design Decisions

DecisionRationale
OnceCell wiringTokioScheduler needs AppState for agent/channel access, but AppState contains the scheduler — OnceCell breaks the cycle
WS /ws/notificationsDedicated endpoint for push notifications, separate from /ws/chat
svelte-sonner toastsFrontend subscribes to WS notifications and displays via toast library
tauri-plugin-notificationDesktop OS-level notifications when app is in tray

Channel Router Pipeline (Stage 8.7)

The ChannelRouter struct orchestrates the full message processing pipeline from inbound channel message to outbound response. It runs as a background task with an mpsc receiver and watch stop signal.

Gateway Integration

RouteDescription
POST /channels/{name}/messageWebhook endpoint — injects message into ChannelRouter pipeline

Frontend: Session Source

Channel-originated sessions carry a source field displayed as a platform badge (Telegram/Slack/Discord icon) in the session list UI.

Channel Lifecycle Hooks (Stage 8.8)

Lifecycle hooks run at key points in the ChannelRouter pipeline. They are best-effort — failures are logged but do not block the pipeline.

Platformon_agent_starton_tool_useon_agent_complete
TelegramSend status messageRefresh typing indicator (4s)Stop typing refresh
SlackPost ephemeral "thinking..."Update ephemeral messageDelete ephemeral message
DiscordStart typing indicator(no-op)(typing auto-expires)

Test Debt and Hardening (Stage 8.9)

Stage 8.9 addressed test coverage gaps and hardened critical modules.

ProcessTool Kill Action

The ProcessTool gained a kill action using sysinfo-based process lookup. Kill requires Full autonomy level — lower autonomy levels are denied with ZeniiError::PolicyDenied.

Context Engine Tests (52 tests)

Comprehensive unit test coverage for:

  • ContextEngine — level determination, compose output, config toggles
  • BootContext — OS/arch/hostname/locale detection
  • Context summaries — hash-based cache invalidation, DB storage/retrieval
  • Tier injection — Full/Minimal/Summary content verification

Agent Tool Loop Tests (5 tests)

Integration tests verifying RigToolAdapter dispatch — agent correctly invokes tools during the chat loop and feeds results back to the LLM.

Agent Action Tools (Phase 8.10)

Four new agent-callable tools give the AI agent direct control over system functions:

Autonomous Reasoning Engine (Phase 8.11)

The ReasoningEngine provides an extensible pipeline for autonomous multi-step agent operation, with per-request tool call deduplication to prevent redundant API calls:

Key components:

  • ReasoningEngine -- orchestrates agent calls with pluggable strategy pipeline
  • ToolCallCache -- per-request DashMap<u64, CachedResult> keyed by hash(tool_name + args_json). Shared across all RigToolAdapters via Arc. Caches both successes and errors. Tracks execution count via AtomicU32. Controlled by tool_dedup_enabled config (default true)
  • ContinuationStrategy -- tool-aware continuation detection. If tool_calls_made > 0, skips the text heuristic entirely (prevents false positives like "Let me tell you about..."). Falls back to planning/refusal language detection only when no tools were called. Respects agent_max_continuations limit (default 1)
  • BootContext -- system environment discovery (OS, arch, hostname, home dir, desktop, downloads, shell, username)

Deduplication defaults

ConfigDefaultRangeDescription
agent_max_turns81-32Max rig-core agentic turns per agent.chat()
agent_max_continuations10-5Max ReasoningEngine continuation rounds
tool_dedup_enabledtrue--Enable per-request tool call cache

Semantic Memory and Embeddings (Phase 8.11)

Hybrid search combining FTS5 full-text search with vector similarity:

Gateway embedding routes (5):

  • GET /embeddings/status -- current provider and model info
  • POST /embeddings/test -- test embedding generation
  • POST /embeddings/embed -- embed arbitrary text
  • POST /embeddings/download -- download local model
  • POST /embeddings/reindex -- re-embed all stored memories

Memory Quality Improvements

Three quality enhancements applied at the SqliteMemoryStore layer:

BM25 Field Weighting (M5)

FTS5 query uses per-field BM25 weights so key matches rank higher than content or category:

bm25(memories_fts, 2.0, 1.0, 0.5)
-- key=2.0, content=1.0, category=0.5

Configurable via memory_bm25_key_weight, memory_bm25_content_weight, memory_bm25_category_weight.

Temporal Decay (M1)

Recall scores are multiplied by an exponential decay factor based on how many days have passed since the memory was last updated:

final_score = raw_score × exp(-λ × days_since_update)

Default λ=0.01 gives approximately a 70-day half-life. Configurable via memory_decay_enabled (default true) and memory_decay_lambda (default 0.01).

Semantic Deduplication (M4)

Before inserting a new memory, store() checks whether any existing entry has a vector cosine similarity ≥ threshold. If so, the existing entry's content is updated in-place instead of creating a duplicate. Configurable via memory_dedup_enabled (default true) and memory_dedup_threshold (default 0.92). Only active when an embedding provider is configured.

Phase 18 Hardening

Phase 18 addressed 51 issues from two code audits across 8 parallel work streams:

  • ArcSwap config -- runtime config hot-reload via arc_swap::ArcSwap<AppConfig> replacing manual TOML write + reload
  • Security -- CORS origin validation improvements, path traversal protection in file tools
  • Concurrency -- eliminated data races in scheduler, security, and tools modules
  • Channel reliability -- UTF-8 safe message splitting, Slack echo loop prevention
  • Frontend -- svelte-check warnings reduced from 19 to 0
  • CI/CD -- all-features testing added to CI pipeline

Workflow Audit Hardening

A whole-app workflow audit addressing security, agent safety, session lifecycle, event bus hygiene, and frontend resilience. 16 new tests added (1,306 total).

Security Hardening

9 additional commands added to BLOCKED_COMMANDS in security/policy.rs: eval, exec, nc, ncat, socat, docker, systemctl, xdg-open, open. Pipe-to-shell patterns (e.g., curl | sh) were already caught by | in INJECTION_PATTERNS.

Agent Execution Safety

Key changes in gateway/handlers/ws.rs:

  • Agent timeout: tokio::time::timeout() with configurable agent_timeout_secs (default 300s)
  • Client disconnect abort: JoinHandle stored, tokio::select! detects WS close and aborts the agent task
  • Tool event lag handling: when tool_rx lags, sends WsOutbound::Warning with count of dropped events
  • DB persistence retry: one retry after 100ms on failure, WsOutbound::Warning sent to client on final failure

Session Lifecycle

SessionManager::cleanup_old_sessions() added in ai/session.rs. Deletes sessions older than session_max_age_days (default 90). Runs automatically on boot during init_services().

Event Bus Cleanup

  • 10 never-published AppEvent variants removed from event_bus/mod.rs: SessionCreated, SessionDeleted, MessageReceived, StreamChunk, StreamDone, ToolExecutionStarted, ToolExecutionCompleted, ProviderChanged, MemoryStored, GatewayStarted
  • Event bus capacity now reads from config.event_bus_capacity (default 256) instead of being hardcoded

Notification Routing

heartbeat_alert field added to NotificationRouting (backend routing.rs + frontend notifications.svelte.ts). Frontend now uses hasTarget("heartbeat_alert", ...) instead of piggybacking on scheduler_job_completed.

Frontend Resilience

  • activeToolCalls array capped at 50 entries in messages.svelte.ts
  • Session store retry replaced with exponential backoff (3 attempts: 1s, 2s, 4s) in sessions.svelte.ts

Scheduler Validation

add_job() validates start_hour != end_hour in active hours configuration (tokio_scheduler.rs).

New Config Fields

FieldTypeDefaultDescription
agent_timeout_secsu64300Maximum seconds for agent execution before timeout
event_bus_capacityusize256Capacity of the tokio broadcast event bus
session_max_age_daysu3290Days before old sessions are cleaned up on boot

Plugin Architecture (Phase 9)

Plugin Lifecycle

  • Discovery: On boot, PluginRegistry scans plugins_dir for installed plugins
  • Registration: Each plugin's tools are wrapped in PluginToolAdapter and registered in ToolRegistry
  • Execution: When a tool is called, PluginProcess spawns the plugin binary, communicates via JSON-RPC 2.0 over stdio
  • Recovery: Crashed plugins are automatically restarted up to plugin_max_restart_attempts times
  • Idle Shutdown: Inactive plugin processes are terminated after plugin_idle_timeout_secs

Client Interfaces

Plugin management is available across all interfaces:

  • CLI: zenii plugin <cmd> (list, install, remove, update, enable, disable, info) -- HTTP calls to gateway
  • Web/Desktop: PluginsSettings.svelte component with full install/remove/enable/disable UI via pluginsStore
  • TUI: PluginList mode (press p from session list) with keybindings: j/k navigate, e toggle enable/disable, d remove, i install, r refresh, Esc back

Plugin Manifest Format (plugin.toml)

[plugin]
name = "weather"
version = "1.0.0"
description = "Weather forecast tool"
author = "example"

[permissions]
network = true
filesystem = false

[[tools]]
name = "get_weather"
binary = "weather-tool"
description = "Get weather for a location"

[[skills]]
name = "weather-prompt"
file = "skills/weather.md"

Context-Driven Auto-Discovery

The context engine automatically detects which feature domains are relevant to the user's message and injects only pertinent context and agent rules.

Domain Detection

Domain-to-Category Mapping

DomainAgent Rule Category
Channelschannel
Schedulerscheduling
Skills / Toolstool_usage
Always includedgeneral

Key files: ai/context.rs (ContextDomain enum, detect_relevant_domains(), domains_to_rule_categories())


AgentSelfTool

The agent_notes tool allows the agent to learn, recall, and forget behavioral rules that persist across conversations and get auto-injected into context.

Data Model

  • Table: agent_rules (DB migration v10)
  • Schema: id, content, category, created_at, active
  • Categories: general, channel, scheduling, user_preference, tool_usage

Tool Actions

ActionDescriptionRequired Params
learnCreate a new behavioral rulecontent, optional category
rulesList active rulesoptional category filter
forgetSoft-delete a rule by IDid

Integration

Control: Gated by self_evolution_enabled config flag (runtime toggle via Arc<AtomicBool>).

Key file: tools/agent_self_tool.rs


OpenAPI Documentation

Interactive API documentation via Scalar UI, feature-gated behind api-docs.

Stack

  • utoipa -- OpenAPI 3.1 spec generation from Rust handler annotations
  • scalar -- Interactive documentation UI served at /api-docs
  • Feature gate: api-docs (enabled by default in daemon and desktop)

Endpoints

PathDescription
GET /api-docsScalar interactive UI
GET /api-docs/openapi.jsonRaw OpenAPI 3.1 JSON spec

Build

The spec is assembled at runtime from #[utoipa::path] annotations on handler functions. Feature-gated handlers (channels, scheduler) are conditionally merged into the spec.

Key file: gateway/openapi.rs


Onboarding Flow

Multi-step onboarding wizard that collects AI provider setup (provider selection, API key, model), optional channel credentials (Telegram, Slack, Discord), and user profile (name, location, timezone). Available across Desktop, CLI, and TUI interfaces.

SetupStatus

The check_setup_status() function (in onboarding.rs) determines whether onboarding is needed:

  • needs_setup: bool -- true if user_name, user_location, or API key is missing
  • missing: Vec<String> -- list of missing fields (e.g., ["user_name", "api_key"])
  • detected_timezone: Option<String> -- auto-detected IANA timezone via iana-time-zone crate
  • has_usable_model: bool -- true if at least one provider has a stored API key

Desktop (OnboardingWizard)

CLI (Interactive Flow)

The zenii onboard command runs an interactive onboarding:

  1. Fetch providers from GET /providers/with-key-status
  2. User selects provider via dialoguer::Select
  3. Prompt for API key via dialoguer::Password, save to POST /credentials
  4. Refresh providers to get updated models
  5. User selects model, set default via PUT /providers/default
  6. Optional: dialoguer::Confirm to set up a messaging channel (Telegram/Slack/Discord), save credentials to POST /credentials
  7. Prompt for name, location, timezone (auto-detected default)
  8. Save profile to PUT /config

TUI (5-Step Overlay Modal)

Centered ratatui modal (60% x 70%) with step indicator:

  1. ProviderSelect -- list providers, j/k navigate, Enter select
  2. ApiKey -- masked password input, Enter save, Esc back
  3. ModelSelect -- list models for selected provider, Enter select
  4. Channels (optional) -- Tab to switch between Telegram/Slack/Discord, j/k navigate credential fields, Enter save, s to skip
  5. Profile -- three text fields (Name/Location/Timezone), Tab switch, Enter save

Detection

  • Timezone (server): iana-time-zone crate (Rust) -- returned in SetupStatus.detected_timezone
  • Timezone (browser): Intl.DateTimeFormat().resolvedOptions().timeZone -- fallback in AuthGate
  • Location: Manual user input (e.g., "Toronto, Canada")

Config Fields

  • user_name: Option<String> -- display name for greetings
  • user_timezone: Option<String> -- IANA format (e.g., "America/New_York")
  • user_location: Option<String> -- human-readable (e.g., "New York, US")

Key files: onboarding.rs, gateway/handlers/config.rs (setup_status), web/src/lib/components/OnboardingWizard.svelte, web/src/lib/components/AuthGate.svelte, crates/zenii-cli/src/commands/onboard.rs, crates/zenii-tui/src/ui/onboard.rs


LLM-Based Auto Fact Extraction

Automatically extracts structured facts about the user from conversation exchanges and persists them via UserLearner::observe(). Fire-and-forget design -- errors are logged, never propagated to the user.

Flow

Extraction Prompt

The LLM receives the user prompt and assistant response, asked to extract facts in category|key|value format (one per line). Valid categories: preference, knowledge, context, workflow. If no meaningful facts, the LLM outputs NONE.

Config Fields

FieldTypeDefaultPurpose
context_auto_extractbooltrueEnable/disable fact extraction
context_extract_intervalusize3Extract every N messages
context_summary_provider_idString"openai"LLM provider for extraction
context_summary_model_idString"gpt-4o-mini"LLM model for extraction

Integration Points

  • HTTP chat (gateway/handlers/chat.rs): called after reasoning_engine.chat(), before storing assistant message
  • WebSocket chat (gateway/handlers/ws.rs): called after streaming completes

Key files: ai/context.rs (ContextBuilder::extract_facts), user/learner.rs (UserLearner::observe), config/schema.rs


Tool Permission System (Phase 19)

Per-surface, risk-based tool permission system. Each tool declares a RiskLevel (Low, Medium, High) via the Tool trait. Permissions are resolved hierarchically: per-surface per-tool override > risk-level default.

Risk Level Defaults

Risk LevelDefaultExamples
LowAllowedweb_search, system_info
MediumAllowedconfig, learn, memory, skill_proposal, agent_self, channel_send, scheduler
HighDeniedshell, file_read, file_write, file_list, file_search, patch, process

Surface Overrides

Local surfaces (desktop, cli, tui) override all high-risk tools to Allowed by default. Remote surfaces (telegram, slack, discord) use risk-level defaults -- high-risk tools are denied unless explicitly overridden.

Permission States

StateBehavior
allowedTool can execute
deniedTool is blocked
ask_oncePrompt user once, remember answer (Phase 2)
ask_alwaysPrompt user every time (Phase 2)

Resolution Flow

Key Files

FilePurpose
security/permissions.rsToolPermissions, PermissionResolver, PermissionState
tools/traits.rsrisk_level() method on Tool trait
config/schema.rstool_permissions field in AppConfig
gateway/handlers/permissions.rsREST API (4 routes)
web/src/lib/components/settings/PermissionsSettings.svelteSettings UI
web/src/lib/stores/permissions.svelte.tsFrontend store

Model Capability Validation

Pre-agent-dispatch check that prevents tool-calling errors with incompatible models.

Flow

Data

  • Field: ModelInfo.supports_tools: bool (default true)
  • Storage: ai_models.supports_tools column (DB migration v8)
  • API: POST /providers/{id}/models accepts supports_tools flag

Key file: ai/agent.rs (capability check in get_or_build_agent())


Agent Delegation

The delegation system allows the main agent to decompose complex tasks into independent sub-tasks, execute them in parallel via isolated sub-agents, and aggregate the results into a unified response.

Key Components

ComponentFileDescription
DelegationConfigai/delegation/mod.rsConfig: max sub-agents, token budget, timeout, decomposition model
DelegationTaskai/delegation/task.rsTask definition with id, description, tool_allowlist, depends_on
TaskResultai/delegation/task.rsPer-task outcome: status, output, usage, duration, session_id
DelegationResultai/delegation/task.rsAggregated result: all task results + synthesized response + total usage
TaskStatusai/delegation/task.rsEnum: Pending, Running, Completed, Failed, Cancelled, TimedOut
SubAgentai/delegation/sub_agent.rsIsolated agent with own session, filtered tools, timeout enforcement
Coordinatorai/delegation/coordinator.rsOrchestrator: decompose, validate, execute waves, cancel, aggregate

Execution Model

  • Dependency waves: Tasks are partitioned into waves based on depends_on fields. Each wave runs in parallel via JoinSet. Wave N+1 starts only after wave N completes.
  • Isolated sessions: Each sub-agent gets a dedicated session with source: "delegation" for traceability.
  • Tool filtering: Sub-agents can be restricted to a tool allowlist, or inherit the surface's full permission set.
  • Timeout: Per-agent timeout via tokio::time::timeout, configurable via delegation_per_agent_timeout_secs.
  • Cancellation: Coordinator::cancel(id) aborts all sub-agent JoinHandles for a delegation run. cancel_all() aborts everything.

Config Fields

FieldTypeDefaultDescription
delegation_max_sub_agentsusize4Maximum sub-tasks per delegation
delegation_per_agent_token_budgetusize4000Token budget per sub-agent
delegation_per_agent_timeout_secsu64120Timeout per sub-agent in seconds
delegation_decomposition_modelOptionNoneModel override for decomposition LLM call

Gateway Integration

The ChatRequest struct has an optional delegation: Option<bool> field. When true, the chat handler delegates to Coordinator::delegate() instead of the standard agent flow. Two management routes are always available:

  • GET /agents/active -- list active delegation run IDs
  • POST /agents/{id}/cancel -- cancel a delegation run by ID

Delegation System Flow

End-to-end sequence from client WebSocket request through decomposition, parallel execution, and aggregated response. Everything runs on the daemon -- clients are thin renderers of streamed events.

Key points:

  • LLM does the decomposition -- the Coordinator sends a meta-prompt to the configured model asking it to break the task into sub-tasks. The LLM decides how many agents, what each does, and what tools each needs.
  • WebSocket protocol defines 4 message types: delegation_started, agent_progress, agent_completed, delegation_completed -- enabling real-time visualization in all clients.
  • Oneshot channel delivers the final DelegationResult back to the WS handler for the aggregated text response.

Workflow Engine

The workflow engine provides multi-step automation pipelines defined in TOML, with DAG-based execution ordering, template resolution between steps, retry/timeout policies, and DB-persisted run history. Feature-gated behind workflows.

Key Components

ComponentFileDescription
Workflowworkflows/definition.rsWorkflow definition: id, name, steps, optional schedule
WorkflowStepworkflows/definition.rsStep with name, type, depends_on, retry config, failure policy, timeout
StepTypeworkflows/definition.rsEnum: Tool, Llm, Condition, Parallel, Delay
FailurePolicyworkflows/definition.rsEnum: Stop, Continue, Fallback with step reference
RetryConfigworkflows/definition.rsmax_retries + retry_delay_ms
StepOutputworkflows/definition.rsPer-step result: output string, success, duration, error
WorkflowRunworkflows/definition.rsRun record: status, step results, timestamps
WorkflowRunStatusworkflows/definition.rsEnum: Running, Completed, Failed, Cancelled
WorkflowRegistryworkflows/mod.rsDashMap-backed CRUD + TOML persistence to disk
WorkflowExecutorworkflows/executor.rsDAG builder, topological execution, DB persistence, retry/timeout
dispatch_stepworkflows/runtime.rsStep type dispatcher with template resolution
resolveworkflows/templates.rsMinijinja template engine for inter-step data flow

Step Types

TypeDescriptionTemplate Support
ToolExecute a registered tool with JSON argsArgs are template-resolved
LlmSend prompt to LLMPrompt is template-resolved
ConditionEvaluate expression, branch to if_true/if_falseExpression is template-resolved
ParallelMeta-step referencing parallel sub-stepsN/A
DelaySleep for N secondsN/A

Template Resolution

Inter-step data flow uses minijinja templates. Completed step outputs are available via {{ steps.step_name.output }}, {{ steps.step_name.success }}, and {{ steps.step_name.error }}.

Example workflow TOML:

id = "daily-report"
name = "Daily Report"
description = "Fetch news and summarize"

[[steps]]
name = "fetch"
type = "tool"
tool = "web_search"
[steps.args]
query = "latest tech news"

[[steps]]
name = "summarize"
type = "llm"
prompt = "Summarize: {{ steps.fetch.output }}"
depends_on = ["fetch"]

Failure Policies

PolicyBehavior
Stop (default)Workflow fails immediately on step failure
ContinueSkip failed step, continue to next
Fallback { step }Execute named fallback step on failure

Config Fields

FieldTypeDefaultDescription
workflow_dirOptionNoneWorkflow TOML directory (default: data_dir/workflows)
workflow_max_concurrentusize5Max concurrent workflow runs
workflow_max_stepsusize50Max steps per workflow
workflow_step_timeout_secsu64300Default step timeout in seconds
workflow_step_max_retriesu323Default step retry count

DB Schema

  • workflow_runs -- run history: id, workflow_id, workflow_name, status, started_at, completed_at, error
  • workflow_step_results -- per-step results: id, run_id, step_name, output, success, duration_ms, error, executed_at

MCP Integration

Zenii supports the Model Context Protocol as both a server (exposing tools to external AI agents) and a client (consuming tools from external MCP servers).

MCP Server Architecture

Key components:

  • ZeniiMcpServer — implements rmcp::ServerHandler manually (tools are dynamic from ToolRegistry, not static)
  • convert module — bidirectional conversion between Zenii ToolInfo/ToolResult and rmcp Tool/CallToolResult
  • Security enforcement — every call_tool goes through SecurityPolicy::validate_tool_execution()
  • Tool filtering — configurable mcp_server_exposed_tools (allowlist) and mcp_server_hidden_tools (denylist)
  • Tool prefix — all tools exposed with zenii_ prefix (configurable via mcp_server_tool_prefix)

Files:

  • crates/zenii-core/src/mcp/server.rsZeniiMcpServer handler
  • crates/zenii-core/src/mcp/convert.rs — type conversions
  • crates/zenii-mcp-server/src/main.rs — thin binary (~75 lines)

MCP Client Architecture

Key components:

  • McpClientManager — spawns external MCP servers as child processes and connects via stdio (HTTP transport is planned but not yet implemented)
  • Persistent sessions — each server's Peer<RoleClient> is stored and reused; sessions are kept alive in background tasks (no per-call respawn)
  • Timeouts: 15 s connect, 10 s tool discovery, 60 s per tool call
  • McpClientTool — wraps a remote MCP tool as a Zenii Tool (RiskLevel::Medium), forwarding call_tool via the live session and returning ToolResult
  • Tool prefixing — optional tools_prefix per server (e.g., "github/""github/list_repos")
  • Resilient startup — servers that fail to connect are skipped with a warning; the agent still starts
  • Configuration via mcp_client_servers array in config.toml

Config schema:

[[mcp_client_servers]]
id = "github"
tools_prefix = "github/"
enabled = true

[mcp_client_servers.transport]
type = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]

Note: HTTP/SSE transport is defined in the schema but not yet implemented. Attempts to use it return an error.

Session lifecycle:

boot → McpClientManager::connect_all()
├─ spawn child process (stdio)
├─ rmcp handshake (15s timeout)
├─ list_all_tools() (10s timeout)
├─ store Peer<RoleClient> in HashMap
└─ background task keeps session alive indefinitely
call → peer.call_tool() via stored Peer (60s timeout)
shutdown → tokio aborts background tasks → child process exits

Files:

  • crates/zenii-core/src/mcp/client.rsMcpClientManager
  • crates/zenii-core/src/tools/mcp_client_tool.rsMcpClientTool

Settings UI:

  • web/src/lib/components/settings/McpSettings.svelte — MCP tab in Settings with server sub-tab (connection snippets + tool visibility) and clients sub-tab (add/edit/delete/toggle servers with 5 quick-add presets)

A2A Agent Card

The GET /.well-known/agent.json endpoint serves an A2A Agent Card for agent-to-agent discovery. This is a public endpoint (no auth required), served before the auth middleware layer alongside /health.

File: crates/zenii-core/src/gateway/handlers/agent_card.rs

Feature Gates

FeatureWhat It EnablesNew Deps
mcp-serverZeniiMcpServer, convert modulermcp, schemars
mcp-clientMcpClientManager, rig-core rmcp integrationrmcp (+ rig-core/rmcp)

Neither feature is in the default set — zero size impact on existing binaries.


Concurrency Rules

These rules are enforced across the entire codebase to prevent async runtime issues.

RuleRationale
No std::sync::Mutex in async pathsBlocks the tokio runtime; use tokio::sync::Mutex or DashMap
No block_on() anywherePanics inside tokio runtime; use tokio::spawn or .await
All SQLite ops via spawn_blockingrusqlite is synchronous; blocking in async context starves tasks
All errors are ZeniiErrorNo Result<T, String>; use thiserror enum with typed variants
AppState is Clone + Arc<T>Shared across axum handlers without lifetime issues
EventBus uses tokio::sync::broadcastLock-free fan-out to all subscribers
Never hold async locks across .awaitPrevents deadlocks; acquire, use, drop before yielding

Lessons Learned from v1

Key architectural mistakes from Zenii v1 and how v2 prevents them.

v1 Mistakev2 Prevention
std::sync::Mutex in async codetokio::sync::Mutex or DashMap exclusively
block_on() in event loopZero block_on() calls; tokio::spawn for sync work
Result<T, String> everywhereZeniiError enum with thiserror
Custom AI layer (1400 LOC)rig-core (battle-tested, 18 providers)
21 Zustand stores6 Svelte 5 rune stores ($state), single WS connection
165 IPC commands (Tauri v1)Gateway-only architecture (~40 HTTP routes)
OKLCH color functions in CSSPre-computed hex values only
useEffect soup (React)Single $effect per Svelte component, reactive stores
13-phase boot sequenceSingle init_services() in boot.rs