Architecture

Version: 0.1.0

DecisionBox has three services, one database, and a plugin system for extensibility. There are no message queues, caches, or event streams — just MongoDB.

System Overview

┌─────────────────────────────────────────────────────────┐
│                   Dashboard (Next.js 16)                 │
│                  http://localhost:3000                    │
│                                                          │
│   - Project management (create, edit, delete)            │
│   - Discovery results (insights table, recommendations)  │
│   - Live progress (real-time step feed)                  │
│   - Prompt editor (markdown, per-project)                │
│   - Settings (warehouse, LLM, secrets, schedule)         │
│   - Feedback (like/dislike insights + recommendations)   │
│                                                          │
│   All /api/* requests proxied to API via Next.js         │
│   middleware (server-side, API never exposed publicly)    │
└──────────────────────────┬──────────────────────────────┘
                           │ HTTP proxy (runtime, not build-time)
                           ▼
┌─────────────────────────────────────────────────────────┐
│                      API (Go, net/http)                   │
│                  http://localhost:8080                    │
│                                                          │
│   - REST endpoints (projects, discoveries, prompts,      │
│     feedback, pricing, secrets, health)                   │
│   - Spawns agent as subprocess (local) or K8s Job (prod) │
│   - Reads provider metadata for dynamic UI forms         │
│   - Seeds pricing from registered providers              │
│   - No authentication (open-source, internal use)        │
└──────┬──────────────────────────────────────┬───────────┘
       │ exec.Command / K8s Job               │ MongoDB driver
       ▼                                      ▼
┌──────────────────────┐              ┌──────────────────┐
│   Agent (Go binary)  │              │    MongoDB 7+    │
│                      │──────write──▶│                  │
│   Autonomous AI      │              │  Collections:    │
│   data explorer      │              │  - projects      │
│                      │              │  - discoveries   │
│   Components:        │              │  - discovery_runs│
│   - LLM provider     │              │  - feedback      │
│   - Warehouse prov.  │              │  - secrets       │
│   - Domain pack      │              │  - pricing       │
│   - Secret provider  │              │  - project_ctx   │
│   - Prompts          │              │  - debug_logs    │
└──────────┬───────────┘              └──────────────────┘
           │ SQL queries
           ▼
┌──────────────────────┐
│   Data Warehouse     │
│                      │
│   BigQuery           │
│   Amazon Redshift    │
│   (read-only access) │
└──────────────────────┘

Components

Dashboard

The web UI. Built with Next.js 16, React 19, TypeScript, and Mantine 8.

Key design decision: The dashboard proxies all /api/* requests to the API via Next.js middleware. The API is never exposed publicly. This means:

No CORS issues
Single ingress point (only the dashboard needs a public URL)
API URL is a runtime environment variable (API_URL), not baked at build time
One Docker image works across all environments

API

The REST API. Built with Go's standard net/http package (no frameworks). Handles:

Project CRUD — Create, read, update, delete projects
Discovery management — Trigger runs, list results, get status
Agent spawning — Starts the agent as a subprocess or Kubernetes Job
Provider metadata — Returns available LLM/warehouse providers with config field definitions for dynamic UI forms
Prompts — Read/write per-project prompt overrides
Secrets — Per-project encrypted key storage
Feedback — Like/dislike on insights and recommendations
Health — Liveness and readiness probes

The API has no authentication in v0.1.0. It's designed for internal use — the dashboard sits in front of it.

Agent

The autonomous AI data explorer. A standalone Go binary that:

Loads project configuration from MongoDB
Initializes providers (LLM, warehouse, secrets, domain pack)
Discovers warehouse table schemas
Runs autonomous exploration (AI writes SQL, executes, iterates)
Analyzes results per analysis area
Validates insights against warehouse data
Generates recommendations
Saves results to MongoDB
Updates run status throughout

The agent is stateless — it reads everything from MongoDB and the domain pack files. It can run as:

A subprocess spawned by the API (local development, RUNNER_MODE=subprocess)
A Kubernetes Job created by the API (production, RUNNER_MODE=kubernetes)

MongoDB

The only infrastructure dependency. Stores:

Collection	Purpose
`projects`	Project configuration (name, warehouse, LLM, schedule, profile, prompts)
`discoveries`	Discovery results (insights, recommendations, logs, validation)
`discovery_runs`	Live run status (phase, progress, steps, errors)
`feedback`	User feedback on insights and recommendations
`secrets`	Encrypted per-project secrets (API keys, credentials)
`pricing`	LLM and warehouse pricing configuration
`project_context`	Rolling context (previous insights, patterns)
`discovery_debug_logs`	Detailed debug logs (TTL: 30 days)

All collections and indexes are created automatically on API startup (idempotent).

Plugin Architecture

DecisionBox is built on four plugin systems. Each uses the same pattern: providers register themselves via init() functions, and services select them by name at runtime.

How Registration Works

// In a provider package (e.g., providers/llm/claude/provider.go)
func init() {
    llm.Register("claude", func(cfg llm.ProviderConfig) (llm.Provider, error) {
        return NewClaudeProvider(cfg["api_key"], cfg["model"])
    })
}

// In a service (e.g., services/agent/main.go)
import _ "github.com/decisionbox-io/decisionbox/providers/llm/claude" // triggers init()

provider, err := llm.NewProvider("claude", llm.ProviderConfig{
    "api_key": "sk-ant-...",
    "model":   "claude-sonnet-4-20250514",
})

Services import provider packages with blank imports (_). The init() function runs at startup and registers the provider factory. The service then creates providers by name.

Four Plugin Types

Plugin	Interface	Purpose	Shipped Implementations
LLM	`llm.Provider`	AI model access	claude, openai, ollama, vertex-ai, bedrock
Warehouse	`warehouse.Provider`	Data warehouse access	bigquery, redshift
Secrets	`secrets.Provider`	Encrypted key storage	mongodb, gcp, aws
Domain Pack	`domainpack.DiscoveryPack`	Domain-specific analysis	gaming (match-3)

For details on implementing each, see:

Data Flow

Discovery Run

1. User clicks "Run discovery" in Dashboard
   ↓
2. Dashboard sends POST /api/v1/projects/{id}/discover
   ↓
3. API creates a run record in MongoDB (status: pending)
   ↓
4. API spawns agent (subprocess or K8s Job)
   ↓
5. Agent loads project config, secrets, prompts from MongoDB
   ↓
6. Agent initializes LLM provider, warehouse provider, domain pack
   ↓
7. Agent discovers warehouse schemas (LIST TABLES, GET SCHEMA)
   ↓
8. Agent runs exploration:
   a. Sends schema + prompt to LLM
   b. LLM generates SQL query
   c. Agent executes query against warehouse
   d. Agent sends results back to LLM
   e. LLM generates next query based on results
   f. Repeat for N steps (default: 100)
   g. Each step written to run record in MongoDB (live progress)
   ↓
9. Agent runs analysis per area:
   a. Loads area-specific prompt (e.g., analysis_churn.md)
   b. Feeds relevant exploration results to LLM
   c. LLM generates insights (JSON)
   d. Agent parses and assigns IDs
   ↓
10. Agent validates insights:
    a. For each insight with affected_count
    b. Generates verification SQL
    c. Executes against warehouse
    d. Compares claimed vs verified count
    ↓
11. Agent generates recommendations:
    a. Feeds all validated insights to LLM
    b. LLM generates recommendations with related_insight_ids
    ↓
12. Agent saves DiscoveryResult to MongoDB
    ↓
13. Agent updates run status to "completed" (or "failed")
    ↓
14. Dashboard polls for status, shows completed results

Prompt Flow

Domain Pack provides template files (.md)
  ↓
Project-level overrides stored in MongoDB (editable via dashboard)
  ↓
Agent loads prompts (project overrides take priority)
  ↓
Agent substitutes template variables:
  {{PROFILE}}         → JSON-encoded project profile
  {{PREVIOUS_CONTEXT}} → Previous discoveries + feedback
  {{SCHEMA_INFO}}      → Discovered table schemas
  {{DATASET}}          → Dataset names
  {{FILTER}}           → WHERE clause for multi-tenant
  {{QUERY_RESULTS}}    → Exploration query results (per area)
  ...
  ↓
Rendered prompt sent to LLM

See Prompts for the full variable reference.

Deployment Models

Local Development

Dashboard (npm run dev)  →  API (go run .)  →  Agent (subprocess)
                                ↕
                            MongoDB (Docker)

Docker Compose

Dashboard (container)  →  API (container)  →  Agent (subprocess inside API container)
                               ↕
                           MongoDB (container)

Kubernetes (Production)

Dashboard (Deployment)  →  API (Deployment)  →  Agent (K8s Job per discovery)
                                ↕
                           MongoDB (StatefulSet or external)

In Kubernetes mode (RUNNER_MODE=kubernetes), the API creates a K8s Job for each discovery run instead of spawning a subprocess. The agent runs as an isolated container with configurable CPU/memory limits.

Security Model

v0.1.0 (Current)

No authentication — Designed for internal/single-user deployment
API not publicly exposed — Dashboard proxies all requests
Secrets encrypted at rest — AES-256-GCM when using MongoDB provider with SECRET_ENCRYPTION_KEY
Warehouse read-only — Agent only executes SELECT queries
Per-project isolation — Each project has its own secrets, prompts, discoveries

Future

Authentication (OAuth2 / Auth0)
Multi-user RBAC
API key authentication for external integrations

System Overview​

Components​

Dashboard​

API​

Agent​

MongoDB​

Plugin Architecture​

How Registration Works​

Four Plugin Types​

Data Flow​

Discovery Run​

Prompt Flow​

Deployment Models​

Local Development​

Docker Compose​

Kubernetes (Production)​

Security Model​

v0.1.0 (Current)​

Future​