Enterprise · Fine-tuning

Turn every discovery
into training data.

DecisionBox records everything the agent does — SQL queries, schema lookups, validation retries, errors and their fixes, every finding, every feedback thumb you've given. Export it, fine-tune an open model on it, and the next version of the agent knows your warehouse like a senior analyst does.

Talk to Us Explore Enterprise

What we capture

Everything the agent does is written down

Not just the findings that ship — the full trace. Eight distinct record types, all structured, all exportable.

Discovery runs

Every step of every run — explore, query, analyze, decide — with the agent's reasoning attached.

Insights

Structured findings with severity, indicators, affected counts, and validation state.

Recommendations

Priority, expected impact, target segment, and the numbered action steps behind each one.

Schema exploration

Which tables and columns the agent inspected, in what order, and why it chose them.

SQL queries

Every query executed, with the results it returned and how they were used downstream.

SQL fix history

Errors paired with their warehouse-specific rewrites — a rich corpus for teaching SQL generation.

Debug logs

Agent-internal reasoning per step: hypotheses formed, rejected, refined.

User feedback

Thumbs-up and thumbs-down on insights and recommendations — natural preference data.

Golden data

Not just logs — curated training data

Typical fine-tuning datasets are scraped or synthetic. Yours is neither. Every record DecisionBox exports has been validated against your warehouse, feedback-rated by your team, or produced by an agent run that completed successfully.

Validated

Every insight's numbers are re-queried against your warehouse before the insight ships. The training set is confirmed ground truth, not model output.

Labeled

Thumbs-up and thumbs-down on every insight and recommendation. Your team's preferences are attached — DPO-ready out of the box.

Tested

Every SQL query ran against your warehouse. Every fix was confirmed to work. Your model learns what actually executes, not what looks right.

Contextual

Domain pack, schema, and warehouse type are attached to every record. The model learns your data, not just abstractions.

Training on this is training on confirmed ground truth. Not noise, not hallucinated chatter, not unverified model output.

The feedback loop

Your team's thumbs-ups go two places

Into the next discovery

Feedback is part of the agent's context on the next run. It learns which patterns your team cares about, which findings you've already shipped, and which cuts you don't find useful.

Into your training set

Every thumbs-up is a labeled positive example; every thumbs-down, a counter-example. This is natural preference data — exactly what modern fine-tuning methods (DPO, ORPO, KTO) are built to consume.

Want to see the dataset schema before committing?

Talk to Us Read the architecture

Model compatibility

Works with the open model you already run

Export is framework-agnostic JSON/Parquet. Any model family that accepts instruction-tuning data works — here's the short list we've validated against.

Meta Llama

Google Gemma

Mistral

Qwen

DeepSeek

Phi

Anything on HuggingFace that accepts a standard SFT or DPO dataset will work — these are just the families we've seen customers choose.

How to get there

Two ways to actually fine-tune

Pick the workflow that matches how your team operates.

Path 1 · Self-serve

Train it yourself

Export the dataset as JSON or Parquet. Run your preferred stack — TRL, Unsloth, Axolotl, with LoRA or QLoRA for parameter-efficient tuning — on your own compute. Best if you already have ML infra and a flow your team likes.

JSON / Parquet export
SFT + DPO dataset formats
Your GPUs, your timeline

Talk about the dataset schema

Recommended

Path 2 · DecisionBox training tool

Train it from the UI you already use

Pick a base model, click train. Dataset assembly, LoRA / QLoRAorchestration, evaluation against your held-out feedback, and model versioning — all automated. You still bring the compute; we handle the pipeline.

One-click dataset build
LoRA / QLoRA out of the box
Eval against held-out feedback
Versioned weights, swappable at runtime

Talk to us

Why fine-tune on your own runs

The model that knows your warehouse beats the one that knows the internet

SQL tuned to your schema

Generic models guess your columns. A model fine-tuned on your run history already knows what fact_orders looks like and which joins matter.

Run it on your hardware

A 7B–14B fine-tuned model can match a frontier API for your specific task. Lower latency, lower cost, fewer surprises in the bill.

Training data stays with you

The dataset exports into your infrastructure. The fine-tuning job runs on your compute. The resulting weights are yours.

Self-healing SQL, learned

Every error-fix pair the agent logged becomes training signal. Your model learns your warehouse's quirks without being told.

The honest limits

What we do and don't handle.

We capture the data, structure it, and export it. We don't supply compute — you bring your own GPU or cloud. We don't host the fine-tuned model — it's yours to deploy wherever your agent runs (locally via Ollama, vLLM, or a managed endpoint). Until the training tool ships, you'll convert the export to your framework's format — a short adapter script for most stacks.

More enterprise features

Turn every discoveryinto training data.