# Analytics & learnings

Two surfaces, one purpose: close the loop between the AI and the humans correcting it.

## Analytics (`/analytics`)

The dashboard for queue health. The views you'll actually use:

### Volume

Items per day, broken down by project, status, and reviewer. Use this to spot:

* Sudden volume spikes (pipeline bug? attack? campaign?)
* Reviewer load imbalance (one person doing 80% of decisions?)
* Project drift (one project consuming all the reviewer time?)

### Outcomes

Approval / edit / rejection / escalation rates. A healthy support-reply project might run 70% approve, 20% edit, 8% reject, 2% escalate. Numbers far from your expected mix point at model issues or policy mistuning.

### Time-to-decision

p50, p90, p99 latency from item creation to terminal decision. Track this against your SLA. Auto-approved items skew this low; filter to queued items for the meaningful number.

### Calibration plot

Override rate by confidence bin. The most important plot in the product. See [Tuning thresholds](/hitl-docs/guides/tuning-thresholds.md) for how to read it.

### Risk-flag effectiveness

For each flag: how often it fires, and how often items with that flag get overridden vs. items without. A flag that fires often but doesn't correlate with overrides is noise — drop it.

## Learnings (`/learnings`)

Every reviewer override is a learning. The view lets you:

* **Browse** — most recent first, filterable by project / reviewer / diff kind / risk flag
* **Cluster** — group by reason or by AI suggestion pattern to spot recurring failure modes
* **Tag** — annotate clusters with action items ("update guideline X", "add few-shot Y")
* **Export** — CSV or JSONL from `/export`

### What's in a learning

```json
{
  "id": "...",
  "item_id": "...",
  "project_id": "...",
  "original_suggestion": "...",
  "corrected_decision": "...",
  "diff_kind": "edit",
  "reason": "Tone too casual for enterprise customer",
  "guideline_id": null,
  "reviewer_id": "...",
  "confidence": 0.87,
  "risk_flags": ["new_user"],
  "input": { ... },
  "created_at": "2026-05-20T12:00:00Z"
}
```

The `input` is included to make the learning a self-contained training example — you don't need to join back to `items` to use it.

## Export (`/export`)

* **CSV** — for spreadsheets and BI tools
* **JSONL** — for ML pipelines, eval suites, and fine-tuning workflows

Exports are project-scoped and respect role permissions. Admins can export everything; reviewers can only export from projects in their pool.

## Operating cadence

A loop that nobody looks at is just a queue. Recommended cadence:

* **Daily** — glance at volume + outcomes, scan for anomalies
* **Weekly** — review learnings, cluster overrides, ship one guideline/prompt update
* **Monthly** — re-check calibration plot, prune dead risk flags, re-tune thresholds if needed
* **Quarterly** — export learnings as eval set, re-test current model version against it

The queue should shrink over time. If it doesn't, the loop isn't closing.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://hitl-01.gitbook.io/hitl-docs/features/analytics.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
