Skip to content

Section 4: Audit & Observability

Status: Draft v0.1
Last updated: 2026-03-25


4.1 Audit Philosophy

Cappella's audit model is: every write to Hippo carries provenance, and every Cappella operation that changes state produces a structured log entry. The audit trail is not a separate system — it is the Hippo provenance event model, extended with Cappella-specific context.

In v0.1, operational audit (sync run results, resolution run results, reconciliation findings) is written to structured JSON logs. In v0.2, these become Hippo entities for queryability through the standard HippoClient API.


4.2 Provenance Context

Every entity write made by Cappella carries a structured context in the Hippo provenance event:

{
  "cappella_version": "0.1.0",
  "source_system": "starlims",
  "adapter": "starlims",
  "adapter_version": "1.0.0",
  "run_id": "uuid-run-123",
  "trigger": "nightly_starlims_sync",
  "fetched_at": "2026-03-25T02:00:01Z",
  "trust_level": 80
}

For collection resolution runs, artifact writes by Canon carry Canon's own provenance context. Cappella does not add additional context to Canon's writes — Canon is responsible for its own provenance.


4.3 Structured Log Events

Cappella emits structured JSON log events to stdout (or a configured log sink) for all significant operations. These are machine-parseable and can be ingested by log aggregation tools (Datadog, CloudWatch, ELK).

Event Types

Event When emitted
adapter_run_started Sync run begins
adapter_run_completed Sync run ends (success, partial, failure)
record_transform_failed A record could not be transformed
record_upsert_conflict Conflict detected during upsert
resolution_run_started Collection resolution begins
resolution_run_completed Collection resolution ends
canon_resolve_failed Canon returned an error for a sample
reconciliation_started Reconciliation run begins
reconciliation_finding A discrepancy was detected
trigger_fired A trigger executed
trigger_failed A trigger encountered an error

Example: adapter_run_completed

{
  "event": "adapter_run_completed",
  "timestamp": "2026-03-25T02:00:47Z",
  "run_id": "uuid-run-123",
  "adapter": "starlims",
  "trigger": "nightly_starlims_sync",
  "mode": "incremental",
  "since": "2026-03-24T02:00:00Z",
  "fetched": 150,
  "transformed": 149,
  "upserted": 23,
  "skipped_identical": 126,
  "failed_transform": 1,
  "conflicts_detected": 2,
  "duration_seconds": 46.2,
  "status": "partial_success"
}

Example: reconciliation_finding

{
  "event": "reconciliation_finding",
  "timestamp": "2026-03-25T03:00:12Z",
  "finding_id": "uuid-finding-456",
  "check": "field_conflict",
  "entity_type": "Donor",
  "entity_id": "uuid-donor-789",
  "field": "diagnosis",
  "source_a": {"system": "starlims", "value": "CTE", "updated_at": "2026-03-24T..."},
  "source_b": {"system": "redcap", "value": "Probable CTE", "updated_at": "2026-03-25T..."},
  "severity": "warning",
  "suggested_action": "manual_review"
}

4.4 HarmonizationConflict Events

When validate() returns errors or a field conflict is detected during upsert, Cappella records a HarmonizationConflict provenance event on the affected Hippo entity. This is queryable via the standard Hippo provenance API.

{
  "event_type": "HarmonizationConflict",
  "entity_id": "uuid-donor-789",
  "conflict_type": "field_conflict",
  "field": "diagnosis",
  "existing_value": "CTE",
  "incoming_value": "Probable CTE",
  "incoming_source": "redcap",
  "resolution": "existing_wins",
  "resolution_reason": "starlims (trust=80) > redcap (trust=60)",
  "cappella_run_id": "uuid-run-123"
}

This creates a permanent, queryable audit trail of every conflict and how it was resolved.


4.5 ReconciliationFinding Entity (v0.2)

Opinion (mark for review): In v0.1, reconciliation findings are structured log events. In v0.2, they become ReconciliationFinding Hippo entities, queryable via HippoClient and surfaced in Aperture as a "data quality" view. The log structure in §4.3 is designed to map directly to the future entity schema — migration from logs to entities is straightforward.


4.6 Health Endpoint

GET /status returns Cappella's operational health:

{
  "cappella_version": "0.1.0",
  "hippo": {"status": "ok", "version": "0.3.1", "url": "http://localhost:8000"},
  "canon": {"status": "ok", "version": "0.2.0", "url": "http://localhost:8001"},
  "adapters": {
    "starlims": {"status": "ok", "last_sync": "2026-03-25T02:00:47Z"},
    "halo": {"status": "stub", "last_sync": null},
    "manual": {"status": "ok", "last_sync": null}
  },
  "triggers": {
    "nightly_starlims_sync": {"status": "scheduled", "next_run": "2026-03-26T02:00:00Z"}
  }
}