> For the complete documentation index, see [llms.txt](https://wiki.fridays.bot/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://wiki.fridays.bot/documentation/white-paper/7.-auditability-and-observability.md).

# 7. Auditability and Observability

Audit is a product feature here, not an ops afterthought: the single-egress invariant (§2.1) makes the log *complete* — every action passed one point — and this section makes it *trustworthy* (tamper-evident), *explainable* (provenance to the exact model inputs), and *usable* (owner UI vs. auditor export are different consumers with different requirements).

#### 7.1 Immutable action log

**Schema.** One entry per gateway transit; entries are append-only and cover actions, approval decisions (§6.3), configuration events (threshold overrides, delegation grants — §6.2/6.3), vault accesses (§4.3), and compensations (§5.5). Abridged entry:

```json
{
  "seq": 84210,
  "ts": "2026-07-02T14:31:07.412Z",
  "tenant": "t_481",
  "kind": "action",                          // action | approval | config | vault_access | compensation
  "action": {
    "tool": "gmail.message.send",
    "risk_class": "external_communication",
    "instance_score": 0.71,
    "idempotency_key": "t_481:inv_fu#2026-07-02:send:QB-10442:2",
    "args_hash": "sha256:9f2c…",
    "plan_ref": "sha256:b41a…",              // content-addressed plan (§5.2)
    "trigger": { "type": "webhook", "vendor_event_id": "qbo:evt:7731…" }
  },
  "decision": { "path": "approved", "approval_ref": 84195 },   // links the approval entry
  "dispatch": {
    "vendor": "google", "status": 200,
    "request_digest": "sha256:…", "response_digest": "sha256:…",
    "payload_ref": "pstore://t_481/84210"    // full bodies in payload store, own retention class
  },
  "cost": { "llm": null, "vendor_quota": {"gmail_units": 100} },
  "prev": "sha256:e08d…",
  "entry_hash": "sha256:44c1…"               // H(prev || canonical(entry))
}
```

**Hash chaining.** `entry_hash = H(prev_hash ‖ canonical_serialization(entry))` per tenant chain, with periodic signed checkpoints, following the transparency-log construction (RFC 6962 \[1]; Merkle-tree batching over entries as in Trillian-class implementations for O(log n) inclusion proofs rather than full-chain replay). Checkpoint heads are additionally anchored daily to an external write-once store outside the primary trust domain — tamper-evidence against an attacker who controls the datastore requires the verifier to hold a head the attacker cannot rewrite. Why chaining at all, concretely: the log is the substantiation artifact for §6.4's "the human decided before the effect" and §12.2's FTC claims; an ordinary mutable table proves nothing to a party that doesn't already trust Fridays' operators.

**Storage and retention.** Entries land in WORM-configured object storage (S3 Object-Lock-class, compliance mode — the same mechanism used for SEC 17a-4-style recordkeeping \[2]); default retention 7 years for `money_movement`-touching chains (aligned with common tax-record horizons), shorter classes per Appendix C.

**The GDPR tension, solved structurally.** Immutable log vs. Art. 17 erasure (§9.4) is the classic conflict. Resolution: entry *structure* (hashes, classes, timestamps, decisions) is chained; entry *payloads* (args, bodies) live in the payload store encrypted under per-tenant DEKs (§4.3), and the chain hashes cover the **ciphertext**. Tenant deletion crypto-shreds the DEK: payloads become unrecoverable, while chain integrity — and the ability to prove *that* an approval preceded *that* dispatch — survives. Deletion and tamper-evidence stop being enemies.

#### 7.2 Per-action provenance

Every action reconstructs to its full causal chain, each link content-addressed:

* **Trigger** — verified webhook event ID or scheduler tick (§3.4), so "why did this run at all" has one answer.
* **Model inputs, exactly as seen.** The prompt archive stores the planner's assembled context — playbook version, tenant-snapshot projection, the *pseudonymized* form of untrusted content (§5.7) — keyed by hash and referenced from the plan. "Exactly as seen" matters: post-hoc reconstruction from live systems is unreliable (the invoice has since been paid; the cache has rolled). Reproducing a bad plan requires the inputs at plan time, not the world at investigation time.
* **Plan** — the content-addressed DAG (§5.2); `plan_ref` in every child action.
* **Model calls** — provider, model ID and version string, sampling params, token counts, response hash. Model version pinning in provenance is what makes §11's regression analysis possible when a provider silently updates a snapshot.
* **Dispatch** — request/response digests inline, full bodies in the payload store. Digest-inline/body-by-reference keeps chain entries small and lets payload retention (and crypto-shredding) differ from chain retention.
* **Human decision** — approval/denial entry with deciding identity, device, and the approved record hash (byte-identity, §6.3).

Worked use: owner disputes "Fridays sent my customer the wrong amount." Resolution path is mechanical — approval entry shows the card hash the owner tapped; args under that hash show the amount; plan shows which tool read produced it; the read's response digest shows what QuickBooks returned. The defect localizes to (a) vendor data, (b) extraction (model), or (c) owner approved it — in minutes, from records, with no log-grepping archaeology.

#### 7.3 Owner-facing audit vs. compliance export

Two consumers, deliberately different surfaces:

* **Owner UI:** human-readable timeline per playbook/counterparty ("Reminder #2 sent to Acme — approved by you, Jul 2, 2:31 pm"), one tap from any outcome to its provenance chain rendered legibly. Design target is comprehension, not completeness — raw hashes are behind a disclosure, not in the default view.
* **Compliance export:** NDJSON of raw chain entries plus checkpoint signatures, consumed by an **open-source verifier** that checks chain continuity, checkpoint signatures, and inclusion proofs against the externally anchored heads. The point of open-sourcing the verifier: an auditor (SOC 2 §8.6, a tenant's accountant, or opposing counsel) validates integrity without trusting any Fridays-rendered surface. A verification story that terminates at "our dashboard says so" is not a verification story.
* Access to export is itself a logged event; auditor access is scoped per tenant (partitioning, §8.3).

#### 7.4 Metering

Same choke point, second product: every gateway transit records its marginal cost.

* **LLM spend:** per model call — provider, tier (§5.6), input/output tokens, and the **price-sheet snapshot ID** in effect at call time. Snapshotting prices is the non-obvious requirement: provider pricing changes; historical unit economics recomputed against current prices are fiction.
* **Vendor quota:** per dispatch in the vendor's native unit — Gmail quota units, QuickBooks per-realm request count, Zoho credits (§3.5) — because dollar cost is zero but *scarcity* cost is real: quota consumption is what the Rate-Limit Governor rations, and attribution answers "which playbook is eating this tenant's Zoho budget."
* **Aggregation:** action → playbook instance → playbook → tenant. Direct feeds: §13.1's cost-per-action bound is *measured* here, not modeled; plan-cache hit rates (§5.2) are validated against realized token spend; tier-routing changes (§5.6) ship with before/after cost evidence; and per-tenant gross margin (§13.4) is a query, not a spreadsheet.

#### 7.5 Telemetry, tracing, and SLOs

* **Tracing:** OpenTelemetry with W3C Trace Context propagation \[3]; one trace spans webhook receipt → planner → executor → gateway → vendor dispatch, and the trace ID is recorded in the corresponding log entries — the debugging view (traces, retention weeks) and the evidentiary view (chain, retention years) cross-reference but remain separate systems with separate access control. Traces are operational telemetry; they are *not* the audit record and never contain payload plaintext.
* **SLOs**, with error budgets per the SRE formulation \[4]. The chosen SLIs are the ones whose silent failure this paper has repeatedly indicted:
  * **Connector liveness** per (tenant, vendor): token freshness (§4.4) and webhook-subscription validity (§3.4). This is the anti-fail-open SLO — the §1.1 Zapier failure, made a paged condition. Target: staleness detected within one refresh cycle.
  * **Playbook objective attainment:** §5.1's objective predicates evaluated per tenant per day (e.g., % of >14-day invoices with an active cadence). The product's headline promise, expressed as an SLI — degradation here is a defect even when every RPC succeeded.
  * **False-approval-request rate** (§6.2, measured per §11.2): the automation-bias/alarm-fatigue budget. Rising FAR burns budget exactly like availability loss, because disused approvals dismantle the §6 control \[§6.2, ref. Parasuraman & Riley].
  * **Approval-feed latency:** suspension → card visible on device; interactive-priority dispatch latency post-approval (§3.5 priority classes).
  * **Digest integrity:** external-anchor freshness and verifier pass rate on sampled chains — the audit system monitoring itself.
* **Alerting philosophy:** symptom-based on SLO burn rates \[4], not cause-based on component noise; a tenant-visible symptom (objective attainment dropping) pages even when all components report healthy, which is precisely the failure shape of a lapsed webhook subscription.

***

#### References (Section 7)

\[1] IETF RFC 6962, *Certificate Transparency* — Merkle-tree append-only logs, inclusion/consistency proofs; Trillian as the reference implementation class. <https://datatracker.ietf.org/doc/html/rfc6962>

\[2] WORM object storage: AWS S3 Object Lock (compliance mode) documentation; SEC Rule 17a-4(f) as the regulatory archetype for non-rewritable, non-erasable electronic records.

\[3] OpenTelemetry specification; W3C *Trace Context* Recommendation. <https://opentelemetry.io> · <https://www.w3.org/TR/trace-context/>

\[4] B. Beyer et al. (eds.), *Site Reliability Engineering* and *The Site Reliability Workbook* (O'Reilly, 2016/2018) — SLI/SLO/error-budget methodology; multi-window burn-rate alerting.

\[5] NIST SP 800-92, *Guide to Computer Security Log Management* — retention-class separation and log-integrity guidance.

\[6] R. Parasuraman, V. Riley, *Humans and Automation* (1997) — per §6.2; basis for treating false-approval-request rate as an SLO.

***

*Next: Section 8 (Security Architecture) — injection threat model and defenses, tenant isolation, key management, MCP supply chain, SOC 2 mapping.*

***

### As-Built Reconciliation — V1

*Legend/sources as in §1's addendum.*

| §                                                                                                                       | Status                  | As-built (source)                                                                                                                                                                                                                                                                                                                       |
| ----------------------------------------------------------------------------------------------------------------------- | ----------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **7.1 Immutable log** (hash chaining / RFC 6962 Merkle, WORM/S3 Object-Lock, external anchoring, crypto-shred payloads) | **MOSTLY ASPIRATIONAL** | Backend has `activityLog` with regex redaction (EXISTS), but it is **application-convention append-only — not DB-enforced, not hash-chained, not WORM, not externally anchored**. SC §7, AO §8. DB-enforced immutability is **PLANNED** (Phase 3); the transparency-log construction is aspirational. Schema → `data-model-v1.md` §1.7. |
| 7.2 Per-action provenance                                                                                               | **EXISTS (substrate)**  | `heartbeat_runs` + run-log-store + `cost_events` + `issue_execution_decisions` reconstruct the timeline. AO §8. The **prompt archive** (context-as-seen, pseudonymized) is PLANNED.                                                                                                                                                     |
| 7.3 Owner UI vs compliance export (open-source verifier)                                                                | **PARTIAL**             | Admin UI + activity log EXIST; consumer plain-language log + NDJSON export + **open-source verifier** are PLANNED.                                                                                                                                                                                                                      |
| 7.4 Metering                                                                                                            | **EXISTS (substrate)**  | `cost_events` per run (tokens/model/provider/**USD**). AO §6. Native **vendor-quota-unit** attribution + **price-sheet-snapshot** IDs are PLANNED.                                                                                                                                                                                      |
| 7.5 Telemetry / tracing / SLOs                                                                                          | **PLANNED**             | OTel tracing + SLIs/SLOs not built; substrate = pino, `heartbeat_runs`, silent-run watchdog, live events. → `observability-and-slos-v1.md`.                                                                                                                                                                                             |

**Flag:** the hash-chained / WORM / externally-anchored audit log is the largest aspirational delta in this section — today's log is app-convention append-only and an operator with DB access can alter history (SC §7 states this plainly). This matters because §6.4 (Art. 22 evidence) and §12.2 (FTC substantiation) lean on the log as *tamper-evident proof*; **DB-enforced immutability is the minimum** needed to make those downstream claims stand.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://wiki.fridays.bot/documentation/white-paper/7.-auditability-and-observability.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
