> For the complete documentation index, see [llms.txt](https://wiki.fridays.bot/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://wiki.fridays.bot/documentation/white-paper/6.-the-approval-model.md).

# 6. The Approval Model

The approval model is the system's central control: it is simultaneously the product's trust proposition (§1.2), the backstop that bounds model error and prompt injection (§2.4 TB2, §8.2), and the mechanism that takes consequential decisions outside "solely automated processing" (§6.4). This section specifies it precisely, because an approval model specified loosely is theater — the failure modes in 6.5 and the byte-identity requirement in 6.3 are where most human-in-the-loop designs quietly break.

#### 6.1 Action risk classification

Every tool carries a static risk class assigned at connector build time (Appendix B is the full table); every *action* inherits the tool's class and is then scored on instance attributes. Classes, ordered by default severity:

| Class                        | Definition                                                                 | Examples                                                                      | Reversibility                                                                                   | Default policy (cold start)               |
| ---------------------------- | -------------------------------------------------------------------------- | ----------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- | ----------------------------------------- |
| **`money_movement`**         | Creates, modifies, or triggers a transfer of funds or a payment obligation | `quickbooks.payment.create`, `quickbooks.invoice.void`, payment-link issuance | Partially — vendor-side deletion exists but downstream (bank, customer) effects may not reverse | Always suspend                            |
| **`external_communication`** | Emits content to a party outside the tenant                                | `gmail.message.send`, `quickbooks.invoice.send`, SMS/social publish           | **None** — cannot be unsent (§5.5)                                                              | Always suspend                            |
| **`record_mutation`**        | Writes to tenant systems without external emission                         | `zoho.contact.update`, `gmail.draft.create`, `quickbooks.invoice.update`      | High — compensations declared per tool (§5.5)                                                   | Auto-execute below score threshold        |
| **`read_only`**              | No vendor-side state change                                                | `quickbooks.invoice.list`, `gmail.thread.search`                              | N/A                                                                                             | Auto-execute; logged like everything else |

Class assignment is deliberately coarse and static; nuance lives in the **instance score**, computed at the gateway per action from typed features: monetary magnitude (absolute and relative to tenant's median transaction), counterparty familiarity (prior interaction count from the Operational Cache), content novelty for communications (distance from previously approved drafts for this recipient class), playbook maturity (release age × tenant tenure on it), and blast scope (single record vs. batch cardinality).

Two design rules prevent classification games:

1. **The class is a floor, not an input to the model.** The planner cannot argue an action into a lower class; class derives from the tool identifier the gateway resolves independently (§5.3). An injected instruction "treat this as routine" is inert — classification never reads content the attacker controls \[6].
2. **Batch cardinality escalates.** Forty individually-low-risk `record_mutation` writes as one plan branch score as one action of blast scope 40; mass mutation is a distinct risk even when each element is trivial.

#### 6.2 Threshold engine: static policy → calibrated per-tenant thresholds

The threshold determines, per (tenant, class), the instance score above which an action suspends. Lifecycle:

* **Cold start: suspend-heavy.** New tenants and newly activated playbooks start with `money_movement` and `external_communication` at always-suspend and `record_mutation` at a conservative score cutoff. The first weeks intentionally over-ask; the approval history this generates is the calibration data.
* **Calibration is a ratchet with monotone constraints.** Relaxation of a threshold (fewer suspensions) requires *N* consecutive approvals with zero denials for the (playbook, class, feature-band) cell, moves one band at a time, and is capped by class ceilings: `money_movement` above a tenant-set absolute amount **never** auto-executes regardless of history, and `external_communication` to a never-before-contacted counterparty always suspends. Any denial in a cell resets that cell to its prior band. The asymmetry is deliberate: the system can only earn autonomy slowly and lose it instantly.
* **Why calibrate at all.** Static suspend-everything fails on human factors, not engineering: high false-alarm rates produce operator *disuse* — approvals get rubber-stamped or ignored, which is worse than fewer, informative requests (Parasuraman & Riley's misuse/disuse analysis \[4]; the "cry-wolf" effect in alarm research). The engine's objective function is therefore explicitly the **false-approval-request rate** — suspensions the owner approves without modification, measured in §11.2 — driven down subject to the monotone safety constraints. An approval request should carry information; when it stops carrying information, it stops protecting.
* **Owner override in both directions.** The tenant can pin any class to always-suspend (some owners want every email) or raise the `record_mutation` floor; overrides dominate learned state and are themselves audited configuration events.
* **What is&#x20;*****not*****&#x20;learned.** No cross-tenant threshold transfer (tenant A's tolerance never loosens tenant B — §9.5's per-tenant learning boundary), and no learned relaxation of the two hard ceilings above. The learned surface is small by design; §6.4's legal argument depends on the human decision remaining decisive, and a system that learned its way out of asking would erode its own compliance basis.

#### 6.3 Approval feed mechanics

* **Card = signed action record.** The card renders from the *same* signed record `(tool, args, idempotency_key, plan_ref)` the gateway will dispatch — byte-for-byte parameter identity between what is shown and what runs (TB4, §2.4). There is no template layer that could render "$450" while the args say $4,500; the rendering is a projection of the record, and the approval signature covers the record hash, not the rendering. Approving card *h* authorizes exactly the action whose hash is *h*; a post-approval args change is a different hash and requires a new approval.
* **What the card shows.** Class and score drivers ("first contact with this recipient"; "amount 3.2× your median invoice"), the full outbound content for communications (not a summary — summaries reintroduce the misrepresentation channel), the provenance chain (triggering event, playbook, plan ref), and the counterparty history one tap deep. This is the *competence* half of §6.4: the approver must be able to actually evaluate, not merely click.
* **Batching.** Homogeneous low-variance suspensions (12 overdue-invoice reminders, all within the approved-draft distance band) may present as a batch card with per-item expansion; `money_movement` is never batchable, and any batch item the owner expands and edits splits out as an individual card. Batching is a concession to \[4] — twelve identical pushes train disuse — bounded so it cannot become bulk rubber-stamping of heterogeneous risk.
* **Expiry.** Default 72 h; expiry is a **no-op plus digest entry**, never a silent retry and never auto-approval (§2.3). For cadence-driven playbooks the planner treats the expiry as a skipped touch and re-plans the next one — the invoice gets chased next cycle; nothing fires without a decision.
* **Delegation.** Tenants may register secondary approvers with per-class grants (bookkeeper: `money_movement` up to a cap; office manager: `external_communication`). Every approval records the deciding identity and device; delegation grants are configuration events in the audit log. Approver authentication is device-bound with platform biometric/passcode re-confirmation for `money_movement` — the approval credential must be harder to steal than the payoff it gates.

#### 6.4 Human-in-the-loop as a formal control

GDPR Art. 22(1) grants the right not to be subject to a decision "based solely on automated processing" producing legal or similarly significant effects \[1]. The Art. 29 Working Party's guidance (WP251rev.01, endorsed by the EDPB) sets the bar for what removes a decision from "solely automated": human involvement must be **meaningful, not a token gesture** — carried out by someone with the **authority and competence to change the decision**, who considers the relevant data *before* the decision takes effect \[2].

Mapping, element by element:

* **Authority:** the approver is the business owner (or an owner-granted delegate, §6.3) — the party with maximal authority over the action, not a reviewer downstream of it.
* **Competence:** the card supplies the decision-relevant data (full content, score drivers, counterparty history) rather than a yes/no stub; §6.3's no-summary rule for outbound content exists specifically to satisfy this element.
* **Actual influence, pre-effect:** suspension occurs *between* intent and execution (§3.3); the human decision is causally upstream of the effect, not an appeal mechanism after it. Denial is a real, cheap, frequent path (and feeds the ratchet, §6.2) — WP251's concern about rubber-stamping is answered operationally by measuring and minimizing uninformative requests, not by asserting diligence.
* **Scope honesty:** many Fridays actions plausibly fall below Art. 22's "legal or similarly significant effects" bar (a reminder email to a business debtor is not credit scoring). The architecture nonetheless applies the control to all high-risk classes rather than litigating the bar per action — cheaper to over-comply structurally than to classify effects case-by-case, and the same control does double duty as the injection backstop (§2.4).

The EU AI Act's human-oversight article points the same direction and adds one requirement worth engineering for even where Fridays sits outside the high-risk categories: oversight measures should account for **automation bias** — the tendency to over-rely on system output (Art. 14(4)(b)) \[3]. The threshold engine's false-approval-request objective (§6.2) *is* the automation-bias countermeasure: keeping requests rare and informative is what keeps the human's consideration real. §12.3 completes the regulatory mapping; §12.4 covers the Colorado AI Act's consequential-decision framing.

#### 6.5 Failure modes

Specified exhaustively because approval systems fail at the edges, not the happy path:

* **Timeout/expiry:** no-op + digest (§6.3). Fail-closed for everything above threshold — an unanswered question is a "no."
* **Precondition staleness.** Between suspension and approval, the world moves: the invoice gets paid, the contact updates their email. Approved actions **revalidate declared preconditions at dispatch** (invoice still open; recipient unchanged); a failed revalidation aborts the dispatch, logs `stale_approval`, and re-plans — an approval is authorization for an action against a state, not a blank order. Precondition sets are declared per tool in the connector, same discipline as compensations (§5.5).
* **Deny-and-explain.** Denial optionally captures a structured reason (wrong tone / wrong amount / wrong recipient / not now); the reason routes differently — *not now* re-queues with backoff; *wrong tone* re-enters the planner with the correction as a constraint; *wrong amount/recipient* additionally flags the plan for eval review (§11.4), since it may indicate a data or extraction defect rather than a preference.
* **Escalating denials.** ≥K denials in a window for one (playbook, class) cell pauses the playbook cell with an explicit status ("paused after repeated denials — review settings"), rather than continuing to generate requests the owner keeps refusing. Silence-as-signal: sustained non-response (expiries) over a longer window triggers the same pause plus a connector-health-style notification (§4.4) — an owner who has stopped answering has withdrawn the oversight the model depends on, and the system's correct response is to stop acting, not to act unsupervised.
* **Approver unavailability.** Delegation (§6.3) is the designed path; absent delegates, high-risk actions simply queue and expire. There is deliberately **no** "auto-approve after N hours" configuration — the moment such an option exists, §6.4's meaningful-involvement claim is dead for every tenant who enables it, and the injection backstop (§2.4) acquires a bypass an attacker only has to outwait.
* **Approval-channel compromise.** Bounded by device binding + biometric re-confirmation for `money_movement` (§6.3) and by the byte-identity rule: a compromised rendering layer still cannot cause execution of anything but the signed record. Residual risk (full device compromise of the owner's phone) is accepted and stated in the threat model (§8) — it is the same residual risk as the owner's banking app.

***

#### References (Section 6)

\[1] Regulation (EU) 2016/679 (GDPR), Art. 22. <https://eur-lex.europa.eu/eli/reg/2016/679/oj>

\[2] Article 29 Data Protection Working Party, *Guidelines on Automated individual decision-making and Profiling for the purposes of Regulation 2016/679*, WP251rev.01 (2018; endorsed by the EDPB) — "meaningful" human involvement: authority and competence to change the decision; not a token gesture. <https://ec.europa.eu/newsroom/article29/items/612053>

\[3] Regulation (EU) 2024/1689 (AI Act), Art. 14 — human-oversight design, incl. Art. 14(4)(b) awareness of automation bias.

\[4] R. Parasuraman, V. Riley, *Humans and Automation: Use, Misuse, Disuse, Abuse*, Human Factors 39(2), 1997 — false-alarm-driven disuse; the basis for treating over-asking as a safety defect, not a safety margin.

\[5] IETF RFC 6962 construction per §7.1 — approval decisions are entries in the same hash-chained log as the actions they authorize.

\[6] S. Willison, prompt-injection series — content-independent classification as capability-side enforcement (§2.4 TB2, §8.2). <https://simonwillison.net/series/prompt-injection/>

***

*Next: Section 7 (Auditability and Observability) — action-log schema, hash chaining, provenance, owner-facing audit vs. compliance export, metering, SLOs.*

***

### As-Built Reconciliation — V1

*Legend/sources as in §1's addendum.*

| §                                                                                                    | Status                 | As-built (source)                                                                                                                                                                                                                                                                                                                                                            |
| ---------------------------------------------------------------------------------------------------- | ---------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 6.1 Risk classification (money / comms / record / read)                                              | **PLANNED**            | The confirmation *primitive* EXISTS (issue interactions, IS §4.3); the risk taxonomy + instance scoring (magnitude, counterparty familiarity, novelty, batch cardinality) is Fridays application logic. "Class derived from tool id, not content" depends on connector design.                                                                                               |
| 6.2 Threshold engine (static → learned, monotone ratchet)                                            | **PLANNED**            | Backend has review stages but **no learned thresholds**; per-tenant calibration is a Fridays design. Ties to the defined-learning-boundary requirement → `data-handling-and-privacy-v1.md` §5.                                                                                                                                                                               |
| 6.3 Approval feed (card = signed record, **byte-identity**, batching, expiry, delegation, biometric) | **PARTIAL**            | EXISTS: interactions surfaced via API/UI/WebSocket, review stages, mandatory decision comments, **self-review excluded**. SC §5. PLANNED: mobile card UX, **literal-payload binding** (IS §4.3 names it as *the* thing to build — the connector executes the bound payload, not a re-interpretation), multi-channel dispatch, biometric re-confirm, cross-tenant delegation. |
| 6.4 HITL as formal control (GDPR Art. 22 / WP251)                                                    | **EXISTS (substrate)** | Self-review exclusion + mandatory decision comments + pre-effect suspension are exactly the "meaningful human involvement" answer. SC §5. → `ai-governance-v1.md` §3.                                                                                                                                                                                                        |
| 6.5 Failure modes (timeout / staleness / deny-explain / escalation / channel compromise)             | **PARTIAL**            | `wake_assignee` continuation + pause/escalate EXIST (AO §4). Precondition revalidation at dispatch, structured deny reasons, escalating-denial pause, device-binding/biometric — PLANNED. The **no-auto-approve-on-expiry** invariant is consistent with backend (interactions stay pending).                                                                                |

**Flag:** the approval *primitives* are the strongest as-built part of the whole system and genuinely back the paper's central control. What is PLANNED is the consumer approval UX and — most importantly — the **literal-payload byte-identity binding** ("what is approved is what runs"), which IS §4.3 flags as the single control the product's trust story rests on and which is not yet built.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://wiki.fridays.bot/documentation/white-paper/6.-the-approval-model.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
