> For the complete documentation index, see [llms.txt](https://wiki.fridays.bot/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://wiki.fridays.bot/documentation/white-paper/4.-identity-auth-and-secrets.md).

# 4. Identity, Auth, and Secrets

Every connector's ability to act is a delegated credential. This section covers how those credentials are acquired (4.1–4.2), stored (4.3), kept alive (4.4), and minimized (4.5). The governing principle repeats §2.1: connectors never hold credentials; the Action Gateway injects tokens from the vault at dispatch time, so credential exposure is confined to one component with one audit trail.

#### 4.1 OAuth flows per vendor class

All vendors in the catalog use OAuth 2.0 authorization-code grant; Fridays implements it uniformly per current best practice — RFC 9700 (BCP 240) \[2]: PKCE on every flow regardless of client confidentiality, exact-match redirect URIs, no implicit grant, no resource-owner-password grant, and `state` for CSRF defense in addition to PKCE (RFC 6749 §10.12 \[1]). This matches the OAuth 2.1 consolidation the MCP authorization spec already mandates (§3.2.1), so REST wrappers and vendor MCP servers share one auth implementation.

Uniform grant, divergent token semantics. The differences below are not trivia — each one dictates a piece of §4.3–4.4 infrastructure:

| Vendor class            | Access-token lifetime | Refresh-token behavior                                                                                                                                                            | Design consequence                                                                                                                                             |
| ----------------------- | --------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Intuit (QuickBooks)** | \~60 min              | \~100-day rolling expiry; **refresh-token value rotates** — Intuit may issue a new refresh token on each refresh \[4]                                                             | Persist-before-use atomicity in the cron farm (§4.4); losing the rotated token = tenant re-auth                                                                |
| **Google**              | \~60 min              | Long-lived, but: per-(client, account) token cap with **silent oldest-token invalidation**; revoked on password change for Gmail scopes; revoked after \~6 months inactivity \[4] | One refresh token per tenant per Google service, tracked centrally; password-change revocation makes revocation *detection* (not prevention) the design target |
| **Microsoft (Graph)**   | 60–90 min             | \~90-day sliding window; Continuous Access Evaluation can revoke mid-lifetime on risk signals \[4]                                                                                | Treat 401 + claims challenge as a first-class re-auth path, not an error                                                                                       |
| **Zoho**                | \~60 min              | Non-expiring but capped count per client; **data-center-partitioned endpoints** (`accounts.zoho.com` / `.eu` / `.in` / `.com.au`) \[4]                                            | Vault records carry the DC of issuance; a token replayed against the wrong DC fails opaquely                                                                   |

Two flow-level details worth stating:

* **Mix-up defense.** With \~50 authorization servers configured, an attacker-controlled or compromised AS returning a code for replay at another vendor is a live threat class. Per RFC 9700, Fridays uses per-issuer redirect URIs and validates the `iss` authorization-response parameter (RFC 9207) where the vendor emits it \[2]\[3].
* **Incremental authorization.** Where supported (Google), scopes are requested per playbook activation, not up front — a tenant enabling only `invoice_followup` never consents to calendar access. This is UX (shorter consent screens convert better) and compliance surface control (§4.5) simultaneously.

#### 4.2 OAuth callback service

Single public redirect endpoint set, multi-tenant by construction. Responsibilities:

* **State binding.** The `state` value is a single-use, TTL-bounded token minted server-side and bound to (tenant, vendor, initiating user, PKCE verifier hash). On callback, the service resolves tenant identity *from state*, never from cookies or heuristics — the tenant-mixing failure (§2.2, component 2) where user A's callback lands tokens in tenant B's vault is structurally impossible if state is the only join key and it is unforgeable and single-use.
* **PKCE completion.** Verifier is held server-side against the state record; code exchange happens from the callback service only, over mTLS to internal services, and the resulting tokens go directly to the vault write API. Authorization codes and tokens never appear in logs (URL scrubbing at the ingress LB; codes are single-use within seconds regardless, per RFC 9700's short-lifetime guidance \[2]).
* **Vendor quirk normalization.** Intuit returns `realmId` as a separate callback parameter — captured atomically with the token or the grant is unusable; Zoho returns the DC location that must be persisted with the token (§4.1). The callback service is where per-vendor callback shape is absorbed so the vault schema stays uniform.

#### 4.3 Token vault

Envelope encryption, standard construction \[5]:

* **Data keys.** Each credential record is encrypted with AES-256-GCM under a per-tenant data-encryption key (DEK); DEKs are encrypted under a key-encryption key (KEK) held in the cloud KMS, backed by FIPS 140-3 validated HSMs. Compromise of the datastore alone yields ciphertext; compromise requires both the datastore and KMS-decrypt permission, which is the narrowest IAM grant in the system.
* **Per-tenant DEKs, not global.** Two reasons: (a) blast-radius — a leaked DEK exposes one tenant; (b) crypto-shredding — tenant deletion (§9.4) is completed by destroying the DEK, rendering residual ciphertext in backups unrecoverable without hunting every copy. This is the practical answer to "deletion across backups," which physical deletion cannot honestly promise.
* **Rotation.** KEK rotation is KMS-native (re-wrap DEKs, no data re-encryption); DEK rotation re-encrypts a tenant's records on a schedule and on suspicion. Rotation schedules and key-lifetime rationale follow NIST SP 800-57 \[5].
* **Access shape.** The vault exposes no `read_token` API to general services. Two callers exist: the Action Gateway (dispatch-time injection, token used in-memory, never persisted downstream — §2.1) and the Renewal Cron Farm (refresh flows). Both authenticate via short-lived workload identity, and every vault access is itself an audit-log event (§7.1). A hypothetical compromised planner or executor cannot exfiltrate tokens because it never receives them — it emits *intents*; the gateway attaches credentials.

#### 4.4 Renewal cron farm

Refresh is treated as a scheduled production workload, not a lazy on-demand path, because on-demand refresh converts vendor auth hiccups into user-visible action latency and races under concurrency.

* **Proactive refresh with jitter.** Access tokens are refreshed ahead of expiry on a jittered schedule (avoiding thundering-herd refresh at the top of the hour across tenants — the Rate-Limit Governor meters refresh calls like any other vendor traffic, §3.5).
* **Rotation atomicity (the Intuit case).** Where the refresh token itself rotates \[4], the sequence is: obtain new pair → persist new pair to vault → mark old as superseded. A crash between obtain and persist is the dangerous window; the farm writes the vendor response to the vault *before* acknowledging the refresh job, and the vault keeps the prior token until the new one is confirmed live, so a replayed job within the vendor's grace behavior recovers rather than strands the tenant.
* **Revocation detection.** `invalid_grant` on refresh is classified: expired vs. revoked (user password change on Google, admin revocation on Microsoft, app disconnect on Intuit). Detection triggers the re-auth path immediately — this is the direct countermeasure to the fail-open mode of §1.1: the connector's death is *observed* within one refresh cycle, not discovered weeks later via missing outcomes.
* **Re-auth UX.** Revocation produces a push notification and an Approval-Feed card ("QuickBooks connection expired — reconnect") deep-linking into the OAuth flow, and affected playbooks pause with an explicit status rather than silently no-op. Connector health (token freshness, subscription liveness from §3.4) is a first-class per-tenant SLO (§7.5).
* **Shared lifecycle for webhook subscriptions.** The same scheduler re-arms Graph subscriptions (≤3-day expiry) and Gmail `watch()` registrations (7-day) \[§3.4]; token refresh and subscription renewal are one workload class with one health model.

#### 4.5 Least-privilege scope policy per playbook

Scopes are declared in the playbook spec (§1.4), aggregated per tenant, and requested incrementally (§4.1). Policy rules:

1. **Narrowest functional scope.** Example — `invoice_followup` on Google: `gmail.readonly` (thread history) + `gmail.send` (dispatch approved drafts). Not `gmail.modify`, not full `mail.google.com`. Rationale beyond principle: Gmail restricted-scope surface directly sizes the annual CASA security assessment (§3.7) \[6]; every unnecessary scope is recurring audit cost.
2. **Read/write asymmetry.** Read scopes may be requested speculatively at connector setup (they feed the Operational Cache, §2.2); write scopes are requested only on activation of a playbook that declares the corresponding tool. A tenant's consent screen is therefore an accurate statement of what Fridays *does*, not what it might do.
3. **Scope-to-tool consistency check in CI.** A playbook release that declares a tool whose vendor operation exceeds its declared scopes fails the build; the inverse (scopes broader than any declared tool needs) fails review. This keeps §3.3's gateway allow-lists, the consent screens, and the partner-program filings (which enumerate scopes) mutually consistent — reviewers at Google and Intuit compare requested scopes against stated functionality, and drift between them is a rejection category \[6].
4. **Admin-consent paths.** Microsoft tenants using org-wide consent get a documented minimal-permission manifest; Fridays does not request directory-wide scopes (`Mail.Read` on delegated basis, not application-wide `Mail.Read` for all mailboxes) — application-permission requests would both over-privilege and trigger heavier review.

Net effect: the credential a compromised component could theoretically misuse (already bounded by the vault access shape, §4.3) is itself bounded to the tool surface the tenant's active playbooks declared — defense-in-depth stacking with §3.3 allow-lists and §6 approval gating.

***

#### References (Section 4)

\[1] IETF RFC 6749, *The OAuth 2.0 Authorization Framework* — §10.12 (CSRF, `state`). <https://datatracker.ietf.org/doc/html/rfc6749>

\[2] IETF RFC 9700 / BCP 240, *Best Current Practice for OAuth 2.0 Security* (Jan 2025) — PKCE universally, exact redirect-URI matching, mix-up defenses, code lifetime. <https://datatracker.ietf.org/doc/html/rfc9700>

\[3] IETF RFC 9207, *OAuth 2.0 Authorization Server Issuer Identification* (`iss` response parameter). <https://datatracker.ietf.org/doc/html/rfc9207>

\[4] Vendor token-lifecycle documentation: Intuit OAuth 2.0 (access \~1 h; refresh \~100-day rolling, value rotation on refresh); Google OAuth 2.0 (refresh-token invalidation causes: token cap per client/account, password change on sensitive scopes, inactivity); Microsoft identity platform (refresh-token sliding lifetime, Continuous Access Evaluation); Zoho OAuth (DC-partitioned endpoints, refresh-token caps). Exact current values in Appendix A — lifetimes are vendor-mutable; this section fixes the design responses.

\[5] NIST SP 800-57 Part 1 Rev. 5, *Recommendation for Key Management*; cloud-KMS envelope-encryption pattern (AWS KMS / Google Cloud KMS developer documentation); FIPS 140-3 module validation.

\[6] Google, *OAuth API verification FAQ* — restricted scopes, CASA assessment, scope-vs-functionality review criteria. <https://support.google.com/cloud/answer/9110914>

***

*Next: Section 5 (Agent Orchestration) — playbook model, planner/executor separation, plan caching, idempotency and retries, cross-vendor sagas, model routing, context construction.*

***

### As-Built Reconciliation — V1

*Legend/sources as in §1's addendum.*

| §                                                                                                                       | Status                       | As-built (source)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| ----------------------------------------------------------------------------------------------------------------------- | ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 4.1 OAuth flows / PKCE / incremental scopes                                                                             | **PLANNED**                  | OAuth broker is Fridays-layer work; backend stores issued tokens as company secrets. Vendor token-semantics table is accurate external fact; the handling infra is planned. IS §4.2.                                                                                                                                                                                                                                                                                                                                            |
| 4.2 OAuth callback service                                                                                              | **PLANNED**                  | Fridays BFF. IS §2.2, §4.2.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| 4.3 Token vault (per-tenant DEK envelope, KMS/HSM, crypto-shred)                                                        | **PARTIAL / ASPIRATIONAL**   | Secrets EXIST: AES-256-GCM + **per-instance master key** (0600) + AWS Secrets Manager provider; strict mode; JIT injection; refs to plugins; `secret_access_events`. SC §9, TS §5. But the specified construction — **per-tenant DEKs under a KMS KEK with crypto-shredding** — is *not* what backend does (it is one master key per instance). Under instance-per-customer the master key becomes per-customer, achieving isolation differently than the DEK-envelope design. HSM roots / GCP / Vault — **stubbed (PLANNED)**. |
| 4.4 Renewal cron farm (proactive refresh, rotation atomicity, revocation detection, re-auth UX, subscription re-arming) | **PLANNED (under-designed)** | The single "connector silently dies" failure. SC §12 names rotation/revocation as a Phase-1 item; persist-before-use atomicity, refresh-before-expiry, and subscription re-arming are unspecified (Tier-1 D). Detection half → `observability-and-slos-v1.md` §4.1.                                                                                                                                                                                                                                                             |
| 4.5 Least-privilege scope per playbook                                                                                  | **EXISTS (substrate)**       | Low-trust presets + `principalPermissionGrants` ship. SC §4. Per-connector scope policy + scope-to-tool CI check — **PLANNED**.                                                                                                                                                                                                                                                                                                                                                                                                 |

**Flag:** two deltas matter. (1) The token vault's cryptographic construction differs from as-built (per-instance master key ≠ per-tenant DEK envelope) — this affects the §7.1 crypto-shred-deletion story, which assumes per-tenant DEKs. (2) The renewal farm is the highest-risk undesigned piece and should be specified before the first connector ships.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://wiki.fridays.bot/documentation/white-paper/4.-identity-auth-and-secrets.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
