awesome-everything RU
↑ Back to the climb

Security

OAuth in production: audience attacks, observability, and real failures

Crux Audience confusion attacks, token storage by client type, the observability metrics that detect compromise, and the four real-world OAuth failures that cost millions.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 11 min

A JWT validator logs a WARN when the aud claim does not contain the expected value, then proceeds to accept the token anyway because “aud contains the caller’s client_id.” The token was issued for a different application. The validator just accepted a token intended for someone else — audience bypass, a real CVE pattern seen at Microsoft Azure AD in November 2024.

Audience attacks

When one authorization server issues tokens for multiple resource servers (API-A and API-B), a validator that accepts any valid aud can be exploited: an attacker who obtains a token for API-A presents it to API-B.

The correct aud check: does the token’s aud claim contain this resource server’s hard-coded identifier? Not “does aud contain any known client_id?” — a logical inversion that bypasses the protection.

RFC 8707 — Resource Indicators: The client specifies a resource parameter in the /token request, naming the target resource server. The IdP mints a token with that audience only. If the client needs tokens for both API-A and API-B, it calls /token twice with different resource values. Inconvenient but correct.

Token Exchange (RFC 8693): A resource server that needs to call a downstream service can present its incoming token and request a new token valid at the downstream, with a narrowed audience. This is “least privilege” for service-to-service hops.

The cnf (confirmation) claim also narrows binding: DPoP sets cnf.jkt to the public key thumbprint. A token with cnf can only be used with proof signed by the corresponding private key — audience binding plus sender binding together.

aud check: correct vs wrong
WRONG — audience bypass
if token.aud.includes(request.client_id) { accept }
Accepts any token where aud contains ANY known client_id, including one from a different app.
CORRECT
const EXPECTED = “acme-api-public”;
if token.aud.includes(EXPECTED) { accept }
Hard-coded expected audience. No request-derived logic. Regression test: token with aud=[“other-api”] must be rejected.

Token storage by client type

Where tokens live determines their exposure surface:

Browser SPA:

  • Access token: JavaScript memory only (closure or React state). Never localStorage — XSS-readable.
  • Refresh token: httpOnly; Secure; SameSite=Strict cookie — inaccessible from JS.
  • On page reload: prompt=none silent /authorize round using the refresh token cookie.

Native mobile app:

  • Access token: OS keychain (iOS Keychain, Android Keystore). Not process memory.
  • Refresh token: OS keychain, never in SharedPreferences or unprotected storage.

Server-side app:

  • Access token: server session (in-memory or Redis, not the client).
  • Refresh token: server session. Client never sees either token.

Machine-to-machine (CI/CD, service):

  • Use client_credentials grant, not user-delegated tokens.
  • Never embed user access tokens in CI pipelines.

Production observability

A minimum OAuth observability dashboard:

MetricAlert condition
refresh_replay_detected_totalAny non-zero value — probable compromise
id_token_validation_failure_total by reasonSpike on sig/kid → JWKS rotation issue
token_introspection_latency_p99Above 200ms → IdP overloaded
jwks_cache_hit_ratioDrop below 90% → cache TTL too short
dpop_proof_failure_total by reasonSpike on iat_skew → clock drift in mobile clients
token_request_total by outcomeSpike on invalid_grant → attack or misconfigured client

refresh_replay_detected_total > 0 is the most actionable alert: it means at least one refresh token was used by two distinct clients — someone stole a token.

Four real-world OAuth failures

Facebook, September 2018. A bug in the “View As” feature exposed 50 million access tokens. Attackers could impersonate any of those users at any Facebook OAuth client (Spotify, Tinder, Instagram). Fix: invalidate 90 million tokens.

Slack, February 2017. A misconfigured OAuth redirect URI in a third-party Slack app let an attacker phish a workspace owner’s authorization code. The attacker escalated to full admin access. Fix: URI validation, required redirect URI exact-match enforcement at the IdP level.

GitHub Enterprise, 2021. A bug in PKCE verification allowed downgrade attacks against older clients. Fixed in a patch release.

Microsoft Azure AD, November 2024. A token-validation cache poisoning vulnerability let a crafted token bypass aud validation for ~30 minutes on cached negative responses. Fixed within hours. Root cause: the negative-response cache did not distinguish “aud not found” from “aud explicitly rejected.”

The pattern: every incident involved one skipped or buggy mandatory check. The industry response to all four was the same — more mandatory checks, shorter TTLs, wider observability.

Quiz

A team wants to share one access token between two resource servers (API-A and API-B) for convenience. Why is this dangerous?

Quiz

Why should access tokens never be stored in localStorage in a browser SPA?

Order the steps

Order the steps to diagnose a spike in id_token validation failures:

  1. 1 Check the failure_reason dimension on id_token_validation_failure_total
  2. 2 If reason is 'sig' or 'kid-not-found', suspect JWKS key rotation
  3. 3 Check jwks_cache_hit_ratio — a drop indicates stale cache from rotation
  4. 4 Confirm by checking if the IdP published a new signing key recently
  5. 5 Trigger immediate JWKS refresh across all instances
  6. 6 Reduce JWKS cache TTL to 5–10 minutes and add on-cache-miss refresh as backstop
Recall before you leave
  1. 01
    Explain the audience confusion attack and the correct aud check that prevents it.
  2. 02
    Why is refresh_replay_detected_total > 0 a compromise indicator rather than a harmless error?
  3. 03
    What observability gap does short access token TTL (5–15 min) address?
Recap

Audience validation is a hard-coded identity check, not a membership check — the token’s aud must contain this resource server’s specific identifier, never derived from the request. Token storage follows the client’s threat model: JS memory for SPAs, OS keychain for native, server session for server-side. Production observability must expose refresh_replay_detected_total (compromise signal), JWKS cache hit ratio (rotation readiness), and introspection latency (IdP health). The four major OAuth incidents — Facebook 2018, Slack 2017, GitHub Enterprise 2021, Azure AD 2024 — each followed from one skipped or buggy mandatory check. OAuth security is a complete-set problem: every check must pass, every time.

Connected lessons
appears again in202
Continue the climb ↑OAuth/OIDC: multiple-choice review
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources4
expand
  1. 01
  2. 02
  3. 03
  4. 04

Trademarks belong to their respective owners. Editorial reference only.