Auth & Security
Olly uses Keycloak as its OpenID Connect (OIDC) and SAML 2.0 identity provider, deployed self-hosted on EKS in the private subnet. All authentication flows use PKCE (Authorization Code with Proof Key for Code Exchange) — the OAuth 2.0 implicit flow is not used anywhere. Internal machine-to-machine calls use Client Credentials with secrets rotated every 24 hours by OpenBao.
Keycloak is never exposed directly to the internet. APISIX proxies the /auth/ path for external clients. All internal services validate JWTs using the Keycloak JWKS endpoint, which is cached in service memory with a 5-minute TTL.
Realm Structure
The platform uses four Keycloak realms to enforce hard isolation between user populations. A token issued in one realm is invalid in any other realm's context.
| Realm | Users | Clients | Auth Flow | MFA |
|---|---|---|---|---|
olly-members | Individual members, employer HR admins | olly-mobile, olly-member-portal, olly-employer-portal | PKCE Authorization Code | Optional for members; required for EMPLOYER_ADMIN |
olly-providers | Licensed providers (NPI holders), billing admins | olly-provider-portal | PKCE Authorization Code | Required for all provider users (TOTP) |
olly-internal | Olly operations staff, claims adjusters, underwriters, system admins | olly-admin-console | PKCE Authorization Code | Required for all users (TOTP; FIDO2/passkeys in P3) |
olly-services | None — M2M only | One client per Go service | Client Credentials | N/A (client secrets + mTLS) |
Why four realms? A compromised member token cannot be used in the internal admin context. Each realm enforces independent session policies, MFA requirements, and IP restrictions. The operational overhead is justified by the security boundary enforcement.
Auth Flow per Client Type
| Client Type | Realm | PKCE | Token Storage | Session Enforcer |
|---|---|---|---|---|
| Mobile app (Expo RN) | olly-members | Yes (expo-auth-session, RFC 7636) | Expo SecureStore (iOS Keychain / Android Keystore) | Biometric gate on SecureStore |
| Web portals (Next.js) | olly-members / olly-providers | Yes (next-auth v5) | httpOnly Secure SameSite=Strict cookie; token never in JS | next-auth server-side session |
| Admin console (React SPA) | olly-internal | Yes (via thin Next.js BFF) | httpOnly Secure cookie set by BFF | BFF session check + OPA IP restriction |
| Go services (M2M) | olly-services | No (Client Credentials) | In-memory with TTL buffer; sourced from OpenBao | mTLS between all service pods |
Mobile PKCE Flow (Expo React Native)
The app generates a code_verifier (43–128 random chars) and code_challenge (SHA-256, base64url), redirects to Keycloak, and exchanges the authorization code for tokens. Access tokens are stored exclusively in Expo SecureStore — never in AsyncStorage, memory logs, or JavaScript variables. Biometric verification (Face ID / Touch ID via Expo LocalAuthentication) gates access to SecureStore on every app open and before sensitive actions (claim submission, EOB viewing). Biometric checks are local-only and do not involve a server round-trip.
Web Portal BFF Pattern (Next.js)
next-auth v5 handles the PKCE exchange server-side and stores the session in an httpOnly, Secure, SameSite=Strict cookie. Client-side JavaScript never sees a raw JWT. Next.js Server Components and Route Handlers use auth() to read the session; client components call Next.js API routes that attach the Bearer token to upstream service calls. Silent token refresh happens transparently via next-auth before access token expiry.
Service-to-Service (Client Credentials)
Go services obtain access tokens using Client Credentials at startup. The Keycloak client_secret is never in environment variables — it is injected into a local memory-mapped file by the OpenBao agent sidecar. OpenBao rotates secrets every 24 hours. Each service caches its access token with a TTL of (expires_in - 30s) and re-fetches transparently on expiry.
JWT Validation Flow
Inbound request
│
▼
APISIX Gateway
1. Extract Bearer token from Authorization header
2. Fetch Keycloak JWKS (cached 5 min, fallback to cached copy on failure)
3. Verify signature (RS256), issuer, audience, expiry
4. OPA plugin: coarse-grained role + path check
5. Forward request with X-User-ID, X-Member-ID, X-Roles headers
│
▼
Go service pod
6. auth.JWTMiddleware: re-validate token (defense in depth; JWKS cached in-process)
7. OPA sidecar: fine-grained ABAC check (e.g., does member_id match the resource?)
8. Handler executes
9. PHI audit log written to OpenSearchInternal routes (/internal/*) skip steps 1–7 at the service layer; they are accessible only within the Istio service mesh (enforced via NetworkPolicy and AuthorizationPolicy).
OPA Roles
Roles are defined in Keycloak realm roles and referenced in OPA Rego policies. A user may hold multiple roles. OPA decisions are logged to the opa-decisions OpenSearch index; denied decisions include the policy package, rule, and input.
| Role | Realm | Permissions Summary |
|---|---|---|
MEMBER | olly-members | View own claims, submit claims, view EOBs, manage own enrollment, view coverage details |
EMPLOYER_ADMIN | olly-members | All MEMBER permissions for their group, manage group enrollment, view group census, billing overview |
PROVIDER | olly-providers | View claims for their patients (filtered by NPI), submit 837 EDI, view prior auth and credentialing status |
PROVIDER_BILLING_ADMIN | olly-providers | All PROVIDER permissions plus payment/835 remittance history and billing contact management |
CLAIMS_ADJUSTER | olly-internal | Read all claims, adjudicate, approve/deny prior auths, view member PHI (audit-logged), generate EOBs |
UNDERWRITER | olly-internal | View actuarial data, rate tables, enrollment analytics, read-only claims aggregates |
NETWORK_MANAGER | olly-internal | Manage provider credentialing workflows, approve/deny network contracts, update provider directory |
SYSTEM_ADMIN | olly-internal | All permissions, Keycloak realm administration, OPA policy deployment, OpenBao secret management |
Principle of least privilege: CLAIMS_ADJUSTER cannot modify enrollment records. EMPLOYER_ADMIN cannot view claim detail beyond their own group. PROVIDER can only see claims where their NPI appears as the rendering or billing provider.
Session TTLs
| Parameter | Mobile | Web (Member / Provider) | Web (Admin / Internal) |
|---|---|---|---|
| Access token TTL | 15 min | 15 min | 15 min |
| Refresh token TTL | 24 h | 8 h | 4 h |
| Max session duration | 30 days (with activity) | 8 h (hard) | 4 h (hard) |
| Idle timeout | 30 min | 30 min | 15 min |
| Token storage | Expo SecureStore | httpOnly cookie | httpOnly cookie |
| Refresh token rotation | Yes | Yes | Yes |
Refresh token rotation: Every use of a refresh token issues a new refresh token and invalidates the old one. If a previously-used (stolen) refresh token is presented, Keycloak invalidates the entire session family. Forced re-authentication is triggered on: password change, account compromise action, 30-day hard session expiry, and high-sensitivity endpoints (e.g., SSN viewing) that require max_age=0.
PHI Audit Logging
Every API response that includes PHI fields is audit-logged to satisfy HIPAA §164.312(b) (Audit Controls). Each log entry records:
| Field | Source |
|---|---|
timestamp | Request timestamp (UTC) |
request_id | Unique ID generated by APISIX, propagated via X-Request-ID |
actor_id | JWT sub claim (Keycloak user UUID) |
actor_type | MEMBER, PROVIDER, INTERNAL, or SERVICE |
actor_roles | JWT realm_access.roles |
client_ip | Original client IP from X-Forwarded-For (set by APISIX) |
endpoint | URL path (no query params to avoid leaking filter values) |
resource_type | CLAIM, MEMBER, EOB, PRIOR_AUTH, etc. |
resource_id | UUID of the accessed resource |
member_id | Member whose PHI was accessed (may differ from actor) |
phi_fields_returned | Array of PHI field names in the response |
opa_decision | ALLOW or DENY and the matched policy rule |
Log destinations:
- AWS CloudTrail — API call-level logging for AWS SDK calls (S3 object access, KMS decrypt). Stored in S3 with S3 Object Lock (WORM mode), 6-year retention.
- OpenSearch
audit-logsindex — Application-level PHI access logs. Indexed bymember_id,actor_id,timestamp,resource_type. 6-year retention via ILM policy. Encrypted at rest with KMS CMK. - Grafana Loki — Operational log stream. 90-day retention (operational use only, not compliance).
Access to the audit-logs index is restricted to SYSTEM_ADMIN role and automated log-writing service accounts.