System Overview
Olly is a HIPAA-native, mobile-first, self-hosted AI health insurance platform built for employer groups and individual markets. It processes medical claims, manages member enrollment and eligibility, handles provider credentialing and contracting, and delivers an AI-assisted adjudication layer — all within a private AWS infrastructure that satisfies HIPAA, ACA, and state regulatory requirements. The backend is a Go microservices monorepo; every service owns its own PostgreSQL database and communicates asynchronously through Amazon MSK (Kafka). All LLM inference runs on self-hosted GPU nodes inside a private VPC — no PHI ever reaches a third-party inference endpoint.
System Topology
Design Principles
HIPAA-First
PHI is encrypted at every layer — Aurora TDE with KMS CMK per tenant, TLS 1.3 in transit everywhere, and application-level field encryption for the most sensitive attributes. All AWS services have signed Business Associate Agreements. The network uses a zero-trust model: Istio enforces mTLS between every service pod (SPIFFE/SVID workload identity), and OPA policies are evaluated at both the APISIX gateway and inside each service. AI inference is isolated in a dedicated VPC with no external egress; NeMo Guardrails and Presidio redact PHI before any data reaches an LLM.
OSS-First
Open source by default. AWS managed services are used only for the data plane (Aurora, MSK, ElastiCache, OpenSearch) where a HIPAA BAA and operational simplicity justify the cost. The entire identity, authorization, secret management, service mesh, observability, and AI stack is open source. Enterprise licenses are limited to Vanta (compliance automation), Snowflake (actuarial analytics, P5), and Twilio/SendGrid (communications APIs).
Event-Driven Core
Amazon MSK (Kafka) backs the claims pipeline, enrollment saga, and billing workflows. The outbox pattern ensures that database writes and Kafka publishes are atomic — services insert into an outbox table inside the same transaction, and a background worker publishes and deletes. The saga pattern coordinates multi-step workflows (enrollment → activation → first invoice; claim → adjudication → payment) without distributed transactions. All topics use Avro encoding with backward-compatible schema evolution via Confluent Schema Registry.
Mobile-First UX
Expo + React Native delivers iOS and Android from a single TypeScript codebase with OTA updates, biometric authentication via Expo LocalAuthentication, and offline-capable claim submission backed by Expo SQLite / WatermelonDB. Next.js 15 powers the member, employer, and provider web portals with server-side rendering and httpOnly session cookies. Push-driven member communication via Firebase FCM ensures members receive adjudication results, payment confirmations, and enrollment notices on their device.
Technology Stack
| Layer | Technology | Purpose |
|---|---|---|
| Mobile | Expo + React Native + Expo Router | Single iOS/Android codebase, OTA updates, biometric auth, offline SQLite |
| Web portals | Next.js 15 + React 19 | SSR for SEO and performance, next-auth v5 PKCE session |
| Admin console | React 19 + TanStack Router | SPA with role-based route guards, no SSR needed for internal ops |
| API gateway | Apache APISIX | Plugin ecosystem, native OPA integration, FHIR routing, rate limiting |
| Backend services | Go (Golang) | High concurrency, low memory footprint, type-safe SQL with sqlc |
| Long-running workflows | Temporal | Durable execution for COBRA timelines, credentialing, enrollment sagas |
| EDI translation | Mirth Connect | Industry-standard 837/835/834/270/271 EDI parsing |
| FHIR | HAPI FHIR R4 | CMS interoperability rule compliance |
| Event bus | Amazon MSK (Kafka) | Managed Kafka with HIPAA BAA, durable event replay, per-service consumer groups |
| Primary DB | Aurora PostgreSQL | Managed multi-AZ, HIPAA BAA, compatible with sqlc + pgx |
| Cache | ElastiCache Valkey | Sub-millisecond session and eligibility cache |
| Search | OpenSearch | Provider directory full-text search, ICD-10/CPT lookup, audit log analytics |
| Vector search | Qdrant | Self-hosted on EKS; PHI never leaves the private VPC; RAG for policy Q&A |
| Service mesh | Istio + Envoy | mTLS everywhere, SPIFFE workload identity, traffic management |
| Secrets | OpenBao | Vault-compatible OSS fork; dynamic DB credentials, 24h secret rotation |
| Identity | Keycloak | OSS OIDC/SAML IdP, multi-realm isolation, FIDO2/passkeys, TOTP MFA |
| Authorization | Open Policy Agent | Rego RBAC/ABAC at APISIX gateway and in-service sidecars |
| AI inference | vLLM + LiteLLM | Self-hosted PagedAttention, LiteLLM as unified API with cost-based routing |
| PHI filtering | NeMo Guardrails + Presidio | PHI/PII redaction before LLM; hallucination detection for clinical outputs |
| Observability | Prometheus + Grafana + Loki + Tempo | Full OSS stack; correlated metrics, logs, and traces via OpenTelemetry SDK |
| IaC | OpenTofu + Crossplane | OSS Terraform alternative + K8s-native AWS resource management |
| GitOps | GitHub Actions + Argo CD | CI pipeline with SBOM generation and image signing; declarative cluster sync |