Event Flow
Olly's asynchronous backbone is Amazon MSK (Kafka). Events flow between services using the choreography-based saga pattern — no central orchestrator, just producers and consumers reacting to domain events. All messages are Avro-encoded with schema compatibility enforced by Confluent Schema Registry deployed on MSK.
Kafka Topology
Topic Configuration Defaults
Unless noted otherwise, all topics use these settings:
| Parameter | Default | Notes |
|---|---|---|
replication.factor | 3 | All MSK clusters run 3+ brokers |
min.insync.replicas | 2 | Producer acks=all; ensures durability on broker failure |
retention.ms | 604800000 (7 days) | Sufficient for consumer lag recovery |
compression.type | lz4 | Good ratio/speed balance for Avro payloads |
max.message.bytes | 1 MB | Increased for EDI topics with large raw payloads |
partitions | 6 | Adjusted up for high-volume topics |
High-volume topics (claims.claim.submitted, billing.invoice.generated) use 12 partitions and 14-day retention. EDI archive topics use 30-day retention.
Consumer Group Naming
Consumer groups follow the pattern {service}-{domain}-consumer. Examples:
billing-claims-consumer— Billing Service consuming from claims topicsnotifications-enrollment-consumer— Notifications Service consuming from enrollment topicseligibility-enrollment-consumer— Eligibility Service consuming from enrollment topics
Schema Registry
All Kafka messages use Avro encoding registered with Confluent Schema Registry. Compatibility mode is BACKWARD — new schema versions must be readable by consumers running the previous schema version. This allows producers to upgrade first without coordinating consumer deployments. Go services use the github.com/riferrei/srclient library; schema IDs are embedded in the 5-byte Confluent wire format prefix of each message.
Event Catalog
Claims Domain
| Topic | Producer | Consumers | Partitions | Retention |
|---|---|---|---|---|
claims.claim.submitted | Claims | Billing, Notifications | 12 | 14 days |
claims.claim.adjudicated | Claims | Billing, Notifications, Mirth Connect | 12 | 14 days |
claims.claim.paid | Billing (primary), Claims (mirror) | Notifications, OpenSearch | 6 | 7 days |
claims.prior_auth.decision | Claims | Notifications, Eligibility | 6 | 7 days |
claims.appeal.resolved | Claims | Notifications, Billing | 6 | 7 days |
claims.claim.adjudicated event types: CLAIM_APPROVED, CLAIM_DENIED
Key fields on approval: claimId, allowedAmount, paidAmount, memberResponsibility, copayApplied, eobS3Key
Key fields on denial: claimId, denialReasonCode (X12 code, e.g. CO-4), denialReasonText, appealDeadline
Eligibility Domain
| Topic | Producer | Consumers | Retention |
|---|---|---|---|
eligibility.coverage.verified | Eligibility | Claims | 7 days |
eligibility.coverage.terminated | Eligibility | Claims, Notifications | 7 days |
eligibility.coverage.verified carries accumulator balances (deductibleRemaining, oopMaxRemaining), network tier, and coverage dates — the Claims Service uses this to compute member cost-sharing without a synchronous lookup.
Enrollment Domain
| Topic | Producer | Consumers | Retention |
|---|---|---|---|
enrollment.enrollment.submitted | Enrollment | Eligibility, Billing, Notifications | 7 days |
enrollment.enrollment.activated | Enrollment | Eligibility, Billing, Notifications | 7 days |
enrollment.enrollment.terminated | Enrollment | Eligibility, Billing, Notifications | 7 days |
enrollment.enrollment.plan_changed | Enrollment | Eligibility, Billing, Notifications | 7 days |
enrollment.cobra.notice_sent | Enrollment | Notifications | 7 days |
enrollment.cobra.elected | Enrollment | Eligibility, Billing, Notifications | 7 days |
enrollment.enrollment.activated is the key saga trigger: it activates the pending coverage record in Eligibility, starts the recurring invoice schedule in Billing, and sends the enrollment confirmation in Notifications.
Billing Domain
| Topic | Producer | Consumers | Partitions | Retention |
|---|---|---|---|---|
billing.invoice.generated | Billing | Notifications | 12 | 14 days |
billing.payment.received | Billing | Enrollment, Notifications | 6 | 7 days |
billing.payment.missed | Billing | Enrollment, Notifications | 6 | 7 days |
billing.payment.completed | Billing | Claims, Notifications | 6 | 7 days |
billing.payment.missed carries three event types: PAYMENT_MISSED, GRACE_PERIOD_STARTED, and GRACE_PERIOD_EXPIRED. The GRACE_PERIOD_EXPIRED event triggers Enrollment Service to begin the COBRA timeline.
Provider Domain
| Topic | Producer | Consumers | Retention |
|---|---|---|---|
provider.credentialing.status_changed | Provider | Claims, Eligibility, Notifications | 7 days |
provider.network.updated | Provider | Claims, Eligibility | 7 days |
provider.credentialing.status_changed event types: CREDENTIALING_APPROVED, CREDENTIALING_DENIED, CREDENTIALING_EXPIRED, CREDENTIALING_SUSPENDED. Claims and Eligibility update their local projection tables on receipt.
EDI Domain
| Topic | Producer | Consumers | Retention |
|---|---|---|---|
edi.inbound.received | Mirth Connect | Claims, Enrollment, Eligibility | 30 days |
edi.outbound.generated | Mirth Connect | OpenSearch (audit) | 30 days |
EDI topics use 30-day retention because raw X12 transaction archives may be needed for clearinghouse dispute resolution. The payloadS3Key field in each event points to the raw EDI file stored in S3 with SSE-KMS encryption.
Outbox Pattern
Every Kafka publish is made safe against partial failures by the transactional outbox pattern:
The outbox worker runs as a background goroutine in each service. If the service restarts between the Kafka produce and the outbox delete, the message will be re-published on restart. Consumers handle duplicate delivery using an idempotency key stored in ElastiCache (TTL = 24 hours): if a claimId + eventType pair is already in the set, the consumer skips processing and commits the offset.
Dead Letter Queue (DLQ) Pattern
Every Kafka consumer implements exponential-backoff retry followed by DLQ routing on persistent failure:
Normal Flow:
Topic → Consumer → Process → Commit offset
Failure Flow (after 3 retries with backoff: 1s, 2s, 4s):
Topic → Consumer → Process FAILS
│
▼
DLQ Topic: {original-topic}.dlq (30-day retention)
│
▼
DLQ Consumer: alert to Grafana OnCall + write to OpenSearch for manual reviewDLQ messages wrap the original payload with failure metadata: originalTopic, originalPartition, originalOffset, failureReason, retryCount, and consumerGroup. There is no DLQ of the DLQ — messages that fail in the DLQ consumer are logged to OpenSearch and alert to Grafana OnCall for manual intervention.
Enrollment Saga
The enrollment saga coordinates new member activation from submission through to active coverage and first invoice:
Compensation path: If the carrier sends a 999 rejection for the 834 transaction, Enrollment Service publishes enrollment.enrollment.cancelled. Eligibility deletes the pending coverage record, Billing cancels the staged invoice, and Notifications sends an error message to the member.