Skip to content

Event Flow

Olly's asynchronous backbone is Amazon MSK (Kafka). Events flow between services using the choreography-based saga pattern — no central orchestrator, just producers and consumers reacting to domain events. All messages are Avro-encoded with schema compatibility enforced by Confluent Schema Registry deployed on MSK.

Kafka Topology

Topic Configuration Defaults

Unless noted otherwise, all topics use these settings:

ParameterDefaultNotes
replication.factor3All MSK clusters run 3+ brokers
min.insync.replicas2Producer acks=all; ensures durability on broker failure
retention.ms604800000 (7 days)Sufficient for consumer lag recovery
compression.typelz4Good ratio/speed balance for Avro payloads
max.message.bytes1 MBIncreased for EDI topics with large raw payloads
partitions6Adjusted up for high-volume topics

High-volume topics (claims.claim.submitted, billing.invoice.generated) use 12 partitions and 14-day retention. EDI archive topics use 30-day retention.

Consumer Group Naming

Consumer groups follow the pattern {service}-{domain}-consumer. Examples:

  • billing-claims-consumer — Billing Service consuming from claims topics
  • notifications-enrollment-consumer — Notifications Service consuming from enrollment topics
  • eligibility-enrollment-consumer — Eligibility Service consuming from enrollment topics

Schema Registry

All Kafka messages use Avro encoding registered with Confluent Schema Registry. Compatibility mode is BACKWARD — new schema versions must be readable by consumers running the previous schema version. This allows producers to upgrade first without coordinating consumer deployments. Go services use the github.com/riferrei/srclient library; schema IDs are embedded in the 5-byte Confluent wire format prefix of each message.

Event Catalog

Claims Domain

TopicProducerConsumersPartitionsRetention
claims.claim.submittedClaimsBilling, Notifications1214 days
claims.claim.adjudicatedClaimsBilling, Notifications, Mirth Connect1214 days
claims.claim.paidBilling (primary), Claims (mirror)Notifications, OpenSearch67 days
claims.prior_auth.decisionClaimsNotifications, Eligibility67 days
claims.appeal.resolvedClaimsNotifications, Billing67 days

claims.claim.adjudicated event types: CLAIM_APPROVED, CLAIM_DENIED

Key fields on approval: claimId, allowedAmount, paidAmount, memberResponsibility, copayApplied, eobS3Key

Key fields on denial: claimId, denialReasonCode (X12 code, e.g. CO-4), denialReasonText, appealDeadline

Eligibility Domain

TopicProducerConsumersRetention
eligibility.coverage.verifiedEligibilityClaims7 days
eligibility.coverage.terminatedEligibilityClaims, Notifications7 days

eligibility.coverage.verified carries accumulator balances (deductibleRemaining, oopMaxRemaining), network tier, and coverage dates — the Claims Service uses this to compute member cost-sharing without a synchronous lookup.

Enrollment Domain

TopicProducerConsumersRetention
enrollment.enrollment.submittedEnrollmentEligibility, Billing, Notifications7 days
enrollment.enrollment.activatedEnrollmentEligibility, Billing, Notifications7 days
enrollment.enrollment.terminatedEnrollmentEligibility, Billing, Notifications7 days
enrollment.enrollment.plan_changedEnrollmentEligibility, Billing, Notifications7 days
enrollment.cobra.notice_sentEnrollmentNotifications7 days
enrollment.cobra.electedEnrollmentEligibility, Billing, Notifications7 days

enrollment.enrollment.activated is the key saga trigger: it activates the pending coverage record in Eligibility, starts the recurring invoice schedule in Billing, and sends the enrollment confirmation in Notifications.

Billing Domain

TopicProducerConsumersPartitionsRetention
billing.invoice.generatedBillingNotifications1214 days
billing.payment.receivedBillingEnrollment, Notifications67 days
billing.payment.missedBillingEnrollment, Notifications67 days
billing.payment.completedBillingClaims, Notifications67 days

billing.payment.missed carries three event types: PAYMENT_MISSED, GRACE_PERIOD_STARTED, and GRACE_PERIOD_EXPIRED. The GRACE_PERIOD_EXPIRED event triggers Enrollment Service to begin the COBRA timeline.

Provider Domain

TopicProducerConsumersRetention
provider.credentialing.status_changedProviderClaims, Eligibility, Notifications7 days
provider.network.updatedProviderClaims, Eligibility7 days

provider.credentialing.status_changed event types: CREDENTIALING_APPROVED, CREDENTIALING_DENIED, CREDENTIALING_EXPIRED, CREDENTIALING_SUSPENDED. Claims and Eligibility update their local projection tables on receipt.

EDI Domain

TopicProducerConsumersRetention
edi.inbound.receivedMirth ConnectClaims, Enrollment, Eligibility30 days
edi.outbound.generatedMirth ConnectOpenSearch (audit)30 days

EDI topics use 30-day retention because raw X12 transaction archives may be needed for clearinghouse dispute resolution. The payloadS3Key field in each event points to the raw EDI file stored in S3 with SSE-KMS encryption.

Outbox Pattern

Every Kafka publish is made safe against partial failures by the transactional outbox pattern:

The outbox worker runs as a background goroutine in each service. If the service restarts between the Kafka produce and the outbox delete, the message will be re-published on restart. Consumers handle duplicate delivery using an idempotency key stored in ElastiCache (TTL = 24 hours): if a claimId + eventType pair is already in the set, the consumer skips processing and commits the offset.

Dead Letter Queue (DLQ) Pattern

Every Kafka consumer implements exponential-backoff retry followed by DLQ routing on persistent failure:

Normal Flow:
  Topic → Consumer → Process → Commit offset

Failure Flow (after 3 retries with backoff: 1s, 2s, 4s):
  Topic → Consumer → Process FAILS


  DLQ Topic: {original-topic}.dlq  (30-day retention)


  DLQ Consumer: alert to Grafana OnCall + write to OpenSearch for manual review

DLQ messages wrap the original payload with failure metadata: originalTopic, originalPartition, originalOffset, failureReason, retryCount, and consumerGroup. There is no DLQ of the DLQ — messages that fail in the DLQ consumer are logged to OpenSearch and alert to Grafana OnCall for manual intervention.

Enrollment Saga

The enrollment saga coordinates new member activation from submission through to active coverage and first invoice:

Compensation path: If the carrier sends a 999 rejection for the 834 transaction, Enrollment Service publishes enrollment.enrollment.cancelled. Eligibility deletes the pending coverage record, Billing cancels the staged invoice, and Notifications sends an error message to the member.

Olly Health Insurance Platform