Event-Driven Systems
Event-driven systems decouple producers from consumers by passing facts (events) through a broker. They’re the backbone of microservices, real-time pipelines, and any system where “things happen and other things react.”
Event vs Command vs Message
Section titled “Event vs Command vs Message”| Term | Definition | Example |
|---|---|---|
| Event | An immutable fact — something that happened | OrderPlaced, PaymentFailed |
| Command | A request for a specific action to occur | PlaceOrder, ProcessPayment |
| Message | Generic envelope — events and commands are both messages | Any payload sent over a broker |
Core Patterns
Section titled “Core Patterns”-
Event Notification — Services emit lightweight events; consumers decide what to do. Loose coupling, but consumers must fetch state if they need details. Use when you want decoupling and don’t need full event history.
-
Event-Carried State Transfer — Events carry enough data so consumers never need to call back. Reduces coupling further but bloats event payloads. Use when consumers need state immediately and round-trips are expensive.
-
Event Sourcing — State is derived by replaying a log of events; the event store is the source of truth. Enables full audit trail and time-travel debugging. Use when auditability, compliance, or temporal queries matter.
-
CQRS (Command Query Responsibility Segregation) — Write model (commands) and read model (queries) are separate. Often paired with Event Sourcing but not required. Use when read/write access patterns diverge significantly.
Saga: Choreography vs Orchestration
Section titled “Saga: Choreography vs Orchestration”Sagas manage distributed transactions across services without two-phase commit.
| Choreography | Orchestration | |
|---|---|---|
| How it works | Each service reacts to events and emits new ones | A central coordinator tells each service what to do |
| Coupling | Low — services only know their own events | Higher — coordinator knows the full flow |
| Observability | Hard — flow is implicit across services | Easy — flow is explicit in one place |
| Failure handling | Compensating events per service | Coordinator handles rollback logic |
| Best for | Simple, stable workflows | Complex, long-running, or frequently-changing flows |
Delivery Semantics
Section titled “Delivery Semantics”| Semantic | Guarantee | Cost | Use Case |
|---|---|---|---|
| At-most-once | May lose messages | Cheapest — fire and forget | Metrics, telemetry where loss is acceptable |
| At-least-once | No message loss; duplicates possible | Moderate — requires retry logic | Most production systems; pair with idempotent consumers |
| Exactly-once | No loss, no duplicates | Expensive — distributed coordination | Financial transactions, billing (verify broker support) |
Kafka vs Azure Service Bus
Section titled “Kafka vs Azure Service Bus”| Apache Kafka | Azure Service Bus | |
|---|---|---|
| Model | Distributed log (pull) | Message broker (push/pull) |
| Retention | Configurable — messages persist after consumption | Deleted after consumption (or DLQ) |
| Replay | ✅ Built-in — rewind consumer offset | ❌ Not supported |
| Ordering | Per-partition only | Per session (with sessions enabled) |
| Throughput | Millions of msgs/sec | Thousands of msgs/sec |
| Delivery | At-least-once; exactly-once within cluster | At-least-once; deduplicated window available |
| Dead-letter | Manual (separate topic) | ✅ Built-in DLQ per queue/topic |
| Best for | Event streaming, log aggregation, replay scenarios | Workflow integration, enterprise messaging, Azure-native apps |
Eventual Consistency
Section titled “Eventual Consistency”In distributed systems, writes propagate asynchronously — there’s a window where different nodes see different data. That’s the trade-off you accept for availability and partition tolerance (CAP theorem).
How to handle it:
- Idempotent consumers — Process the same message twice safely; use unique event IDs.
- Outbox pattern — Write to DB and event log atomically; avoids dual-write inconsistency.
- Saga compensations — If a downstream step fails, emit compensating events to undo prior steps.
- Read-your-writes consistency — Route reads to the write replica briefly after a mutation if stale reads are unacceptable.
Dead-Letter Queues & Poison Messages
Section titled “Dead-Letter Queues & Poison Messages”A dead-letter queue (DLQ) is where messages go when they can’t be processed — after max retries, on deserialization failure, or when they expire.
A poison message is one that causes the consumer to crash or loop repeatedly. Without a DLQ, it blocks the whole queue.
Key practices:
- Always configure a DLQ — never let a broker silently drop failed messages.
- Alert on DLQ depth; a growing DLQ is a canary for schema or logic bugs.
- Include original metadata (error reason, retry count) in the DLQ message for debugging.
Common Gotchas
Section titled “Common Gotchas”- Confusing Event Sourcing with event-driven — They often coexist but are independent concepts. Interviewers use this to separate practitioners from readers.
- Choreography spaghetti — With enough services, choreography produces invisible workflows. No single place shows the full business process.
- Schema evolution breaks consumers — Adding a required field or renaming a property can silently break downstream consumers. Use schema registries (Confluent Schema Registry, Azure Schema Registry) and favour additive changes.
- Kafka ordering is per-partition — Globally ordered processing requires a single partition, which kills parallelism. Design partition keys carefully.
- Exactly-once is harder than it sounds — Broker-level exactly-once doesn’t cover your database writes. You still need idempotent consumers for true end-to-end guarantees.