Microservices Trade-offs

Microservices split a system into independently deployable services. Done right, they enable scale and team autonomy. Done wrong, they’re a monolith with extra network hops.

The Progression

Don’t jump straight to microservices. Most systems follow this arc:

Stage	What It Is	When to Use
Monolith	Single deployable unit, all code together	MVPs, small teams, fast iteration
Modular Monolith	Monolith with enforced internal boundaries	Growing teams, prepping for a future split
Microservices	Independently deployed services per domain	Scale, multiple teams, proven domain boundaries

Trade-offs

What You Gain	What You Pay
Independent deployability per service	Distributed system complexity
Scale each service in isolation	Network latency on every call
Team autonomy around domains	Operational overhead — CI/CD × N services
Fault isolation (one service fails, others survive)	Data consistency is now your problem
Tech stack flexibility per service	Observability requires serious tooling investment
Smaller codebases, easier to reason about	Integration testing is significantly harder

Communication Patterns

Services need to talk. Two fundamental approaches:

Type	Examples	When to Use
Sync (request/response)	REST, gRPC	Real-time queries, reads, simple commands
Async (event-driven)	Kafka, RabbitMQ, Azure Service Bus	Workflows, writes, cross-service side effects

Sync chains are fragile — if A calls B calls C, one failure cascades. Async decouples services; the publisher doesn’t wait for consumers to catch up.

Data Patterns

Each service should own its data. A shared database creates hidden coupling — a schema change in one service can silently break others.

Pattern	Pros	Cons
Database-per-service	True isolation, independent schema evolution	Cross-service joins need API calls or events
Shared database	Easy joins, simple to start	Tight coupling, hard to deploy independently

Accept that data will be eventually consistent across services. Use events to synchronise state, and design your UX to tolerate slight delays.

Resilience Patterns

Distributed systems fail in partial ways — handle it explicitly:

Circuit Breaker — After N failures, stop calling the failing service and fail fast. In .NET, Polly implements this in a few lines.
Retry with exponential backoff — Retry transient failures, but wait longer between each attempt. Without backoff, retries amplify load on an already struggling service.
Timeouts — Always set timeouts on outbound calls. Without them, one slow dependency can exhaust your thread pool and take down the caller.
Bulkhead — Isolate calls to different services in separate thread pools so one slow dependency can’t starve all others.

Observability

A monolith has one log file. Microservices have N — you can’t debug what you can’t see:

Distributed tracing — Propagate a correlation/trace ID across all service calls. Tools like Jaeger, Zipkin, or Azure Application Insights let one trace show the full request path.
Centralised logging — Aggregate logs from all services into one place (ELK, Seq, Azure Monitor). Search by trace ID to reconstruct any request’s journey.
Health checks — Every service exposes /health and /ready. Kubernetes and Azure Container Apps use these to route traffic and restart unhealthy instances.

Common Gotchas

Distributed monolith — Services that are technically separate but share a database or require coordinated deploys. You’ve got all the complexity with none of the benefits.
Eventual consistency underestimated — “The order exists in Service A but not B yet” confuses users and breaks downstream logic. Design for it deliberately, not as an afterthought.
Operational overhead — Each service needs its own CI/CD, logging, monitoring, and on-call runbook. 10 services = 10× the operational surface area.
Too many services too soon — “Death by a thousand cuts.” Splitting before you understand the domain gives you wrong boundaries that are painful to fix later.
No contract testing — Consumer-driven contract tests (e.g., Pact) catch breaking API changes before production. Without them, a schema change in Service A silently breaks Service B.