Command Palette

Search for a command to run...

All postsArchitecture

Monolith to Microservices: What Nobody Tells You

SB
Sachin Babu
Oct 19, 202410 min read

The case for microservices is usually made in the abstract: independent deployability, team autonomy, technology flexibility, fault isolation. The case against is almost always empirical: the teams who tried it and quietly regrouped. This is a report from the field.

Why We Did It

Our monolith was a five-year-old NestJS application. Deployments were slow — 18 minutes of build time to ship a CSS change. The authentication service was coupled to the reporting service in ways nobody could fully explain anymore. Three teams stepped on each other's toes constantly because there was one main branch.

Reasonable causes. Microservices are a reasonable answer to these problems when you have the operational maturity to support them.

What We Got Right: The Strangler Fig Pattern

We didn't rewrite. We extracted. New traffic was routed to new services while the monolith handled the rest. Each extracted service got proper domain ownership, its own database, its own deployment pipeline.

The rule we enforced strictly: services communicate over async messages (SQS + SNS) except for synchronous reads that the client is waiting on. This boundary prevented the most common microservices failure mode — distributed monolith — where services are split but still call each other synchronously in chains that propagate failures.

What Bit Us: Distributed Transactions

The monolith had database transactions. Service A could write to table X and table Y in a single atomic operation, and if anything failed, both rolled back. Microservices break this.

We didn't appreciate the full cost until we had order-creation events that were partially processed: order created in the Orders service, payment charged in the Payments service, notification never sent because the Notifications service was in a deployment at that moment. The saga pattern is the standard answer — compensating transactions that undo work when a downstream step fails. Building and testing sagas is significant engineering work that doesn't appear in the "benefits of microservices" slide deck.

Observability Is Not Optional

A stack trace in a monolith tells you exactly what happened. In a distributed system, a single user request produces logs across six services with no correlation by default. We invested heavily in distributed tracing (OpenTelemetry + Jaeger) before the migration was complete, and that investment paid back immediately when debugging.

Structured logging with consistent trace IDs, service names, and correlation identifiers isn't a nice-to-have — it's the minimum viable operational requirement.

The Honest Assessment

The monolith's deployment problem was solved. Team autonomy genuinely improved. But we underestimated operational complexity by roughly 3x. Kubernetes cluster management, service mesh configuration, certificate rotation, inter-service authentication — these are non-trivial. We needed a platform engineering function we didn't have.

Would I do it again? Yes, but later. The monolith was worth splitting at ~80 engineers and serious scale. At 15 engineers, a well-structured modular monolith with independent deployment via feature flags would have served us better for another two years.

All postsEnd of article

Keep Reading

All Posts →