Observability Patterns in Microservices

2023-01-20

Observability Patterns in Microservices

In a distributed world, logging "everything" is a recipe for high storage bills and low utility. Real observability is about creating a system that can answer questions you didn't know you needed to ask.

Beyond the Three Pillars

The industry often talks about the "Three Pillars" (Metrics, Logs, Traces), but they aren't silos. They should be deeply interconnected:

Metrics tell you when something is wrong.
Traces tell you where the error is happening.
Logs tell you why it's happening.

The magic happens when you can jump from a latency spike in a metric graph directly to the trace of the request that caused it, and then to the granular logs for that specific trace ID.

Structured Data is Mandatory

If you are still using grep on plain-text logs, you're debugging with one hand tied behind your back. Structured logging (typically via JSON) allows you to treat logs as data. Common keys like request_id, user_id, and duration_ms should be present in every log entry to allow for high-cardinality analysis.

Context Injection (W3C Trace Context)

Propagating context across service boundaries is the hardest part of observability. Adopting standards like the W3C Trace Context ensures that your traces remain unbroken as they traverse through different libraries, languages, and cloud providers.

The Goal: High Cardinality

A good observability platform should allow you to filter by specific users or session IDs. This "high cardinality" data is what actually helps you find the "needle in the haystack" when a single customer is experiencing a weird edge case.