Application Performance Monitoring, Application Performance Management or APM tools allow you to hook a plugin into your application to expose useful metrics — like end-to-end request response time — that shed light on the health of your applications. Observability builds atop the legacy of APM tools with framework for collecting telemetry data end-to-end — such as logs, traces, metrics, and events — at the infrastructure and application levels for analysis, visualization, and finding the root cause of issues in complex distributed systems.
This guide covers the basics of application performance monitoring and observability and helps you find the right vendor for your use case.
With APM and observability tools, you can hook a plugin into your application and expose useful metrics with minimal modifications to the application, like:
In recent years, we've evolved beyond application performance monitoring and added infrastructure monitoring, containers/Kubernetes monitoring, alerting, etc. to a “single pane of glass.” Ultimately, this speeds up correlating application data and infrastructure bottlenecks.
Choosing the right APM and observability vendor can save your DevOps team considerable time and keep your system performance in check.
New offerings like Honeycomb and Datadog (started in Cloud Monitoring) are challenging incumbents like AppDynamics and New Relic, leaving many DevOps leaders wondering which vendor to choose and why.
Many of these vendors will do an excellent job, but at times there can be hidden “gotchas” that only show themselves at scale. In our comparisons listed below, we cover the pros, the cons and the gotchas that could impact your decision on the right APM and observability provider for you.
Our most popular comparison pages are:
Depending on how you choose to monitor and troubleshoot issues in your systems, different vendors offer different features that can be of use to you.
For example, if you're looking to monitor real-user metrics (RUM), you're going to find traditional APMs like Dynatrace and AppDynamics well-suited for the task. If you're looking for real-time (to the second) metrics visualizations in your dashboard, Instana might be an even better fit.
In most cases, you're going to want a vendor that provides some level of:
Because we're dealing with more complex systems (serverless, microservices, many kinds of languages), Ops teams need to collect real-time data from all components to understand performance, and prevent and solve issues faster. This is what Observability helps with. Its three pillars are:
Depending on the kind of system you're maintaining, you're going to want to consider whether a traditional APM or observability platform is best for you.
The most important differentiator is the degree of complexity in your system. If you have a whole lot of microservices, modern observability tools are a must. However, because traditional APM vendors have also capitalized on the significant shift from monolith to microservice-based architectures and re-launched as observability vendors, the lines are blurrier than ever.
The right APM vendor for you is determined by your usage requirements or features.
There are many ways to give your Ops and developer teams visibility into your systems. The features, type of architecture and infrastructure, integrations or usage patterns you have will greatly narrow the vendor you can work with.
Get recommendations on the vendor that will work best for you in our resource, where we cover:
Thriving developer communities, companies (SoundCloud, Google, etc.), and foundations (CNCF) have incubated a variety of useful APM tools that have been open-sourced.
Without the cost of an expensive license, these tools offer advanced features like real-user monitoring (RUM), fully customizable dashboards, and codeless installation.
Vendors like Honeycomb and Lightstep are very focussed on the Observability toolset, whereas traditional APMs have made drastic improvements to catch up and now offer Observability in addition to APM in recent years.
The primary benefits of using Observability tools are:
Observability has been largely defined by open source projects, thanks in large part to OpenTelemetry and its predecessors, OpenTracing and OpenCensus.
These tools have thriving communities and are well-suited for most kinds of enterprise environments.
OpenTelemetry (OTEL) is an open-source Observability framework comprised of several tools, APIs, and SDKs, operating under the Cloud Native Computing Foundation (CNCF).
Observability, and therefore OpenTelemetry, is an approach to instrumentation for gathering actionable data on these services and systems and identifying issues faster.
Because the Kubernetes dashboard has limited monitoring features, you might want to consider adding a few more monitoring tools to the mix for tracing, log management, metric collection, etc.