The "DORA metrics," developed by Google's DevOps Research and Assessment team, emerged from a five-year study of 31,000 developers. This research, published in the 2019 State of DevOps report, introduced four key engineering metrics for software delivery performance. A fifth was added in 2021.
But DORA's work extends beyond metrics, identifying core DevOps capabilities that support optimal performance and organizational outcomes. These capabilities and outcomes tend to fall by the wayside (which, as we'll explain, is important.)
The DORA metrics (in green below) have become one of software development's most prominent frameworks. In fact, many organizations who use DORA consider it synonymous with these key metrics.
The problem? The DORA metrics only tell half the story.
Today, forward-thinking companies, DevOps experts, and even DORA team members are pulling back on the reins. It’s not that the five metrics are wrong — it’s that they measure efficiency more than performance. But efficiency is only one part of the DORA Core Model that looks at engineering effectiveness in context.
At Uplevel, we look at DORA metrics — but we look at them as part of a larger picture.
DORA Engineering Metrics: What Are They and What Do They Measure?
The DORA team outlines five DevOps metrics that fall into three main categories:
-
Throughput (velocity) metrics indicate delivery speed, presenting a way to measure velocity outside of arbitrary measurements like story points.
-
Stability (quality) metrics speak to the quality of the deployment processes.
-
Reliability is often overlooked (you'll still see many references to the "four" DORA metrics), but was added later as its own category and metric.
Lead time for changes indicates the amount of time between code commit and deployment to production. Are automated tests and a healthy CI/CD pipeline reducing the time a pull request sits in queue? Is there a healthy code review process for what must be done manually?
Deployment frequency quantifies how often teams are successfully releasing code to production. Best practice recommends that the total number of deployments should be high. Deploying to production more frequently is a byproduct of streamlined processes and less complex PRs, which indicate healthy code reviews and more opportunity to catch errors.
Change failure rate captures the percentage of deployments that required remediation. How much of the time did your code fail in production and require a hotfix, patch, or rollback?
MTTR (mean time to repair) is the average time it takes to restore service when an incident occurs. The objective is to restore as soon as possible, ideally before users are affected. When software can be repaired quickly, that’s a sign of healthy architecture, incident response systems, and a well-coordinated software development team.
Reliability measures operational performance: your team's ability to keep promises about the product they're building. This might be captured in artifacts like SLAs and error budgets.
DORA metrics measure efficiency in terms of throughput, stability, and reliability. This means maximizing speed and output when it comes to deploying functional code — all indicators of success. But what if engineering teams are focusing on the wrong priorities or burning out to meet production demands? What if they're jeopardizing their ability to ship functional code in the future? Are they still high-performing teams?
What Software Efficiency Metrics Can’t Tell You
While DORA metrics focus on team efficiency, they don't reflect effectiveness.
At Uplevel, we define effectiveness as working on the right things in the right ways. This means aligning teams around the right priorities and giving them time to work on them.
The DORA DevOps team acknowledges these variables. The addition of reliability even attempts to addresses alignment to business value. The 2023 State of DevOps report emphasized the importance of developer well-being. This holistic view is also captured in the SPACE Framework, introduced by the original DORA team but lacking distinct metrics.
Let’s come back to the DORA Core Model, now simply represented like this:
There are a few things to notice here (beyond the fact that the metrics are actually just a part of the whole picture):
-
DORA metrics are lagging indicators. They will give you a picture of delivery performance, but they won’t explain any of the context behind it. You'll need more context to know which capabilities are strong or lacking in your organization).
-
They don’t capture well-being, which is both an outcome and a predictor of organizational performance.
Why is this? Why don’t classic software engineering metrics attempt to capture problems early? Given well-being's role in predicting organizational performance, why aren't people health metrics included?
DORA's focus on efficiency as a proxy for effectiveness reflects how hard people data is to collect and interpret.
Outside of surveys, how can teams measure “well-being,” “burnout,” or “work recovery”? Available data is evolving, as is our ability to surface insights leading to high-value outcomes.
How Accolade Meets Commitments with Uplevel
20% increase in deep work and 205% increase in deployments
A New Comprehensive Measure of Engineering Effectiveness
While the State of DevOps reports rely on self-reported survey data, engineering intelligence solutions can also use telemetry from developer tools to quantify team performance and track trends over time.
No framework for understanding engineering performance is perfect. Capturing the nuance and complexity of effectiveness across people and processes is difficult. People data is messy, after all. But advanced ML models and composite metrics now offer greater visibility into the larger picture outlined by the DORA team in the Core Model.
At Uplevel, we've found that engineering leaders struggle with three often competing mandates. DORA metrics measure delivery efficiency and quality, but not whether they are the right initiatives or whether teams are working sustainably.
Measuring efficiency is critical for our customers. Uplevel measures the four DORA metrics to assess delivery, throughput, and stability.
But if DORA is not the whole story, what other insights do we surface — and why?
Allocation Insights
A team working on quick, low-priority deployments may appear more effective than one working slowly on a complicated feature. That’s why aligning engineering efforts with business priorities is crucial.
Uplevel's allocation insights reveal how much time your organization spends on new value demand work vs. KTLO. By understanding how teams spend their time, you can effectively distribute effort to impactful activities and better estimate delivery timelines.
This visibility helps align engineering with other business functions, surfacing metrics that support DORA’s goals of creating a transparent value stream.
Challenge Accepted: Time Allocation
Learn how Uplevel sorts out the mess in product management tools
With DORA metrics, you might see that your teams are deploying quickly and frequently and infer they have more capacity. The problem? You have no way to quantify it.
For example, if you’re measuring deployment frequency in weeks, how much of that time did your teams actually spend on delivery? If their days were packed with meetings and other interruptions, they may not have had the capacity to deploy tons of new code. In that case, low deployment frequency may not accurately depict team efficiency. They may have performed at a high level given their lower capacity.
Instead, we measure how much time teams have for deep work, which we define as uninterrupted blocks of at least two hours. Our deep work metrics account for the planned and unplanned interruptions like meetings and chat throughout the day.
Meeting and Chat Interruption Insights
Time spent in deep work is not easy to measure. Unscheduled time on a calendar doesn’t mean that a dev is focusing on one thing at a time. Engineers’ days often involve a lot of context switching — from meetings to Slack conversations to a few minutes of coding time and back again.
Uplevel analyzes meeting and chat metadata to show if they are harming deep work time and productivity. Meeting titles and chat timestamps, for example, can provide powerful insight into the disruptions that developers often experience. This visibility helps leaders maximize deep work time and prioritize well-being and productivity.
Allocation and capacity insights are a measure of organizational focus. What are your teams working on and how much time do they have to work on it? Viewing your engineering efforts through this lens can give you a more accurate idea of effectiveness and performance.
Work Sustainability and Well-Being
Developer burnout is a significant issue, highlighted in every DORA report since 2019. Rightly so. It doesn't take a team of researchers to figure out that overworked, unmotivated employees threaten organizational performance.
But the DORA metrics don't include burnout risk. It's difficult to quantify well-being and link it to objective performance. But waiting for performance measures to decline before recognizing burnout risks isn't helpful. True measures of engineering performance should account for burnout early, spurring leaders to action before DORA metrics drop.
To quantify burnout risk at the team level, Uplevel uses composite metrics. Sustained Always On combines low deep work time and “Always On” behaviors (Jira, messaging, and calendar activity beyond normal hours). These metrics provide context for understanding efficiency and sustainability over time.
AI-Powered Context for DORA Metrics
DevOps efficiency and effectiveness metrics enable enterprise companies to deliver value in a tough market. Allocation and burnout risk are important to the larger context, but they're often missing because they are difficult to capture.
Thankfully, you no longer have to go without.
Modern engineering leadership goes beyond homegrown BI platforms or quarterly dev surveys. Leveraging data science best practices and people health capabilities, Uplevel provides you with the comprehensive team telemetry you need to turn engineering data — including DORA metrics — into actionable change.