Skip navigation
Product & Engineering

Quality-Driven Engineering: How We Use Metrics at Duo to Drive Continuous Improvement

Introduction:

In a fast-paced agile environment where new products and features are being developed and released to customers on a sprint by sprint basis, it can become challenging to create a quality-focused culture that drives engineering improvements.

At Duo, we have a rich culture where quality is a shared responsibility and hence do not have a separate manual test team to handle testing of our products. Quality is owned by the entire engineering organization — and everyone collectively collaborates to ensure test coverage via unit, integration, end-to-end UI tests and a few manual tests. We rely heavily on automated tests. In fact, we currently have only around 50 manual tests for our main code base, complemented by over 27,000 automated tests.

Furthermore, as our engineering organization has matured, we have developed quality-centric metrics to drive continuous improvement.

Our goal for this blog post is to take a deeper look into the metrics that we collect and report on, the story they tell, and how they drive quality improvements. We hope that readers can walk away with actionable ideas on metrics that can help keep quality front and center of the software development process.

Metrics We Collect and Their Purpose:

Unit Test to Integration Test Ratio:

The test pyramid below acts as a visual suggestion for the relative number of different types of test cases in a test suite. Basically, it suggests having a small number of end-to-end or manual tests, more integration tests and the most unit tests. Unit tests tend to be the fastest and most reliable. Relying on more unit tests can also ensure a shorter CI (Continuous Integration) time.

Our test counts do not conform nicely to the ratios prescribed by the test pyramid, as we have more integration tests than unit tests. We would prefer to have more unit tests, which is why we track the ratio of unit tests to integration tests, with the goal of increasing that ratio. We have not reached a 1:1 ratio yet, but we have continued to slowly approach that mark.

With more active effort, such as by converting legacy integration tests to new unit tests, we could perhaps reach that level quicker. But our primary approach thus far has been encouraging a higher ratio of unit tests alongside new features. In addition, occasionally sharing a metric like this can increase awareness across the team and help in promoting the desired behavior.

Unit Test Code Coverage:

This metric helps us measure test coverage as new features are added. We like to see this number go up over time (or at least not decline), as an indicator that our codebase has reasonable test coverage by unit tests. Note that, as mentioned in the previous section, we have more integration tests than unit tests. However, we do not yet collect code coverage for integration test cases.

We have not enacted strict code coverage requirements for new code (e.g. all new code must have >90% unit test coverage). Instead, we have tended to use the measurement as an indicator that test automation continues to be added alongside new feature code. If we do see a dip, or particularly low coverage numbers in a specific area, then we can dive in and add missing test coverage, as needed.

Master Branch and CI Developer Delivery Metrics:

The master branch metric tracks the number of times when our master branch goes red, as well as the mean time to restore the CI pipeline to green again. The master branch should always be green, but occasionally a build, test, or infrastructure issue occurs which causes consistent failures for a brief period of time, and it turns red.

We want this number to be consistently low as an indicator of a healthy CI process and minimal or no impact on developers who are merging code into master on a continuous basis.

If an issue does slip through and turn master red, we treat it with importance and urgency. At the same time, we keep in mind one of Duo’s core values of being kinder than necessary. Each issue is examined and a root-cause determined, in a blame-free manner so that we can improve tools or processes going forward.

Continuous Integration (CI) Automated Test Pass Rates:

We collect build, static analysis, unit test, and integration test pass rates to inform on the overall health of our automated tests as they are executed during the CI process. A decrease in pass rate is a cue to investigate and determine the root cause. It may be that our master branch turned “red” multiple times, as also tracked by the prior metric. Or it could be due to test flakiness or infrastructure issues, which also can occur as our test automation suite and CI infrastructure grow and scale.

Ideally, the pass rate would be consistently at or near 100%. We first started tracking this metric when it became apparent that numerous automated tests were “flaky,” and failing with false positives — which impacted developer productivity. We spent a concerted effort to fix or remove flaky tests, and have seen our overall pass rate improve (as seen in the graph above).

Manual Test Counts:

This metric represents the number of manual tests — those not covered by automated end-to-end UI tests — that we run each sprint.

Over time, we want the number of manual tests to go down, aspirationally approaching zero manual tests. Reducing manual regression tests helps scale our development process as more features are added.

This can be a “catch-22” of sorts, since as new features are added, new manual tests that verify end-to-end functionality are also added. To that end, we’ve been working on automating portions of these tests each quarter with the ultimate goal of fully automating all of them over time. To avoid growing the set of manual tests, we also strive to have Quality and Development teams pair so that end-to-end UI test automation is added alongside new features, rather than adding a manual test alongside the feature and automating it later.

At the time of writing (Q4 2020), we are currently at 50 manual tests, down from a high of 60 manual tests in Q2 of 2020.

Release Metrics:

At the end of our bi-weekly sprint release cycle, and before the software is deployed to pre-production environments, the Quality team coordinates a one day acceptance testing process where the Engineering and Quality teams collaborate to ensure that no defects or regressions were introduced. During acceptance testing, we track the number of issues found, as well as test completion times. Over time, we want to see the issues found during acceptance testing go down to zero as we strive to move testing “left” in the development cycle, since the cost of addressing issues closer to release is much higher.

We use metrics such as test completion times to improve the acceptance testing process, thereby reducing the time it takes to deploy new features to our customers.

Using such metrics, we’ve been able to reduce the time it takes to complete acceptance testing from 2-3 business days to 1 day which has been a huge win for us!

Conclusion:

Quality-defined metrics can be a key ally in driving continuous improvements. At Duo, we’ve developed a broad range of metrics over time to drive a culture of continuous improvement. We review these metrics regularly to keep us informed on where we are doing well and areas for improvement.

Looking ahead, collecting and reporting on metrics is a journey we continue to iterate on.

We continue to make improvements in order to increase our CI pass rates, and to reduce the occurrences of our master branch turning “red.” And we continue to strive to reduce our manual test count even further, with zero as an aspirational goal.

We hope that readers found the metrics and information in this blog post helpful, with ideas to apply to your own journey towards quality driven continuous improvement.


Try Duo For Free

With our free 30-day trial and see how easy it is to get started with Duo and secure your workforce, from anywhere and on any device.