High-performing engineering teams and the Holy Grail

A presentation at Civo Navigate in February 2023 in Tampa, FL, USA by Jeremy Meiss

Jeremy Meiss Director, DevRel & Community

Forrester 2021 Total Economic Empact study Using best-in-class CI/CD platforms can provide: $7.8 million saved from shorter software development cycles. $4.3 million recuperated in lost developer productivity. 50% decrease in annual infrastructure spend. $1.7 million estimated value of improved code quality.

CI/CD Benchmarks for high-performing teams Duration Mean time to recovery Success rate Throughput

Duration the foundation of software engineering velocity, measures the average time in minutes required to move a unit of work through your pipeline

<=10 minute builds “a good rule of thumb is to keep your builds to no more than ten minutes. Many developers who use CI follow the practice of not moving on to the next task until their most recent checkin integrates successfully. Therefore, builds taking longer than ten minutes can interrupt their flow.” — Paul M. Duvall (2007). Continuous Integration: Improving Software Quality and Reducing Risk

Duration: What the data shows Benchmark: 5-10mins

“Why so much lower than the Duration benchmark?”

Improving test coverage Add unit, integration, UI, and end-to-end testing across all app layers Incorporate code coverage tools into pipelines to identify inadequate testing Include static and dynamic security scans to catch vulnerabilities Incorporate TDD practices by writing tests during design phase

Optimizing your pipelines Use test splitting and parallelism to execute multiple tests simultaneously Cache dependencies and other data to avoid rebuilding unchanged portions Use Docker images custom made for CI environments Choose the right machine size for your needs

Duration and the Platform Team Identify and eliminate impediments to developer velocity Set guardrails and enforce quality standards across projects Standardize test suites and CI pipeline configs, i.e. shareable config templates and policies Welcome failed pipelines, i.e. fast failure Actively monitor, streamline, and parallelize pipelines across the org

Mean time to Recovery the average time required to go from a failed build signal to a successful pipeline run

Mean time to recovery is indicative of resilience

“A key part of doing a continuous build is that if the mainline build fails, it needs to be fixed right away. The whole point of working with CI is that you’re always developing on a known stable base.” — Fowler, Martin. “Continuous Integration.” Web blog post. MartinFowler.com. 1 May 2006. Web.

MTTR: What the data shows Benchmark: 60mins

“10 minutes is a striking improvement - what happened?”

Two factors impacting reduced MTTR Economic pressures in the macro environment + rising competition in the micro environment, forcing teams to prioritize product stability and reliability over growth High performers increasingly rely on platform teams to achieve steadier and more resilient development pipelines with built-in recovery mechanisms.

Treat your default branch as the lifeblood of your project

Getting to faster recovery times Set up instant alerts for failed builds using services like Slack, Twilio, or Pagerduty. Write clear, informative error messages for your tests that allow you to quickly diagnose the problem and focus your efforts in the right place. SSH into the failed build machine to debug in the remote test environment. Doing so gives you access to valuable troubleshooting resources, including log files, running processes, and directory paths.

MTTR and the Platform Team Ephasise the value of deploy-ready, default branches, with clear processes & expectations for failure recovery across all projects Set up effective monitoring and alerting systems, and track recovery time Limit frequency and severity of broken builds with role-based AC and config policies Config- and Infrastructure-as-Code tools limit potential for misconfig errors Actively monitor, streamline, and parallelize pipelines across the org

Success Rate number of passing runs divided by the total number of runs over a period of time

Success rate: What the data shows Benchmark: 90%+ on default

Success rate and the Platform Team With low success rates, look at your MTTR and shorten recovery time first Set a baseline success rate, then aim for continuous improvement, looking for flaky tests or gaps in test coverage Be mindful of patterns and influence of external factors, i.e. decline on Fridays, holidays, etc.

Throughput average number of workflow runs that an organization completes on a given project per day

Throughput: What the data shows Benchmark: at the speed of your business

Throughput and the Platform Team Map goals to reality of internal and external business situations, i.e. customer expectations, competitive landscape, codebase complexity, etc. Capture a baseline, monitor for deviations Alleviate as much developer cognitive load from day-to-day work

“Surely <insert programming language> helps me achieve the “Holy Grail”!?”

Thank You. timeline.jerdog.me IAmJerdog jerdog /in/jeremymeiss For feedback and swag: circle.ci/jeremy @jerdog@hachyderm.io

Jeremy Meiss
@jeremymeiss

1 / 67

High-performing engineering teams” are the Holy Grail for every CTO. But what are they, are they attainable, and if so, how? In this talk we’ll look at anonymous data collected each year since 2019, and explore this rare specimen in its native habitat - right there in your organization, and how to activate them. We’ll also gain some interesting insights into better DevOps practices along the way.

Buzz and feedback

Here’s what was said about this presentation on social media.

Conference season 2023 kicks off for me today with a trip to Tampa for @CivoCloud's #CivoNavigate. Looking forward to presenting some metrics on the Holy Grail - high-performing engineering teams. pic.twitter.com/8WNP5VD54G
— Jeremy, The Patronizing Saint of DevOps 🇺🇲🇺🇦 (@IAmJerdog) February 6, 2023
Come to my talk this morning, at 10a ET in The Theatre, and hear about high-performing engineering teams. Or just come for GIFs like these... Either way, see you soon at #CivoNavigate! https://t.co/JaJ50z3pi3 pic.twitter.com/quef4isdP2
— Jeremy, The Patronizing Saint of DevOps 🇺🇲🇺🇦 (@IAmJerdog) February 8, 2023
This was a fabulous and very entertaining talk, but not just hilarious slides. Also great content! @CivoCloud #CivoNavigate pic.twitter.com/FtD9ekCGlv
— Lisa-Marie Namphy 😇 (@SWDevAngel) February 8, 2023
The meme potential of this @CivoCloud talk is high. Looking forward to what @IAmJerdog has to say about high performance teams and medieval tomfoolery. pic.twitter.com/7xnSwEE92O
— Ramiro Berrelleza (@rberrelleza) February 8, 2023
Listening @IAmJerdog about high performing engineering teams at @CivoCloud Navigate pic.twitter.com/8R0CBpDZ37
— ☸️ Civo Navigate 💜 #wasm (@k33g_org) February 8, 2023
Sup @IAmJerdog 👋🏽🔥

Always great hearing your humorous talks with great perspectives on cloud native and dev technology#CivoNavigate pic.twitter.com/OvRRPkg6XN
— Marino Wijay (he/him) 🇨🇦🍁 (@virtualized6ix) February 8, 2023
“It’s okay to fail” is a recurring theme in this morning’s talks at @CivoCloud #CivoNavigate from @oicheryl & @IAmJerdog. If you’re not “failing” your not trying! 😀🤩🚀 pic.twitter.com/lDh8fppASy
— Lisa-Marie Namphy 😇 (@SWDevAngel) February 8, 2023
🚀 Change is coming to the technology industry, and our 2023 SOSDR is here to reveal how your #DevOps teams can deliver high-impact results for engineering productivity based on our analysis of nearly 15 million CircleCI workflows.

Read the report: https://t.co/lDXxc9ml7k pic.twitter.com/lX5PmEg52f
— CircleCI (@CircleCI) April 6, 2023

High-performing engineering teams and the Holy Grail

Link for this presentation:

HTML code for embedding:

Share on social media:

Buzz and feedback