What a global pandemic can tell you about better DevOps practices

A presentation at ConFoo 2022 in February 2022 in Montreal, QC, Canada by Jeremy Meiss

Slide 1

Slide 1

DevOps & Software Delivery in a Global Pandemic

Slide 2

Slide 2

Slide 3

Slide 3

Slide 4

Slide 4

performance described vs performance derived

Slide 5

Slide 5

Jeremy Meiss Director, DevRel & Community

Slide 6

Slide 6

2+ million 43,000+ 290,000+ 1,000x jobs/day orgs projects Larger than surveys

  • 40k in 2019
  • 150k in 2019 6

Slide 7

Slide 7

Four classic metrics Deployment frequency Lead time to change Change failure rate Recovery from failure time

Slide 8

Slide 8

CI/CD Benchmarks for high performance Throughput At will Duration <10 minutes Success Rate

90% Mean Time to Recovery <1 hour

Slide 9

Slide 9

The Data

Slide 10

Slide 10

Photo by: Matthew Henry

Slide 11

Slide 11

Throughput

Slide 12

Slide 12

Most teams are not deploying dozens of times per day

Slide 13

Slide 13

Image by Pawan Kolhe from Pixabay

Slide 14

Slide 14

Duration

Slide 15

Slide 15

Photo by Lukas from Pexels

Slide 16

Slide 16

Success Rate

Slide 17

Slide 17

Photo by Brett Sayles from Pexels

Slide 18

Slide 18

Recovery Time

Slide 19

Slide 19

Recovery Time

Slide 20

Slide 20

Recovery Time

Slide 21

Slide 21

The Insight

Slide 22

Slide 22

2020/21 was quite a timeline.

Slide 23

Slide 23

Throughput

Slide 24

Slide 24

Throughput in a global pandemic

Slide 25

Slide 25

Peak Throughput was in April 2020

Slide 26

Slide 26

Duration

Slide 27

Slide 27

Duration in a global pandemic

Slide 28

Slide 28

Hypothesis: more tests written in March, driving up Duration. In April, a concerted effort on optimization

Slide 29

Slide 29

Success rate

Slide 30

Slide 30

Success rate in a global pandemic

Slide 31

Slide 31

Success rate in a global pandemic

Slide 32

Slide 32

Success rate in a global pandemic

Slide 33

Slide 33

Hypothesis: people working hard on core business stability

Slide 34

Slide 34

Recovery Time

Slide 35

Slide 35

Recovery time in a global pandemic

Slide 36

Slide 36

Hypothesis: few distractions* working at home

Slide 37

Slide 37

Important to set targets Throughput The average number of workflow runs per day Duration The average length of time for a workflow to run Mean time to recovery The average time between failures & their next success Success rate The number of successful runs / the total number of runs over a period of time Median CircleCI Developer Suggested Benchmarks 0.7 times/day Merge on any pull request < 4 minutes 5-10 minutes < 56 minutes Under 1 hour 80% for default branch 90% or better on default branch

Slide 38

Slide 38

Things that make you go 🤔

Slide 39

Slide 39

Success Rate on default branch higher than on non-default

Slide 40

Slide 40

Duration on default branches faster at every percentile

Slide 41

Slide 41

Recovery Time lower on default branches at every percentile

Slide 42

Slide 42

What development practices definitively work?

Slide 43

Slide 43

Success Rate does not correlate with company size

Slide 44

Slide 44

Duration is longest for teams of one

Slide 45

Slide 45

Recovery Time decreases with increased team size (up to 200)

Slide 46

Slide 46

Performance is better with >1 contributor

Slide 47

Slide 47

Software is collaborative

Slide 48

Slide 48

Language by Throughput

Slide 49

Slide 49

Language by Success Rate

Slide 50

Slide 50

Language by fastest MTTR

Slide 51

Slide 51

Language by shortest duration

Slide 52

Slide 52

“Don’t deploy on Friday” is not a thing.

Slide 53

Slide 53

“Don’t Deploy on Friday” is not a thing ○ 70% less Throughput on weekends ○ 11% less Throughput on Friday (UTC) ○ 9% less Throughput on Monday (UTC)

Slide 54

Slide 54

2021/22 Sneak Peek 1. Workflows with 0 tests increase YoY, but decrease as total of all workflows 2. More deployments YoY 3. Change validation

Slide 55

Slide 55

2021/22 Sneak Peek

Slide 56

Slide 56

2021/22 Sneak Peek

Slide 57

Slide 57

2021/22 Sneak Peek

Slide 58

Slide 58

2021/22 Sneak Peek Top Languages by # of workflows Language Workflows 1 TypeScript 2,141,524 14 Elixir 133,194 2 JavaScript 1,989,404 15 Jupyter Notebook 130,424 3 Ruby 1,712,578 16 Vue 125,126 4 Python 1,610,022 17 C# 88,364 5 Go 684,239 18 C++ 80,022 6 Java 568,671 19 Gherkin 53,844 7 PHP 475,190 20 CSS 48,955 8 Kotlin 293,032 21 Clojure 47,281 9 HCL 260,143 22 Apex 32,073 10 HTML 256,976 23 Rust 28,144 11 Shell 221,042 24 C 26,607 12 Swift 206,635 25 Dart 23,604 13 Scala 152,340

Slide 59

Slide 59

2021/22 Sneak Peek Shortest Duration by Language Language 1 Batchfile 14 Lua 2 SaltStack 15 Liquid 3 Makefile 16 VCL 4 Smarty 17 Open Policy Agent 5 Jsonnet 18 Groovy 6 Shell 19 Go 7 Mustache 20 Starlark 8 HCL 21 API Blueprint 9 FreeMarker 22 Roff 10 Dockerfile 23 HTML 11 PLSQL 24 R 12 Jinja 25 Python 13 Elm

Slide 60

Slide 60

2021/22 Sneak Peek Shortest MTTR by Language Language 1 Gherkin 14 Kotlin 2 HCL 15 Elixir 3 JavaScript 16 HTML 4 Go 17 Scala 5 Clojure 18 Jupyter Notebook 6 C# 19 Java 7 Vue 20 Swift 8 TypeScript 21 Apex 9 Ruby 22 CSS 10 Python 23 C++ 11 PHP 24 Rust 12 Perl 25 C 13 Shell

Slide 61

Slide 61

2021/22 Sneak Peek Throughput by Language Language 1 Hack 14 Dart 2 Slim 15 Elixir 3 Elm 16 Go 4 Mustache 17 C# 5 Haskell 18 Kotlin 6 Jinja 19 Blade 7 Gherkin 20 Scala 8 Jsonnet 21 Python 9 Jupyter Notebook 22 LookML 10 Apex 23 Lua 11 TypeScript 24 CoffeeScript 12 Swift 25 Clojure 13 Ruby

Slide 62

Slide 62

2021/22 Sneak Peek Success Rate by Language Language 1 Dockerfile 14 Clojure 2 Vue 15 Jupyter Notebook 3 Shell 16 Java 4 Go 17 Scala 5 SCSS 18 CSS 6 HTML 19 PLpgSQL 7 TypeScript 20 Kotlin 8 PHP 21 Ruby 9 Python 22 Makefile 10 C# 23 Groovy 11 HCL 24 TSQL 12 JavaScript 25 Gherkin 13 Elixir

Slide 63

Slide 63

2021/22 Sneak Peek 50th percentile on CircleCI fit into the “Elite performer” category on the 2021 State of DevOps report

Slide 64

Slide 64

Full Report https://circle.ci/ssd2020

Slide 65

Slide 65

Timeline.jerdog.me Thank you. For feedback and swag: circle.ci/jeremy IAmJerdog jerdog /in/jeremymeiss