What a global pandemic can tell you about better DevOps practices

A presentation at Code PaLOUsa 2021 in August 2021 in by Jeremy Meiss

Slide 1

Slide 1

DevOps & Software Delivery in a Global Pandemic

Slide 2

Slide 2

Slide 3

Slide 3

Slide 4

Slide 4

performance described vs performance derived

Slide 5

Slide 5

Jeremy Meiss Director, DevRel & Community

Slide 6

Slide 6

2 million 44,000+ 160,000+ 1,000x jobs/day orgs projects Larger than surveys

  • 40k in 2019
  • 150k in 2019 6

Slide 7

Slide 7

Four classic metrics Deployment frequency Lead time to change Change failure rate Recovery from failure time

Slide 8

Slide 8

CI/CD Benchmarks for high performance Throughput At will Duration <10 minutes Success Rate

90% Mean Time to Recovery <1 hour

Slide 9

Slide 9

The Data

Slide 10

Slide 10

Photo by: Matthew Henry

Slide 11

Slide 11

Throughput

Slide 12

Slide 12

Most teams are not deploying dozens of times per day

Slide 13

Slide 13

Image by Pawan Kolhe from Pixabay

Slide 14

Slide 14

Duration

Slide 15

Slide 15

Photo by Lukas from Pexels

Slide 16

Slide 16

Success Rate

Slide 17

Slide 17

Photo by Brett Sayles from Pexels

Slide 18

Slide 18

Recovery Time

Slide 19

Slide 19

Recovery Time

Slide 20

Slide 20

Recovery Time

Slide 21

Slide 21

The Insight

Slide 22

Slide 22

2020 has been a year.

Slide 23

Slide 23

Throughput

Slide 24

Slide 24

Throughput in a global pandemic

Slide 25

Slide 25

Peak Throughput was in April 2020

Slide 26

Slide 26

Duration

Slide 27

Slide 27

Duration in a global pandemic

Slide 28

Slide 28

Hypothesis: more tests written in March, driving up Duration. In April, a concerted effort on optimization

Slide 29

Slide 29

Success rate

Slide 30

Slide 30

Success rate in a global pandemic

Slide 31

Slide 31

Success rate in a global pandemic

Slide 32

Slide 32

Success rate in a global pandemic

Slide 33

Slide 33

Hypothesis: people working hard on core business stability

Slide 34

Slide 34

Recovery Time

Slide 35

Slide 35

Recovery time in a global pandemic

Slide 36

Slide 36

Hypothesis: few distractions* working at home

Slide 37

Slide 37

Important to set targets Throughput The average number of workflow runs per day Duration The average length of time for a workflow to run Mean time to recovery The average time between failures & their next success Success rate The number of successful runs / the total number of runs over a period of time Median CircleCI Developer Suggested Benchmarks 0.7 times/day Merge on any pull request < 4 minutes 5-10 minutes < 56 minutes Under 1 hour 80% for default branch 90% or better on default branch

Slide 38

Slide 38

Things that make you go 🤔

Slide 39

Slide 39

Branch information

Slide 40

Slide 40

No significant change in default branch from master… yet.

Slide 41

Slide 41

Success Rate on default branch higher than on non-default

Slide 42

Slide 42

Duration on default branches faster at every percentile

Slide 43

Slide 43

Recovery Time lower on default branches at every percentile

Slide 44

Slide 44

What development practices definitively work?

Slide 45

Slide 45

Success Rate does not correlate with company size

Slide 46

Slide 46

Duration is longest for teams of one

Slide 47

Slide 47

Recovery Time decreases with increased team size (up to 200)

Slide 48

Slide 48

Performance is better with >1 contributor

Slide 49

Slide 49

Software is collaborative

Slide 50

Slide 50

Language by Throughput

Slide 51

Slide 51

Language by Success Rate

Slide 52

Slide 52

Language by fastest TTR

Slide 53

Slide 53

Language by shortest duration

Slide 54

Slide 54

“Don’t deploy on Friday” is not a thing.

Slide 55

Slide 55

“Don’t Deploy on Friday” is not a thing ○ 70% less Throughput on weekends ○ 11% less Throughput on Friday (UTC) ○ 9% less Throughput on Monday (UTC)

Slide 56

Slide 56

Full Report https://circle.ci/ssd2020

Slide 57

Slide 57

Thank you.