The Unicorn Project and the Five Ideals.pdf

1,009 views 95 slides Jun 21, 2022
Slide 1
Slide 1 of 95
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85
Slide 86
86
Slide 87
87
Slide 88
88
Slide 89
89
Slide 90
90
Slide 91
91
Slide 92
92
Slide 93
93
Slide 94
94
Slide 95
95

About This Presentation

The Five Ideals: How Structure and Architecture Enable Developer Productivity - Gene Kim


Slide Content

@RealGeneKim
Session ID:Gene Kim
The Unicorn Project
And The Five Ideals

@RealGeneKim
My Definition of DevOps
The architecture, technical practices, and cultural norms that enable us to…
increase our ability to deliver applications and services...
quickly and safely, which enables rapid experimentation and innovation, and the fastest delivery of value to our customers…
while ensuring world-class security, reliability, and stability...
…so that we can win in the marketplace.
Source: DevOps Handbook (Kim, Humble, Willis, Debois)

@RealGeneKim
Better Value,
Sooner, Safer, Happier
Source: Jon Smart, Partner, Founder, Sooner, Safer, Happier (@jonsmart)

@RealGeneKim
The Problems That Still Remain
§Absence of all the invisible structures needed to
enable developer productivity
§The orthogonal problem of getting data from
where it resides to where it needs to be used
§Strong opposition to support new ways of
working
§Ambiguity on what behaviors needed to support
during a transformation

@RealGeneKim
The Five Ideals
1.Locality and Simplicity
2.Focus, Flow, and Joy
3.Improvement of Daily Work
4.Psychological Safety
5.Customer Focus

@RealGeneKim
Session ID:
The Business Value Of DevOps
Is Even Higher Than We Thought

@RealGeneKim
EliteLowDifference
Deployment FrequencyOn-demand
(multiple times per day)Monthly or quarterly208x
Deployment Lead Time< 1 hour1 day to 1 week2,555x
Deploy Success Rate0-15%46-60%7x
Mean Time to Restore< 1 hour1 week to 1 month2,604x
Elite vs. Low Performers
Source: Google/DORA: 2019 State Of DevOps Report: https://cloud.google.com/devops/state-of-devops/

@RealGeneKim
EliteLowDifference
Deployment FrequencyOn-demand
(multiple times per day)Monthly or quarterly208x
Deployment Lead Time< 1 hour1 week to 1 month106x
Deploy Success Rate0-15%46-60%7x
Mean Time to Restore< 1 hour1 week to 1 month2,604x
Elite vs. Low Performers
Source: Google/DORA: 2019 State Of DevOps Report: https://cloud.google.com/devops/state-of-devops/

@RealGeneKim
EliteLowDifference
Deployment FrequencyOn-demand
(multiple times per day)Monthly or quarterly208x
Deployment Lead Time< 1 hour1 week to 1 month106x
Deploy Failure Rate0-15%46-60%7x
Mean Time to Restore< 1 hour1 week to 1 month2,604x
Elite vs. Low Performers
Source: Google/DORA: 2019 State Of DevOps Report: https://cloud.google.com/devops/state-of-devops/

@RealGeneKim
EliteLowDifference
Deployment FrequencyOn-demand
(multiple times per day)Monthly or quarterly208x
Deployment Lead Time< 1 hour1 week to 1 month106x
Deploy Failure Rate0-15%46-60%7x
Mean Time to Restore< 1 hourLess than one day2,604x
Elite vs. Low Performers
Source: Google/DORA: 2019 State Of DevOps Report: https://cloud.google.com/devops/state-of-devops/

@RealGeneKim
High Performers Are More Secure And
Controlled
2x29%
less time spent
remediating
security issues
more time spent
on new work
Source: Google/DORA: 2018 State Of DevOps Report: https://cloudplatformonline.com/2018-state-of-devops.html

@RealGeneKim
High Performers Win In The Marketplace
2x2x
more likely to
exceed profitability,
market share &
productivity goals
more likely to achieve
organizational and
mission goals, customer
satisfaction, quantity &
quality goals
Source: Google/DORA: 2018 State Of DevOps Report: https://cloudplatformonline.com/2018-state-of-devops.html

@RealGeneKim
High Performers Win In The Marketplace
2.2x
higher employee
Net Promoter Score50%
higher market
capitalization growth
over 3 years*
Source: Google/DORA: 2018 State Of DevOps Report: https://cloudplatformonline.com/2018-state-of-devops.html

@RealGeneKim
The Opposite Of
Technical Debt Is…

@RealGeneKim
When we can safely, quickly,
reliably, securely achieve
all the goals, dreams and
aspirations of our business…

@RealGeneKim
When we can safely, quickly,
reliably, securely achieve
all the goals, dreams and
aspirations of the organizations
we serve…

@RealGeneKim
The Five Ideals
1.Locality and Simplicity
2.Focus, Flow, and Joy
3.Improvement of Daily Work
4.Psychological Safety
5.Customer Focus

@RealGeneKim
Session ID:
Ideal #1:
Locality and Simplicity

@RealGeneKim
The First Ideal: A Measure
§Bus factor
§Lunch factor

@RealGeneKim
How Many People Do You Need To Feed?
§Two pizza team
§Feeding everyone in the building
§Schedule lunch with 43 different people

@RealGeneKim
Architecture Enables Teams To…
§…make large scale changes to the design of its system without the permission of someone outside the team, or depending on other teams
§...complete its work without fine-grained communication and coordination with people outside the team
§...deploy and release its product or service on demand, independently of other services the product or service depends upon
§...do most of its testing on demand, without requiring an integrated test environment
§...perform deployments during normal business hours with negligible downtime
Source: Puppet/DORA: 2017 State Of DevOps Report: https://puppet.com/resources/whitepaper/state-of-devops-report

@RealGeneKim
The First Ideal: Code
§Ideal: anyone can implement what they need by
looking at one file or module, and make the
needed change
§Kubernetes sidecars
§Spring (http-retry, Dependency Injection)
§Aspect Oriented Programming
§Not Ideal: to make your needed change, you
have to understand and change all the files and
modules

@RealGeneKim
The First Ideal: Code
§Ideal: changes can be independently
implemented and tested, isolated from other
components (composability)
§Not Ideal: in order for changes to be
implemented and tested, the entire system must
be present (e.g., integrated test environment)

@RealGeneKim
The First Ideal: Organization
§Ideal: every team has the expertise, capability
and authority to satisfy customer needs
§Not Ideal: in order to satisfy customer needs,
every team must escalate up two levels (and over
two, and down two)

@RealGeneKimSource: Manu Cornet: Bonkersworld

@RealGeneKimSource: Manu Cornet: Bonkersworld

@RealGeneKim
Team of Teams
§Story of Joint Special Forces
Task Force battling a smaller,
nimbler adversary in Iraq in
2004
§Pushing decision making to
the edges

@RealGeneKim
Unity of Command
Enable Decentralized Execution

@RealGeneKim
Session ID:
Ideal #2:
Focus, Flow, and Joy

@RealGeneKimSource: https://itrevolution.com/love-letter-to-clojure-part-1/

@RealGeneKim
Rediscovering The Joy Of Programming
§For decades, I self-identified as an Ops person…
§2 years ago, I’ve started to self-identify as Dev
§Clojure/ ClojureScript
§LISP, functional programming, immutability
§3000 lines of Objective C -> 1500 lines of TypeScript/React -> 500 lines of ClojureScript
§Development is so fun, and these days, you can do miraculous things with so little effort

@RealGeneKim
Why Functional Programming
§The famous French philosopher Claude Lévi-Strauss would say of certain tools, ‘is it good to think with?’
§Core FP concepts
§Immutability
§Pure functions
§Composability
§Pioneered by LISP and ML. Popularized by OCaml, Haskell, Clojure, Erlang, Elm, Elixir, ReasonML, PureScript…

@RealGeneKim
Interestingly, It Portends Future Of Ops
§Core concepts
§Immutability
§Pure functions
§Composability
§Look at…
§Docker, Docker Compose
§Kubernetes
§Kubernetes sidecars
§Event streams: Apache Kafka
§Immutable databases: Datomic, Couchdb, Delphix
§Git

@RealGeneKim
The Second Ideal: Focus and Flow
§Ideal: your energy and time is focused on solving
the business problem, and you’re having fun
§Not Ideal: all your time is spent trying to solve
problems you don’t even want to solve (e.g.,
YAML files, Makefileand spaces in filenames,
bash)

@RealGeneKim
Never Have I Valued Infrastructure More
§Things I detest now
§Everything outside of my application
§Connecting anything to anything
§Updating dependencies
§Secrets management
§Authentication and authorization
§Data masking
§Bash
§YAML
§Patching
§Building kubernetesdeployment files (mostly by Googling)
§Why my cloud costs are so high

@RealGeneKim
The Value Of Platforms
§Enable developer productivity
§Self-service
§On-demand
§Immediacy and fast feedback
§Focus and flow
§Joy
§Monitoring, deployment, feature flagging,
environment creation, security scans, orchestration,
database provisioning, test data management…

@RealGeneKim
Flow: Dr. Mihaly Csikszentmihalyi
●State of Flow
●Two types of learning
●Procedural Learning
●One-shot Learning

@RealGeneKim
“What is your lead time
for changes?”
“How long does it take to go from
code committed tocode successfully
running in production?”

@RealGeneKimSource: The DevOps HandbookChange Committed Into Version Control
Product Design and DevelopmentProduct Delivery
(Build, Test, Deploy)
Create new products and services that solve
customer problems using hypothesis-driven
delivery, modern UX, design thinking
Enable fast flow from development to
production and reliable releases by
standardizing work, reducing variability and
batch sizes
Feature design and implementation may
require work that has never been done before
Integration, test and deployment must be
performed continuously, as quickly as possible
Estimates are highly uncertainCycle times should be well-known and
predictable
Outcomes are highly variableOutcomes should have low variability

@RealGeneKim
Making Changes When It Matters Most
“By installing a rampant innovation culture,
we performed 165 experiments in the peak three
months of tax season.”
–Scott Cook, Intuit Founder
“Our business result? Conversion rate of the
website is up 50 percent. Employee result?
Everyone loves it, because now their ideas can
make it to market.”

@RealGeneKim
What Is The One Question That
Predicts Performance With
Startling Accuracy?

@RealGeneKim
“To what degree do we fear
doing deployments?”
Source: Puppet Labs 2015 State Of DevOps: https://puppetlabs.com/2015-devops-report

@RealGeneKim
Session ID:
Ideal #3:
Improvement Of Daily Work

@RealGeneKim
Third Ideal: Improvement of Daily Work
§Not Ideal: TWWADI
§“The Way We’ve Always Done It”
§Ideal: MTBTT
§“Make Tomorrow Better Than Today”
(Google SRE Principle #2)

@RealGeneKim
Session ID:
Greatness Isn’t Free…
The Need To Pay Down Technical
Debt

@RealGeneKim
Fast Push To Market
Debts & Risks
Features
Quality
Defects

@RealGeneKim
Fast Push To Market —Continued
Features
Defects
Defect fixing dominates work
Site reliability tanks
Slower and slower velocity
Customers leave
Morale plunges
Devsleave because everything is hard
Quality
Debts & Risks

@RealGeneKimSource: https://twitter.com/johncutlefish/status/1046169469268111361
Who hasn’t felt this?
You hire a bunch of developers, but you
still can’t ship the features you
promised…
…and maybe you even have the feeling
that things are slowing down…

@RealGeneKim
RistoSiilasmaa, NOKIA
Source: The Unicorn Project (2019) / Transforming NOKIA (2019)

@RealGeneKim
Near Death Experiences
●Ebay(1999)
●Microsoft (2002): Bill Gates memo
●Google (2005): Automated testing culture
●Amazon (2004): Jeff Bezos memo
●Twitter (2008)
●LinkedIn (2009)
●Etsy (2009)

@RealGeneKim
2002 Microsoft Security
Standdown
§Famously, Microsoft after
SQL Slammer required
every product group to
freeze feature
Source: https://www.wired.com/2002/01/bill-gates-trustworthy-computing/

@RealGeneKim
The Feature Freeze / Standdown
Debt
Features
Quality
Defects
Features

@RealGeneKim

@RealGeneKim
The Third Ideal: Enabling Greatness
§Ideal: 3-5% of developers dedicated to improving
developer productivity
§Google: likely 1,500+ devs($1B+)
§Microsoft: likely over 3,000 devs
§Not ideal: assigned to summer interns and
“people not good enough to be developers”

@RealGeneKimSource: Satya Nadella, CEO, Microsoft (@satyanadella)

@RealGeneKim
50% Of R&D On Platforms!
§Jean-Michel Lemieux
(formerly VP Engr,
Atlassian; former CTO,
Shopify)
§“50% of R&D cycles
should be spent on
platforms”
§40% on features
§10% on experiments
Source: https://twitter.com/jmwind/status/1470894712538103813?s=20

@RealGeneKim
The Third Ideal: Improvement
§Not Ideal: No one cares if someone breaks the
build, or checks in code that breaks our tests
§Ideal: When someone breaks our build or our
tests, fixing it becomes the most important work
of the moment

@RealGeneKim
Session ID:
Ideal #4:
Psychological Safety

@RealGeneKim
Google: Project Aristotle, Oxygen, re:Work
Source: https://rework.withgoogle.com/blog/five-keys-to-a-successful-google-team/

@RealGeneKim
DevOps Enterprise: Lessons Learned
§In 2022, weheld fifteenth DevOps Enterprise Summit, a conference for horses, by horses
§Over the years, we’ve had over 600 leaders from:
§Capital One, KeyBank, Barclays, GE Capital, ING Bank, Fidelity, PNC, ADP, BofA, Western Union, BBVA, US Bank
§Nationwide Insurance, Zurich Insurance, Allstate, Hiscox, Aviva, LV=
§Walmart, Nordstrom, Target, Macy’s, Marks and Spencer, H&M
§Nike, Adidas
§J&J, Syngenta, Siemens Healthineers, Schlumberger*, AscendisPharma*
§American Airlines, Delta Airlines, TUI Group
§Sherwin Williams, Unilever, P&G
§Verizon, Telstra, T-Mobile, Orange, CSG
§Raytheon, Lockheed Martin, Northrop Grumman, CSRA, Jaguar Land Rover, Fiat/Chrysler, Cisco
§Disney, Ticketmaster, NBC/Universal, Comcast
§Kaiser Permanente, Stanford Medicine, Columbia Memorial, BUPA*
§US Citizenship & Immigration Services, UK HM Revenue Collection, DISA Forge.mil, NZ Ministry of Social Development, UK Welfare and Pensions, US Joint Warfare Analysis Center, USAF Kessel Run
§Amazon PrimeNow, CA, Compuware, Google Search, IBM, MicroFocus, Microsoft, SAP

@RealGeneKim
DevOps Enterprise: Big Four Auditors
§Matt Bonser, Director, Digital Risk Solutions, PricewaterhouseCoopers LLP
§Yosef Levine, Managing Director, Global Technology Controls, Confidentiality & Privacy, Deloitte
§Jeff Roberts, Senior Manager, Advisory Services, Ernst & Young
§Michael Wolf, Managing Director Modern Delivery Lead, KPMG
Source: https://videolibrary.doesvirtual.com/?video=485153001

@RealGeneKim
One Of The Highest Predictors Of
Performance
Source: Typology Of Organizational Culture (Westrum, 2004)

@RealGeneKim
One Of The Highest Predictors Of
Performance
Source: Typology Of Organizational Culture (Westrum, 2004)

@RealGeneKim
One Of The Highest Predictors Of
Performance
Source: Typology Of Organizational Culture (Westrum, 2004)

@RealGeneKim
One Of The Highest Predictors Of
Performance
Source: Typology Of Organizational Culture (Westrum, 2004)

@RealGeneKim
Great Practices Enabled
§Blameless post-mortems
§Chaos Monkeys

@RealGeneKim
Session ID:
Ideal #5:
Customer Focus

@RealGeneKimCourtesy: Compuware (Chris O’Malley, @chris_t_omalley; Jim Bryan, @jimbryan82)

@RealGeneKim
The Fifth Ideal: Focus On The Customer
§Core vs. Context
§Enabled reallocation of $8MM back
into R&D

@RealGeneKim
The Fifth Ideal: Focus On The Customer
§Not ideal: Functional silo managers prioritize silo
goals over business goals
§Ideal: Functional silo managers make decisions
based on what the customer values, and helps
ensure their teams have the skills to thrive in the
long term

@RealGeneKim
Why Do I Think This Is
Important?

@RealGeneKim
“The world is changing very fast...
Big will not beat small anymore. It
will be the fast beating the slow.”
Source: Rupert Murdoch

@RealGeneKim
The Five Ideals
1.Locality and Simplicity
2.Focus, Flow, and Joy
3.Improvement of Daily Work
4.Psychological Safety
5.Customer Focus

@RealGeneKim
2014: Dr. Steve Spear at MIT Sloan

@RealGeneKim

@RealGeneKim
DevOps, Lean As Part of a Great Whole
§Team of Teams, TQM, psychological safety, 2nd
order learning, resilience engineering, safety
culture, disruptive innovation, MIT Beer Game,
optionality and architecture, teal organizations,
radical delegation, inner-sourcing, away teams

@RealGeneKim
GreatHorrible
U.S. JSOC (2000s)Team of Teams (After)Team of Teams (Before)
Naval doctrine and execution
(1890s-1940s)
US Navy WWIIJapanese Navy WWII
Nuclear reactor design and
operations (1950s-2020s)
US NavySoviet Navy
Space programs (1960s vs 1970s)US Apollo Space ProgramUS Space Shuttle Program
Missile development (1960s)Sidewinder missile projectFalcon and Sidewinder missiles
Automobile design and
manufacturing (1970s-2010s)
Toyota (1970-2010)General Motors (1970-2010)
Software development (1970s vs.
2000s)
DevOps and AgileWaterfall software development
Mobile phone (2000s)Apple iPhoneNokia
Search engines (2000-2010s)GoogleYahoo
COVID vaccine creation (2020)5x COVID vaccines approved for
experimental use
The 20 companies that took Warp
Speed $$ but failed
COVID vaccination sites (2021)Vaccination sites (8K/day, 100%)Vaccination sites (1K/day, 30%)

@RealGeneKim
Communication Paths As Predictor
Source: Manu Cornet: Bonkersworld

@RealGeneKim
Slower Integrated Problem Solving
Not allowedIntegrated problem
solving
§Leaders get incomplete information, too late
§Teams don’t have access to expertise they need, deprived of full creative potential

@RealGeneKim
Typical Healthcare Scenario
§Someone in Nursing has gloves that tear
§Why does it need to escalate 8 levels to COO to
connect them with someone in supply chain?
§Happens whenever integrated problem solving
must cross function silos
§Nursing, Pharmacy, Transport, Supply Chain, Labs
§And issues involving clinicians has to go to COO, over
to CMO, and then down

@RealGeneKim
Faster Integrated Problem Solving
§Defined values streams, where relationships that enable integrated problem solving are more linear and explicit
§Easier for the organization to dynamically change, because the structure is simpler

@RealGeneKim
“If you have a dope at the top, you
will have, or soon will have, dopes
all the way down.”
Jack Rabinow
Rule #23 of Leadership

@RealGeneKim
The Sociotechnical Maestro
§High energy
§High standards
§Great in the large
§Great in the small (so they can ask good
questions)
§Loves walking the floor

@RealGeneKim
Structure As A Predictor
Source: Manu Cornet: Bonkersworld

@RealGeneKim
Amazon 2004
§“Amazon.comstarted 10 years ago as a monolithic application, running on a Web server, talking to a database on the back end. This application, dubbed Obidos, evolved to hold all the business logic, all the display logic, and all the functionality that Amazon eventually became famous for: similarities, recommendations, Listmania, reviews, etc.”
§“The many things that you would like to see happening in a good software environment couldn’t be done anymore; there were many complex pieces of software combined into a single system. It couldn’t evolve anymore. The parts that needed to scale independently were tied into sharing resources with other unknown code paths. There was no isolation and, as a result, no clear ownership
Source: https://queue.acm.org/detail.cfm?id=1142065

@RealGeneKim
The $1 Billion Amazon API Rearchitecture
1.All teams will henceforth expose their data and functionality through service interfaces.
2.Teams must communicate with each other through these interfaces.
3.There will be no other form of interprocesscommunication allowed
4.It doesn't matter what technology you use, HTTP, Corba, Pubsub, Bezos doesn't care.
5.Service interfaces without exception must be designed from the ground up to be externalizable
6.Anybody who doesn't do this will be fired.
7.Thank you, have a nice day.
Source: https://queue.acm.org/detail.cfm?id=1142065
(“#7 is obviously a joke,
because obviously
Bezos doesn’t care
whether you have a
good day or not”)
Who enforced this?
Amazon CIO: Rick
Dalzell, a former U.S.
Army Ranger

@RealGeneKim

@RealGeneKim

@RealGeneKimSource: Pingdom

@RealGeneKim

@RealGeneKim
Amazon Results
§1999: thousands of deployments/year
§2001: tens of deployments/year
§2011: 15K deployments/day
§2015: 136K deployments/day

@RealGeneKim
The Magic of 2000s Google Infrastructure
§The miracle of Google platform in 2000s: present to developers the illusion that they’re running on persistent, deterministic hardware.
…when in actuality, they were running unreliable, ephemeral compute resources.
§It vastly simplified the developer mental model, enabling them to be massively more productive

@RealGeneKim
50% Of R&D Cycles On Platforms!
§Jean-Michel Lemieux(formerly VP Engr, Atlassian; former CTO, Shopify)
§Ideal
§50% on platforms
§40% on features
§10% on experiments
§Not Ideal: 80% of features !
Source: https://twitter.com/jmwind/status/1470894712538103813?s=20

@RealGeneKim
Want More Learn More?
To receive this presentation and the following:
§PDF and audio excerpts from The Unicorn Project
§Eight excerpts from Beyond The Phoenix Projectaudio series w/John Willis
§The 140 page excerpt of The DevOps Handbook
§The 140 page excerpt of The Phoenix Project
§Videos and slides from DevOps Enterprise 2014-2019
§One hour excerpt of The Phoenix Project audiobook
Just pick up your phone, and send an email:
To: [email protected]
Subject: devops
[email protected]
devops

@RealGeneKim
Tags