EuroPython 2024 - Streamlining Testing in a Large Python Codebase
jimmy_lai
231 views
58 slides
Jul 12, 2024
Slide 1 of 58
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
About This Presentation
Maintaining code quality through effective testing becomes increasingly challenging as codebases expand and developer teams grow. In our rapidly expanding codebase, we encountered common obstacles such as increasing test suite execution time, slow test coverage reporting and delayed test startup. By...
Maintaining code quality through effective testing becomes increasingly challenging as codebases expand and developer teams grow. In our rapidly expanding codebase, we encountered common obstacles such as increasing test suite execution time, slow test coverage reporting and delayed test startup. By leveraging innovative strategies using open-source tools, we achieved remarkable enhancements in testing efficiency and code quality.
As a result, in the past year, our test case volume increased by 8000, test coverage was elevated to 85%, and Continuous Integration (CI) test duration was maintained under 15 minute
Size: 2.12 MB
Language: en
Added: Jul 12, 2024
Slides: 58 pages
Slide Content
Streamlining Testing in a Large
Python Codebase
Jimmy Lai, Staff Software Engineer, Zip
July 12, 2024
Python Testing: pytest, coverage, and continuous integration01
02
03
04
05
Outline
The Slow Test Challenges
Optimization Strategies
Results
Recap
Zip is the world’s leading
Intake & Procurement
Orchestration Platform
450+ global
customers
$4.4 billion
total customer savings
Top talent from
tech disruptors
$181 million
raised at $1.5 billion valuation
A Large Python Codebase
100 developers
We’re hiring fast
1
A Large Python Codebase
100 developers
We’re hiring fast
2.5 million lines of
Python code
Doubling every year
1 2
Scaling Challenges
100 developers
We’re hiring
2.5 million lines of
Python code
Doubling every year
1 2
Number of tests and
tech debt increase
fast
3
------------- coverage -------------
Name Stmts Miss Cover
------------------------------------
helper.py 5 1 80%
test_helper.py 6 0 100%
------------------------------------
TOTAL 11 1 91%
======= 2 passed in 0.03s =======
To increase the test coverage: add a new test case
for odd numbers
https://pypi.org/project/pytest-cov/
Test Coverage
Continuous Integration
Practice: continuously merge changes into the shared codebase
while ensuring the quality
Continuous Integration
Practice: continuously merge changes into the shared codebase
while ensuring the quality
●Developers submit a pull request (PR) for code review
Continuous Integration
Practice: continuously merge changes into the shared codebase
while ensuring the quality
●Developers submit a pull request (PR) for code review
●Run tests to verify the code changes
Continuous Integration
Practice: continuous merge changes into the shared codebase
●Developers submit a pull request (PR) for code review
●Run tests to verify the code changes
●Merge a PR after all tests passed and approved
Continuous Integration
Practice: continuously merge changes into the shared codebase
while ensuring the quality
●Developers submit a pull request (PR) for code review
●Run tests to verify the code changes
●Merge a PR after all tests passed and approved
Ensure that test reliability and test coverage meet the required
thresholds
Continuous Integration using Github Workflows
# File: .github/workflows/ci.yml
name: CI
on:
pull_request: # on updating a pull request
branches:
- main
push: # on merging to the main branch
branches:
- main
https://docs.github.com/en/actions/using-workflows
Continuous Integration using Github Workflows
jobs:
build:
runs-on: ubuntu-latest
on:
pull_request: # on updating a pull request
branches:
- main
push: # on merging to the main branch
branches:
- main
https://docs.github.com/en/actions/using-workflows
Challenge: Test Execution Time Increases Over Time
Number of tests
increases
1
Pain Point:
Long Test Execution Time
Challenge: Test Execution Time Increases Over Time
Number of tests
increases
Codebase size
increases
1 2
Pain Point:
Test Coverage OverheadPain Point:
Long Test Execution Time
Challenge: Test Execution Time Increases Over Time
Number of tests
increases
Codebase size
increases
Number of
dependencies increases
1 2 3
requirements.txt
Pain Point:
Test Coverage Overhead Pain Point: Slow Test StartupPain Point:
Long Test Execution Time
?????? Strategy #1: Parallel Execution
Run Tests in Parallel on multiple CPUs
https://pypi.org/project/pytest-xdist/
pytest -n 8 # use 8 worker processes
# use all available CPU cores
pytest -n auto
Run Tests in Parallel on multiple CPUs
https://pypi.org/project/pytest-xdist/
pytest -n 8 # use 8 worker processes
# use all available CPU cores
pytest -n auto
N: number of CPUs (e.g. 8 cores)
Test Execution Time ÷ N
10,000 tests ÷ N is still slow
Run Tests in Parallel on multiple Runners
https://pypi.org/project/pytest-split/
# Split tests into 10 parts and run the 1st part
pytest --splits 10 --group 1
Run Tests in Parallel on multiple Runners
https://pypi.org/project/pytest-split/
# Split tests into 10 parts and run the 1st part
pytest --splits 10 --group 1
N: number of CPUs
Test Execution Time ÷ N
M: number of runners
10,000 tests ÷ N ÷ M
Run Tests in Parallel on multiple Runners
https://pypi.org/project/pytest-split/
# Split tests to 10 parts and run the 1st part
pytest --splits 10 --group 1
# Assumption: All tests have the same
# test execution time.
# Unbalanced test execution time can lead to
# unbalanced Runner durations
N: number of CPUs
Test Execution Time ÷ N
M: number of runners
10,000 tests ÷ N ÷ M
Run Tests in Parallel on multiple Runners
https://pypi.org/project/pytest-split/
# Split tests to 10 parts and run the 1st part
pytest --splits 10 --group 1
# Assumption: All tests have the same
# test execution time.
# Unbalanced test execution time can lead to
# unbalanced Runner durations
# To collect test execution time
pytest --store-durations
# To use the collected time
pytest --splits 10 --group 1 --durations-path
.test_durations
N: number of CPUs
Test Execution Time ÷ N
M: number of runners
10,000 tests ÷ N ÷ M
Use Multi-Runners and Multi-CPUs in a Github Workflow
python-test-matrix:
runs-on: ubuntu-latest-8-cores # needs larger runner configuration
strategy:
fail-fast: false # to collect all failed tests
matrix:
group: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
steps:
- run: pytest -n auto -split 10 --group ${{ matrix.group }} ...
Cache Non-Python Dependency Installation
Common non-Python dependencies:
●Python and Node interpreters
●Database: Postgres
●System packages: protobuf-compiler, graphviz, etc.
●Browsers for end-to-end tests: Playwright
# Dockerfile
FROM … # a base image
RUN sudo apt-get install -y postgresql-16 protobuf-compiler
Save 10 minutes or more on each CI run
in a large codebase
https://docs.github.com/en/actions/using-jobs/running-jobs-in-a-container
# After publishing the image
# to a registry
Skip Unnecessary Tests and Linters
Only run specific tests when only specific code are changed
# Github workflow
jobs:
changed-files:
outputs:
has-py-changes: ${{ steps. find-py-changes.outputs .any_changed }}
runs-on: ubuntu-latest
steps: actions/checkout@v4
- uses: tj-actions/changed-files @44
id: find-py-changes
with:
files: **/*.py
Only run specific tests when only specific code are changed
# Github workflow
jobs:
changed-files:
outputs:
has-py-changes: ${{ steps. find-py-changes.outputs .any_changed }}
runs-on: ubuntu-latest
steps: actions/checkout@v4
- uses: tj-actions/changed-files@44
id: find-py-changes
with:
files: **/*.py
run-pytest:
needs: changed-files
if: needs.changed-files.outputs.has-py-changes == 'True'
steps:
- run: pytest
Skip Unnecessary Tests and Linters
?????? Can also only runs on updated files in linters
✨ Modularize code and use build systems to run even fewer tests
https://github.com/marketplace/actions/changed-files
Skip Coverage Analysis for Unchanged Files
# pytest --cov by default measures coverage for all files
and it’s slow in a large codebase
# Add --cov=UPDATED_PATH1 --cov=UPDATED_PATH2 … to only
measure the updated files
Skip Coverage Analysis for Unchanged Files
# pytest --cov by default measures coverage for all files
and it’s slow in a large codebase
# Add --cov=UPDATED_PATH1 --cov=UPDATED_PATH2 … to only
measure the updated files
Save 1 minute or more on each CI run in a
large codebase
?????? Strategy #4: Modernize Runners
Use Faster and Cheaper Runners
Use the new generation CPU/MEM to run faster and cheaper
The 3rd-party-hosted runner providers:
●Namespace
●BuildJet
●Actuated
●…
Use self-hosted runners with auto-scaling
https://github.com/actions/actions-runner-controller/
Use Actions Runner Controller to deploy auto-scaling runners using
Kubernetes with custom hardware specifications (e.g. AWS EC2)
5X+ Cost Saving and 2X+ Faster Test Speed compared to Github runners
Rujul Zaparde
Co-Founder and CEO
Continuously optimizing CI test execution time to improve
developer experiences
Results
Rujul Zaparde
Co-Founder and CEO
Continuously optimizing CI test execution time to improve
developer experiences
Results
Increasing test coverage with
beer quality assurance
Recap: ?????? Strategies for Scaling Slow Tests
in a Large Codebase
Parallel Execution01
02
03
04
Cache
Skip Unnecessary Computing
Modernize Runners
Rujul Zaparde
Co-Founder and CEO
Lu Cheng
Co-Founder and CTO
Engineering Blog
hps://engineering.ziphq.com