Monitoring the Execution of 14K Tests: Methods Tend to Have One Path that Is Significantly More Executed (FSE 2024)

Monitoring the Execution of 14K Tests:
Methods Tend to Have One Path That Is
Significantly More Executed
Andre Hora
DCC/UFMG
[email protected]
1
FSE 2024
Ideas, Visions and Reflections

Motivation & Problem
Having a good test suite is fundamental to ensuring software quality and
sustainable software evolution
Developers should focus on testing both the expected and unexpected behaviors
of the program to catch more bugs and protect against regressions
●Expected behavior: the normal execution, simpler to test
●Unexpected behavior: the abnormal execution, harder to test
2

Motivation & Problem
Having a good test suite is fundamental to ensuring software quality and
sustainable software evolution
Developers should focus on testing both the expected and unexpected behaviors
of the program to catch more bugs and protect against regressions
●Expected behavior: the normal execution, simpler to test
●Unexpected behavior: the abnormal execution, harder to test
3
In practice, it is well-known that developers are more
likely to test expected behaviors than unexpected ones

Motivation & Problem
However, existing research is mostly restricted to controlled experiments, like case
studies with students and developers
- Students are likely to (naively) test the “happy cases” [7]
- Expert developers may test the “sad cases” [25]

We still lack empirical evidence extracted from
real-world software systems and their test suites
4

5
Email Python Standard Library

6
Email Python Standard Library
Three possible behaviors at runtime:
1.Entering in both the for and if blocks
2.Entering in the for block and not in the if block
3.Not entering in the for block

7
Email Python Standard Library
Three possible behaviors at runtime:
1.Entering in both the for and if blocks
2.Entering in the for block and not in the if block
3.Not entering in the for block
At this point, it is unclear what
behaviors are the most and least
frequently tested by developers

Can you guess?

8

9
Interesting: the large
discrepancy between the
execution frequency of
different paths
Path 1 concentrates most
of the calls (70.9%)

Path 3 receives only 4.4%

Open Question
Are tested paths of real software likely to concentrate calls or do
calls tend to be more distributed among the tested paths?

Provide insights for developers to improve existing test suites
Support the creation of novel testing tools to better understand test suites
Reveal novel empirical data for researchers to quantify the difference between the
execution frequency of distinct paths in real-world software
10

Proposed Work
We propose an empirical study to assess the tested paths quantitatively
We monitor the execution of 14K tests from 25 real-world Python systems,
assessing 11K tested paths from 2,357 methods
11

Study Design
12

Study Design
1.Detecting the tested paths
2.Selecting software systems
3.Research questions
13

Study Design: Detecting the Tested Paths
1. Collecting executed lines of code
We execute an instrumented version of the
test suite that monitors the tests and collect
data from the execution trace
2. Detecting the tested paths
A tested path represents a set of input
values that make the method execute the
same lines of code
3. Ranking the tested paths
For each method with one or more tested
paths, we sort their paths in descending
order of path frequency
14

Study Design: Selecting Software Systems
25 Python systems
2,357 methods
14,177 tests
11,425 tested paths
15

Study Design: Research Questions
RQ1: Frequency of the most tested paths (top 1 vs. top 2)
RQ2: Frequency of the least tested paths (top 1 vs. top 3+)
16

Results
17

RQ1: Frequency of the Most Tested Paths
18
Top 1 vs. Top 2

RQ1: Frequency of the Most Tested Paths
19
Top 1 vs. Top 2
Finding 1: Overall, one tested path tends
to receive most of the calls. Top 1 receives
4x more calls than the Top 2.

RQ1: Frequency of the Most Tested Paths
20
Finding 1: Overall, one tested path tends
to receive most of the calls. Top 1 receives
4x more calls than the Top 2.
Top 1 vs. Top 2
Finding 2: In methods with two tested
paths, one path tends receive close to 5x
more calls than the second one.

RQ1: Frequency of the Most Tested Paths
21
Finding 2: In methods with two tested
paths, one path tends receive close to 5x
more calls than the second one.
Finding 3: Even methods with four or more
tested paths have one path that receives
the majority of the calls.
Top 1 vs. Top 2
Finding 1: Overall, one tested path tends
to receive most of the calls. Top 1 receives
4x more calls than the Top 2.

RQ2: Frequency of the Least Tested Paths
22
Top 1 vs. Top 3+

RQ2: Frequency of the Least Tested Paths
23
Top 1 vs. Top 3+

RQ2: Frequency of the Least Tested Paths
24
Top 1 vs. Top 3+
Finding 4: The top 3+ tested paths receive a
minority of the calls, ranging from 4% to 24%.

Overall, the most tested path of a method has
6.5x more calls than the top 3+.

Summary
We presented an empirical study to assess the tested paths quantitatively
We monitored the execution of over 14K tests and 11K tested paths
Overall, we found that one tested path is prevalent and receives most of the calls,
while others are significantly less executed
Possible applications:
●Provide insights for developers to improve existing test suites
●Support the creation of novel testing tools
●Reveal novel empirical data for researchers
25

Monitoring the Execution of 14K Tests:
Methods Tend to Have One Path That Is
Significantly More Executed
Andre Hora
DCC/UFMG
[email protected]
26
FSE 2024
Ideas, Visions and Reflections

Monitoring the Execution of 14K Tests: Methods Tend to Have One Path that Is Significantly More Executed (FSE 2024)

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Monitoring the Execution of 14K Tests: Methods Tend to Have One Path that Is Significantly More Executed (FSE 2024)

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......