Test Polarity: Detecting Positive and Negative Tests (FSE 2024)
andrehoraa
92 views
31 slides
Jul 18, 2024
Slide 1 of 31
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
About This Presentation
Positive tests (aka, happy path tests) cover the expected behavior of the program, while negative tests (aka, unhappy path tests) check the unexpected behavior. Ideally, test suites should have both positive and negative tests to better protect against regressions. In practice, unfortunately, we can...
Positive tests (aka, happy path tests) cover the expected behavior of the program, while negative tests (aka, unhappy path tests) check the unexpected behavior. Ideally, test suites should have both positive and negative tests to better protect against regressions. In practice, unfortunately, we cannot easily identify whether a test is positive or negative. A better understanding of whether a test suite is more positive or negative is fundamental to assessing the overall test suite capability in testing expected and unexpected behaviors. In this paper, we propose test polarity, an automated approach to detect positive and negative tests. Our approach runs/monitors the test suite and collects runtime data about the application execution to classify the test methods as positive or negative. In a first evaluation, test polarity correctly classified 117 tests as as positive or negative. Finally, we provide a preliminary empirical study to analyze the test polarity of 2,054 test methods from 12 real-world test suites of the Python Standard Library. We find that most of the analyzed test methods are negative (88%) and a minority is positive (12%). However, there is a large variation per project: while some libraries have an equivalent number of positive and negative tests, others have mostly negative ones.
Size: 1.88 MB
Language: en
Added: Jul 18, 2024
Slides: 31 pages
Slide Content
Test Polarity:
Detecting Positive and Negative Tests
Andre Hora
DCC/UFMG [email protected]
1
FSE 2024
Ideas, Visions and Reflections
Motivation and Problem
Positive tests (also known as happy path and sunny day tests) cover
expected behaviors of the program → the normal execution
Negative tests (also known as sad path and bad weather tests) check
unexpected behaviors → the abnormal execution
2
Motivation and Problem
Positive tests (also known as happy path and sunny day tests) cover
expected behaviors of the program → the normal execution
Negative tests (also known as sad path and bad weather tests) check
unexpected behaviors → the abnormal execution
3
Ideally, test suites should have both positive and negative
tests to catch more bugs and protect against regressions
4
calendar.monthrange(year, month)
5
testing valid case (1)
calendar.monthrange(year, month)
testing valid case (12)
6
testing valid case (1)
testing valid case (12)
testing valid case (leap)
testing valid case (non-leap)
calendar.monthrange(year, month)
7
testing valid case (1)
testing valid case (12)
testing valid case (leap)
testing valid case (non-leap)
testing invalid case (0)
testing invalid case (13)
testing invalid case (65)
calendar.monthrange(year, month)
8
calendar.monthrange(year, month)
testing valid case (1)
testing valid case (12)
testing valid case (leap)
testing valid case (non-leap)
testing invalid case (0)
testing invalid case (13)
testing invalid case (65)
Positive and negative tests:
●Not easy to identify
●No detection approach
Benefits?
●Assess test suite capability in testing
expected and unexpected behaviors
●Have data to improve the tests
9
Proposed Work: Test Polarity
An automated approach to detect positive and negative tests
Run and monitor the test suite and collects runtime data about the application
execution to classify the test methods as positive or negative
10
Test Polarity
11
Test Polarity
1. Identify and Rank Tested Paths
2. Detect Tested Paths of Test Methods
3. Classify Tests as Positive or Negative
12
Run and monitor the test suite of the target application
Collect application methods executed by the test suite & tested paths
Rank their tested paths according to their call frequency
1. Identify and Rank Tested Paths
13
Run and monitor the test suite of the target application
Collect application methods executed by the test suite & tested paths
Rank their tested paths according to their call frequency
1. Identify and Rank Tested Paths
14
Path 1
Path 2
Run and monitor the test suite of the target application
Collect application methods executed by the test suite & tested paths
Rank their tested paths according to their call frequency
1. Identify and Rank Tested Paths
15
Path 1
218 calls
Path 2
3 calls
2. Detect Tested Paths of Test Methods
For each test method, identify which application methods & paths are executed
16
2. Detect Tested Paths of Test Methods
For each test method, identify which application methods & paths are executed
17
monthrange
weekday
2. Detect Tested Paths of Test Methods
For each test method, identify which application methods & paths are executed
18
monthrange
weekday
3. Classify Tests as Positive or Negative
Test methods that always execute the top-ranked tested paths (ie. most frequently
executed) are classified as positive, otherwise, they are classified as negative
Rationale: the top-ranked tested path is likely to represents the “happy path”
19
3. Classify Tests as Positive or Negative
Test methods that always execute the top-ranked tested paths (ie. most frequently
executed) are classified as positive, otherwise, they are classified as negative
Rationale: the top-ranked tested path is likely to represents the “happy path”
20
21
Preliminary
Evaluation
22
Preliminary Evaluation: Method
We assess its precision in correctly classifying the tests as positive or negative
We analyze 12 test suites provided by the Python Standard Library
Test polarity classified 2,054 test methods and we randomly selected 324 (i.e.,
95% confidence level and 5% confidence interval) to perform a manual analysis
We manually inspected their source code and classified it as:
●Positive test: valid and expected cases
●Negative test: invalid, unexpected, and exceptional cases
●Unclear polarity: we did not classify the test method
23
Preliminary Evaluation: Results
We could manually classify 117 out of 324:
●Positive tests: 5
●Negative tests: 112
●Unclear polarity: 207
Our automated approach correctly classified all 117 test methods
24
25
26
Preliminary
Empirical Study
27
RQ: What is the test polarity of real-world test suites?
2,054 test methods from 12 test suites of the Python Standard Library
●Negative tests: 88%
●Positive tests: 12%
●Large variation per project
28
RQ: What is the test polarity of real-world test suites?
Words in negative test names: bad, illegal, error, empty, null, issue, and limits
Words in positive tests names: valid, safe, and basic
29
Summary
Test polarity is an automated approach to detect positive and negative tests
based on runtime analysis
Preliminary empirical study with 2,054 test methods from 12 test suites
●Negative tests: 88%
●Positive tests: 12%
●But, large variation per project
Our initial research opens room for novel empirical studies in software testing to
better understand the polarity of real-world systems
30
Test Polarity:
Detecting Positive and Negative Tests
Andre Hora
DCC/UFMG [email protected]
31
FSE 2024
Ideas, Visions and Reflections