Assessing the Threat of Untracked Changes in Software Evolution (ICSE 2018)

andrehoraa 18 views 46 slides Jul 25, 2024
Slide 1
Slide 1 of 46
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46

About This Presentation

While refactoring is extensively performed by practitioners, many Mining Software Repositories (MSR) approaches do not detect nor keep track of refactorings when performing source code evolution analysis. In the best case, keeping track of refactorings could be unnecessary work; in the worst case, t...


Slide Content

Assessing the Threat of Untracked
Changes in Software Evolution
André Hora, Danilo Silva,
Marco Tulio Valente, Romain Robbes
ICSE 2018

Outline
1.Context
2.Problem
3.Background
4.Study Design
5.Results
6.Final Remarks
2

Mining Software
Repositories (MSR)
3

MSR Examples
•Library migration
•Change prediction
•Bug fixing
•Warnings prioritization
•Code expert computation
•…
4

Level of analysis
Changes in classes and
methods over time
5

Example 1
java.util.Vector —> java.util.List
6

Example 2
7
FileInputStream() —> Okio.source()

Example 2
8
FileInputStream() —> Okio.source()

Example 2
9
FileInputStream() —> Okio.source()
Rule would not be detected due to
the method renaming

Several other examples…
10

Refactoring is common
practice in software
development
11

Outline
1.Context
2.Problem
3.Background
4.Study Design
5.Results
6.Final Remarks
12

MSR studies may be
affected by refactoring
13

MSR researchers are aware about this
“threat”, but they often do not assess it
“Our tool is unable to verify if an entity in revision n has been renamed
in revision n+1” [48]
“The development history of a file can be lost in case of renaming
operations, copy or file split” [3]
“It is possible to miss bug-introducing changes when a file changes its
name since the approach does not track such name changes” [38]
“We detect renamed or moved units as units that are removed first and
added later” [50]
14

MSR researchers are aware about this
“threat”, but they often do not assess it
“Our tool is unable to verify if an entity in revision n has been renamed
in revision n+1” [48]
“The development history of a file can be lost in case of renaming
operations, copy or file split” [3]
“It is possible to miss bug-introducing changes when a file changes its
name since the approach does not track such name changes” [38]
“We detect renamed or moved units as units that are removed first and
added later” [50]
15
[2, 5, 6, 7, 12, 22, 26, 27, 28, 29, 34, 36,
42, 45, 53, 54, 59, 61, 62, 66, 67, 68…]

What is the impact of
refactoring on MSR
studies?
16

Outline
1.Context
2.Problem
3.Background
4.Study Design
5.Results
6.Final Remarks
17

Tracked and Untracked
Changes
version 1 version 2
public void foo() {
obj.print()
}
public void foo() {
obj.println()
}
version 3
public void bar() {
obj.println()
}
tracked change: preserves the entity name and
modifies its source code
untracked change: modifies the entity name,
and may also modify its source code
18

Change Graph
class Foo {
mA() {…}
}
class Bar {
mB() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
}
class Foo {
mA() {…}
}
class Baz {
mY() {…}
}
class Qux {
mC() {…}

}
class Qux {
mC() {…}
mE() {…}
}
version 1 version 2 version 3 version 4
19
tracked change
untracked change
Legend

Outline
1.Context
2.Problem
3.Background
4.Study Design
5.Results
6.Final Remarks
20

Research Questions
•RQ1.What is the frequency of untracked
changes?
•RQ2. What is the extension of untracked
changes?
•RQ3. What is the impact of untracked
changes in existing MSR-based
approaches?
21

Case Studies
22

Tracked and Untracked
Changes Computation
Refactoring resolution
•RefDiff [Silva et al., MSR 2017]
•Precision: 85.6% - 100%
•Recall: 89.8% - 93.9%
1.Rename Class
2.Move Class
3.Extract Superclass
4.Move and Rename Class
5.Extract Interface
6.Rename Method
7.Move Method
8.Extract Method
9.Inline Method
10.Pull Up Method
11.Push Down Method
23

Outline
1.Context
2.Problem
3.Background
4.Study Design
5.Results
6.Final Remarks
24

RQ1
What is the frequency of untracked
changes?
25

RQ1. What is the frequency of
untracked changes? (example)
class Foo {
mA() {…}
}
class Bar {
mB() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
}
class Foo {
mA() {…}
}
class Baz {
mY() {…}
}
class Qux {
mC() {…}

}
class Qux {
mC() {…}
mE() {…}
}
version 1 version 2 version 3 version 4
26
17 changes
12 tracked changes
5 untracked changes

RQ1. What is the frequency of
untracked changes? (example)
class Foo {
mA() {…}
}
class Bar {
mB() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
}
class Foo {
mA() {…}
}
class Baz {
mY() {…}
}
class Qux {
mC() {…}

}
class Qux {
mC() {…}
mE() {…}
}
version 1 version 2 version 3 version 4
27
17 changes
12 tracked changes
5 untracked changes

RQ1. What is the frequency of
untracked changes? (example)
class Foo {
mA() {…}
}
class Bar {
mB() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
}
class Foo {
mA() {…}
}
class Baz {
mY() {…}
}
class Qux {
mC() {…}

}
class Qux {
mC() {…}
mE() {…}
}
version 1 version 2 version 3 version 4
28
17 changes
12 tracked changes
5 untracked changes

RQ1. What is the frequency of
untracked changes? (example)
class Foo {
mA() {…}
}
class Bar {
mB() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
}
class Foo {
mA() {…}
}
class Baz {
mY() {…}
}
class Qux {
mC() {…}

}
class Qux {
mC() {…}
mE() {…}
}
version 1 version 2 version 3 version 4
29
Not desirable: relevant
data may be missed !!!
17 changes
12 tracked changes
5 untracked changes

RQ1. What is the frequency of
untracked changes?
Untracked
changes
Classes
2% to 15%
Methods
10% to 21%
30

RQ1. What is the frequency of
untracked changes?
Untracked
changes
Classes
2% to 15%
Methods
10% to 21%
31
Untracked changes are frequent

RQ1. What is the frequency of
untracked changes?
Untracked
changes
Rename mtd: 26%
Extract mtd: 23%
Move mtd: 22%
Move class: 12%
32

RQ1. What is the frequency of
untracked changes?
Untracked
changes
Rename mtd: 26%
Extract mtd: 23%
Move mtd: 22%
Move class: 12%
33
Keeping track of renamings is not enough

RQ2
What is the extension of untracked
changes?
34

class Foo {
mA() {…}
}
class Bar {
mB() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
}
class Foo {
mA() {…}
}
class Baz {
mY() {…}
}
class Qux {
mC() {…}

}
class Qux {
mC() {…}
mE() {…}
}
version 1 version 2 version 3 version 4
7 paths
3 paths: only tracked
changes
4 paths: at least one
untracked changes
RQ2. What is the extension of
untracked changes? (example)
35

class Foo {
mA() {…}
}
class Bar {
mB() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
}
class Foo {
mA() {…}
}
class Baz {
mY() {…}
}
class Qux {
mC() {…}

}
class Qux {
mC() {…}
mE() {…}
}
version 1 version 2 version 3 version 4
RQ2. What is the extension of
untracked changes? (example)
36
1
2
3
7 paths
3 paths: only tracked
changes
4 paths: at least one
untracked changes

class Foo {
mA() {…}
}
class Bar {
mB() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
}
class Foo {
mA() {…}
}
class Baz {
mY() {…}
}
class Qux {
mC() {…}

}
class Qux {
mC() {…}
mE() {…}
}
version 1 version 2 version 3 version 4
RQ2. What is the extension of
untracked changes? (example)
37
1
2
3
4
7 paths
3 paths: only tracked
changes
4 paths: at least one
untracked changes

class Foo {
mA() {…}
}
class Bar {
mB() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
mC() {…}
}
class Foo {
mA() {…}
}
class Bar {
mX() {…}
}
class Foo {
mA() {…}
}
class Baz {
mY() {…}
}
class Qux {
mC() {…}

}
class Qux {
mC() {…}
mE() {…}
}
version 1 version 2 version 3 version 4
RQ2. What is the extension of
untracked changes? (example)
38
1
2
3
4
Not desirable: their
histories may be split !!!
7 paths
3 paths: only tracked
changes
4 paths: at least one
untracked changes

RQ2. What is the extension of
untracked changes?
39
18% to 41%
entities with at least
one untracked change
in their histories

RQ2. What is the extension of
untracked changes?
22% to 58%
entities with at least
one untracked change
in their histories
Only considering the
most changed entities
40

RQ2. What is the extension of
untracked changes?
22% to 58%
entities with at least
one untracked change
in their histories
Only considering the
most changed entities
41
Untracked changes cause splits in entity histories

RQ3. What is the impact of untracked changes
in existing MSR-based approaches?
•Approaches
•API evolution mining rule (eg, Vector —> List)
•API co-usage mining rule (eg, Map —> HashMap)
•Results
•Amount of mined rules: usually improves when taking into
account untracked changes (median: 0% to +7%)
•Quality of mined rules: slightly improves when including
untracked changes (median: -2% to +2%)
42

RQ3. What is the impact of untracked changes
in existing MSR-based approaches?
•Approaches
•API evolution mining rule (eg, Vector —> List)
•API co-usage mining rule (eg, Map —> HashMap)
•Results
•Amount of mined rules: usually improves when taking into
account untracked changes (median: 0% to +7%)
•Quality of mined rules: slightly improves when including
untracked changes (median: -2% to +2%)
43
The impact of untracked changes is difficult to predict,
and needs to be evaluated in a case-by-case basis

Outline
1.Context
2.Problem
3.Background
4.Study Design
5.Results
6.Final Remarks
44

Untracked changes are frequent
(10-21% at method level)
MSR studies should resolve untracked changes to access potentially
relevant new mining data
Keeping track of renamings is not enough
(≈26%)
MSR studies should address “extraction” and “moving” for a more
complete resolution of untracked changes
Untracked changes cause splits in entity histories
(18-41%)
MSR studies should resolve untracked changes when performing
traceability analysis, for more precise entity lifespans
45

Assessing the Threat of Untracked
Changes in Software Evolution
André Hora, Danilo Silva,
Marco Tulio Valente, Romain Robbes
ICSE 2018
Tags