Introduction to Garbage Collection

5,358 views 23 slides Jan 09, 2017
Slide 1
Slide 1 of 23
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23

About This Presentation

Introduction to Garbage Collection Algorithms. Use cases in JVM.


Slide Content

Garbage Collection

In computer science, garbage collection (GC) is a
form of Automatic Memory Management.



The garbage collector attempts to reclaim the
memory occupied by objects that are no longer in
use by the program.

Garbage collection was invented by John McCarthy
around 1959 to abstract away manual memory
management in Lisp.

Static vs Dynamic memory allocation



Creating an array of ten integers in C (static)
int array[10];
Creating an array of integers in C (dynamic)
int * array = malloc(N * sizeof(int));

free(array);
In C++
Foo* fooPtr = new Foo()
delete fooPtr

Languages without GC

C, C++*, D*, Objective-C*,
Rust*

Languages

Languages wit GC

Java, Go, PHP, Python, Scala*,
Haskell, ….

Garbage Collection Algorithms

Reference Counting Mark and Sweep
Copy Collection
Generational Collection

Reference Counting


Keep an extra integer (“reference count”) to every
heap-allocated data structure.

●With a new reference ++refCount.
●When a reference disappears --refCount.
●If refCount == 0, then reclaim the storage
+Easy to implement
+Real Time cleanup
-Additional storage
-Speed: incr, decr
-Cycles not cleaned
C++, Objective-C, Rust,
PHP, Python….

Tracing Collectors vs Reference Counting

Mark and Sweep

Mark Phase: traverses all objects,
starting with roots, and marks every
object found as alive.
Sweep Phase: traverses all
objects, reclaim the storage of
unmarked objects.
+Handles cycles
+Easy to implement
-Stops the world**
-Scans the entire heap

Copy Collection

+Handles cycles
+Automatic Compaction
-Stops the world**
-Changes addresses
-2nd half is unused

Generational Collection

●Most objects die young
●Newer objects usually point to older objects
+Same as Copy (unidirectional)
+Less Objects to Copy
+Frequency of collection
+Faster
-Stops the world**
-Complex to implement

Java Stack vs Heap

Pre and Post Java 8 Heap

Java Garbage Collectors

●Serial Collector - -XX:+UseSerialGC
●Parallel Collector - -XX:+UseParallelGC
●Concurrent Mark & Sweep Collector - -XX:+UseConcMarkSweepGC
●Garbage First (G1) Collector - -XX:+UseG1GC

Serial GC

●Stops the world
●Uses single thread
●Designed for single CPU small Heap apps
●Do not use it!
●Young Gen - Copy Collector
●Old Gen - Mark and Sweep

Parallel GC (The Default Collector)

●Stops the world
●Designed to work with multiple CPUs
●Uses multiple threads
●Expect high latencies when GC runs (-XX:MaxGCPauseMillis=<N>)
●Young Gen - Copy Collector
●Old Gen - Mark and Sweep

Concurrent Mark and Sweep (CMS) GC

●Stops the world (relatively short pauses)
●Designed to work with multiple CPUs
●Uses multiple threads
●Good for low latency apps
●Young Gen - Copy Collector
●Old Gen - Concurrent Mark-Sweep

Garbage First (G1) Collector

●Heap is split into (typically 2048) smaller regions
●Avoids collecting the entire heap at once, instead collects incrementally
○The regions that contain the most garbage are collected first
●Soft real-time garbage collector (Predictable | Configurable STW)
●Compaction is relatively easy
●Uses multiple threads and is good for > 6G heap sizes. Java 9 Default GC.
●Young Gen - Copy Collector
●Old Gen - Concurrent Mark-Sweep

Minor GC vs Major GC vs Full GC

Minor GC cleans the Young Generation

Major GC cleans the Old Generation

Full GC cleans the Young and Old Generation

AdProxy Latency Issues

JVM Options to know
java -XX:+PrintFlagsFinal -version | grep HeapSize

uintx InitialHeapSize := 268435456 {product}
uintx MaxHeapSize := 4294967296 {product}


java -Xms256m -Xmx2048m

-XX:MaxGCPauseMillis=200

-XX:+PrintGCDetails -XX:+PrintGCDateStamps

Java Memory Leak

Hashmap keys without proper equals and hashcode

Maps or lists which are growing forever

Troubleshooting

UI Options: jconsole or jvisualvm.

Command Line:
jstat -gc -t processID 1s
jmap -heap processID
jmap -dump:live,format=b,file=heap.bin processID
jhat heap.bin

JVM Options: -XX:+PrintGCDetails-XX:+PrintGCDateStamps

Thank you!