Python GC

delimitry 3,389 views 22 slides May 23, 2014
Slide 1
Slide 1 of 22
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22

About This Presentation

Slides for presentation about Python GC (Garbage Collector) and memory management in Python (CPython version 2.7)


Slide Content

Python GC Dmitry Alimov Software Developer Zodiac Interactive 2014

Garbage collection The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program. W as invented by John McCarthy around 1959 to solve problems in Lisp . Used in Lisp, Smalltalk, Python, Java, Ruby, Perl, C#, D, Haskell, Schema, Objective-C, etc. Basic algorithms: Reference counting Mark-and-sweep Mark-and-compact Copying collector Generational collector

Memory in Python

PyMem_Malloc (), PyMem_Realloc (), PyMem_Free () PyMem_New (), PyMem_Resize (), PyMem_Del () Memory Management Other languages have " variables“, Python has "names" or "identifiers ". Everything is an object >>> b = a >>> a = 2 >>> a = 1 Memory management involves a private heap containing all objects and data structures.

sys.getsizeof ( object [, default ]) >>> import sys >>> a = 123 >>> sys.getsizeof (a) 24 # 64-bit version Return the size of an object in bytes (without GC overhead). __ sizeof __() >>> a.__ sizeof __() 24 # 64-bit version sys.getsizeof and __ sizeof __ Return the size of an object in bytes. The object can be any type of object . getsizeof () calls the object’s __ sizeof __ method and adds an additional garbage collector overhead if the object is managed by the garbage collector. >>> sys.getsizeof (tuple((1, 2, 3))) 72 >>> tuple((1, 2, 3)).__ sizeof __() 48

id(object) >>> a = 123 >>> id(a) 30522672L This function returns the string starting at memory address address . ctypes.string_at ( address [, size ]) >>> ctypes.string_at(id(a), 24) '\x06\x00\x00\x00\x00\x00\x00\x00\xc0G)\x1e\x00\x00\x00\x00{\x00\x00\x00\x00\x00\x00\x00' >>> struct.unpack ('QQQ', ctypes.string_at(id(a), 24)) (6, 506021824, 123) id and ctypes.string_at Return the “identity” of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime . CPython implementation detail: This is the address of the object in memory.

>>> sys.getrefcount (a) 8 >>> struct.unpack ('QQQ', ctypes.string_at(id(a), 24)) ( 6 , 506021824 , 123 ) >>> type(a) <type ' int '> >>> id(type(a)) 506021824L >>> a 123 >>> ctypes.c_long.from_address (id(a)) c_long (6) Return the reference count of the object. The count returned is generally one higher than you might expect, because it includes the (temporary) reference as an argument to getrefcount (). sys.getrefcount ( object ) Unpack the string (presumably packed by pack( fmt , ...)) according to the given format. struct.unpack ( fmt , string ) Q | unsigned long long | integer type | 8 bytes

>>> struct.unpack ('QQQ', ctypes.string_at(id(a), 24)) ( 6 , 506021824 , 123 ) C code: typedef struct { PyObject_HEAD long ob_ival ; } PyIntObject ; #define PyObject_HEAD \ _ PyObject_HEAD_EXTRA \ Py_ssize_t ob_refcnt ; \ struct _ typeobject * ob_type ; # define _ PyObject_HEAD_EXTRA \ struct _object *_ ob_next ; \ struct _object *_ ob_prev ;

Garbage Collector in Python

First garbage collection algorithm is known as reference counting . It was invented by George Collins in 1960. Reference Counting Py_INCREF / Py_DECREF If something decref'ed to 0, it should have been deallocated immediately at that time.

GC methods gc.get_referrers (* objs ) Return the list of objects that directly refer to any of objs . gc.get_referents (* objs ) Return a list of objects directly referred to by any of the arguments.

Cyclic references

G enerational algorithm of GC 3 Generations with thresholds: - generation 0 (youngest): 700 - generation 1 (middle ): 10 - generation 2 (oldest): 10 >>> import gc >>> gc.get_threshold() (700, 10, 10) To limit the cost of garbage collection, there are two strategies: - make each collection faster, e.g. by scanning fewer objects - do less collections Except objects with a __del__ method ! -> gc.garbage Full collection if the ratio: long_lived_pending / long_lived_total > 25% ( Python 2.7+ )

Py_TPFLAGS_HAVE_GC flag >>> Py_TPFLAGS_HAVE_GC = 1 << 14 >>> bool (type(1).__flags__ & Py_TPFLAGS_HAVE_GC ) False >>> bool (type([]).__flags__ & Py_TPFLAGS_HAVE_GC ) True TYPE* PyObject_GC_New (TYPE, PyTypeObject *type) TYPE* PyObject_GC_NewVar (TYPE, PyTypeObject *type, Py_ssize_t size) The Py_TPFLAGS_HAVE_GC flag is set. Need provide an implementation of the tp_traverse handler. /* Adds op to the set of container objects tracked by GC */ void PyObject_GC_Track ( PyObject *op) Object types which are “containers” for other objects C API:

Generation Generation 0 Linked list Generation 0

Generation 0 Generation 1

Weak References >>> import weakref >>> class A(object): pass >>> a = A() >>> b = weakref.ref (a ) >>> weakref.getweakrefcount (a) 1 >>> p = weakref.proxy (a ) >>> b() <__ main__.A object at 0x0000000001EE64A8> >>> del a >>> b() None >>> b < weakref at 0000000001E8C408; dead > >>> p < weakproxy at 0000000001EAC458 to NoneType at 00000001E297348 > Weak reference is a reference that does not protect the referenced object from collection by a garbage collector, unlike a strong reference.

Debug gc.DEBUG _* gc.set_debug ( gc.DEBUG_LEAK ) Heapy (http://guppy-pe.sourceforge.net/) Memory profiler (https :// pypi.python.org/pypi/memory_profiler) Python Object Graphs ( http://mg.pov.lt/objgraph /) gdb -heap (https://fedorahosted.org/gdb-heap/)

Thank you

http://en.wikipedia.org/wiki/Garbage_collection_(computer_science ) http:// docs.python.org/2/library/gc.html http://svn.python.org/view/python/trunk/Modules/gcmodule.c?revision=81029 http://patshaughnessy.net/2013/10/30/generational-gc-in-python-and-ruby http:// asvetlov.blogspot.ru/2008/11/blog-post.html http://habrahabr.ru/post/193890 / http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html http://foobarnbaz.com/2012/07/08/understanding-python-variables / http://habrahabr.ru/company/wargaming/blog/198140 / http ://en.wikipedia.org/wiki/Weak_reference References

Q & A @ delimitry