Garbage collection The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program. W as invented by John McCarthy around 1959 to solve problems in Lisp . Used in Lisp, Smalltalk, Python, Java, Ruby, Perl, C#, D, Haskell, Schema, Objective-C, etc. Basic algorithms: Reference counting Mark-and-sweep Mark-and-compact Copying collector Generational collector
Memory in Python
PyMem_Malloc (), PyMem_Realloc (), PyMem_Free () PyMem_New (), PyMem_Resize (), PyMem_Del () Memory Management Other languages have " variables“, Python has "names" or "identifiers ". Everything is an object >>> b = a >>> a = 2 >>> a = 1 Memory management involves a private heap containing all objects and data structures.
sys.getsizeof ( object [, default ]) >>> import sys >>> a = 123 >>> sys.getsizeof (a) 24 # 64-bit version Return the size of an object in bytes (without GC overhead). __ sizeof __() >>> a.__ sizeof __() 24 # 64-bit version sys.getsizeof and __ sizeof __ Return the size of an object in bytes. The object can be any type of object . getsizeof () calls the object’s __ sizeof __ method and adds an additional garbage collector overhead if the object is managed by the garbage collector. >>> sys.getsizeof (tuple((1, 2, 3))) 72 >>> tuple((1, 2, 3)).__ sizeof __() 48
id(object) >>> a = 123 >>> id(a) 30522672L This function returns the string starting at memory address address . ctypes.string_at ( address [, size ]) >>> ctypes.string_at(id(a), 24) '\x06\x00\x00\x00\x00\x00\x00\x00\xc0G)\x1e\x00\x00\x00\x00{\x00\x00\x00\x00\x00\x00\x00' >>> struct.unpack ('QQQ', ctypes.string_at(id(a), 24)) (6, 506021824, 123) id and ctypes.string_at Return the “identity” of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime . CPython implementation detail: This is the address of the object in memory.
>>> sys.getrefcount (a) 8 >>> struct.unpack ('QQQ', ctypes.string_at(id(a), 24)) ( 6 , 506021824 , 123 ) >>> type(a) <type ' int '> >>> id(type(a)) 506021824L >>> a 123 >>> ctypes.c_long.from_address (id(a)) c_long (6) Return the reference count of the object. The count returned is generally one higher than you might expect, because it includes the (temporary) reference as an argument to getrefcount (). sys.getrefcount ( object ) Unpack the string (presumably packed by pack( fmt , ...)) according to the given format. struct.unpack ( fmt , string ) Q | unsigned long long | integer type | 8 bytes
First garbage collection algorithm is known as reference counting . It was invented by George Collins in 1960. Reference Counting Py_INCREF / Py_DECREF If something decref'ed to 0, it should have been deallocated immediately at that time.
GC methods gc.get_referrers (* objs ) Return the list of objects that directly refer to any of objs . gc.get_referents (* objs ) Return a list of objects directly referred to by any of the arguments.
Cyclic references
G enerational algorithm of GC 3 Generations with thresholds: - generation 0 (youngest): 700 - generation 1 (middle ): 10 - generation 2 (oldest): 10 >>> import gc >>> gc.get_threshold() (700, 10, 10) To limit the cost of garbage collection, there are two strategies: - make each collection faster, e.g. by scanning fewer objects - do less collections Except objects with a __del__ method ! -> gc.garbage Full collection if the ratio: long_lived_pending / long_lived_total > 25% ( Python 2.7+ )
Py_TPFLAGS_HAVE_GC flag >>> Py_TPFLAGS_HAVE_GC = 1 << 14 >>> bool (type(1).__flags__ & Py_TPFLAGS_HAVE_GC ) False >>> bool (type([]).__flags__ & Py_TPFLAGS_HAVE_GC ) True TYPE* PyObject_GC_New (TYPE, PyTypeObject *type) TYPE* PyObject_GC_NewVar (TYPE, PyTypeObject *type, Py_ssize_t size) The Py_TPFLAGS_HAVE_GC flag is set. Need provide an implementation of the tp_traverse handler. /* Adds op to the set of container objects tracked by GC */ void PyObject_GC_Track ( PyObject *op) Object types which are “containers” for other objects C API:
Generation Generation 0 Linked list Generation 0
Generation 0 Generation 1
Weak References >>> import weakref >>> class A(object): pass >>> a = A() >>> b = weakref.ref (a ) >>> weakref.getweakrefcount (a) 1 >>> p = weakref.proxy (a ) >>> b() <__ main__.A object at 0x0000000001EE64A8> >>> del a >>> b() None >>> b < weakref at 0000000001E8C408; dead > >>> p < weakproxy at 0000000001EAC458 to NoneType at 00000001E297348 > Weak reference is a reference that does not protect the referenced object from collection by a garbage collector, unlike a strong reference.