Advanced Python

Python is a simple language, but its base on dynamicity makes it to easily implement some magical behaviors. Although the cost at runtime is high, the magics is very convenient. Combined with the high-speed C++ and C code, the advanced Python features are powerful when used correctly.

Iterator

When processing a big amount of data, repetition and conditional operations are used everywhere. Data are kept in containers and we write code to iterate through the elements. Python provides the iterator protocol for the iterating idioms.

Let us see a simple example of having 10 elements in a list:

>>> data = list(range(10))
>>> print(data, type(data))

Custom Iterator

Python provides iterator out of the box. To demonstrate how it works, here we make a custom class that implements the Python iterator protocol:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
class ListIterator:

    def __init__(self, data, offset=0):
        self.data = data
        self.it = None
        self.offset = offset

    def __iter__(self):
        return self

    def __next__(self):
        if None is self.it:
            self.it = 0
        elif self.it >= len(self.data)-1:
            raise StopIteration
        else:
            self.it += 1
        return self.data[self.it] + self.offset

Create the custom iterator from the list:

>>> list_iterator = ListIterator(data)

Check the type of the custom iterator object:

>>> print(list_iterator)
<__main__.ListIterator object at 0x10cfaebd0>

Take a look at its members:

>>> print(dir(list_iterator))
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__',
'__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__',
'__init_subclass__', '__iter__', '__le__', '__lt__', '__module__', '__ne__',
'__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__',
'__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__',
'data', 'it', 'offset']

Python uses the iterator object in the for ... in ... looping construct. Every time the construct needs the next element, ListIterator.__next__() is called. Let us see how it executes:

>>> for i in list_iterator:
>>>     print(i)
0
1
2
3
4
5
6
7
8
9

List comprehensions are another construct that uses the iterator protocol:

>>> print([value+100 for value in data])
[100, 101, 102, 103, 104, 105, 106, 107, 108, 109]

Built-In Iterator

Of course, it is not necessary to write a custom class ListIterator for iterating a list on a daily basis. Python already builds in an iterator iter():

>>> list_iterator2 = iter(data)
>>> for i in list_iterator2:
>>>     print(i)
0
1
2
3
4
5
6
7
8
9

Check the type of the built-in iterator object:

>>> print(list_iterator2)
<list_iterator object at 0x10cfb2990>

Comparison

Compare with the type of our custom iterator:

>>> print(list_iterator)
<__main__.ListIterator object at 0x10cfaebd0>

Take a look at the members of the built-in iterator:

>>> print(dir(list_iterator2))
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__',
'__ge__', '__getattribute__', '__gt__', '__hash__', '__init__',
'__init_subclass__', '__iter__', '__le__', '__length_hint__', '__lt__',
'__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__',
'__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__']

Comparison

Compare with the members of our custom iterator:

>>> print(dir(list_iterator))
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__',
'__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__',
'__init_subclass__', '__iter__', '__le__', '__lt__', '__module__', '__ne__',
'__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__',
'__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__',
'data', 'it', 'offset']

Implicitly Created Iterator

The built-in iterator may also be created by calling container.__iter__() method on the container object (iter() simply does it for you):

>>> list_iterator3 = data.__iter__()
>>> print(list_iterator3)
<list_iterator object at 0x10cfbab90>

Aided by container.__iter__(), most of the time we can directly use a container in the loop construct for ... in ..., because the construct knows about the iterator protocol:

>>> for i in data:
>>>     print(i)
0
1
2
3
4
5
6
7
8
9

List Comprehension

List comprehensions are the construct [... for ... in ...]. Python borrowed the syntax of list comprehension from other languages, e.g., Haskell. List comprehensions follow the iterator protocol.

The construct is very convenient. When used wisely, it makes code look cleaner. For example, the above for loop can be replaced by a one-liner:

>>> print("\n".join([str(i) for i in data]))
0
1
2
3
4
5
6
7
8
9

Note

While list comprehensions are mostly a short-hand to the for loop, it may runs faster or slowers than the for loop. It depends on the complexity of your statement and the container.

Generator

An advanced use of the Python iterator protocol is the generator. A generator is a function returning an iterator (the iterator is also known as a generator iterator). An example of such a generator function:

def list_generator(input_data):
    for i in input_data:
        yield i

A generator function uses the yield statement instead of the return statement.

When “calling” the generator function, we get the generator object in return:

>>> generator = list_generator(data)
>>> print(generator)
<generator object list_generator at 0x10cf756d0>

The generator object is an iterator with the methods iterator.__iter__() and iterator.__next__():

>>> print(dir(generator))
['__class__', '__del__', '__delattr__', '__dir__', '__doc__', '__eq__',
'__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__',
'__init_subclass__', '__iter__', '__le__', '__lt__', '__name__', '__ne__',
'__new__', '__next__', '__qualname__', '__reduce__', '__reduce_ex__',
'__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__',
'close', 'gi_code', 'gi_frame', 'gi_running', 'gi_yieldfrom', 'send',
'throw']

It works in the same way as the iterators we used earlier:

>>> for i in list_generator(data):
>>>     print(i)
0
1
2
3
4
5
6
7
8
9

Generator Expression

A more convenient way of creating a generator is to use the generator expression (... for ... in ...). Note this looks like the list comprehension [... for ... in ...], but uses parentheses to replace the brackets.

Use the generator expression to return a generator object (and check its type):

>>> generator2 = (i for i in data)
>>> print(generator2)
<generator object <genexpr> at 0x10cfce1d0>

See what are on the object:

>>> print(dir(generator2))
['__class__', '__del__', '__delattr__', '__dir__', '__doc__', '__eq__',
'__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__',
'__init_subclass__', '__iter__', '__le__', '__lt__', '__name__', '__ne__',
'__new__', '__next__', '__qualname__', '__reduce__', '__reduce_ex__',
'__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__',
'close', 'gi_code', 'gi_frame', 'gi_running', 'gi_yieldfrom', 'send',
'throw']

The generator iterator returned from the generator expression works just like the iterator shown before:

>>> for i in generator2:
>>>     print(i)
0
1
2
3
4
5
6
7
8
9

Since the generator expression is an expression, it can be used to replace the list comprehension in an expression. The one-liner that printed the data can have the brackets removed (turning from using a list comprehension to using a generator expression):

>>> print("\n".join(str(i) for i in data))
0
1
2
3
4
5
6
7
8
9

Comparison

Compare with the list comprehension:

>>> print("\n".join( [ str(i) for i in data ] ))
0
1
2
3
4
5
6
7
8
9

Stack Frame

Contents in the section

Frame Object

We can get the frame object of the current stack frame using inspect.currentframe():

>>> import inspect
>>> f = inspect.currentframe()

A frame object has the following attributes:

  • Namespace:
    • f_builtins: builtin namespace seen by this frame
    • f_globals: global namespace seen by this frame
    • f_locals: local namespace seen by this frame
  • Other:
    • f_back: next outer frame object (this frame’s caller)
    • f_code: code object being executed in this frame
    • f_lasti: index of last attempted instruction in bytecode
    • f_lineno: current line number in Python source code

Let us see it ourselves:

>>> print([k for k in dir(f) if not k.startswith('__')])
['clear', 'f_back', 'f_builtins', 'f_code', 'f_globals', 'f_lasti',
'f_lineno', 'f_locals', 'f_trace', 'f_trace_lines', 'f_trace_opcodes']

We can learn many things about the frame in the object. For example, take a look in the builtin namespace (f_builtins):

>>> print(f.f_builtins.keys())
dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__',
'__build_class__', '__import__', 'abs', 'all', 'any', 'ascii', 'bin',
'breakpoint', 'callable', 'chr', 'compile', 'delattr', 'dir', 'divmod',
'eval', 'exec', 'format', 'getattr', 'globals', 'hasattr', 'hash', 'hex',
'id', 'input', 'isinstance', 'issubclass', 'iter', 'len', 'locals', 'max',
'min', 'next', 'oct', 'ord', 'pow', 'print', 'repr', 'round', 'setattr',
'sorted', 'sum', 'vars', 'None', 'Ellipsis', 'NotImplemented', 'False',
'True', 'bool', 'memoryview', 'bytearray', 'bytes', 'classmethod', 'complex',
'dict', 'enumerate', 'filter', 'float', 'frozenset', 'property', 'int',
'list', 'map', 'object', 'range', 'reversed', 'set', 'slice', 'staticmethod',
'str', 'super', 'tuple', 'type', 'zip', '__debug__', 'BaseException',
'Exception', 'TypeError', 'StopAsyncIteration', 'StopIteration',
'GeneratorExit', 'SystemExit', 'KeyboardInterrupt', 'ImportError',
'ModuleNotFoundError', 'OSError', 'EnvironmentError', 'IOError', 'EOFError',
'RuntimeError', 'RecursionError', 'NotImplementedError', 'NameError',
'UnboundLocalError', 'AttributeError', 'SyntaxError', 'IndentationError',
'TabError', 'LookupError', 'IndexError', 'KeyError', 'ValueError',
'UnicodeError', 'UnicodeEncodeError', 'UnicodeDecodeError',
'UnicodeTranslateError', 'AssertionError', 'ArithmeticError',
'FloatingPointError', 'OverflowError', 'ZeroDivisionError', 'SystemError',
'ReferenceError', 'MemoryError', 'BufferError', 'Warning', 'UserWarning',
'DeprecationWarning', 'PendingDeprecationWarning', 'SyntaxWarning',
'RuntimeWarning', 'FutureWarning', 'ImportWarning', 'UnicodeWarning',
'BytesWarning', 'ResourceWarning', 'ConnectionError', 'BlockingIOError',
'BrokenPipeError', 'ChildProcessError', 'ConnectionAbortedError',
'ConnectionRefusedError', 'ConnectionResetError', 'FileExistsError',
'FileNotFoundError', 'IsADirectoryError', 'NotADirectoryError',
'InterruptedError', 'PermissionError', 'ProcessLookupError', 'TimeoutError',
'open', 'copyright', 'credits', 'license', 'help', '__IPYTHON__', 'display',
'get_ipython'])

The field f_code is a mysterious code object:

>>> print(f.f_code)
<code object <module> at 0x10d0d1810, file "<ipython-input-26-dac680851f0c>",
line 3>

Danger

Because a frame object holds everything a construct uses, after finishing using the frame object, make sure to break the reference to it:

>>> f.clear()
>>> del f

If we don’t do it, it may take long time for the interpreter to break the reference for you.

An example of using the frame object is to print the stack frame in a custom way:

Custom code for showing stack frame (showframe.py).
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#!/usr/bin/env python3

import sys
import inspect

def main():
    for it, fi in enumerate(inspect.stack()):
        sys.stdout.write('frame #{}:\n  {}\n\n'.format(it, fi))

if __name__ == '__main__':
    main()
$ ./showframe.py
frame #0:
  FrameInfo(frame=<frame at 0x7f8d4c31fdc0, file './showframe.py', line 8, code main>,
  filename='./showframe.py', lineno=7, function='main',
  code_context=['    for it, fi in enumerate(inspect.stack()):\n'], index=0)

frame #1:
  FrameInfo(frame=<frame at 0x104762450, file './showframe.py', line 11, code <module>>,
  filename='./showframe.py', lineno=11, function='<module>',
  code_context=['    main()\n'], index=0)

Module Magic with meta_path

Python importlib allows high degree of freedom in customizing how modules are imported. Not a lot of people know the capabilities, and perhaps one of the most useful hidden gems is sys.meta_path.

Here I will use an example to show how to use sys.meta_path to customize module loading. I will use a module, onemod, locating in an alternate directory, altdir/, and ask Python to load it from the non-standard location.

Note

Before running the example, make a shallow copy of the list to back it up:

>>> # Bookkeeping code: keep the original meta_path.
>>> old_meta_path = sys.meta_path[:]

sys.meta_path defines a list of importlib.abc.MetaPathFinder objects for customizing the import process. Take a look at the contents in sys.meta_path:

>>> sys.meta_path = old_meta_path  # Reset the list.
>>> print(sys.meta_path)
[<class '_frozen_importlib.BuiltinImporter'>, <class
'_frozen_importlib.FrozenImporter'>, <class
'_frozen_importlib_external.PathFinder'>]

At this point, onemod cannot be imported, because the path altdir/ is not in sys.path:

>>> import onemod
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'onemod'

In normal Python code, you will be asked to modify sys.path to include the path altdir/ for correctly import onemod. Here we will use MetaPathFinder. Derive from the abstract base class (ABC) and override the find_spec() method to tell it to load the module onemod at the place we specify.

For our path finder to work, we need to properly set up a importlib.machinery.ModuleSpec and create a importlib.machinery.SourceFileLoader object:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
import os
import importlib.abc
import importlib.machinery

class MyMetaPathFinder(importlib.abc.MetaPathFinder):
    def find_spec(self, fullname, path, target=None):
        if fullname == 'onemod':
            print('DEBUG: fullname: {} , path: {} , target: {}'.format(
                fullname, path, target))
            fpath = os.path.abspath('altdir/onemod.py')
            loader = importlib.machinery.SourceFileLoader('onemod', fpath)
            return importlib.machinery.ModuleSpec(fullname, loader, origin=fpath)
        else:
            return None

Add an instance of MyMetaPathFinder in sys.meta_path:

>>> sys.meta_path = old_meta_path + [MyMetaPathFinder()]
>>> print(sys.meta_path)
[<class '_frozen_importlib.BuiltinImporter'>, <class
'_frozen_importlib.FrozenImporter'>, <class
'_frozen_importlib_external.PathFinder'>, <__main__.MyMetaPathFinder object
at 0x10117b850>]

With the meta path finder inserted, onemod can be imported:

>>> import onemod
DEBUG: fullname: onemod , path: None , target: None
>>> print("show content in onemod module:", onemod.content)
show content in onemod module: string in onemod

It limits the special loading scheme to the specific module onemod. To test, ask it to load a module that does not exist:

>>> import one_non_existing_module
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'one_non_existing_module'

See the module we loaded. Compare it with a “normal module”.

>>> import re
>>> print('onemod:', onemod)
onemod: <module 'onemod' (/Users/yungyuc/work/web/ynote/nsd/12advpy/code/altdir/onemod.py)>
>>> print('re:', re)
re: <module 're' from '/Users/yungyuc/hack/usr/opt39_210210/lib/python3.9/re.py'>

The module objects have an important field __spec__, which is the ModuleSpec we created:

>>> print('onemod.__spec__:', onemod.__spec__)
onemod.__spec__: ModuleSpec(name='onemod',
loader=<_frozen_importlib_external.SourceFileLoader object at 0x10117bd30>,
origin='/Users/yungyuc/work/web/ynote/nsd/12advpy/code/altdir/onemod.py')
>>> print('re.__spec__:', re.__spec__)
re.__spec__: ModuleSpec(name='re',
loader=<_frozen_importlib_external.SourceFileLoader object at 0x1010b4fa0>,
origin='/Users/yungyuc/hack/usr/opt39_210210/lib/python3.9/re.py')

Descriptor

Contents in the section

Python is very flexible in accessing attributes in an object. There are multiple ways to customize the access, and the descriptor protocol provides the most versatile API and allows us to route attribute access anywhere [1].

Naive Accessor

To show how descriptors work, make a naive accessor class (by following the descriptor protocol):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
class ClsAccessor:
    """Routing access to all instance attributes to the descriptor object."""
    def __init__(self, name):
        self._name = name
        self._val = None
    def __get__(self, obj, objtype):
        print('On object {} , retrieve: {}'.format(obj, self._name))
        return self._val
    def __set__(self, obj, val):
        print('On object {} , update: {}'.format(obj, self._name))
        self._val = val

Use the descriptor in a class:

class MyClass:
    x = ClsAccessor('x')

See the message printed while getting the attribute x:

>>> o = MyClass()
>>> print(o.x)
On object <__main__.MyClass object at 0x1011c02b0> , retrieve: x
None

Setting the attribute also shows a message:

>>> o.x = 10
On object <__main__.MyClass object at 0x1011c02b0> , update: x
>>> print(o.x)
On object <__main__.MyClass object at 0x1011c02b0> , retrieve: x
10

This naive descriptor has a problem. Because the attribute value is kept in the descriptor object, and the descriptor is kept in the class object, attributes of all instances of MyClass share the same value:

>>> o2 = MyClass()
>>> print(o2.x) # Already set, not None!
On object <__main__.MyClass object at 0x1011c02e0> , retrieve: x
10
>>> o2.x = 100 # Set the value on o2.
On object <__main__.MyClass object at 0x1011c02e0> , update: x
>>> print(o.x) # The value of o changes too.
On object <__main__.MyClass object at 0x1011c02b0> , retrieve: x
100

Keep Data on the Instance

Having all instances sharing the attribute value usually undesirable, but of course the descriptor protocol allows to bind the values to the instance. Let us change the accessor class a little bit:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
class InsAccessor:
    """Routing access to all instance attributes to alternate names on the instance."""
    def __init__(self, name):
        self._name = name
    def __get__(self, obj, objtype):
        print('On object {} , instance retrieve: {}'.format(obj, self._name))
        varname = '_acs' + self._name
        if not hasattr(obj, varname):
            setattr(obj, varname, None)
        return getattr(obj, varname)
    def __set__(self, obj, val):
        print('On object {} , instance update: {}'.format(obj, self._name))
        varname = '_acs' + self._name
        return setattr(obj, varname, val)

The key of preserving the value in the instance is in lines 7 and 13. We mangle the variable name and use it to add a reference on the instance. Now add the descriptor to a class:

class MyClass2:
    x = InsAccessor('x')

Create the first instance. The descriptor can correctly set and retrieved:

>>> mo = MyClass2()
>>> print(mo.x) # The value is uninitialized
On object <__main__.MyClass2 object at 0x101190250> , instance retrieve: x
None
>>> mo.x = 10
On object <__main__.MyClass2 object at 0x101190250> , instance update: x
>>> print(mo.x)
On object <__main__.MyClass2 object at 0x101190250> , instance retrieve: x
10

Create another instance. According to our implementation, what we did in the first instance is not seen in the second one:

>>> mo2 = MyClass2()
>>> print(mo2.x) # The value remains uninitialized
On object <__main__.MyClass2 object at 0x101190a90> , instance retrieve: x
None

Metaclass

Metaclasses is a mechanism to perform meta-programming in Python. That is, metaclasses change how the Python code works by writing Python code, but do not use a code generator.

Class is an Object

In Python, a class is also an object, which is of the type “type”. Let us observe this interesting fact. Make a class:

class ClassIsObject:
    pass

The class can be manipulated like a normal object:

>>> print(ClassIsObject) # Operate the class itself, not the instance of the class
<class '__main__.ClassIsObject'>

The class has its own namespace (__dict__):

>>> print(ClassIsObject.__dict__) # The namespace of the class, not of the instance
{'__module__': '__main__',
 '__dict__': <attribute '__dict__' of 'ClassIsObject' objects>,
 '__weakref__': <attribute '__weakref__' of 'ClassIsObject' objects>,
 '__doc__': None}

The class is an object as well as a type. A type is also an object:

>>> isinstance(ClassIsObject, object)
True
>>> isinstance(ClassIsObject, type)
True
>>> isinstance(type, object)
True

Customize Class Creation

Now we can discuss how to customize class creation using metaclasses, after knowing that the classes are just Python objects. We will continue to use the accessor example in Descriptor. In the previous example, the descriptor object needs to take an argument for its name:

x = InsAccessor('x')

I would like to lift that inconvenience. First, I create a new descriptor:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
class AutoAccessor:
    """Routing access to all instance attributes to alternate names on the instance."""
    def __init__(self):
        self.name = None
    def __get__(self, obj, objtype):
        print('On object {} , auto retrieve: {}'.format(obj, self.name))
        varname = '_acs' + self.name
        if not hasattr(obj, varname):
            setattr(obj, varname, None)
        return getattr(obj, varname)
    def __set__(self, obj, val):
        print('On object {} , auto update: {}'.format(obj, self.name))
        varname = '_acs' + self.name
        return setattr(obj, varname, val)

The new descriptor class AutoAccessor does not take the attribute name in the constructor. Then I create a corresponding metaclass:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
class AutoAccessorMeta(type):
    """Metaclass to use the new automatic descriptor."""
    def __new__(cls, name, bases, namespace):
        print('DEBUG before names:', name)
        print('DEBUG before bases:', bases)
        print('DEBUG before namespace:', namespace)
        for k, v in namespace.items():
            if isinstance(v, AutoAccessor):
                v.name = k
        # Create the class object for MyAutoClass.
        newcls = super(AutoAccessorMeta, cls).__new__(cls, name, bases, namespace)
        print('DEBUG after names:', name)
        print('DEBUG after bases:', bases)
        print('DEBUG after namespace:', namespace)
        return newcls

The metaclass AutoAccessorMeta assigns the correct attribute name. We will compare the effects of the metaclass by creating two classes. The first is to use the AutoAccessor without the metaclass:

>>> class MyAutoClassDefault(metaclass=type):
...     x = AutoAccessor()
...

The second is to use the metaclass. The metaclass scans the class namespace and assigns the attribute name to the corresponding descriptor:

>>> class MyAutoClass(metaclass=AutoAccessorMeta):
...     x = AutoAccessor()  # Note: no name is given.
...
DEBUG before names: MyAutoClass
DEBUG before bases: ()
DEBUG before namespace: {'__module__': '__main__',
'__qualname__': 'MyAutoClass',
'x': <__main__.AutoAccessor object at 0x10117bcd0>}
DEBUG after names: MyAutoClass
DEBUG after bases: ()
DEBUG after namespace: {'__module__': '__main__',
'__qualname__': 'MyAutoClass',
'x': <__main__.AutoAccessor object at 0x10117bcd0>}

Now we successfully upgrade the descriptor to avoid the explicit argument for the attribute name:

>>> ao = MyAutoClass()
>>> print(ao.x) # The value is uninitialized
On object <__main__.MyAutoClass object at 0x101190460> , auto retrieve: x
None
>>> ao.x = 10
On object <__main__.MyAutoClass object at 0x101190460> , auto update: x
>>> print(ao.x)
On object <__main__.MyAutoClass object at 0x101190460> , auto retrieve: x
10
>>> print(ao._acsx)
10

Abstract Base Class (ABC)

Python is object-oriented and supports inheritance. Most of the time we use a simple inheritance relation, and it works as expected.

Create two classes with a simple inheritance
>>> class MyBaseClass:
...     pass
...
>>> class MyDerivedClass(MyBaseClass):
...     pass
...
>>> base = MyBaseClass()
>>> derived = MyDerivedClass()

The instance “base” instantiated from MyBaseClass is an instance of MyBaseClass but not MyDerivedClass:

>>> print('base {} MyBaseClass'.format(
...     'is' if isinstance(base, MyBaseClass) else 'is not'))
base is MyBaseClass
>>> print('base {} MyDerivedClass'.format(
...     'is' if isinstance(base, MyDerivedClass) else 'is not'))
base is not MyDerivedClass

The instance “derived” instantiated from MyDerivedClass is an instance of both MyBaseClass and MyDerivedClass:

>>> print('derived {} MyBaseClass'.format(
...     'is' if isinstance(derived, MyBaseClass) else 'is not'))
derived is MyBaseClass
>>> print('derived {} MyDerivedClass'.format(
...     'is' if isinstance(derived, MyDerivedClass) else 'is not'))
derived is MyDerivedClass

Nothing surprises. But Python also supports Abstract Base Class (abc) and to turn it up side down.

Method Resolution Order (MRO)

When we need to use ABC, the inheritance is usually much more complex than what we just described and involves multiple inheritance. There needs to be a definite way to resolve the multiple inheritance and Python uses the “C3” algorithm [2]. The description can also be found in the Python method resolution order (MRO) [3].

Let us see how MRO works with a single-level inheritance:

Example of single diamond inheritance.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
class A:
    def process(self):
        print('A process()')

class B(A):
    def process(self):
        print('B process()')
        super(B, self).process()

class C(A):
    def process(self):
        print('C process()')
        super(C, self).process()

class D(B, C):
    pass

In the above code, the inheritance relationship among the four classes form a “diamond”. The MRO is:

>>> print(D.__mro__)
(<class '__main__.D'>,
<class '__main__.B'>, <class '__main__.C'>,
<class '__main__.A'>, <class 'object'>)
>>> obj = D()
>>> obj.process()
B process()
C process()
A process()

If we change the order in the inheritance declaration:

class D(C, B):
    pass

the MRO changes accordingly:

>>> print(D.__mro__)
(<class '__main__.D'>,
<class '__main__.C'>, <class '__main__.B'>,
<class '__main__.A'>, <class 'object'>)

In a more complex inheritance relationship, there may not be a single diamond. The following example have 3 diamonds crossing multiple levels:

Example of multiple diamond inheritance.
1
2
3
4
5
6
7
O = object
class F(O): pass
class E(O): pass
class D(O): pass
class C(D, F): pass
class B(D, E): pass
class A(B, C): pass

The MRO of the complex inheritance is:

>>> print(A.__mro__)
(<class '__main__.A'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.D'>,
<class '__main__.E'>, <class '__main__.F'>, <class 'object'>)
>>> print(B.__mro__)
(<class '__main__.B'>, <class '__main__.D'>, <class '__main__.E'>, <class 'object'>)
>>> print(C.__mro__)
(<class '__main__.C'>, <class '__main__.D'>, <class '__main__.F'>, <class 'object'>)
>>> print(D.__mro__)
(<class '__main__.D'>, <class 'object'>)
>>> print(E.__mro__)
(<class '__main__.E'>, <class 'object'>)
>>> print(F.__mro__)
(<class '__main__.F'>, <class 'object'>)

And an instance of A is an instance of all the classes based on the inheritance rule:

>>> a = A()
>>> print('a {} A'.format('is' if isinstance(a, A) else 'is not'))
a is A
>>> print('a {} B'.format('is' if isinstance(a, B) else 'is not'))
a is B
>>> print('a {} C'.format('is' if isinstance(a, C) else 'is not'))
a is C
>>> print('a {} D'.format('is' if isinstance(a, D) else 'is not'))
a is D
>>> print('a {} E'.format('is' if isinstance(a, E) else 'is not'))
a is E
>>> print('a {} F'.format('is' if isinstance(a, F) else 'is not'))
a is F

Note

In production code, we usually do not want to deal with inheritance that is so complex. If possible, try to avoid it through the system design.

Virtual Base Class

Python abstract base class (abc) provides the capabilities to overload isinstance() and issubclass(), and define abstract methods.

We can use abc.ABCMeta.register() method to ask a class MyABC that is not in a inheritance chain of another class A to be a “virtual” base class of the latter.

import abc

class MyABC(metaclass=abc.ABCMeta):
    pass

Warning

Python “virtual” base classes have nothing to do with the C++ virtual classes.

As we know, A is not a subclass of MyABC:

>>> print('A {} a subclass of MyABC'.format('is' if issubclass(A, MyABC) else 'is not'))
A is not a subclass of MyABC

But once we “registerMyABC to be a virtual base class of A, we will see A becomes subclass of MyABC:

>>> MyABC.register(A)
<class '__main__.A'>
>>> print('A {} a subclass of MyABC'.format('is' if issubclass(A, MyABC) else 'is not'))
A is a subclass of MyABC

Abstract Methods

Using abc, we can add abstract methods to an class (making it abstract).

class AbstractClass(metaclass=abc.ABCMeta):
    @abc.abstractmethod
    def process(self):
        pass

An abstract class cannot be instantiated:

>>> a = AbstractClass()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class AbstractClass with abstract method process

In a derived class, the abstract method needs to be overridden

class GoodConcreteClass(AbstractClass):
    def process(self):
        print('GoodConcreteClass process')

Then the good concrete class can run.

>>> g = GoodConcreteClass()
>>> g.process()
GoodConcreteClass process

If the abstract method is not overridden

class BadConcreteClass(AbstractClass):
    pass

the derived class cannot run.

>>> b = BadConcreteClass()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class BadConcreteClass with abstract method process

References

[1]Descriptor HowTo Guide
[2]K. Barrett, B. Cassels, P. Haahr, D. A. Moon, K. Playford, and P. T. Withington, “A monotonic superclass linearization for Dylan,” SIGPLAN Not., vol. 31, no. 10, pp. 69–82, Oct. 1996, doi: 10.1145/236338.236343. https://dl.acm.org/doi/10.1145/236338.236343.
[3]The Python 2.3 Method Resolution Order, https://www.python.org/download/releases/2.3/mro/.