from IPython.core.display import HTML

HTML(open("custom.html", "r").read())
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Copyright (C) 2014-2023 Scientific IT Services of ETH Zurich,
Contributing Authors: Uwe Schmitt, Mikolaj Rybniski

Introduction to object oriented programming¶

We have seen "objects" already: actually any type of data, like basic types int or bool or collection types like lists are objects. There are also objects not representing a data type, for example file handles.

1. "objects" have methods "attached". The functionality of the method depends on the objects type.¶

Examples:

"abcde".upper()

[1, 2, 3].append(4)

with open("say_hi.txt") as fh:
    print(fh.readlines())
['hi\n', 'ho\n']

2. Some objects also have attached values, which we call attributes.¶

x = 0.5 + 1j
print(x.real, x.imag)
0.5 1.0
print(fh.closed)
True

3. Classes vs. objects / instances.¶

  • When we create objects (e.g. a list, a string), we create an instance of a given class.

  • You can also interpret a class as a recipe how instances (objects) of that class work.

This is a very minimalistic example class with one method:

class HiSayer:
    def greet(self, name):
        print("hi", name)

So we use class followed by its name to start the declaration of a class.

Every other declaration related to this class follows indented.

Methods are declared like functions, but within a class declaration.

Just ignore the self for a moment:

# this is how we create the instance
x = HiSayer()

# now we call the greet method:
x.greet("retro")
hi retro

4. The magic self.¶

What is this self ? Lets print it:

class HiSayer:
    def greet(self, name):
        print("self is", self)
xxxx = HiSayer()
print("xxxx is", xxxx)
xxxx.greet("retro")
xxxx is <__main__.HiSayer object at 0x10351d840>
self is <__main__.HiSayer object at 0x10351d840>

The 0x.... you see is the storage location of the object in hexadecimal notation.

We see here: self is the same object as the xxxx in xxxx.greet("retro").

So Python "magically" inserts the current object as self.

We need this, because a method is defined at class level which is not aware of the current instance. By injecting the x as self the method knows on what object is operating on.

The name self is a convention, an one could use other names instead.

The "magic" insertion works as follows: Python translates object.method(...) to class_of_object.method(object, ...), e.g. in the example above xxxx.greet("retro") is translated to HiSayer.greet(xxxx, "rethro").

# this works also, but it is usually not used this way:
HiSayer.greet(xxxx, "retro")
self is <__main__.HiSayer object at 0x10351d840>

5. How can we pass arguments when creating an object ?¶

Many classes accept arguments when we create an instance, e.g. list

li = list((1, 2, 3))
print(type(li), li)
<class 'list'> [1, 2, 3]

To handle this we must implement special "initializer" method, named __init__:

import math


class Point2D:
    def __init__(self, x0, y0):
        self.x = x0  # set attribute x
        self.y = y0  # set attribute y

    def distance_to_origin(self):
        """method which computes length of Vector2D"""
        return math.hypot(self.x, self.y)

    def as_string(self):
        return "<Point2D x={}, y={}>".format(self.x, self.y)


# create an instance, this creates the object and
# calls __init__ with args 1 an
p = Point2D(1, -1)

# now we can access attributes
print(p.x, p.y)
1 -1
print(p.as_string())
print(p.distance_to_origin())
<Point2D x=1, y=-1>
1.4142135623730951

Comment:

  • names starting and ending with double underscores relate to Python internals.
  • methods following this notation are also called dunder (double underscore) or special methods.

Exercise section 1¶

  • reproduce the examples from above
  • implement a class JazzRecord which can be initialzied with artist name, record name and release year. e.t. record = JazzRecord("John Coltrane", "Giant Steps", 1957).
  • implement a method as_string which returns a string represetation like Giant Steps (by John Coltrane, 1957).

Why / when implement objects ?¶

  • create custom datatypes

  • implement data entities

  • data encapsulation

  • hide implementation details from user of a class

  • model state: In case you have a group a functions with long argument lists or you think they need global variables to "communicate", classes might be a solution

Using inheritance (next chapters) classes also help to:

  • avoid code duplication
  • write clearer and more maintainable code

6. Inheritance¶

Inheritance is a mechanism which in principle reuses code from an existing class.

This mechanism is very versatile and the core of many object oriented programming techniques.

To declare a class X which inherits methods from a class Y, we use the declaration

class X(Y):
     ...

We also say Y is a base class of X, X derives from Y or X inherits from Y.

In the following example we implement a class Vector2D

  • which inherits all methods from Point2D, including __init__.
  • adds two new methods add and length
  • but overwrites as_string.

This is now the code:

class Vector2D(Point2D):
    
    def add(self, other):
        """is called when you execute self + other"""
        return Vector2D(self.x + other.x, self.y + other.y)

    def length(self):
        return self.distance_to_origin()

    def as_string(self):
        return "<Vector2D x={}, y={}>".format(self.x, self.y)


# create instances
# calls __init__ from base class, since we did not implement it in Vector2D:

v1 = Vector2D(1.0, 2.0)    
v2 = Vector2D(-1.0, -2.0)
# access attributes
print(v1.x, v1.y)

print(v1.as_string())

# calls internally method in base class !
print(v1.length())
1.0 2.0
<Vector2D x=1.0, y=2.0>
2.23606797749979

Lets check what methods and attributes are now available on an instance of Vector2D, you see lots of internal methods, but also __init__, etc:

(more details about some of the special methods follow soon)

dir(v1)
['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'add',
 'as_string',
 'distance_to_origin',
 'length',
 'x',
 'y']

We make more use of inheritance later when we discuss use cases of object oriented programming.

A few words about multiple base classes:¶

Python supports multiple base classes, but this comes at some risk, especially if base classes have overlapping method names. The order you list base classes also can make a difference.

My recommenation is to avoid multiple base classes unless you exactly know what you do.

Subclassing builtin classes¶

Using builtin classes or classes from external libraries for subclassing comes with the risk to break internal contracts between methods which are only known to the original developper.

Better use e.g. UserDict from the std library as base class!

7. More dunder (__magic__) methods¶

import math


class FancyVector2D(Vector2D):
    def __init__(self, x, y, name):

        # super() allows access to the function of same name in 
        # the base class. Here we call the already existing __init__
        # instead of setting self.x and self.y manually:
        super().__init__(x, y)
        
        self.name = name

    def __len__(self):
        """implements len(..) for a Point2D Object"""
        return 2

    def __str__(self):
        """to string conversion, when str(.) is called
        or implicitly within print:"""
        return "<FancyVector2D x={}, y={} name='{}'>".format(self.x, self.y, self.name)

    def __add__(self, other):
        """implements "self + other"""
        assert isinstance(other, FancyVector2D)
        new_name = "{} + {}".format(self.name, other.name)
        return FancyVector2D(self.x + other.x, self.y + other.y, new_name)

    def __getitem__(self, index):
        """implementes index access"""
        if index == 0 or index == 1:
            return (self.x, self.y)[index]
        # invalid index:
        raise IndexError()

    def __eq__(self, other):
        # implements "self == other"
        if not isinstance(other, FancyVector2D):
            # comparing makes only snse if other is also of type
            # FancyVector2D:
            return False
        return self.x == other.x and self.y == other.y


p = FancyVector2D(1, -1, "p")
q = FancyVector2D(2, 5, "q")
r = p + q    # same as p.__add__(q)
print("r.x=", r.x, "r.y=", r.y)
r.x= 3 r.y= 4
# same as r.__len__():
print("len(r)=", len(r))
len(r)= 2
# same as r.__str__():
u = str(r)
print("str(u)=", u)

# this also calls the __str__ method implicitely:
print("print(r):", r)
str(u)= <FancyVector2D x=3, y=4 name='p + q'>
print(r): <FancyVector2D x=3, y=4 name='p + q'>
# same as r.__getitem__(0):
print("r[0]=", r[0])
r[0]= 3
# same as p.__eq__(q):
print("p == q:", p == q)

# etc
print("r == r:", r == r)
print("p == 1:", p == 1)
p == q: False
r == r: True
p == 1: False

There are many other special functions for customizing your objects, see the reference documentation at https://docs.python.org/2/reference/datamodel.html#special-method-names and the blog post https://rszalski.github.io/magicmethods/

Exercise section 2¶

  1. Repeat the examples above.

  2. Extend FancyVector2D to implement a method scale_inplace which takes a float x and scales the attributes x and y internally, e.g to be used like

    
    v1 = FancVector2D(1.0, 2.0)
    v1.scale_inplace(2.0)
    assert v1.x == 1.0 and v1.y == 4.0
    
    
    
  3. Implement a method __mul__ which takes another vector and returns the dot product (scalar product) of both. __mul__ is called if you use v1 * v2. The scalar product of two v1 and v2 is defined as v1.x * v2.x + v1.y * v2. and is a floating point value (and not another Vector).

Optional exercise *¶

  1. Create a class ComplexNumber which inherits Vector2D and reimplements __mul__ for complex arithmethic. Reimplement __str__ to achieve output in the style of 1 + 2i.

Some object oriented programming techniques¶

Implementing classes can help in many situations. Whereas in the early 90s OO was considered as a silver bullet for most programming problems, opionns changed meanwhile among many programmers.

So when to implement classes ?

  • Data entities: address records, own datatypes
  • Offer a uniform interface to resources (like filehandles do, ...)
  • In case you implement functions and you can not get around sharing data between them, never use global variables, classes might be the solution.
  • classes can support code reuse and reusability.

1. Open/Closed Principle¶

In object-oriented programming, the open/closed principle states

software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification; that is, such an entity can allow its behaviour to be extended without modifying its source code.

— https://en.wikipedia.org/wiki/Open/closed_principle

As this sounds a bit abstract we will discuss this with a programming example:

class Adder:
    def accumulated_values(self, data):
        assert len(data) > 0
        result = data[0]
        for value in data[1:]:
            result = self.combine(result, value)

        return result

    def combine(self, value_1, value_2):
        return value_1 + value_2


class Multiplier(Adder):
    def combine(self, value_1, value_2):
        return value_1 * value_2


a = Adder()
m = Multiplier()

print(a.accumulated_values([2, 3, 4]))
print(m.accumulated_values([2, 3, 4]))
9
24

Here implementing the overall algorithm and a implementation detail in two different methods allows us to implement modified behaviour without the need to change anything in the base class.

2. Abstract base classes and the template method pattern¶

This approach looks similar to the previous example, the exception is that we make a clear difference between the base algorithm and specific implementations.

In this specific case we also call this the "template method pattern". A base class provides the template for an algorithm, and we fill out the missing pieces in a subclass.

Beyond that we use the abc module from the Python standard library to make sure that subclasses implement all missing methods:

from abc import ABC, abstractmethod
class Accumulator(ABC):
    @abstractmethod
    def start_value(self):
        pass

    @abstractmethod
    def combine(self, value_1, value_2):
        pass

    def accumulated_value(self, values):
        result = self.start_value()
        for value in values:
            result = self.combine(result, value)
        return result

What is new ?

  • if we specify the given base class ABC and if we decorate placeholder methods with abstractmethod, every derived subclass must implement the given methods.

This is then a clear specification what methods subclasses must implement, and includes a check if a subclass follows this specification.

Let's implement a subclass which forgot to implement start_value:

class Adder(Accumulator):
    def combine(self, value_1, value_2):
        return value_1 + value_2

And this happens when we want to instantiate this class:

a = Adder()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[26], line 1
----> 1 a = Adder()

TypeError: Can't instantiate abstract class Adder with abstract method start_value
class Adder(Accumulator):
    def combine(self, value_1, value_2):
        return value_1 + value_2

    def start_value(self):
        return 0


class Multiplier(Accumulator):
    def combine(self, value_1, value_2):
        return value_1 * value_2

    def start_value(self):
        return 1


a = Adder()
print(a.accumulated_value([1, 2, 3]))

m = Multiplier()
print(m.accumulated_value([]))
6
1

What is the difference between the previous two approaches ?

  1. Both do the same.

  2. The first approach was implemented by a developer who thought "I need a adder, but maybe I will need variants in the futurer:.

  3. The second approach was implemented by a developer who had possible variants in his mind from the early beginning.

  4. The second approach is more common in frameworks: frameworks are libraries who provide "frames" which must be filled out by the user. Here the abstract base class is the frame.

We mixed here two concepts:

  • the template method pattern also works without abstract base classes, the idea is that you sub classes must provide the implementation details.

  • the abstract base class is a way to enforce a policy to ensure valid sub classes.

3. Strategy pattern¶

Instead of using inheritance, we can make code reusable by providing functionalties by using attributes.

class AccumulationStrategy(ABC):
    @abstractmethod
    def start_value(self):
        pass

    @abstractmethod
    def combine(self, value_1, value_2):
        pass


class Accumulator:
    def __init__(self, accumulation_strategy):
        assert isinstance(accumulation_strategy, AccumulationStrategy)
        self.strategy_impl = accumulation_strategy

    def accumulate(self, values):
        result = self.strategy_impl.start_value()
        for value in values:
            result = self.strategy_impl.combine(result, value)
        return result


class Adder(AccumulationStrategy):
    def start_value(self):
        return 0

    def combine(self, value_1, value_2):
        return value_1 + value_2


a = Accumulator(Adder())

print(a.accumulate([1, 2, 3]))
6

Why strategy pattern instead of subclassing ?

  • strategy pattern is more explicit in case you need multiple base classes.
  • in case you understand how multiple sub classes work, it is a matter of personal preference.
  • there is a stronger distinction in compiled languages.

Frameworks often offer different strategies, which can be composes when the base class is instantiated. If a framwork uses sub classing techniques all possible combinations must be provided within the framework.

Strategy pattern also works better in case we use a base class from a different library, thus subclassing can be risky if we do not know the details:

import string
from abc import ABC, abstractmethod


class Histogram:
    def __init__(self, processor, filter_):
        self.processor = processor
        self.filter_ = filter_
        self.histogram = {}

    def get(self):
        return self.histogram

    def calculate(self, values):
        histogram = {}
        for value in values:
            if not self.filter_.check(value):
                continue
            value = self.processor.process(value)
            if value not in histogram:
                histogram[value] = 0
            histogram[value] += 1

        self.histogram = histogram
        return histogram


class BaseProcessor(ABC):
    @abstractmethod
    def process(self, value):
        pass


class NoProcessor(BaseProcessor):
    def process(self, value):
        return value


class IgnoreCaseProcessor(BaseProcessor):
    def process(self, value):
        return value.upper()


class BaseFilter(ABC):
    @abstractmethod
    def check(self, value):
        pass


class NoFilter(BaseFilter):
    def check(self, value):
        return True


class WordFilter(BaseFilter):
    def check(self, value):
        return all(v in string.ascii_letters for v in value)

Here it is up to the user how to combine the different strategies.

Using the base class approach would require that the framework implements all eight variants as different classes incl. assigning names. This can lead to combinatorical overhead and also finding a good naming for all variants can be challenging.

h = Histogram(NoProcessor(), NoFilter())

h.calculate("this is TRICKY !!! or is this not tricky ???".split())
{'this': 2,
 'is': 2,
 'TRICKY': 1,
 '!!!': 1,
 'or': 1,
 'not': 1,
 'tricky': 1,
 '???': 1}
h = Histogram(IgnoreCaseProcessor(), WordFilter())

h.calculate("this is TRICKY !!! or is this not tricky ???".split())
{'THIS': 2, 'IS': 2, 'TRICKY': 2, 'OR': 1, 'NOT': 1}

4. Mixin inheritance pattern*¶

Another alternative to using inheritance is extending the base class objects by inclusion of extra behaviour, which may be re-usable accross different classes, or must be done if the base class code is not to be changed.

class DictPrettyPrintMixIn(ABC):
    @abstractmethod
    def _get_dict(self):
        pass

    def __str__(self):
        dict_ = self._get_dict()
        assert isinstance(dict_, dict)
        keys_width = max(len(str(key)) for key in dict_.keys())
        return "\n".join(
            "{}: {}".format(str(key).rjust(keys_width), value)
            for key, value in sorted(dict_.items())
        )

Sidenote: Access protection¶

Why _get_dict method starts with _?

Python does not support "private" and "protected" methods or attributes (e.g. like C++ or Java). Instead developer use names starting with a single _ to indicate methods / attributes which are not inteded for general use. Calling such a method comes with the risk that things do not work as documented.

Let's use the mixin to define our own pretty dictionary. In Python we use multiple inheritance to that end:

from collections import UserDict


class PrettyDict(UserDict, DictPrettyPrintMixIn):
    def _get_dict(self):
        return self.data


d = PrettyDict(
    [
        ((1, 2, 3), "a"),
        ((2, 3), "b"),
    ]
)
print(d)
(1, 2, 3): a
   (2, 3): b

Let's re-use the mixin in the histogram example:

class PrettyPrintHistogram(Histogram, DictPrettyPrintMixIn):
    def _get_dict(self):
        return self.get()

    def __str__(self):
        return "\n".join(
            [
                "PrettyPrintHistogram",
                "",
                super().__str__(),  # we'll get back to that line in a moment
            ]
        )

and compare how Histogram and PrettyPrintHistogram print-out:

h2 = PrettyPrintHistogram(IgnoreCaseProcessor(), WordFilter())
h2.calculate("this is TRICKY !!! or is this not tricky ???".split())

print(h)
print("\nvs.\n")
print(h2)
<__main__.Histogram object at 0x10421c310>

vs.

PrettyPrintHistogram

    IS: 2
   NOT: 1
    OR: 1
  THIS: 2
TRICKY: 2

Method resolution order (MRO)¶

When calling super().__str__() in the example above, why was __str__() called from DictPrettyPrintMixIn, and not for instance from Histogram, which inherits default __str__ implementation from the object class?

Python follows MRO, which in our case is:

PrettyPrintHistogram.mro()
[__main__.PrettyPrintHistogram,
 __main__.Histogram,
 __main__.DictPrettyPrintMixIn,
 abc.ABC,
 object]

In case of multiple inheritance Python resolves the methods by looking from left to right according to the order given in the inheritance declaration, but prioritising implementations found on "lower" levels of inheritance over the "higher" ones is not a rule (if your curious, see: https://www.python.org/download/releases/2.3/mro/ ).

Advice: be cautious when using multiple-inheritance - it may lead to surprising and hard bugs.

Possibly better, in our last example, is to use a safer explicit DictPrettyPrintMixIn.__str__(self) call instead of the implicit super().__str__() call, or use a different design pattern (strategy or a dynamic decorator).

5. Dynamic decorator pattern*¶

In contrast to the mixin inheritance pattern (sometimes called a static implementation of a decorator pattern), the OO dynamic decorator pattern solves problem of dynamical extension of class' functionality, i.e. it enables different behaviour for different instances of the class.

The main difference with respect to the strategy pattern is that code of the decorated objects is not changed.

The decorator pattern is implemented by wrapping instance of the original class (passed during instatiation), inheriting all the original behaviour and implementing it by forwarding (delegating) calls to the wrapped instance, whilst overriding or adding custom behaviour.

Note: The programming language-agnostic decorator pattern is not but has an analogy to Python decorators (a language feature), such as @lru_cache or @abstractmethod, which alow to dynamically modify functions (methods) or classes.

Here is another way of implementing a prety printing for histograms, using the dynamic decorator pattern:

# inherit from Histogram to perserve a decorated instance's original type
class DecoratedHistogramBase(Histogram):
    def __init__(self, histogram_obj):
        self.histogram_obj = histogram_obj

    def get(self):
        return self.histogram_obj.get()

    def calculate(self, values):
        return self.histogram_obj.calculate(values)


class PrettyPrintDecoratedHistogram(DecoratedHistogramBase):
    def __str__(self):
        return "\n".join(
            [
                "PrettyPrintDecoratedHistogram",
                "",
                str(PrettyDict(self.histogram_obj.get())),
            ]
        )


h = Histogram(IgnoreCaseProcessor(), WordFilter())
# decorate instance
h2 = PrettyPrintDecoratedHistogram(h)
assert isinstance(h2, Histogram)
# `calculate` either before or after decoration
h2.calculate("this is TRICKY !!! or is this not tricky ??? Tricky it is ...".split())

print(h2)
PrettyPrintDecoratedHistogram

    IS: 3
    IT: 1
   NOT: 1
    OR: 1
  THIS: 2
TRICKY: 3

A different decorator, that adds (text) plotting behaviour:

class TextPlotDecoratedHistogram(DecoratedHistogramBase):
    def text_plot(self):
        dict_ = self.get()
        keys_width = max(len(str(key)) for key in dict_.keys())
        for key, value in sorted(dict_.items()):
            print("{}: {}".format(str(key).rjust(keys_width), "*" * value))


# decorators should be able to be chained
h3 = TextPlotDecoratedHistogram(h2)
assert isinstance(h3, Histogram)
h3.text_plot()
# but decorating is not inheriting, e.g. __str__ call was not delegated..
print()
print(h3)
    IS: ***
    IT: *
   NOT: *
    OR: *
  THIS: **
TRICKY: ***

<__main__.TextPlotDecoratedHistogram object at 0x1043a7a60>

6. Observer pattern*¶

An example of a pattern that is not focused on altering behaviour.

It is used when objects, callled observers, need to notified about some status changes (events) without direct coupling to the status-changing environment.

Observers register for notifications to a notifier (also called subject). Notifications from notifier can be implemented either by directly calling pre-defined methods on observers, or by calling provided callback functions.

from abc import ABC, abstractmethod


class Notifier:
    def __init__(self):
        self._observers = {}

    def register_observer(self, observer, event_type):
        print("Notifier:", observer, "registered to observe", event_type)
        if event_type not in self._observers:
            self._observers[event_type] = []
        self._observers[event_type].append(observer)

    def event(self, event_type, message):
        print("Notifier:", event_type)
        for observer in self._observers.get(event_type, []):
            observer.notify(event_type, message)


class Observer(ABC):
    @abstractmethod
    def notify(self, event_type, message):
        pass


class Bartender(Observer):
    def __init__(self):
        print("Bartender: ready to serve.")

    def notify(self, event_type, message):
        # dispatch, based on an event type
        if event_type in (
            "someone_is_thirsty",
            "beer_is_empty",
        ):
            self.someone_is_thirsty(message)

    def someone_is_thirsty(self, who):
        print("Bartender: {}, how about some beer mate?".format(who))


notifier = Notifier()

# dynamically couple bartender to an event listener
bartender = Bartender()
notifier.register_observer(bartender, "someone_is_thirsty")

# whenever notifier sees an event (here, triggered by calling "event")
# it passes the message to all interested parties
notifier.event("someone_is_thirsty", "Uwe")
Bartender: ready to serve.
Notifier: <__main__.Bartender object at 0x1034927d0> registered to observe someone_is_thirsty
Notifier: someone_is_thirsty
Bartender: Uwe, how about some beer mate?

More patterns¶

They were originally collected and published in the 1994 book "Design Patterns" by Gang of Four; for a listing see:

  • https://www.oodesign.com

The design patterns from this book play a more important role in statically typed languages such as Java, C++ or C# than in dynamic languages.

Optional exercise*¶

Design and implement a class hierarchy for employees in a department. There are managers and subordinates. Among managers there are bosses and lower management members, and among subordinates there are staff members and students. Everyone except for bosses has exactly one supervisor - that must be one of the managers. Bosses have no supervisors. It should be possible to add employees to the department, however supervisor, if applicable, must be already in the department beforehand. Furthermore, each employee should be able to print nicely giving own name and surname, as well as department, position (boss, student etc), and a supervisor, if applicable.