Introduction to object oriented programming

We have seen "objects" already: actually any type of data, like basic types int or bool or collection types like lists are objects. There are also objects not representing a data type, for example file handles.

1. "objects" have methods "attached". The functionality of the method depends on the objects type.

Examples:

"abcde".upper()

[1, 2, 3].append(4)

with open("say_hi.txt") as fh:
    print(fh.readlines())
['hi\n', 'ho\n']

2. Some objects also have attached values, which we call attributes.

x = 0.5 + 1j
print(x.real, x.imag)
0.5 1.0
print(fh.closed)
True

3. Classes vs. objects / instances.

  • When we create objects (e.g. a list, a string), we create an instance of a given class.

  • You can also interpret a class as a recipe how instances (objects) of that class work.

This is a very minimalistic example class with one method:

class HiSayer:
    
    def greet(self, name):
        print("hi", name)           

So we use class followed by its name to start the declaration of a class.

Every other declaration related to this class follows indented.

Methods are declared like functions, but within a class declaration.

Just ignore the self for a moment:

# this is how we create the instance
x = HiSayer()

# now we call the greet method:
x.greet("retro")
hi retro

4. The magic self.

What is this self ? Lets print it:

class HiSayer:
    
    def greet(self, name):
        print("self is", self)
xxxx = HiSayer()
print("xxxx is", xxxx)
xxxx.greet("retro")
xxxx is <__main__.HiSayer object at 0x109feac88>
self is <__main__.HiSayer object at 0x109feac88>

The 0x.... you see is the storage location of the object in hexadecimal notation.

We see here: self is the same object as the xxxx in xxxx.greet("retro").

So Python "magically" inserts the current object as self.

We need this, because a method is defined at class level which is not aware of the current instance. By injecting the x as self the method knows on what object is operating on.

5. How can we pass arguments when creating an object ?

Many classes accept arguments when we create an instance, e.g. list

li = list((1, 2, 3))
print(type(li), li)
<class 'list'> [1, 2, 3]

To handle this we must implement special "initializer" method, named __init__:

import math


class Point2D:
    
    def __init__(self, x0, y0):
        self.x = x0     # set attribute x
        self.y = y0     # set attribute y
        
    def distance_to_origin(self):
        """method which computes length of Vector2D"""
        return math.hypot(self.x, self.y)
    
    def as_string(self):
        return "<Point2D x={}, y={}>".format(self.x, self.y)
    
    
# create an instance, this creates the object and
# calls __init__ with args 1 and -1:
p = Point2D(1, -1)
print(p.as_string())

# now we can access attributes
print(p.x, p.y)

# and call method
print(p.distance_to_origin())
<Point2D x=1, y=-1>
1 -1
1.4142135623730951

Comment:

  • names starting and ending with double underscores relate to Python internals.
  • methods following this notation are also called dunder (double underscore) or special methods.

Exercise section 1

  • reproduce the examples from above
  • implement a class JazzRecord which can be initialzied with artist name, record name and release year. e.t. record = JazzRecord("John Coltrane", "Giant Steps", 1957).
  • implement a method as_string which returns a string represetation like Giant Steps (by John Coltrane, 1957).

Why / when implement objects ?

  • create custom datatypes
  • implement data entities
  • data encapsulation
  • hide implementation details from user of a class

  • model state: In case you have a group a functions with long argument lists or you think they need global variables to "communicate", classes might be a solution

Using inheritance (next chapters) classes also help to:

  • avoid code duplication
  • write clearer and more maintainable code

6. Inheritance

Inheritance is a mechanism which in principle reuses code from an existing class.

This mechanism is very versatile and the core of many object oriented programming techniques.

To declare a class X which inherits methods from a class Y, we use the declaration

class X(Y):
     ...

We also say Y is a base class of X, X derives from Y or X inherits from Y.

In the following example we implement a class Vector2D

  • which inherits all methods from Point2D, including __init__.
  • adds two new methods add and length
  • but overwrites as_string.

This is now the code:

class Vector2D(Point2D):
        
    def add(self, other):
        """ is called when you execute self + other """
        return Vector2D(self.x + other.x, self.y + other.y)
    
    def length(self):
        return self.distance_to_origin()
    
    def as_string(self):
        return "<Vector2D x={}, y={}>".format(self.x, self.y)


# create instance
v1 = Vector2D(1.0, 2.0)

# access attributes
print(v1.x, v1.y)

print(v1.as_string())

# calls method in base class !
print(v1.length())

# this is in base class too, see result:
print(v1.add(v1).as_string())
1.0 2.0
<Vector2D x=1.0, y=2.0>
2.23606797749979
<Vector2D x=2.0, y=4.0>

Lets check what methods and attributes are now available on an instance of Vector2D, you see lots of internal methods, but also __init__, etc:

(more details about some of the special methods follow soon)

dir(v1)
['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'add',
 'as_string',
 'distance_to_origin',
 'length',
 'x',
 'y']

We make more use of inheritance later when we discuss use cases of object oriented programming.

A few words about multiple base classes:

Python supports multiple base classes, but this comes at some risk, especially if base classes have overlapping method names. The order you list base classes also can make a difference.

My recommenation is to avoid multiple base classes unless you exactly know what you do.

Subclassing builtin classes

Using builtin classes or classes from external libraries for subclassing comes with the risk to break internal contracts between methods which are only known to the original developper.

7. More dunder (__magic__) methods

import math


class FancyVector2D(Vector2D):
    
    def __init__(self, x, y, name):
        
        # call method of same name in base class
        super().__init__(x, y)
        self.name = name

    def __len__(self):
        # implements len(..) for a Point2D Object
        return 2
    
    def __str__(self):
        # to string conversion, when str(.) is called
        # or implicitly within print:
        return "<FancyVector2D x={}, y={} name='{}'>".format(self.x, self.y, self.name)
    
    def __add__(self, other):
        # implements "self + other"
        assert isinstance(other, FancyVector2D)
        new_name = "{} + {}".format(self.name, other.name)
        return FancyVector2D(self.x + other.x, self.y + other.y, new_name)
    
    def __getitem__(self, index):
        # square bracket access
        if 0 <= index < 2:
            return (self.x, self.y)[index]
        raise IndexError()
        
    def __eq__(self, other):
        # implements "self == other"
        if type(self) != type(other):
            return False
        return self.x == other.x and self.y == other.y


p = FancyVector2D(1, -1, "p")
q = FancyVector2D(2, 5, "q")

# call __add__
r = p + q
print("r.x=", r.x, "r.y=", r.y)


# calls __len__
print("len(r)=", len(r))

# calls __str__ method:
u = str(r)
print("str(u)=", u)

# this also calls the __str__ method implicitely:
print("print(r):", r)



# calls __getitem__
print("r[0]=", r[0])

# calls __eq__
print("p == q:", p == q)
print("r == r:", r == r)
print("p == 1:", p == 1)
r.x= 3 r.y= 4
len(r)= 2
str(u)= <FancyVector2D x=3, y=4 name='p + q'>
print(r): <FancyVector2D x=3, y=4 name='p + q'>
r[0]= 3
p == q: False
r == r: True
p == 1: False

There are many other special functions for customizing your objects, see the reference documentation at https://docs.python.org/2/reference/datamodel.html#special-method-names and the blog post https://rszalski.github.io/magicmethods/

Exercise section 2

  1. Repeat the examples above.
  2. Extend FancyVector2D to implement a method scale which takes a float x and scales the attributes x and y internally.
  3. Implement a method __mul__ which takes another vector and returns the dot product (scalar product) of both. __mul__ is called if you use v1 * v2.
  4. Create a class ComplexNumber which inherits Vector2D and reimplements __mul__ for complex arithmethic. Reimplement __str__ to achieve output in the style of 1 + 2i.

Some object oriented programming techniques

Implementing classes can help in many situations. Whereas in the early 90s OO was considered as a silver bullet for most programming problems, opionns changed meanwhile among many programmers.

So when to implement classes ?

  • Data entities: address records, own datatypes
  • Offer a uniform interface to resources (like filehandles do, ...)
  • In case you implement functions and you can not get around sharing data between them, never use global variables, classes might be the solution.
  • classes can support code reuse and reusability.

1. Open/Closed Principle

In object-oriented programming, the open/closed principle states

software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification; that is, such an entity can allow its behaviour to be extended without modifying its source code.

https://en.wikipedia.org/wiki/Open/closed_principle

As this sounds a bit abstract we will discuss this with a programming example:

class Adder:
    
    def accumulated_values(self, data):
        assert len(data) > 0
        result = data[0]
        for value in data[1:]:
            result = self.combine(result, value)
            
        return result
    
    def combine(self, value_1, value_2):
        return value_1 + value_2
    
    
class Multiplier(Adder):
    
    def combine(self, value_1, value_2):
        return value_1 * value_2
    
    
a = Adder()
m = Multiplier()

print(a.accumulated_values([2, 3, 4]))
print(m.accumulated_values([2, 3, 4]))
9
24

Here implementing the overall algorithm and a implementation detail in two different methods allows us to implement modified behaviour without the need to change anything in the base class.

2. Abstract base classes and the template method pattern

This approach looks similar to the previous example, the exception is that we make a clear difference between the base algorithm and specific implementations.

In this specific case we also call this the "template method pattern". A base class provides the template for an algorithm, and we fill out the missing pieces in a subclass.

Beyond that we use the abc module from the Python standard library to make sure that subclasses implement all missing methods:

from abc import ABC, abstractmethod
class Accumulator(ABC):
    
    @abstractmethod
    def start_value(self):
        pass
    
    @abstractmethod
    def combine(self, value_1, value_2):
        pass
    
    def accumulated_value(self, values):
        result = self.start_value()
        for value in values:
            result = self.combine(result, value)
        return result

What is new ?

  • if we specify the given base class ABC and if we decorate placeholder methods with abstractmethod, every derived subclass must implement the given methods.

This is then a clear specification what methods subclasses must implement, and includes a check if a subclass follows this specification.

Let's implement a subclass which forgot to implement start_value:

class Adder(Accumulator):
    
    def combine(self, value_1, value_2):
        return value_1 + value_2

And this happens when we want to instantiate this class:

a = Adder()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-116-c86d63cc3734> in <module>
----> 1 a = Adder()

TypeError: Can't instantiate abstract class Adder with abstract methods start_value
class Adder(Accumulator):
    
    def combine(self, value_1, value_2):
        return value_1 + value_2
    
    def start_value(self):
        return 0
    
class Multiplier(Accumulator):
    
    def combine(self, value_1, value_2):
        return value_1 * value_2
    
    def start_value(self):
        return 1

a = Adder()
print(a.accumulated_value([1, 2, 3]))

m = Multiplier()
print(m.accumulated_value([]))
6
1

What is the difference between the previous two approaches ?

  1. Both do the same.

  2. The first approach was implemented by a developer who thought "I need a adder, but maybe I will need variants in the futurer:.

  3. The second approach was implemented by a developer who had possible variants in his mind from the early beginning.

  4. The second approach is more common in frameworks: frameworks are libraries who provide "frames" which must be filled out by the user. Here the abstract base class is the frame.

We mixed here two concepts:

  • the template method pattern also works without abstract base classes, the idea is that you sub classes must provide the implementation details.

  • the abstract base class is a way to enforce a policy to ensure valid sub classes.

3. Strategy pattern

Instead of using inheritance, we can make code reusable by providing functionalties by using attributes.

class AccumulationStrategy(ABC):
           
    @abstractmethod
    def start_value(self):
        pass
    
    @abstractmethod
    def combine(self, value_1, value_2):
        pass


class Accumulator:
    
    def __init__(self, accumulation_strategy):
        assert isinstance(accumulation_strategy, AccumulationStrategy)
        self.strategy_impl = accumulation_strategy
        
    def accumulate(self, values):
        result = self.strategy_impl.start_value()
        for value in values:
            result = self.strategy_impl.combine(result, value)
        return result
    

class Adder(AccumulationStrategy):
    
    def start_value(self):
        return 0
    
    def combine(self, value_1, value_2):
        return value_1 + value_2
    
    
a = Accumulator(Adder())

print(a.accumulate([1, 2, 3]))
6

Why strategy pattern instead of subclassing ?

  • strategy pattern is more explicit in case you need multiple base classes.
  • in case you understand how multiple sub classes work, it is a matter of personal preference.
  • there is a stronger distinction in compiled languages.

Frameworks often offer different strategies, which can be composes when the base class is instantiated. If a framwork uses sub classing techniques all possible combinations must be provided within the framework.

Strategy pattern also works better in case we use a base class from a different library, thus subclassing can be risky if we do not know the details:

import string
from abc import ABC, abstractmethod


class Histogram:
    
    def __init__(self, processor, filter_):
        self.processor = processor
        self.filter_ = filter_
        self.histogram = {}

    def get(self):
        return self.histogram

    def calculate(self, values):
        histogram = {}
        for value in values:
            if not self.filter_.check(value):
                continue
            value = self.processor.process(value)
            if value not in histogram:
                histogram[value] = 0
            histogram[value] += 1
        
        self.histogram = histogram
        return histogram


class BaseProcessor(ABC):

    @abstractmethod
    def process(self, value):
        pass

class NoProcessor(BaseProcessor):
    
    def process(self, value):
        return value
    
class IgnoreCaseProcessor(BaseProcessor):
    
    def process(self, value):
        return value.upper()


class BaseFilter(ABC):

    @abstractmethod
    def check(self, value):
        pass

class NoFilter(BaseFilter):
    
    def check(self, value):
        return True
    
class WordFilter(BaseFilter):
    
    def check(self, value):
        return all(v in string.ascii_letters for v in value)

Here it is up to the user how to combine the different strategies.

Using the base class approach would require that the framework implements all eight variants as different classes incl. assigning names. This can lead to combinatorical overhead and also finding a good naming for all variants can be challenging.

h = Histogram(NoProcessor(), NoFilter())

h.calculate("this is TRICKY !!! or is this not tricky ???".split())
{'this': 2,
 'is': 2,
 'TRICKY': 1,
 '!!!': 1,
 'or': 1,
 'not': 1,
 'tricky': 1,
 '???': 1}
h = Histogram(IgnoreCaseProcessor(), WordFilter())

h.calculate("this is TRICKY !!! or is this not tricky ???".split())
{'THIS': 2, 'IS': 2, 'TRICKY': 2, 'OR': 1, 'NOT': 1}

4. Mixin inheritance pattern

Another alternative to using inheritance is extending the base class objects by inclusion of extra behaviour, which may be re-usable accross different classes, or must be done if the base class code is not to be changed.

class DictPrettyPrintMixIn(ABC):
    
    @abstractmethod
    def _get_dict(self):
        pass
    
    def __str__(self):
        dict_ = self._get_dict()
        assert isinstance(dict_, dict)
        keys_width = max(len(str(key)) for key in dict_.keys())
        return '\n'.join(
            "{}: {}".format(str(key).rjust(keys_width), value)
            for key, value in sorted(dict_.items())
        )

Sidenote: Access protection

Why _get_dict method starts with _?

Python does not support "private" and "protected" methods or attributes (e.g. like C++ or Java). Instead developer use names starting with a single _ to indicate methods / attributes which are not inteded for general use. Calling such a method comes with the risk that things do not work as documented.

Let's use the mixin to define our own pretty dictionary. In Python we use multiple inheritance to that end:

from collections import UserDict

class PrettyDict(UserDict, DictPrettyPrintMixIn):

    def _get_dict(self):
        return self.data

d = PrettyDict([
    ((1,2,3), 'a'),
    ((2,3), 'b'),
])
print(d)
(1, 2, 3): a
   (2, 3): b

Let's re-use the mixin in the histogram example:

class PrettyPrintHistogram(Histogram, DictPrettyPrintMixIn):
    
    def _get_dict(self):
        return self.get()
    
    def __str__(self):
        return '\n'.join([
            'PrettyPrintHistogram',
            '',
            super().__str__(), # we'll get back to that line in a moment
        ])

and compare how Histogram and PrettyPrintHistogram print-out:

h2 = PrettyPrintHistogram(IgnoreCaseProcessor(), WordFilter())
h2.calculate("this is TRICKY !!! or is this not tricky ???".split())

print(h)
print("\nvs.\n")
print(h2)
<__main__.Histogram object at 0x109b003c8>

vs.

PrettyPrintHistogram

    IS: 2
   NOT: 1
    OR: 1
  THIS: 2
TRICKY: 2

Method resolution order (MRO)

When calling super().__str__() in the example above, why was __str__() called from DictPrettyPrintMixIn, and not for instance from Histogram, which inherits default __str__ implementation from the object class?

Python follows MRO, which in our case is:

PrettyPrintHistogram.mro()
[__main__.PrettyPrintHistogram,
 __main__.Histogram,
 __main__.DictPrettyPrintMixIn,
 abc.ABC,
 object]

In case of multiple inheritance Python resolves the methods by looking from left to right according to the order given in the inheritance declaration, but prioritising implementations found on "lower" levels of inheritance over the "higher" ones is not a rule (if your curious, see: https://www.python.org/download/releases/2.3/mro/ ).

Advice: be cautious when using multiple-inheritance - it may lead to surprising and hard bugs.

Possibly better, in our last example, is to use a safer explicit DictPrettyPrintMixIn.__str__(self) call instead of the implicit super().__str__() call, or use a different design pattern (strategy or a dynamic decorator).

5. Dynamic decorator pattern

In contrast to the mixin inheritance pattern (sometimes called a static implementation of a decorator pattern), the OO dynamic decorator pattern solves problem of dynamical extension of class' functionality, i.e. it enables different behaviour for different instances of the class.

The main difference with respect to the strategy pattern is that code of the decorated objects is not changed.

The decorator pattern is implemented by wrapping instance of the original class (passed during instatiation), inheriting all the original behaviour and implementing it by forwarding (delegating) calls to the wrapped instance, whilst overriding or adding custom behaviour.

Note: The programming language-agnostic decorator pattern is not but has an analogy to Python decorators (a language feature), such as @lru_cache or @abstractmethod, which alow to dynamically modify functions (methods) or classes.

Here is another way of implementing a prety printing for histograms, using the dynamic decorator pattern:

# inherit from Histogram to perserve a decorated instance's original type
class DecoratedHistogramBase(Histogram):

    def __init__(self, histogram_obj):
        self.histogram_obj = histogram_obj

    def get(self):
        return self.histogram_obj.get()

    def calculate(self, values):
        return self.histogram_obj.calculate(values)

class PrettyPrintDecoratedHistogram(DecoratedHistogramBase):

    def __str__(self):
        return '\n'.join([
            'PrettyPrintDecoratedHistogram',
            '',
            str(PrettyDict(self.histogram_obj.get())),
        ])

h = Histogram(IgnoreCaseProcessor(), WordFilter())
# decorate instance
h2 = PrettyPrintDecoratedHistogram(h)
assert isinstance(h2, Histogram)
# `calculate` either before or after decoration
h2.calculate("this is TRICKY !!! or is this not tricky ??? Tricky it is ...".split())

print(h2)
PrettyPrintDecoratedHistogram

    IS: 3
    IT: 1
   NOT: 1
    OR: 1
  THIS: 2
TRICKY: 3

A different decorator, that adds (text) plotting behaviour:

class TextPlotDecoratedHistogram(DecoratedHistogramBase):

    def text_plot(self):
        dict_ = self.get()
        keys_width = max(len(str(key)) for key in dict_.keys())
        for key, value in sorted(dict_.items()):
            print("{}: {}".format(
                str(key).rjust(keys_width), "*" * value
            ))


# decorators should be able to be chained
h3 = TextPlotDecoratedHistogram(h2)
assert isinstance(h3, Histogram)
h3.text_plot()
# but decorating is not inheriting, e.g. __str__ call was not delegated..
print()
print(h3)
    IS: ***
    IT: *
   NOT: *
    OR: *
  THIS: **
TRICKY: ***

<__main__.TextPlotDecoratedHistogram object at 0x109ea80f0>

6. Observer pattern

An example of a pattern that is not focused on altering behaviour.

It is used when objects, callled observers, need to notified about some status changes (events) without direct coupling to the status-changing environment.

Observers register for notifications to a notifier (also called subject). Notifications from notifier can be implemented either by directly calling pre-defined methods on observers, or by calling provided callback functions.

from abc import ABC, abstractmethod

class Notifier:

    def __init__(self):
        self._observers = {}
    
    def register_observer(self, event_type, observer):
        if event_type not in self._observers:
            self._observers[event_type] = []
        self._observers[event_type].append(observer)
    
    def event(self, event_type, message):
        for observer in self._observers.get(event_type, []):
            observer.notify(event_type, message)

class Observer(ABC):

    @abstractmethod
    def notify(self, event_type, message):
        pass
            
class Bartender(Observer):

    def __init__(self):
        print("Bartender ready to serve.")
        
    def notify(self, event_type, message):
        # dispatch, based on an event type
        if event_type in ('someone_is_thirsty', 'beer_is_empty',):
            self.someone_is_thirsty(message)

    def someone_is_thirsty(self, who):
        print("{}, how about some beer mate?".format(who))


notifier = Notifier()

# dynamically couple bartender to an event listener
bartender = Bartender()
notifier.register_observer('someone_is_thirsty',  bartender)

# later notify only listener - it will pass the message to interested parties
notifier.event('someone_is_thirsty', 'Uwe')
Bartender ready to serve.
Uwe, how about some beer mate?

Optional exercise

Design and implement a class hierarchy for employees in a department. There are managers and subordinates. Among managers there are bosses and lower management members, and among subordinates there are staff members and students. Everyone except for bosses has exactly one supervisor - that must be one of the managers. Bosses have no supervisors. It should be possible to add employees to the department, however supervisor, if applicable, must be already in the department beforehand. Futhermore, each employee should be able to print nicely giving own name and surname, as well as department, position (boss, student etc), and a supervisor, if applicable.