from IPython.core.display import HTML
HTML(open("custom.html", "r").read())
We have seen "objects" already: actually any type of data, like basic types int
or bool
or collection types like list
s are objects. There are also objects not representing a data type, for example file handles.
Examples:
"abcde".upper()
[1, 2, 3].append(4)
with open("say_hi.txt") as fh:
print(fh.readlines())
['hi\n', 'ho\n']
x = 0.5 + 1j
print(x.real, x.imag)
0.5 1.0
print(fh.closed)
True
When we create objects (e.g. a list, a string), we create an instance of a given class.
You can also interpret a class as a recipe how instances (objects) of that class work.
This is a very minimalistic example class with one method:
class HiSayer:
def greet(self, name):
print("hi", name)
So we use class
followed by its name to start the declaration of a class.
Every other declaration related to this class follows indented.
Methods are declared like functions, but within a class
declaration.
Just ignore the self
for a moment:
# this is how we create the instance
x = HiSayer()
# now we call the greet method:
x.greet("retro")
hi retro
self
.¶What is this self
? Lets print it:
class HiSayer:
def greet(self, name):
print("self is", self)
xxxx = HiSayer()
print("xxxx is", xxxx)
xxxx.greet("retro")
xxxx is <__main__.HiSayer object at 0x10351d840> self is <__main__.HiSayer object at 0x10351d840>
The 0x....
you see is the storage location of the object in hexadecimal notation.
We see here: self
is the same object as the xxxx
in xxxx.greet("retro")
.
So Python "magically" inserts the current object as self
.
We need this, because a method is defined at class level which is not aware of the current instance. By injecting the x
as self
the method knows on what object is operating on.
The name self
is a convention, an one could use other names instead.
The "magic" insertion works as follows: Python translates object.method(...)
to class_of_object.method(object, ...)
, e.g. in the example above xxxx.greet("retro")
is translated to HiSayer.greet(xxxx, "rethro")
.
# this works also, but it is usually not used this way:
HiSayer.greet(xxxx, "retro")
self is <__main__.HiSayer object at 0x10351d840>
Many classes accept arguments when we create an instance, e.g. list
li = list((1, 2, 3))
print(type(li), li)
<class 'list'> [1, 2, 3]
To handle this we must implement special "initializer" method, named __init__
:
import math
class Point2D:
def __init__(self, x0, y0):
self.x = x0 # set attribute x
self.y = y0 # set attribute y
def distance_to_origin(self):
"""method which computes length of Vector2D"""
return math.hypot(self.x, self.y)
def as_string(self):
return "<Point2D x={}, y={}>".format(self.x, self.y)
# create an instance, this creates the object and
# calls __init__ with args 1 an
p = Point2D(1, -1)
# now we can access attributes
print(p.x, p.y)
1 -1
print(p.as_string())
print(p.distance_to_origin())
<Point2D x=1, y=-1> 1.4142135623730951
Comment:
dunder
(double underscore) or special methods.JazzRecord
which can be initialzied with artist name, record name and release year. e.t. record = JazzRecord("John Coltrane", "Giant Steps", 1957).
as_string
which returns a string represetation like Giant Steps (by John Coltrane, 1957)
.create custom datatypes
implement data entities
data encapsulation
hide implementation details from user of a class
model state: In case you have a group a functions with long argument lists or you think they need global variables to "communicate", classes might be a solution
Using inheritance (next chapters) classes also help to:
Inheritance is a mechanism which in principle reuses code from an existing class.
This mechanism is very versatile and the core of many object oriented programming techniques.
To declare a class X
which inherits methods from a class Y
, we use the declaration
class X(Y):
...
We also say Y is a base class of X, X derives from Y or X inherits from Y.
In the following example we implement a class Vector2D
Point2D
, including __init__
.add
and length
as_string
.This is now the code:
class Vector2D(Point2D):
def add(self, other):
"""is called when you execute self + other"""
return Vector2D(self.x + other.x, self.y + other.y)
def length(self):
return self.distance_to_origin()
def as_string(self):
return "<Vector2D x={}, y={}>".format(self.x, self.y)
# create instances
# calls __init__ from base class, since we did not implement it in Vector2D:
v1 = Vector2D(1.0, 2.0)
v2 = Vector2D(-1.0, -2.0)
# access attributes
print(v1.x, v1.y)
print(v1.as_string())
# calls internally method in base class !
print(v1.length())
1.0 2.0 <Vector2D x=1.0, y=2.0> 2.23606797749979
Lets check what methods and attributes are now available on an instance of Vector2D
, you see lots of internal methods, but also __init__
, etc:
(more details about some of the special methods follow soon)
dir(v1)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'add', 'as_string', 'distance_to_origin', 'length', 'x', 'y']
We make more use of inheritance later when we discuss use cases of object oriented programming.
Python supports multiple base classes, but this comes at some risk, especially if base classes have overlapping method names. The order you list base classes also can make a difference.
My recommenation is to avoid multiple base classes unless you exactly know what you do.
Using builtin classes or classes from external libraries for subclassing comes with the risk to break internal contracts between methods which are only known to the original developper.
Better use e.g. UserDict
from the std library as base class!
__magic__
) methods¶import math
class FancyVector2D(Vector2D):
def __init__(self, x, y, name):
# super() allows access to the function of same name in
# the base class. Here we call the already existing __init__
# instead of setting self.x and self.y manually:
super().__init__(x, y)
self.name = name
def __len__(self):
"""implements len(..) for a Point2D Object"""
return 2
def __str__(self):
"""to string conversion, when str(.) is called
or implicitly within print:"""
return "<FancyVector2D x={}, y={} name='{}'>".format(self.x, self.y, self.name)
def __add__(self, other):
"""implements "self + other"""
assert isinstance(other, FancyVector2D)
new_name = "{} + {}".format(self.name, other.name)
return FancyVector2D(self.x + other.x, self.y + other.y, new_name)
def __getitem__(self, index):
"""implementes index access"""
if index == 0 or index == 1:
return (self.x, self.y)[index]
# invalid index:
raise IndexError()
def __eq__(self, other):
# implements "self == other"
if not isinstance(other, FancyVector2D):
# comparing makes only snse if other is also of type
# FancyVector2D:
return False
return self.x == other.x and self.y == other.y
p = FancyVector2D(1, -1, "p")
q = FancyVector2D(2, 5, "q")
r = p + q # same as p.__add__(q)
print("r.x=", r.x, "r.y=", r.y)
r.x= 3 r.y= 4
# same as r.__len__():
print("len(r)=", len(r))
len(r)= 2
# same as r.__str__():
u = str(r)
print("str(u)=", u)
# this also calls the __str__ method implicitely:
print("print(r):", r)
str(u)= <FancyVector2D x=3, y=4 name='p + q'> print(r): <FancyVector2D x=3, y=4 name='p + q'>
# same as r.__getitem__(0):
print("r[0]=", r[0])
r[0]= 3
# same as p.__eq__(q):
print("p == q:", p == q)
# etc
print("r == r:", r == r)
print("p == 1:", p == 1)
p == q: False r == r: True p == 1: False
There are many other special functions for customizing your objects, see the reference documentation at https://docs.python.org/2/reference/datamodel.html#special-method-names and the blog post https://rszalski.github.io/magicmethods/
Repeat the examples above.
Extend FancyVector2D
to implement a method scale_inplace
which takes a float x
and scales the attributes x
and y
internally, e.g to be used like
v1 = FancVector2D(1.0, 2.0)
v1.scale_inplace(2.0)
assert v1.x == 1.0 and v1.y == 4.0
Implement a method __mul__
which takes another vector and returns the dot product (scalar product) of both. __mul__
is called if you use v1 * v2
. The scalar product of two v1
and v2
is defined as v1.x * v2.x + v1.y * v2.
and is a floating point value (and not another Vector).
ComplexNumber
which inherits Vector2D
and reimplements __mul__
for complex arithmethic. Reimplement __str__
to achieve output in the style of 1 + 2i
.Implementing classes can help in many situations. Whereas in the early 90s OO was considered as a silver bullet for most programming problems, opionns changed meanwhile among many programmers.
So when to implement classes ?
In object-oriented programming, the open/closed principle states
software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification; that is, such an entity can allow its behaviour to be extended without modifying its source code.
As this sounds a bit abstract we will discuss this with a programming example:
class Adder:
def accumulated_values(self, data):
assert len(data) > 0
result = data[0]
for value in data[1:]:
result = self.combine(result, value)
return result
def combine(self, value_1, value_2):
return value_1 + value_2
class Multiplier(Adder):
def combine(self, value_1, value_2):
return value_1 * value_2
a = Adder()
m = Multiplier()
print(a.accumulated_values([2, 3, 4]))
print(m.accumulated_values([2, 3, 4]))
9 24
Here implementing the overall algorithm and a implementation detail in two different methods allows us to implement modified behaviour without the need to change anything in the base class.
This approach looks similar to the previous example, the exception is that we make a clear difference between the base algorithm and specific implementations.
In this specific case we also call this the "template method pattern". A base class provides the template for an algorithm, and we fill out the missing pieces in a subclass.
Beyond that we use the abc
module from the Python standard library to make sure that subclasses implement all missing methods:
from abc import ABC, abstractmethod
class Accumulator(ABC):
@abstractmethod
def start_value(self):
pass
@abstractmethod
def combine(self, value_1, value_2):
pass
def accumulated_value(self, values):
result = self.start_value()
for value in values:
result = self.combine(result, value)
return result
What is new ?
ABC
and if we decorate placeholder methods with abstractmethod
, every derived subclass must implement the given methods.This is then a clear specification what methods subclasses must implement, and includes a check if a subclass follows this specification.
Let's implement a subclass which forgot to implement start_value
:
class Adder(Accumulator):
def combine(self, value_1, value_2):
return value_1 + value_2
And this happens when we want to instantiate this class:
a = Adder()
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[26], line 1 ----> 1 a = Adder() TypeError: Can't instantiate abstract class Adder with abstract method start_value
class Adder(Accumulator):
def combine(self, value_1, value_2):
return value_1 + value_2
def start_value(self):
return 0
class Multiplier(Accumulator):
def combine(self, value_1, value_2):
return value_1 * value_2
def start_value(self):
return 1
a = Adder()
print(a.accumulated_value([1, 2, 3]))
m = Multiplier()
print(m.accumulated_value([]))
6 1
What is the difference between the previous two approaches ?
Both do the same.
The first approach was implemented by a developer who thought "I need a adder, but maybe I will need variants in the futurer:.
The second approach was implemented by a developer who had possible variants in his mind from the early beginning.
The second approach is more common in frameworks: frameworks are libraries who provide "frames" which must be filled out by the user. Here the abstract base class is the frame.
We mixed here two concepts:
the template method pattern also works without abstract base classes, the idea is that you sub classes must provide the implementation details.
the abstract base class is a way to enforce a policy to ensure valid sub classes.
Instead of using inheritance, we can make code reusable by providing functionalties by using attributes.
class AccumulationStrategy(ABC):
@abstractmethod
def start_value(self):
pass
@abstractmethod
def combine(self, value_1, value_2):
pass
class Accumulator:
def __init__(self, accumulation_strategy):
assert isinstance(accumulation_strategy, AccumulationStrategy)
self.strategy_impl = accumulation_strategy
def accumulate(self, values):
result = self.strategy_impl.start_value()
for value in values:
result = self.strategy_impl.combine(result, value)
return result
class Adder(AccumulationStrategy):
def start_value(self):
return 0
def combine(self, value_1, value_2):
return value_1 + value_2
a = Accumulator(Adder())
print(a.accumulate([1, 2, 3]))
6
Why strategy pattern instead of subclassing ?
Frameworks often offer different strategies, which can be composes when the base class is instantiated. If a framwork uses sub classing techniques all possible combinations must be provided within the framework.
Strategy pattern also works better in case we use a base class from a different library, thus subclassing can be risky if we do not know the details:
import string
from abc import ABC, abstractmethod
class Histogram:
def __init__(self, processor, filter_):
self.processor = processor
self.filter_ = filter_
self.histogram = {}
def get(self):
return self.histogram
def calculate(self, values):
histogram = {}
for value in values:
if not self.filter_.check(value):
continue
value = self.processor.process(value)
if value not in histogram:
histogram[value] = 0
histogram[value] += 1
self.histogram = histogram
return histogram
class BaseProcessor(ABC):
@abstractmethod
def process(self, value):
pass
class NoProcessor(BaseProcessor):
def process(self, value):
return value
class IgnoreCaseProcessor(BaseProcessor):
def process(self, value):
return value.upper()
class BaseFilter(ABC):
@abstractmethod
def check(self, value):
pass
class NoFilter(BaseFilter):
def check(self, value):
return True
class WordFilter(BaseFilter):
def check(self, value):
return all(v in string.ascii_letters for v in value)
Here it is up to the user how to combine the different strategies.
Using the base class approach would require that the framework implements all eight variants as different classes incl. assigning names. This can lead to combinatorical overhead and also finding a good naming for all variants can be challenging.
h = Histogram(NoProcessor(), NoFilter())
h.calculate("this is TRICKY !!! or is this not tricky ???".split())
{'this': 2, 'is': 2, 'TRICKY': 1, '!!!': 1, 'or': 1, 'not': 1, 'tricky': 1, '???': 1}
h = Histogram(IgnoreCaseProcessor(), WordFilter())
h.calculate("this is TRICKY !!! or is this not tricky ???".split())
{'THIS': 2, 'IS': 2, 'TRICKY': 2, 'OR': 1, 'NOT': 1}
Another alternative to using inheritance is extending the base class objects by inclusion of extra behaviour, which may be re-usable accross different classes, or must be done if the base class code is not to be changed.
class DictPrettyPrintMixIn(ABC):
@abstractmethod
def _get_dict(self):
pass
def __str__(self):
dict_ = self._get_dict()
assert isinstance(dict_, dict)
keys_width = max(len(str(key)) for key in dict_.keys())
return "\n".join(
"{}: {}".format(str(key).rjust(keys_width), value)
for key, value in sorted(dict_.items())
)
Why _get_dict
method starts with _
?
Python does not support "private" and "protected" methods or attributes (e.g. like C++ or Java). Instead developer use names starting with a single _
to indicate methods / attributes which are not inteded for general use. Calling such a method comes with the risk that things do not work as documented.
Let's use the mixin to define our own pretty dictionary. In Python we use multiple inheritance to that end:
from collections import UserDict
class PrettyDict(UserDict, DictPrettyPrintMixIn):
def _get_dict(self):
return self.data
d = PrettyDict(
[
((1, 2, 3), "a"),
((2, 3), "b"),
]
)
print(d)
(1, 2, 3): a (2, 3): b
Let's re-use the mixin in the histogram example:
class PrettyPrintHistogram(Histogram, DictPrettyPrintMixIn):
def _get_dict(self):
return self.get()
def __str__(self):
return "\n".join(
[
"PrettyPrintHistogram",
"",
super().__str__(), # we'll get back to that line in a moment
]
)
and compare how Histogram
and PrettyPrintHistogram
print-out:
h2 = PrettyPrintHistogram(IgnoreCaseProcessor(), WordFilter())
h2.calculate("this is TRICKY !!! or is this not tricky ???".split())
print(h)
print("\nvs.\n")
print(h2)
<__main__.Histogram object at 0x10421c310> vs. PrettyPrintHistogram IS: 2 NOT: 1 OR: 1 THIS: 2 TRICKY: 2
When calling super().__str__()
in the example above, why was __str__()
called from DictPrettyPrintMixIn
, and not for instance from Histogram
, which inherits default __str__
implementation from the object
class?
Python follows MRO, which in our case is:
PrettyPrintHistogram.mro()
[__main__.PrettyPrintHistogram, __main__.Histogram, __main__.DictPrettyPrintMixIn, abc.ABC, object]
In case of multiple inheritance Python resolves the methods by looking from left to right according to the order given in the inheritance declaration, but prioritising implementations found on "lower" levels of inheritance over the "higher" ones is not a rule (if your curious, see: https://www.python.org/download/releases/2.3/mro/ ).
Advice: be cautious when using multiple-inheritance - it may lead to surprising and hard bugs.
Possibly better, in our last example, is to use a safer explicit DictPrettyPrintMixIn.__str__(self)
call instead of the implicit super().__str__()
call, or use a different design pattern (strategy or a dynamic decorator).
In contrast to the mixin inheritance pattern (sometimes called a static implementation of a decorator pattern), the OO dynamic decorator pattern solves problem of dynamical extension of class' functionality, i.e. it enables different behaviour for different instances of the class.
The main difference with respect to the strategy pattern is that code of the decorated objects is not changed.
The decorator pattern is implemented by wrapping instance of the original class (passed during instatiation), inheriting all the original behaviour and implementing it by forwarding (delegating) calls to the wrapped instance, whilst overriding or adding custom behaviour.
Note: The programming language-agnostic decorator pattern is not but has an analogy to Python decorators (a language feature), such as @lru_cache
or @abstractmethod
, which alow to dynamically modify functions (methods) or classes.
Here is another way of implementing a prety printing for histograms, using the dynamic decorator pattern:
# inherit from Histogram to perserve a decorated instance's original type
class DecoratedHistogramBase(Histogram):
def __init__(self, histogram_obj):
self.histogram_obj = histogram_obj
def get(self):
return self.histogram_obj.get()
def calculate(self, values):
return self.histogram_obj.calculate(values)
class PrettyPrintDecoratedHistogram(DecoratedHistogramBase):
def __str__(self):
return "\n".join(
[
"PrettyPrintDecoratedHistogram",
"",
str(PrettyDict(self.histogram_obj.get())),
]
)
h = Histogram(IgnoreCaseProcessor(), WordFilter())
# decorate instance
h2 = PrettyPrintDecoratedHistogram(h)
assert isinstance(h2, Histogram)
# `calculate` either before or after decoration
h2.calculate("this is TRICKY !!! or is this not tricky ??? Tricky it is ...".split())
print(h2)
PrettyPrintDecoratedHistogram IS: 3 IT: 1 NOT: 1 OR: 1 THIS: 2 TRICKY: 3
A different decorator, that adds (text) plotting behaviour:
class TextPlotDecoratedHistogram(DecoratedHistogramBase):
def text_plot(self):
dict_ = self.get()
keys_width = max(len(str(key)) for key in dict_.keys())
for key, value in sorted(dict_.items()):
print("{}: {}".format(str(key).rjust(keys_width), "*" * value))
# decorators should be able to be chained
h3 = TextPlotDecoratedHistogram(h2)
assert isinstance(h3, Histogram)
h3.text_plot()
# but decorating is not inheriting, e.g. __str__ call was not delegated..
print()
print(h3)
IS: *** IT: * NOT: * OR: * THIS: ** TRICKY: *** <__main__.TextPlotDecoratedHistogram object at 0x1043a7a60>
An example of a pattern that is not focused on altering behaviour.
It is used when objects, callled observers, need to notified about some status changes (events) without direct coupling to the status-changing environment.
Observers register for notifications to a notifier (also called subject). Notifications from notifier can be implemented either by directly calling pre-defined methods on observers, or by calling provided callback functions.
from abc import ABC, abstractmethod
class Notifier:
def __init__(self):
self._observers = {}
def register_observer(self, observer, event_type):
print("Notifier:", observer, "registered to observe", event_type)
if event_type not in self._observers:
self._observers[event_type] = []
self._observers[event_type].append(observer)
def event(self, event_type, message):
print("Notifier:", event_type)
for observer in self._observers.get(event_type, []):
observer.notify(event_type, message)
class Observer(ABC):
@abstractmethod
def notify(self, event_type, message):
pass
class Bartender(Observer):
def __init__(self):
print("Bartender: ready to serve.")
def notify(self, event_type, message):
# dispatch, based on an event type
if event_type in (
"someone_is_thirsty",
"beer_is_empty",
):
self.someone_is_thirsty(message)
def someone_is_thirsty(self, who):
print("Bartender: {}, how about some beer mate?".format(who))
notifier = Notifier()
# dynamically couple bartender to an event listener
bartender = Bartender()
notifier.register_observer(bartender, "someone_is_thirsty")
# whenever notifier sees an event (here, triggered by calling "event")
# it passes the message to all interested parties
notifier.event("someone_is_thirsty", "Uwe")
Bartender: ready to serve. Notifier: <__main__.Bartender object at 0x1034927d0> registered to observe someone_is_thirsty Notifier: someone_is_thirsty Bartender: Uwe, how about some beer mate?
They were originally collected and published in the 1994 book "Design Patterns" by Gang of Four; for a listing see:
The design patterns from this book play a more important role in statically typed languages such as Java, C++ or C# than in dynamic languages.
Design and implement a class hierarchy for employees in a department. There are managers and subordinates. Among managers there are bosses and lower management members, and among subordinates there are staff members and students. Everyone except for bosses has exactly one supervisor - that must be one of the managers. Bosses have no supervisors. It should be possible to add employees to the department, however supervisor, if applicable, must be already in the department beforehand. Furthermore, each employee should be able to print nicely giving own name and surname, as well as department, position (boss, student etc), and a supervisor, if applicable.