Unit Testing

Emanuel Schmid

July 5, 2023

Introduction

Programmer’s Credo

We do these things
not because they are easy,
but because we thought
they were going to be easy

Motivation

Let’s assume we have written a linear equation solver

def solve_linear_equation(k, d):
    """Solves k * x + d = 0 for x."""
    return -d / k

As part of some software that we want to use.

So far so good

Add a main function:

if __name__ == '__main__':
    print(solve_linear_equation(k=2, d=5))

Run the code:

$ python solvers.py
-2.5

Makes sense: 2 * -2.5 + 5 = 0 ✅

Same program, different machine

$ python solvers.py
-3

What happened?

Shoulda put a test on it!

Run test after installation
Avoid nasty surprises

Integration Testing

Test the entire bicycle.

All parts work together
Avoid regressions
No need to understand details

Integration Testing

Difficult to debug
Depends on domain
May run a while

Unit Testing

Test a single gear.

Easier to debug
Works across domains
Runs fast (hopefully)
Exonerates integration tests

Automate

Unit Testing

Summary

Testing by developers for developers
Code that tests code
Automatic checks of expected behavior

Example Code

solvers.py

def solve_linear_equation(k, d):
    """Solves k * x + d = 0 for x."""
    return -d / k

Example Test

test_solvers.py

from solvers import solve_linear_equation


def test_solves_general_linear_equation():
    """Checks solution for k = 2 and d = 5."""
    assert solve_linear_equation(k=2, d=5) == -2.5

Run the Test

No output, everything OK (Python 3).

>>> from test import test_solve_linear_equation
>>> test_solve_linear_equation()
>>>

Run the Test Again

Not OK (Python 2.7).

>>> from test import test_solve_linear_equation
>>> test_solve_linear_equation()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "test.py", line 6, in test_solve_linear_equation
    assert solve_linear_equation(k=2, d=5) == -2.5
AssertionError
>>>

Lesson Learned

Software development is complex.

Requirements change
- Code used differently
- Used code changes
- Dependencies get updated
Rechecking manually not feasible

What to Test?

test everything that you don’t want to break.

Refactoring

Tests lead to refactoring of units
Refactoring of units leads to tests
Together they converge to stable code

Improve

Tested/able code is, i.g, better code

Small units (simpler, reuseable)
Special cases covered
Easy to improve or extend later

How to Test

xUnit Frameworks

Nicer output
More checks
- Float comparison
- Expected exceptions
- …

xUnit Arichitecture

Test Case: base class, implement this
Test Runner: find and run
Test Fixture: preconditions
Test Execution: setup, test, teardown
Assertion: test utilities
Result Formatter: many flavors

xUnit Frameworks

Unit Testing is not language-specific.

xUnit Frameworks (e.g.)

Python: unittest, Pytest
R: test_that, tinytest
C++: googletest, Cput
Java: JUnit, TestNG

Remember Solver

solvers.py

def solve_linear_equation(k, d):
    """Solves k * x + d = 0 for x.
    
    Parameters
    ----------
    k : float or int
        gradient
    d : float or int
        offset
    """
    return -d / k

Test Case

test_solvers.py

from unittest import TestCase
from solvers import solve_linear_equation

class SolversTest(TestCase):

    def test_solve_linear_equation(self):
        """Checks solution for k = 2 and d = 5"""
        result = solve_linear_equation(k=2, d=5)
        self.assertEqual(result, -2.5)

Test Runner

Start runner: python -m unittest discover

=================================================
FAIL: test_solves_general_linear_equation (...)
Checks solution for k = 2 and d = 5.
-------------------------------------------------
Traceback (most recent call last):
  File "(...)", line 8, in test_(...)_equation
    self.assertEqual(result, -2.5)
AssertionError: -3 != -2.5
-------------------------------------------------
Ran 1 test in 0.000s

Enable Python 2.7

Adapt for integers.

def solve_linear_equation(k, d):
    """Solves k * x + d = 0 for x.
    
    Parameters
    ----------
    k : float or int
        gradient
    d : float or int
        offset
    """
    return -float(d) / k

Check Result

Restart runner: python -m unittest discover

.
-----------------------------------------------
Ran 1 test in 0.000s

OK

Special Cases

solvers.py

def solve_linear_equation(k, d):
    """Solves k * x + d = 0 for x.
    
    Parameters
    ----------
    (...)
    
    Raises
    ------
    ZeroDivisionError
        when the gradient k == 0.
    """
    return -float(d) / k

Special Cases

Add test to TestCase.

    def test_throws_for_gradient_zero(self):
        """Solve throws exception for gradient zero."""
        with self.assertRaises(ZeroDivisionError):
            solve_linear_equation(k=0., d=5.)

Check Result

Restart runner: python -m unittest discover

..
-----------------------------------------------
Ran 2 tests in 0.000s

OK

Next Steps

What happens when k=0 and d=0?
Refactoring:
- fix behavior for integers
- throw meaningful exceptions
…

Result

Verified Documentation.

class SolversTest(TestCase):

    def test_solve_linear_equation(self):
        (...)
        result = solve_linear_equation(k = 2., d = 5.)
        (...)

    def test_throws_for_gradient_zero(self):
        (...)

    def test_throws_for_gradient_and_offset_zero(self):
        (...)

Know the Framework

Use suitable assertions.

assertAlmostEquals for floats
Avoid assertTrue(a == b)
Many more… (see documentation)

Best Practices

Good Tests

Write your tests first. (Clean Code)

F ast
I solated
R epeatable
S elf-validating
T imely

Fast

Prepare for 1000s of tests.

Maximum: few milliseconds
You should run all the tests often
You will write many of them

Isolated

Minimal maintenance.

Independent of each other
Independent of behaviour tested elsewhere
Independent of machine
Independent of external data/network

Repeatable

Rerun to locate bugs.

Same result every time you run it
No random input
Independent of environment (network, etc.)
Don’t change the environment

Self-Validating

Do the work once.

Test determines if result was expected
No manual work involved
Remember, you want to run them often

Timely

Write tests as soon as possible.

You still know why you wrote the code that way
You know the special cases
You have (manual) test cases ready

(Actually - nothing speaks against writing
tests ahead of the unit.)

Pattern

Given-When-Then Pattern. (Clean Code)

Given: Setup of environment
When: Execute what you want to test
Then: Check result

At last

(if required) Clean Up

Recall Test

test_solve_linear_equation.py

class SolversTest(TestCase):

    def test_solves_general_linear_equation(self):
        """Checks solution for k = 2 and d = 5."""
        result = solve_linear_equation(k=2, d=5)    # WHEN
        self.assertAlmostEqual(result, -2.5)        # THEN

Help the Reader

That’s why I chose 5 and 2.

def test_solves_general_linear_equation(self):
    """Checks solution for nonzero gradient, offset."""
    nonzero_gradient = 2                       # GIVEN
    nonzero_offset = 5                         # GIVEN

    result = solve_linear_equation(nonzero_gradient, 
                                   nonzero_offset)

    self.assertAlmostEqual(result, -2.5)

Repeat

We could do the same thing here.

def test_throws_for_gradient_zero(self):
    """Solve throws exception for gradient zero."""
    with self.assertRaises(ZeroDivisionError):
        solve_linear_equation(k=0, d=5)

Test Fixture & Execution (xUnit)

We use a nontrivial test fixture.

class SolversTest(TestCase):

    def setUp(self):
        self.nonzero_gradient = 2
        self.nonzero_offset = 5

    def test_solves_general_linear_equation(self):
        (...)

Rollout

No need to copy and paste.

def test_throws_for_gradient_zero(self):
    """Solve throws exception for gradient zero."""
    with self.assertRaises(ZeroDivisionError):
        solve_linear_equation(k=0, d=self.nonzero_offset)

Set the Stage

Use Fixtures.

Provide data for the tests
Initialize classes with nontrivial constructors
Provide convenience methods

Real-World Testing

Challenges

You will be facing common challenges.

I don’t know how to get started.
I can’t test hidden/private code.
I can’t test every combination of inputs.
I don’t understand what a function does.
The function has 20 parameters, so 2^20 tests?
The function needs to read a file.
The function needs to write a file.

Getting Started

Break the vicious circle by adding integration tests.

Testing requires refactoring.
Refactoring requires tests.

Private Code

I can’t test hidden/private code.

No problem, everybody else can’t, either.

Test what users can do with your code
Extract code if this makes testing difficult

Lean Testing

I can’t test every combination of inputs.

Test until you are confident that your code works.

Test the uses cases the unit was designed for
Test ‘boundaries’ and special cases
Add tests when you discover bugs
‘Code coverage’ tools can help

No Documentation

I don’t understand what a function does.

No problem, use unit testing.

Write a test that checks any assumption
Look at the actual output
Fix the test
Repeat

Complex Interface

The function has 20 parameters.

Unit testing tests atomic units.

Add regression tests first
Try to extract atomic units and test them
- 20 one-boolean-parameter-functions: 40 tests
- One 20-boolean-parameter-function: ~1M tests
Add integration tests for the (important) use cases

Interoperation

The function needs to read a file.

def bad_function():
    with open("data.txt", "r") as file:
        x = file.read()
        return x + x

No problem, there are tools for that: ‘Mocking’

Mocking

Choose a return value yourself (Python 3).

from unittest import TestCase, mock
from bad_function import bad_function


class MockTest(TestCase):

    def test_returns_read_string_twice(self):
        """bad_function returns read string twice."""
        with mock.patch('builtins.open', 
                        mock.mock_open(read_data='abc')):
            self.assertEqual(bad_function(), 'abcabc')

Mocking in R

mockery package, with_mock

library(testthat)

m <- mock(1)
f <- function (x) summary(x)
with_mock(f = m, {
  expect_equal(f(iris), 1)
})

Side Effects

The function writes to a file.

def write_function(path, content):
    with open(path, "w") as file:
        file.write(content)

No problem. Just make sure

The file is removed after the test, no matter what!
Writing permissions are granted
In Python: use standard library tempfile

Images

The function creates an image

import pandas as pd

def plotting_function(image, data):
    pd.DataFrame(data).plot().get_figure().savefig(image)

Yes, problem. 😱

No easy platform independent way
to properly check the content of an image.
Minimalistic approach: check for no failure

Continuous Integration

Fully Automated

Run your tests with every commit

And get notified if any of them fails.

GitHub

GitHub Actions

Free and unlimited for public projects
Yet
Easy setup: pick a template, done (or almost)

GitLab

Build Pipelines

Bring your own runners
Or order them at GitLab.com

GitLab - Example

.gitlab-ci.yml

image: python:latest

variables:
  PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"

cache:
  paths:
    - venv/

before_script:
  - python -m venv venv/
  - source venv/bin/activate
  - pip install pytest

test:
  script:
    - pytest (...)

CI Servers

e.g., Jenkins
dedicated server for CI:
lint, test, deploy, report, …
hooked to Repository: commit notification
needs (advanced) maintenance

Conclusion

What for?

Correctness: it does what it is supposed to do
Constancy: it keeps on doing so
Clarity: it’s obvious

Additional Benefits

Confidence: put your mind at ease
Trust:
- invite others to use your code
- working in teams
Free Documentation
Takes the weight of further development

Questions?

Acknowledgements

Slides:
- Reveal.js, MIT
- Pandoc, GPLv2+, BSD 3-clause
Images:
- Bicycle, CC0 1.0
- GitLab, MIT
- Gear, PD
This talk is based on a talk by Manuel Weberndorfer
which was based on a talk by Uwe Schmitt.