from IPython.core.display import HTML

HTML(open("custom.html", "r").read())

Script 8

Recap from script 7:

We learned about:

  • .append method for appending an item to a list
  • for loop over a list
  • index method for looking up the position of an element of a list
  • using in and not in for testing if a value is contained in a list
  • dictionaries represent look-up tables
  • dictionaries are declared like {1: 3, 4: 5}
  • we use [...] to read from a dictionary
  • and we can [...] on the left side of an assignment = to write to a dictionary
  • split method for strings splits a given string to a list of strings
  • using dictionaries for creating word or symbol histograms

Before you proceed

What we learned up to now:

  • functions "compute/do something", (eg len, math.sin, print)
  • a function has zero or more arguments: len has one argument, random.randint has two and print accepts zero or more arguments.
  • functions have return values

We will learn to write our own functions in this script. As preparation reread the sections "About Python functions" and "How function calls work" from the first script before you proceed. This is crucial to understand the explanations and concepts we introduce in this script !

Step 1 to understand functions

A function can be imagined as "recorded code" or as a macro (as in Excel etc).

To mark which code to "record" Python uses the def statement:

def my_first_function():
    print("you called me")
    
print("hi")
hi

The Python interpreter:

  • first recognizes def and knows that the following code block (only one line here) is the body of a function called my_first_function. (The def line + the following code block is called the function definition)
  • then skips this code block
  • finally continues executing code after the function definition, this is why you see the hi in the output.

To run this "recorded" code you use the name of the function plus ():

def my_first_function():
    print("you called me")
    
my_first_function()
you called me

To execute the function the parenthesis are mandatory !

How function call works: If you execute my_first_function() the Python interpreter remembers that we declared a function with name my_first_function in the first two lines of the script. Then code execution "jumps" into this function declaration and executes the body of the function. If this is finished program execution continues after the function call:

def my_first_function():
    print("you called me")
    
my_first_function()
print("nope")
my_first_function()
you called me
nope
you called me

Exercise block 1

  1. Type and execute the examples above.
  2. Start with a fresh script and do not cheat in the script to implement a function my_second_function which prints your first and then your second name in two lines. Call this function !

Step 2 to understand functions

Our function was not very useful up to now. The pair of parenthesis () said that the function does not expect any arguments.

Now we will see how to declare function arguments. We start with a function only taking one value and display a message including the doubled value:

def print_double(a):
    print(a, "times 2 is", 2 * a)

print_double(3)
3 times 2 is 6

What happens if you call print_double(3) ?

  • Python matches the argument 3 from the argument a from the functions definition.
  • Before the body of the function definition is executed the Python interpreter assigns a = 3 internally
  • Then the function body is executed as usual

A function may have multiple arguments:

def print_nice(message, decoration):
    print(decoration)
    print(decoration, message)
    print(decoration)

print_nice("Python rocks", "!!!!")
!!!!
!!!! Python rocks
!!!!

If you call print_nice("Python rocks", "!!!!") matching arguments results in assignments message = "Python rocks" and decoration = "!!!!" which displays the message you see above.

Exercise block 2

  1. Type and run the examples above.
  2. Start with a fresh script and do not cheat in the script to implement a function with two arguments: a first and a second name. The function then displays a message greeting the named person.

Step 3 to understand functions

Pythons built in functions we have seen up to now compute something, for example len takes a string as an argument and computes the return value which is the length of the given string.

For example the return value of math.cos(math.pi) is 0.0.

You can imagine that the return value replaces the function call in your code.:

For example if have an expression x = f(7) + g(3) and f(7) has return value 41 and g(3) has return value 1 the expression is transformed to x = 41 + 1 which evaluates as x = 42.

To test what our function print_double from above computes (returns) we call print_double with argument 3:

def print_double(a):
    print(a, "times 2 is", 2 * a)

value = print_double(3)
print(value)
3 times 2 is 6
None

You see that the return value of our function is None ? This is a special Python value indication "nothing" or "not defined".

In order to define the return value of our print_double function we need the return statement: our function will not only display something but compute the given value times 2:

def print_double(a):
    print(a, "times 2 is", 2 * a)
    return 2 * a

value = print_double(3)
print(value)
3 times 2 is 6
6

To understand the output:

If you call the function, the lines of the code block from the function definition are executed until the return statement, this displays 3 times 2 is 6 and defines the return value 2 * 3. So our function call in value = print_double(3) is transformed to value = 6.

return values vs printing results:

Read carefully: The displayed message 3 times 2 is 6 which you see above is not the return value it is just some "side effect" when the functions body is executed.

So the built in print function has some side effects but computes nothing (None):

x = print("hi you !")
print(x)
hi you !
None

Lets have a look at the following two functions, both compute the area of a circle:

import math

def area_1(r):
    return math.pi * r ** 2

def area_2(r):
    print(math.pi * r ** 2)

The first version returns the result, the second one only prints the result. This the first version using return computes a result which can be further used by the caller:

h = 2
r = 3

volume_cylinder = h * area_1(r)
print(volume_cylinder)
56.548667764616276

This code snippet does not work if we use the other function, area_2 has no return statement to pass back the result, thus the function returns None:

h = 2
r = 3

volume_cylinder = h * area_2(r)
print(volume_cylinder)
28.274333882308138
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-12-b104b2eab105> in <module>()
      2 r = 3
      3 
----> 4 volume_cylinder = h * area_2(r)
      5 print(volume_cylinder)

TypeError: unsupported operand type(s) for *: 'int' and 'NoneType'

You can see the output produced when calling area_2 but this value is not passed back to the caller. The caller just receives a None, thus evaluating h * None fails.

Whenever you want to reuse a value appearing in the body of a function in the "outside" code you have to return this value. "reusing" might be in a variable assignment or another function call as print.

Exercise block 3

  1. Type and execute the examples above
  2. Replace the four a which appear in the function definition of print_double by x. Do you expect that the function behaves different now ?
  3. Start with a fresh script and implement a function which computes the average of three given numbers.

More complex functions

The body of a function may be as complex and nested as code we implemented in the course before. The examples we've seen above had simple code blocks to focus on the important concepts.

An example for a more complex body of a function is:

def sign(x):
    if x > 0:
        result = 1
    elif x < 0:
        result = -1
    else: 
        result = 0
    return result

print(sign(3))
1

Another example is a function to compute the average of all numbers given as a list:

def average(numbers):
    if len(numbers) == 0:
        return None   # can't compute average of zero numbers
    result = 0
    for number in numbers:
        result += number
    return result / len(numbers)

print(average([1, 2, 3]))
2.0

Here we pass one single argument which is a list of three numbers (and not three arguments). So when the function body is executed, numbers is [1, 2, 3].

Exercise block 4

  1. Write a function collatz(n) which prints the values from the Collatz iteration as we introduced in script 4. The function should return the number of iterations needed to reach 1.
  2. Rewrite the plotting exercise (we plotted the start value on the x-axis, the number of iterations on the y-axis) to use this function.
  3. Write a script (without functions) which starts with a list and computes the maximum number of the numbers in the list. Don't use the built in max function from the last script. Assume that the list is not empty. So you can assume that the first element is the maximum, then iterate over the remaining elements and update the max value if needed.
  4. Transform the previous solution implementing a function which returns the maximal value. Hint: start with def maximum(li):, finally print(maximum([1, 2, 1]) should display 2 in the console.

More about return

A function body may have more than one return statement. As soon as a return is executed the return value of the function is defined and the execution of the functions body stops (the Python interpreter "jumps back" to the function call).

So we could rewrite sign as :

def sign(x):
    if x > 0:
        return 1
    elif x < 0:
        return -1
    else: 
        return 0

print(sign(3))
1

Another example is:

def is_prime(number):
    for test_value in range(2, number):
        if number % test_value == 0:
            return False
    return True

print(is_prime(13))
print(is_prime(14))
True
False
  • Here the given number number is tested if it can be divided by one of the values in the range 2 to number - 1. - If such a test succeeds, the return value is False (the given number can not be a prime number) and execution of the functions body ends.
  • If the for loop finishes without finding a divisor we know that we detected a prime number and we return True.

Some functions return "nothing" which is the meaning of None in Python. The built in functin print for example has not real return value:

return_value_of_print = print("hi")
print("return value of print is", return_value_of_print)
hi
return value of print is None

You see that print is executed (we see the hi), but it returns None.

If we write a plain return this is the same as return None:

def nope():
    return

x = nope()
print(x)
None

And if the body of a function ends without a return this also is the same as return None:

def nope_2():
    print("nope")
    
x = nope_2()
print(x)
nope
None

Exercise block 5

  1. Type and execute the examples above

  2. Modify the body of the function computing the maximum of a list of numbers to return None if you provide an empty list.

  3. (optional) write a function which computes the standard deviation of a given list of numbers. Take care: the std deviation of an empty list is None, and test with a list with only one element ! $$s = \sqrt{\frac{1}{N-1} \sum_{i=1}^N (x_i - \overline{x})^2}$$.

Variable scope

If you use a variable within he body of a function or as function arguments, these will not interfere with variables outside of the definition.

The arguments in the function declaration and variables in the function body are "temporary" during function execution (imagine the function is executed "as a fresh script").

For example argument names are not defined outside the function:

def add(a, b):
    return a + b

print(a)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-20-13da4c413e4d> in <module>()
      2     return a + b
      3 
----> 4 print(a)

NameError: name 'a' is not defined

And they are not defined after the function call:

print(add(1, 2))
print(a)
3
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-21-f99a03746c4b> in <module>()
      1 print(add(1, 2))
----> 2 print(a)

NameError: name 'a' is not defined

And last but not least they will not overwrite anything:

a = 3
def inc(a):
    a = a + 1
    return a

b = inc(4)
print(a)
3

Here a still has the old value 3 and not 4 as it has during execution of the functions body.

The concept behind this is called variable scope, which means that he the same name can refer to different "memory cells" depending on the execution context.

This scoping also applies for variables declared inside the body of a function:

added = -1

def avg(a, b, c):
    added = a + b + c
    return added / 3

print(added)
print(avg(1, 2, 3))
print(added)
-1
2.0
-1

Exercise block 6

  1. Type and execute the code examples above.

Functions as basic building blocks

Functions may call each other which makes them helpful to structure your code to become more readable and reusable.

def inc(a):
    return a + 1

def multiply(a, b):
    return a * b

def compute_something(a):
    return multiply(inc(a), a)

print(compute_something(6))
42

Exercise block 7

  1. Write a function min2 which returns the minimum of two given values using if. Don't use Pythons builtin min function for this.
  2. Write a function min3 which computes the minimum of three given numbers without branching with if by calling the function from the previous exercise.
  3. Write a function min4 which computes the minimum of four given numbers without branching with if by calling the functions from the previous exercises.
  4. (optional) write a function which detects if all numbers in a given list are prime numbers, reuse the function which decides if a given number is a prime number.
  5. (optional) write a function which detects if a list contains at least one prime number, reuse the function which decides if a given number is a prime number.

Decomposing a given problem into small functions allows testing the distinct parts individually before you compose then to the final solution.

We will exercise this by rewriting some solutions of previous exercises:

Exercise block 8: rewrite Rock-Paper-Scissors

  1. Write a function ask_user having no argument which asks the user for his move until the input is valid. The return value is the input. Test this function !
  2. Write a function computer_move() which returns a random computer move "R", "S" or "P". Test this function !
  3. Write a function detect_winner which expects two moves (two strings with length 1) and returns integer values -1, 0 or 1 indicating if the first player won, if its a tie or if the second player won. Test this function !
  4. Now rewrite your previous solution for the RSP game using these functions.

Exercise block 9: computing mass of a peptide

  1. To compute the molecular mass of a amino acid sequence complete the following code by implementing the missing functions:

    aa_to_mass = read_data("amino_acids.csv")
    print("mass of", valid_sequence, "is", compute_mass(valid_sequence, aa_to_mass))
    
    

    So

    • read_data returns a dictionary which maps one letter symbols of amino acids to their mass,
    • compute_mass finally computes the mass.
  2. (optional) Extend your code so that the following template works:

    aa_to_mass = read_data("amino_acids.csv")
    valid_sequence = ask_user_for_valid_sequence(aa_to_mass.keys())
    print("mass of", valid_sequence, "is", compute_mass(valid_sequence, aa_to_mass))
    
    

    So additionally

    • ask_user_for_valid(allowed) asks the user to input a sequence until it only contains characters from allowed.
    • implement a function is_valid(sequence, allowed) first. This function takes a string and a list and checks if all characters in sequence appear in allowed.

    Hint: to implement is_valid iterate over all characters in sequence. You can stop looping and return False as soon as you detect the first invalid symbol.

  3. Write a function compute_masses(fasta_file_name, aa_to_mass) which computes the mass of every sequence in a given FASTA file. Reuse the functions from the previous example.

  4. Extend this so that compute_masses(fasta_file_name, csv_file_name, aa_to_mass) creates a two column csv file having status lines in one column and the corresponding masses in the second column.