from IPython.core.display import HTML

HTML(open("custom.html", "r").read())
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Copyright (C) 2014-2023 Scientific IT Services of ETH Zurich,
Contributing Authors: Uwe Schmitt, Mikolaj Rybniski

2. String basics¶

Strings are defined using delimiters " or ' or """ or ''':

If you choose " as delimiter you may use ' in the string and the other way round.

print("hi, it's time to go")
hi, it's time to go
print('this is "a quote"')
this is "a quote"
long = """multi line string ...
it works"""

print(long)
multi line string ...
it works

The repr function gives us more detailed information (useful when debugging)

print(repr(long))
'multi line string ...\nit works'

Multi line comments in Python¶

# this is a single line comment

print(3)

"""
this is a multi line comment
the comment ends here
"""

print(4)
3
4

String "algebra":¶

print("3.1" + '41')
3.141
print(3 * "\o/ ")
\o/ \o/ \o/ 
print(len("12345"))
5

Creating strings using string interpolation (old fashioned)¶

String interpolation replaces placeholders such as %s by given values. The expression

template % args

creates a new string by replacing the placeholders in template provided by the value(s) in args:

name = "uwe"
greeting = "hi '%s' how do you do" % name
print(greeting)
hi 'uwe' how do you do

You can have multiple placeholders and arguments, but the number of placeholders and the number of arguments must be the same. For multiple arguments you have to use parenthesis as shown below:

a = 1
b = 2
output = '%s plus %s is %s' % (a, b, a + b)
print(output)
1 plus 2 is 3

There are many different placeholders in Python, e.g. for formatting floats with given precision:

import math

print("pi up to 3 digits is %.3f" % math.pi)
pi up to 3 digits is 3.142

More details at https://docs.python.org/3/library/stdtypes.html#string-formatting-operations

String formatting (more modern)¶

print("{} = {} + {}".format(a + b, a, b))
print("{2} = {0} + {1}".format(a, b, a+b))
print("{c} = {a} + {b}".format(a=a, b=b, c=a+b))
3 = 1 + 2
3 = 1 + 2
3 = 1 + 2
print("{pi:.3f}".format(pi=math.pi))
3.142

Many more options, see https://www.digitalocean.com/community/tutorials/how-to-use-string-formatters-in-python-3 and the cheat sheet at https://pyformat.info/

String formatting (very modern!)¶

Python 3.6 introduced so-called f-strings which can directly access variables and also evaluate expressions:

a = 3
b = 4
print(f"{a} + {b} = {a + b}")
3 + 4 = 7

You can also specify formats:

import math
print(f"pi with 3 digits is {math.pi:.3f}")
pi with 3 digits is 3.142
print(f"{a=}")
a=3

Recommendation: To support readable code I prefer format in case the expressions used in f-strings become complicated.

String methods¶

The word "method" is a term from the field of object oriented programming.

Many string operations are "attached" to string object.

Python strings are immutable ("const"). So string methods never change the string object in place. So for example the following upper method creates and returns a new string:

# transforms string "hello" to a new string:
greeting = "hello"
print(greeting.upper())
print(greeting)        # unchanged !
HELLO
hello

Method calls can be chained. For example this startswith method ...

print("hi you".startswith("hi"))
True

... can be called on the result of upper():

print("hi you".upper().startswith("HI"))
True

Overview of available string methods¶

Python offers many different string methods. To discover available methods jupyter supports autocompletion.

Type str. and then press the TAB key to see available string methods:

Some useful string methods:

  • count(substring) counts non overlapping occurrences of substring,
  • replace(a_string, b_string) replaces all occurrences of a_string by b_string,
  • lower() and upper() convert characters to upper resp. lower case,
  • strip() removes all white-spaces (space, tab and new line characters) from both ends of the string,
  • strip(characters) removes all single characters occurring in characters from both ends of the string,
  • lstrip() as strip() but only from the beginning of the string,
  • rstrip() as strip() but only from the end of the string,
  • startswith(txt) checks if the given strings starts with txt,
  • endswith(txt) checks if the given string ends with txt.
  • split(txt) (explained later in the chapter about lists).
  • join(..) (also explained later in the chapter about lists).

You can find a more complete list at https://www.shortcutfoo.com/app/dojos/python-strings/cheatsheet

String "slicing"¶

Use [..] for accessing parts of a string, counting start with 0.

print("Python"[1])
y

Negative indices start at the end, -1 is the last character, -2 the character before the last character and so on:

print("Python"[-2])
o

To access substrings we use the so calles slicing notation [m:n], the first value is the starting index, the secon one the end index, the end index is exclusive:

print("Python"[2:4])
th

Why exlusive upper limits ?

The following relations hold for slicing:

  1. len(a[n:m]) == m - n
  2. a[i:j] + a[j:k] == a[i:k].

Some other examples for slicing:

print("Python"[1:-1])
ytho

short forms:

print("Python"[:2])
Py
print("Python"[2:])
thon

Limits can be exeeded:

"abc"[1:5]
'bc'
"abc"[5:7]
''

Slicing also supports a third value for specifying a step size:

letters = "abcdefghijkl"
print(letters[1:10:3])
beh

A negative step size also works:

print(letters[4:2:-1])
ed
print(letters[::-1])
lkjihgfedcba

Strings are immutable¶

You can not modify a string in place, instead you have to create a new one!

so letters[3] = "F" will not work instead you have to write

letters = letters[:3] + "F" + letters[4:]
print(letters)
abcFefghijkl

Exercise session 2¶

Check questions¶

Try to forecast the values of the variables in the following snippet using pen and paper, use help(str.rstrip) or the internet for looking up the used methods. You will have to reread previous explanations.

Finally use Python to validate your results.

values = "012" * 3 + """'a'bc"""
a = values[:2] + values[0] + values[2:3:1]  + values[-1]
a = a.replace("c", "")
b = a + values[len(values) - 2].upper()
c = a.strip('0')
d = a.find("2")
e = "{2} / {1} / {0}".format(a[:3], a[3:5], a[5:])
f = f"b{a}"

Homework¶

  1. Speak the following sentence 10 times aloud:

    Indexing in Python starts with zero, upper limits are exclusive, negative indexes start at the end.

  1. Implement a simple encryption method: For a given string (e.g. "Python"):

    1. select all characters at even positions 0, 2, ... (here: "Pto")
    2. select all characters add odd postions 1, 3, ... (here: "yhn")
    3. append both results (here "Ptoyhn")
    4. revert the results (here "nhyotP")
    5. insert an "x" after the first character (here: "nxhyotP")
    6. finally append a "x" (here "nxhyotPx")

    The task is to ask the user for some input and print the encoded string. Check for some inputs if your result is correct.

Optional homework*¶

  1. Extend this method such that "h" and "t" are swapped. The result for the previous example would be "nxtyohPx". To facilitate this you can assume that the input does not contain "." characters.