Example solutions for script 05_sequences

Exercise 2.2

txt = input("gimme some text: ")
if txt == txt.upper():
    print("all upper case !")
gimme some text: ABC
all upper case !

Exercise 2.3

seq = input("please enter sequence: ")
seq = seq.upper().replace(" ", "")

gc_count = seq.count("G") + seq.count("C")
print("GC content is", round(100 * gc_count / len(seq), 2), "%")
please enter sequence: tgc a A g CcC
GC content is 66.67 %

Exercise 2.4

seq = input("please enter sequence: ")

if seq.count("A") + seq.count("T") + seq.count("G") + seq.count("C") == len(seq):
    print("the sequence is valid")
else:
    print("the sequence contains invalid symbols")
please enter sequence: AGGZ
the sequence contains invalid symbols

Exercise 2.5

while True:
    
    seq = input("please enter sequence: ")

    if seq.count("A") + seq.count("T") + seq.count("G") + seq.count("C") == len(seq):
        break
    else:
        print("the sequence contains invalid symbols, try again !")
        
gc_count = seq.count("G") + seq.count("C")
print("GC content is", round(100 * gc_count / len(seq), 2), "%")
please enter sequence: AGGZ
the sequence contains invalid symbols, try again !
please enter sequence: AGGC
GC content is 75.0 %

Exercise 3.3

The program produces the same result, because during the iterations once found_invalid_pair is set to False it only will be overwritten by False again, there is no line in our code which would reset the value to True.

Using break we save some iterations.

Exercise 3.4

text = "racecar"
text_reversed = ""
for i in range(len(text)):
    text_reversed = text[i] + text_reversed
    
if text == text_reversed:
    print(text, "is a palindrome")
else:
    print(text, "is not a palindrome")
racecar is a palindrome

Exercise 3.5

text = "this is an example text"
space_count = 0
for i in range(len(text)):
    if text[i] == " ":
        space_count += 1
        
print(space_count, "spaces in given text")
4 spaces in given text

Exercise 3.6

sequence = "AGCCCGCAGC"
for i in range(len(sequence) - 1):
    if sequence[i:i+2] == "GC":
        print("found GC at position", i)
found GC at position 1
found GC at position 5
found GC at position 8

An solution without slices is:

sequence = "AGCCCGCAGC"
for i in range(len(sequence) - 1):
    if sequence[i] == "G" and sequence[i + 1] == "C":
        print("found GC at position", i)
found GC at position 1
found GC at position 5
found GC at position 8

Take care not to use range(len(sequence)) here because [i + 1] could exceed the string and thus cause an error message.

Question:

  • introduce this mistake and test with inputs GC and CG what do you observe ? Why does the error only occur once ?

Exercise 3.7

seq = input("give me a sequence: ")

reverse_complement = ""

for i in range(len(seq)):
    
    symbol = seq[i]
    
    if symbol == "G":
        complement = "C"
    elif symbol == "C":
        complement = "G"
    elif symbol == "T":
        complement = "A"
    elif symbol == "A":
        complement = "T"
    else:
        print("found invalid symbol", symbol, "at position", i)
    
    reverse_complement = complement + reverse_complement
    
print("the reverse complement of", seq, "is", reverse_complement)
    

Exercise 4.5

Similar to x += y which is short for x = x + y, x -= y is the same as x = x - y. We use both in the following solution.

We propose two solutions for the inverse transform because rotation by -13 is the same as by +13. Why is this the case ?

plaintext = "ENCRYPTION WITH PYTHON"

shift = 13

encrypted = ""
for i in range(len(plaintext)):
    c = plaintext[i]
    code = ord(c)
       
    if 'A' <= c <= 'Z':
        code += 13
        if code > ord('Z'):
            code = code - 26
            
    encrypted = encrypted + chr(code)
    
print("rot", shift, "encryption of", repr(plaintext), "is", repr(encrypted))

# either we shift by +13 to compute the inverse transform 

decrypted = ""
for i in range(len(encrypted)):
    c = encrypted[i]
    code = ord(c)
    if 'A' <= c <= 'Z':
        code += 13
        if code > ord('Z'):
            code = code - 26
            
    decrypted = decrypted + chr(code)
    
print("rot", shift, "decryption of", repr(encrypted), "is", repr(decrypted))

# or we shift by -13:

decrypted = ""
for i in range(len(encrypted)):
    c = encrypted[i]
    code = ord(c)
    if 'A' <= c <= 'Z':
        code -= 13
        if code < ord('A'):
            code = code + 26
            
    decrypted = decrypted + chr(code)
    
print("rot", shift, "decryption of", repr(encrypted), "is", repr(decrypted))
rot 13 encryption of 'ENCRYPTION WITH PYTHON' is 'RAPELCGVBA JVGU CLGUBA'
rot 13 decryption of 'RAPELCGVBA JVGU CLGUBA' is 'ENCRYPTION WITH PYTHON'
rot 13 decryption of 'RAPELCGVBA JVGU CLGUBA' is 'ENCRYPTION WITH PYTHON'