class: center, middle, inverse # Clean and maintanable code
source: gettyimages.com
--- class: center, middle, inverse # Goals of Programming and Priorities --- # The three laws of informatics (source: https://medium.com/@schemouil/rust-and-the-three-laws-of-informatics-4324062b322b) --- # The three laws of informatics (source: https://medium.com/@schemouil/rust-and-the-three-laws-of-informatics-4324062b322b) ### 1. Programs must be correct. --- # The three laws of informatics (source: https://medium.com/@schemouil/rust-and-the-three-laws-of-informatics-4324062b322b) ### 1. Programs must be correct. ### 2. Programs must be maintainable, except where it would conflict with the First Law. --- # The three laws of informatics (source: https://medium.com/@schemouil/rust-and-the-three-laws-of-informatics-4324062b322b) ### 1. Programs must be correct. ### 2. Programs must be maintainable, except where it would conflict with the First Law. ### 3. Programs must be efficient (**enough**), except where it would conflict with the First or Second Law. --- # 1. Correctness 1. A program produces correct results for **regular input**. --- # 1. Correctness 1. A program produces correct results for **regular input**. 2. A program produces consistent results for **corner case input**. --- # 1. Correctness 1. A program produces correct results for **regular input**. 2. A program produces consistent results for **corner case input**. 3. A program detects **invalid input** and reports it. --- # 1. Correctness 1. A program produces correct results for **regular input**. 2. A program produces consistent results for **corner case input**. 3. A program detects **invalid input** and reports it. 4. A Program **does not crash**. --- # 2. Maintainability 1. Making changes is not a struggle. --- # 2. Maintainability 1. Making changes is not a struggle. 2. It is easy to find and fix bugs. --- # 2. Maintainability 1. Making changes is not a struggle. 2. It is easy to find and fix bugs. 3. Others can understand your program. --- # 3. Efficiency 1. Not part of this presentation. --- # 3. Efficiency 1. Not part of this presentation. 2. What is fast enough? --- # 3. Efficiency 1. Not part of this presentation. 2. What is fast enough? 3. Don't guess, measure first (use a **profiler**). --- # 3. Efficiency 1. Not part of this presentation. 2. What is fast enough? 3. Don't guess, measure first (use a **profiler**). 4. Optimization often needs good knowledge of programming language. --- # 3. Efficiency 1. Not part of this presentation. 2. What is fast enough? 3. Don't guess, measure first (use a **profiler**). 4. Optimization often needs good knowledge of programming language. 5. Optimization often needs good knowledge of algorithms and data structures. --- class: middle, center, inverse # Maintainability: # 1. General recommendations --- # General recommendations ## Understand what you do. --- # General recommendations ## Understand what you do. - Plan your program. --- # General recommendations ## Understand what you do. - Plan your program. - Don't program by coincidence. --- # General recommendations ## Understand what you do. - Plan your program. - Don't program by coincidence. - Understand bugs before you try to fix them. --- # General recommendations ## Understand what you do. - Plan your program. - Don't program by coincidence. - Understand bugs before you try to fix them. - Rethink what you do. --- # General recommendations ## Understand what you do. - Plan your program. - Don't program by coincidence. - Understand bugs before you try to fix them. - Rethink what you do. - Be brave to trash your program and start from scratch. --- # General recommendations, continued ## Some principles -
D.R.Y.
: Don't repeat yourself (reduce code duplication). --- # General recommendations, continued ## Some principles -
D.R.Y.
: Don't repeat yourself (reduce code duplication). -
K.I.S.S.
: Keep it simple, stupid = avoid unnecessary complexity. --- # General recommendations, continued ## Some principles -
D.R.Y.
: Don't repeat yourself (reduce code duplication). -
K.I.S.S.
: Keep it simple, stupid = avoid unnecessary complexity. - Separate configuration and code, e.g. no hard coded file names in the script. --- class: middle, center, inverse # Maintainability: # 2. The power of readable code --- class: middle, center
source: https://i.imgflip.com/3dkn21.jpg
--- # Readability: Reduce mental overhead - Little scrolling needed. --- # Readability: Reduce mental overhead - Little scrolling needed. - Keep related code together. --- # Readability: Reduce mental overhead - Little scrolling needed. - Keep related code together. - Use empty lines to structure code. --- # Readability: Reduce mental overhead - Little scrolling needed. - Keep related code together. - Use empty lines to structure code. - Avoid very long lines. --- # Readability: Reduce mental overhead - Little scrolling needed. - Keep related code together. - Use empty lines to structure code. - Avoid very long lines. - Reduce nesting. --- # Readability: Reduce mental overhead - Little scrolling needed. - Keep related code together. - Use empty lines to structure code. - Avoid very long lines. - Reduce nesting. - Choose a code style and use it. --- # Code Style Example: Pythons PEP 8 style guide --- # Code Style Example: Pythons PEP 8 style guide Most important points from
PEP 8
: - Space after `,`, spaces around `+`, `*`, ... --- # Code Style Example: Pythons PEP 8 style guide Most important points from
PEP 8
: - Space after `,`, spaces around `+`, `*`, ... - Maximal line length: 80 characters. --- # Code Style Example: Pythons PEP 8 style guide Most important points from
PEP 8
: - Space after `,`, spaces around `+`, `*`, ... - Maximal line length: 80 characters. - Indent by multiples of four spaces --- # Code Style Example: Pythons PEP 8 style guide Most important points from
PEP 8
: - Space after `,`, spaces around `+`, `*`, ... - Maximal line length: 80 characters. - Indent by multiples of four spaces - `snake_case` for functions and variable names. --- # Code Style Example: Pythons PEP 8 style guide Most important points from
PEP 8
: - Space after `,`, spaces around `+`, `*`, ... - Maximal line length: 80 characters. - Indent by multiples of four spaces - `snake_case` for functions and variable names. - `CamelCase` for class names. --- class: middle # Other style guides - R:
http://adv-r.had.co.nz/Style.html
- Java: https://google.github.io/styleguide/javaguide.html, https://www.oracle.com/technetwork/java/codeconventions-150003.pdf - Matlab:
https://www.mathworks.com/matlabcentral/fileexchange/2529-matlab-programming-style-guidelines
--- class: middle # Code style: tools Tools to reformat your code: - Python: `Black`. *PyCharm* has builtin code formatter, Visual Studio Code can use `Black`. - R: `formatR`, `styler`. *RStudio* has builtin code formatter. - Java: `google-java-format`, *Eclipse*, *IntelliJ*. - Matlab: https://github.com/davidvarga/MBeautifier --- class: center, middle # Code Styles ## Don't discuss details, do your work! --- class: middle, center, inverse # Maintainability: # 3. The power of good names --- class: center, middle
source: https://i.imgflip.com/3dlj31.jpg
--- ## Use names to reveal intention Is this program correct? ```python def compute(a, b): return a + b ``` --- ## Use names to reveal intention Is this program correct? ```python def compute(a, b): return a + b ``` With appropriate names: ```python def area_rectangle(height, width): return height + width ``` --- ## Good names reduce comments ```python time_tolerance = 60 # in seconds ``` --- ## Good names reduce comments ```python time_tolerance = 60 # in seconds ``` ```python time_tolerance_in_seconds = 60 ``` --- ## Good names reduce comments ```python time_tolerance = 60 # in seconds ``` ```python time_tolerance_in_seconds = 60 ``` https://en.wikipedia.org/wiki/Mars_Climate_Orbiter#/Cause_of_failure ($327.6 million lost)
source: Wikipedia
--- ## Good names reduce comments ```python mem_used = mem_used / 1073741824 # bytes to gb ``` --- ## Good names reduce comments ```python mem_used = mem_used / 1073741824 # bytes to gb ``` Introduce a constant for such "magic numbers": ```python BYTES_PER_GB = 1073741824 ... mem_used = mem_used / BYTES_PER_GB ``` --- ## Good names reduce comments ```python mem_used = mem_used / 1073741824 # bytes to gb ``` Introduce a constant for such "magic numbers": ```python BYTES_PER_GB = 1073741824 ... mem_used = mem_used / BYTES_PER_GB ``` - Use capital letters for constants. - Constants also reduce *undetected* typing errors. --- ## Names should not encode types ```python names_list = read_names_as_list() for name in names_list: ... ``` --- ## Names should not encode types ```python names_list = read_names_as_list() for name in names_list: ... ``` ```python names = read_names() for name in names: ... ``` --- ## Names should not encode types ```python names_list = read_names_as_list() for name in names_list: ... ``` ```python names = read_names() for name in names: ... ``` If types change you will have to change your code, else your code will lie! --- ## Explanatory variables to express intent ```python if taste_score > 30: print("I like this") ``` --- ## Explanatory variables to express intent ```python if taste_score > 30: print("I like this") ``` ```python is_sweet = (taste_score > 30) if is_sweet: print("I like this") ``` --- ## Explanatory variables to reduce duplication ```python if width * height > 30: print("area", width * height, "is too large.") ``` --- ## Explanatory variables to reduce duplication ```python if width * height > 30: print("area", width * height, "is too large.") ``` ```python area = width * height if area > 30: print("area", area, "is too large.") ``` Now imagine more complicated expressions! --- class: middle ## If good names are not enough: write comments! - Don't comment the obvious. - Only comment the unexpected. - If you use a solution from the internet: add a comment with the link. - Check if your comment is understandable for others. --- class: middle ## **NOT LIKE THIS** ```python # open file: fh = open("sequece.txt") # read all lines: lines = fh.readlines() # compute number of sequences: number_of_sequences = len(lines) ``` --- class: middle, center, inverse # Maintainability: # 4. The power of writing functions --- class: middle, center
source: https://i.imgflip.com/3dljx0.jpg
--- ## Write many functions - D.R.Y: Functions avoid code duplication --- ## Write many functions - D.R.Y: Functions avoid code duplication - Readability: Functions can reduce nesting level --- ## Write many functions - D.R.Y: Functions avoid code duplication - Readability: Functions can reduce nesting level - Good function names reduce comments --- ## Write many functions - D.R.Y: Functions avoid code duplication - Readability: Functions can reduce nesting level - Good function names reduce comments - Well written functions simplify code structure --- ## Write many functions - D.R.Y: Functions avoid code duplication - Readability: Functions can reduce nesting level - Good function names reduce comments - Well written functions simplify code structure - Well written functions support re-usability --- ## Write many functions - D.R.Y: Functions avoid code duplication - Readability: Functions can reduce nesting level - Good function names reduce comments - Well written functions simplify code structure - Well written functions support re-usability - Well written functions support testability --- ## A function should only do "one thing"! ```python def ask_and_check_if_prime(): number = int(input("give me a number: ")) divisor = 2 while divisor * divisor <= number: if number % divisor == 0: print(number, "is not prime") return divisor += 1 print(number, "is prime") ``` --- ## A function should only do "one thing"! ```python def ask_and_check_if_prime(): number = int(input("give me a number: ")) divisor = 2 while divisor * divisor <= number: if number % divisor == 0: print(number, "is not prime") return divisor += 1 print(number, "is prime") ``` ```python def is_prime(number): divisor = 2 while divisor * divisor <= number: if number % divisor == 0: return False divisor += 1 return True def ask_and_check_if_prime(): number = int(input("give me a number: ")) if is_prime(number): print(number, "is prime") else: print(number, "is not prime") ``` --- ## A function should have few arguments ```python def my_algorithm(input_data, n_iter, epsilon, tau, variant): ... ``` --- ## A function should have few arguments ```python def my_algorithm(input_data, n_iter, epsilon, tau, variant): ... ``` ```python def my_algorithm(input_data, configuration): n_iter = configuration["n_iter"] epsilon = configuration["epsilon"] tau = configuration["tau"] variant = configuration["variant"] ... ``` --- ## A function should have few arguments ```python distance = distance_3d(1, 2, 3, 3, 2, 1) ``` --- ## A function should have few arguments ```python def distance_3d(x0, y0, z0, x1, y1, z1): dx = x1 - x0 dy = y1 - y0 dz = z1 - z0 return (dx ** 2 + dy ** 2 + dz ** 2) ** .5 distance = distance_3d(1, 2, 3, 3, 2, 1) ``` --- ## A function should have few arguments ```python def distance_3d(x0, y0, z0, x1, y1, z1): dx = x1 - x0 dy = y1 - y0 dz = z1 - z0 return (dx ** 2 + dy ** 2 + dz ** 2) ** .5 distance = distance_3d(1, 2, 3, 3, 2, 1) ``` ```python from collections import namedtuple Point3D = namedtuple("Point3D", "x,y,z") def distance_3d(p1, p2): dx = p2.x - p1.x dy = p2.y - p1.y dz = p2.z - p1.z return (dx ** 2 + dy ** 2 + dz ** 2) ** .5 distance = distance_3d(Point3D(1, 2, 3), Point3D(3, 2, 1)) ``` --- ## A function should have few arguments ```python def distance_3d(x0, y0, z0, x1, y1, z1): dx = x1 - x0 dy = y1 - y0 dz = z1 - z0 return (dx ** 2 + dy ** 2 + dz ** 2) ** .5 distance = distance_3d(1, 2, 3, 3, 2, 1) ``` ```python from collections import namedtuple Point3D = namedtuple("Point3D", "x,y,z") def distance_3d(p1, p2): dx = p2.x - p1.x dy = p2.y - p1.y dz = p2.z - p1.z return (dx ** 2 + dy ** 2 + dz ** 2) ** .5 distance = distance_3d(Point3D(1, 2, 3), Point3D(3, 2, 1)) ``` - R has named vectors and lists - Python 3.7 has also `dataclass`es - Else: use objects --- ## The code of a function should operate on the same level of abstraction Avoid long functions with sections! Instead: ```python def workflow(): configuration = read_configuration() data = read_data(configuration) results = process(data, configuration) write_result(results, configuration) ``` --- ## The code of a function should operate on the same level of abstraction Avoid long functions with sections! Instead: ```python def workflow(): configuration = read_configuration() data = read_data(configuration) results = process(data, configuration) write_result(results, configuration) ``` Later we refine: ```python def read_data(configuration): input_path = configuration['input_file'] if input_path.endswith(".csv"): return read_csv(input_path) elif input_path.endswith(".xlsx"): return read_xlsx(input_path) else: raise NotImplementedError('no not know how to read {}'.format(input_file)) ``` --- ## The code of a function should operate on the same level of abstraction Avoid long functions with sections! Instead: ```python def workflow(): configuration = read_configuration() data = read_data(configuration) results = process(data, configuration) write_result(results, configuration) ``` Later we refine: ```python def read_data(configuration): input_path = configuration['input_file'] if input_path.endswith(".csv"): return read_csv(input_path) elif input_path.endswith(".xlsx"): return read_xlsx(input_path) else: raise NotImplementedError('no not know how to read {}'.format(input_file)) ``` Benefit: - Clear expression of underlying approach. - Easier navigation within code (dive deep where you want to). --- ## Return early ```python def is_prime(number): if number >= 1: ... ... ... else: return False ``` --- ## Return early ```python def is_prime(number): if number >= 1: ... ... ... else: return False ``` Better: ```python def is_prime(number): if number < 1: return False ... ... ... ``` --- ## Return early ```python def is_prime(number): if number >= 1: ... ... ... else: return False ``` Better: ```python def is_prime(number): if number < 1: return False ... ... ... ``` Benefits: - reduces indentation - reduces scrolling --- ## Use break / continue early ```python for value in values: if value > 0: # a long code section follows ... ... ... ``` --- ## Use break / continue early ```python for value in values: if value > 0: # a long code section follows ... ... ... ``` Better: ```python for value in values: if value <= 0: continue # a long code section follows ... ... ... ``` --- class: middle, center, inverse # Maintainability: # 4. Be Consistent --- ## Be consistent - Consistent naming (e.g. either `index` or `idx`, not both) --- ## Be consistent - Consistent naming (e.g. either `index` or `idx`, not both) - Consistent physical units --- ## Be consistent - Consistent naming (e.g. either `index` or `idx`, not both) - Consistent physical units - Consistent style --- ## Be consistent - Consistent naming (e.g. either `index` or `idx`, not both) - Consistent physical units - Consistent style - Consistent error handling (`return None` vs `raise ...`) **NOT LIKE THIS:** ```python def lookup_pubchem(keyword): ... if not_found: return None ... def lookup_kegg(keyword): ... if not_found: raise ValueError("not found") ... ```` --- class: center, middle Robert C Martin: Clean Code
source: https://www.oreilly.com
--- class: inverse # **Take home messages**: --- class: inverse # **Take home messages**: ## - Understand what you do! --- class: inverse # **Take home messages**: ## - Understand what you do! ## - Write with style! --- class: inverse # **Take home messages**: ## - Understand what you do! ## - Write with style! ## - Think about good names --- class: inverse # **Take home messages**: ## - Understand what you do! ## - Write with style! ## - Think about good names ## - Write many good functions --- class: inverse # **Take home messages**: ## - Understand what you do! ## - Write with style! ## - Think about good names ## - Write many good functions ## - Be consistent --- class: inverse # **Take home messages**: ## - Understand what you do! ## - Write with style! ## - Think about good names ## - Write many good functions ## - Be consistent ## - Exercise and automate best practices --- class: center, middle, inverse # Questions?
source: https://i.imgflip.com/3dlo92.jpg
--- class: center, middle, inverse # Thanks for your attention!