from IPython.core.display import HTML
HTML(open("custom.html", "r").read())
The following example creates a text file containing two lines hi\n
and ho\n
:
fh = open("say_hi.txt", "w")
print("hi", file=fh)
print("ho", file=fh)
# always close a file because data may be in a buffer
# instead of actually beeing written to file:
fh.close()
open(path, mode)
returns an file object (file handle) which we use for manipulating the given file.mode
maybe "r"
, "w"
, "a"
for "reading", "writing" and "appending" text files (there are more if needed).file=fp
named parameter when calling print
redirects the output to the file.fp.close
closes the file.# this no python, but a jupyter feature to show the content
# of a file:
!cat say_hi.txt
hi ho
This "traditional way" is dangerous if you forget to close the file or if an error resumes program execution before the close
method is called !
Background: operating systems do not write immediately to disk if you call write
but collect data until an internal memory region (buffer) is filled. So you never now exactly what is still in the buffer and what is on disk. Only after closing or calling fp.flush()
you can be sure that your data is on disk.
with
:¶Since Python 2.5 the with
statement is supported. This statement executes the following body in a secure way, so that the file is always closed, even in case of an error inside the body.
with open("say_hi.txt", "w") as fh:
print("hi", file=fh)
print("ho", file=fh)
!cat say_hi.txt
hi ho
For text files there are two ways to read: readlines
returns the file line by line in a list of strings:
with open("say_hi.txt", "r") as fh:
print(fh.readlines())
['hi\n', 'ho\n']
Comment: there is also a method called readline
(no s
at the end !) which only reads one line. So take care the use the right method name.
A very convenient and readable feature of Python is that you can loop over the lines in a file using for
:
with open("say_hi.txt", "r") as fh:
for line in fh:
print(line)
hi ho
Why those empty lines ? We still can access the latest value of line
:
print(repr(line))
'ho\n'
So line
also contains the line break \n
from the file. And as print
automatically starts a new line when done, we get the empty extra lines.
To get rid of the \n
we can use the .rstrip
method of strings which removes trailing white-spaces (a white space is a character which "you don't see", like regular spaces, tabs, line breaks):
with open("say_hi.txt", "r") as fh:
for line in fh:
line = line.rstrip()
print(line)
hi ho
Performance tip: use the for
loop for iterating over a file. For huge files this only reads as much bytes as needed in every iteration and thus works for files which are larger than your computers memory !
Again: if you are not sure what an iterator produces you may use list
(unless the iterator is infinite):
with open("say_hi.txt", "r") as fh:
print(list(fh))
['hi\n', 'ho\n']
Working with multiple files at the same time:
with open("say_hi.txt", "r") as fh_in:
with open("say_hi_upper.txt", "w") as fh_out:
for line in fh_in:
print(line.rstrip().upper(), file=fh_out)
with open("say_hi.txt", "r") as fh_in, open("say_hi_upper.txt", "w"), as fh_out:
for line in fh_in:
print(line.rstrip().upper(), file=fh_out)
Cell In[12], line 1 with open("say_hi.txt", "r") as fh_in, open("say_hi_upper.txt", "w"), as fh_out: ^ SyntaxError: invalid syntax
!cat say_hi_upper.txt
HI HO
If you work with csv files, do not implement your own reader and writer, use the csv
module, there are some corner cases which are tricky (for example you have a cell which contains a "," or line-break "\n"), and there are some variations (dialects).
This is not covered in this script, you find instructions https://pymotw.com/3/csv/index.html!
Another option is to install pandas
, a library for handling so called "data frames". pandas
can read and write from / to multiple sources, like csv
and xlsx
files, but also tables from relational databases.
!cat data/test.csv
a,b 1,3 2,4 3,5
We can load this file as a data frame and print it, more about pandas
and data frames in later chapters:
import pandas as pd
data = pd.read_csv("data/test.csv", delimiter=",")
print(data)
a b 0 1 3 1 2 4 2 3 5
And we can write it in a different format:
data.to_csv("test2.csv", sep=";", index=False)
!cat test2.csv
a;b 1;3 2;4 3;5
For binary files, like images, the modes for opening are "rb"
, "wb"
and "ab"
(reading, writing and appending).
In addition to the methods we introduced above the file handle has methods read
and write
for interaction. These are mostly used for binary files and not for text files.
csv
module and use it to write a 10 x 10 multiplication table to a csv file