💾File I/O (Input/Output)
Last updated
Last updated
In this tutorial, you'll learn about Python file operations. More specifically, opening a file, reading from it, writing into it, closing it, and various file methods that you should be aware of.
Files are named locations on disk to store related information. They are used to permanently store data in a non-volatile memory (e.g. hard disk).
Since Random Access Memory (RAM) is volatile (which loses its data when the computer is turned off), we use files for future use of the data by permanently storing them.
When we want to read from or write to a file, we need to open it first. When we are done, it needs to be closed so that the resources that are tied with the file are freed.
Hence, in Python, a file operation takes place in the following order:
Open a file
Read or write (perform operation)
Close the file
Python has a built-in open() function to open a file. This function returns a file object, also called a handle, as it is used to read or modify the file accordingly.
We can specify the mode while opening a file. In mode, we specify whether we want to read r , write w or append a to the file. We can also specify if we want to open the file in text mode or binary mode.
The default is reading in text mode. In this mode, we get strings when reading from the file.
On the other hand, binary mode returns bytes and this is the mode to be used when dealing with non-text files like images or executable files.
Unlike other languages, the character a does not imply the number 97 until it is encoded using ASCII (or other equivalent encodings).
Moreover, the default encoding is platform dependent. In windows, it is cp1252 but utf-8 in Linux.
So, we must not also rely on the default encoding or else our code will behave differently in different platforms.
Hence, when working with files in text mode, it is highly recommended to specify the encoding type.
When we are done with performing operations on the file, we need to properly close the file.
Closing a file will free up the resources that were tied with the file. It is done using the close() method available in Python.
Python has a garbage collector to clean up unreferenced objects but we must not rely on it to close the file.
This method is not entirely safe. If an exception occurs when we are performing some operation with the file, the code exits without closing the file.
A safer way is to use a try...finally block.
This way, we are guaranteeing that the file is properly closed even if an exception is raised that causes program flow to stop.
The best way to close a file is by using the with statement. This ensures that the file is closed when the block inside the with statement is exited.
We don't need to explicitly call the close() method. It is done internally.
In order to write into a file in Python, we need to open it in write w , append a or exclusive creation x mode.
We need to be careful with the w mode, as it will overwrite into the file if it already exists. Due to this, all the previous data are erased.
Writing a string or sequence of bytes (for binary files) is done using the write() method. This method returns the number of characters written to the file.
This program will create a new file named test.txt in the current directory if it does not exist. If it does exist, it is overwritten.
We must include the newline characters ourselves to distinguish the different lines.
To read a file in Python, we must open the file in reading r mode.
There are various methods available for this purpose. We can use the read(size) method to read in the size number of data. If the size parameter is not specified, it reads and returns up to the end of the file.
We can read the text.txt file we wrote in the above section in the following way:
We can see that the read() method returns a newline as '\n' . Once the end of the file is reached, we get an empty string on further reading.
We can change our current file cursor (position) using the seek() method. Similarly, the tell() method returns our current position (in number of bytes).
We can read a file line-by-line using a for loop. This is both efficient and fast.
In this program, the lines in the file itself include a newline character \n
. So, we use the end parameter of the print() function to avoid two newlines when printing.
Alternatively, we can use the readline()
method to read individual lines of a file. This method reads a file till the newline, including the newline character.
Lastly, the readlines()
method returns a list of remaining lines of the entire file. All these reading methods return empty values when the end of file (EOF) is reached.
There are various methods available with the file object. Some of them have been used in the above examples.
Here is the complete list of methods in text mode with a brief description:
If there are a large number of files to handle in our Python program, we can arrange our code within different directories to make things more manageable.
A directory or folder is a collection of files and subdirectories. Python has the os module that provides us with many useful methods to work with directories (and files as well).
We can get the present working directory using the getcwd() method of the os module.
This method returns the current working directory in the form of a string. We can also use the getcwdb() method to get it as bytes object.
The extra backslash implies an escape sequence. The print() function will render this properly.
We can change the current working directory by using the chdir() method.
The new path that we want to change into must be supplied as a string to this method. We can use both the forward-slash / or the backward-slash to separate the path elements.
It is safer to use an escape sequence when using the backward slash.
All files and sub-directories inside a directory can be retrieved using the listdir() method.
This method takes in a path and returns a list of subdirectories and files in that path. If no path is specified, it returns the list of subdirectories and files from the current working directory.
We can make a new directory using the mkdir() method.
This method takes in the path of the new directory. If the full path is not specified, the new directory is created in the current working directory.
The rename() method can rename a directory or a file.
For renaming any directory or file, the rename() method takes in two basic arguments: the old name as the first argument and the new name as the second argument.
A file can be removed (deleted) using the remove() method.
Similarly, the rmdir() method removes an empty directory.
Note : The rmdir() method can only remove empty directories.
In order to remove a non-empty directory, we can use the rmtree() method inside the shutil module.
In this tutorial, you will learn to parse, read and write JSON in Python with the help of examples. Also, you will learn to convert JSON to dict and pretty print it.
JSON (JavaScript Object Notation) is a popular data format used for representing structured data. It's common to transmit and receive data between a server and web application in JSON format.
In Python, JSON exists as a string. For example:
It's also common to store a JSON object in a file. To represent a JSON file, it must be saved with the .json file extension.
To work with JSON (string, or file containing JSON object), you can use Python's json
module. You need to import the module before you can use it.
The json
module makes it easy to parse JSON strings and files containing JSON object.
Example 1: Python JSON to dict
You can parse a JSON string using json.loads()
method. The method returns a dictionary.
Here, person is a JSON string, and person_dict is a dictionary.
Example 2: Python read JSON file
You can use json.load()
method to read a file containing JSON object.
Suppose, you have a file named person.json
which contains a JSON object.
Here's how you can parse this file:
Here, we have used the open()
function to read the json file. Then, the file is parsed using json.load()
method which gives us a dictionary named data.
You can convert a dictionary to JSON string using json.dumps()
method.
Example 3: Convert dict to JSON
Here's a table showing Python objects and their equivalent conversion to JSON.
Python
JSON Equivalent
dict
object
list, tuple
array
str
string
int, float, int
number
True
true
False
false
None
null
To write JSON to a file in Python, we can use json.dump()
method.
Example 4: Writing JSON to a file
In the above program, we have opened a file named person.txt
in writing mode using 'w'
. If the file doesn't already exist, it will be created. Then, json.dump()
transforms person_dict to a JSON string which will be saved in the person.txt
file.
When you run the program, the person.txt
file will be created. The file has following text inside it.
To analyze and debug JSON data, we may need to print it in a more readable format. This can be done by passing additional parameters indent
and sort_keys
to json.dumps()
and json.dump()
method.
Example 5: Python pretty print JSON
When you run the program, the output will be:
In the above program, we have used 4 spaces for indentation. And, the keys are sorted in ascending order. By the way, the default value of indent is None. And, the default value of sort_keys is False.
You can also define the separators, default value is (", ", ": "), which means using a comma and a space to separate each object, and a colon and a space to separate keys from values.
When you run the program, the output will be:
In this tutorial, we will learn how to read and write into CSV files in Python with the help of examples.
A CSV (Comma Separated Values) format is one of the most simple and common ways to store tabular data. To represent a CSV file, it must be saved with the .csv file extension.
Let's take an example:
If you open the above CSV file using a text editor such as sublime text, you will see:
As you can see, the elements of a CSV file are separated by commas. Here, ,
is a delimiter. You can have any single character as your delimiter as per your needs.
Note: The csv module can also be used for other file extensions (like: .txt) as long as their contents are in proper structure.
While we could use the built-in open()
function to work with CSV files in Python, there is a dedicated csv module that makes working with CSV files much easier.
Before we can use the methods to the csv module, we need to import the module first using:
To read a CSV file in Python, we can use the csv.reader()
function. Suppose we have a csv file named people.csv
in the current directory with the following entries.
Name
Age
Profession
Name
Age
Profession
Jack
23
Doctor
Miller
22
Engineer
Let's read this file using csv.reader()
:
Example 1: Read CSV Having Comma Delimiter
Here, we have opened the people.csv file in reading mode using:
Then, the csv.reader()
is used to read the file, which returns an iterable reader object.
The reader
object is then iterated using a for
loop to print the contents of each row.
In the above example, we are using the csv.reader()
function in default mode for CSV files having comma delimiter.
However, the function is much more customizable.
Suppose our CSV file was using tab as a delimiter. To read such files, we can pass optional parameters to the csv.reader()
function. Let's take an example.
Example 2: Read CSV file Having Tab Delimiter
Notice the optional parameter delimiter = '\t' in the above example.
The complete syntax of the csv.reader()
function is:
As you can see from the syntax, we can also pass the dialect parameter to the csv.reader()
function. The dialect parameter allows us to make the function more flexible.
To write to a CSV file in Python, we can use the csv.writer()
function.
The csv.writer()
function returns a writer object that converts the user's data into a delimited string. This string can later be used to write into CSV files using the writerow()
function. Let's take an example.
Example 3: Write to a CSV file
When we run the above program, a protagonist.csv file is created with the following content:
In the above program, we have opened the file in writing mode.
Then, we have passed each row as a list. These lists are converted to a delimited string and written into the CSV file.
Example 4: Writing multiple rows with writerows()
If we need to write the contents of the 2-dimensional list to a CSV file, here's how we can do it.
The output of the program is the same as in Example 3.
Here, our 2-dimensional list is passed to the writer.writerows()
method to write the content of the list to the CSV file.
Example 5: Writing to a CSV File with Tab Delimiter
Notice the optional parameter delimiter = '\t'
in the csv.writer()
function.
The complete syntax of the csv.writer()
function is:
Similar to csv.reader()
, you can also pass dialect parameter the csv.writer()
function to make the function much more customizable.
The objects of a csv.DictReader()
class can be used to read a CSV file as a dictionary.
Example 6: Python csv.DictReader()
Suppose we have the same file people.csv as in Example 1.
Name
Age
Profession
Jack
23
Doctor
Miller
22
Engineer
Let's see how csv.DictReader()
can be used.
As we can see, the entries of the first row are the dictionary keys. And, the entries in the other rows are the dictionary values.
Here, csv_file is a csv.DictReader()
object. The object can be iterated over using a for loop. The csv.DictReader()
returned an OrderedDict
type for each row. That's why we used dict()
to convert each row to a dictionary.
Notice that, we have explicitly used the dict()
method to create dictionaries inside the for loop.
Note: Starting from Python 3.8, csv.DictReader() returns a dictionary for each row, and we do not need to use dict()
explicitly.
The full syntax of the csv.DictReader() class is:
To learn more about it in detail, visit: Python csv.DictReader() class
The objects of csv.DictWriter()
class can be used to write to a CSV file from a Python dictionary.
The minimal syntax of the csv.DictWriter()
class is:
Here,
file - CSV file where we want to write to
fieldnames - a list object which should contain the column headers specifying the order in which data should be written in the CSV file
Example 7: Python csv.DictWriter()
The program creates a players.csv file with the following entries:
The full syntax of the csv.DictWriter() class is:
To learn more about it in detail, visit: Python csv.DictWriter() class
Pandas is a popular data science library in Python for data manipulation and analysis. If we are working with huge chunks of data, it's better to use pandas to handle CSV files for ease and efficiency.
Before we can use pandas, we need to install it. To learn more, visit: How to install Pandas?
Once we install it, we can import Pandas as:
To read the CSV file using pandas, we can use the read_csv()
function.
Here, the program reads people.csv from the current directory.
To write to a CSV file, we need to call the to_csv()
function of a DataFrame.
Here, we have created a DataFrame using the pd.DataFrame()
method. Then, the to_csv()
function for this object is called, to write into person.csv.