💾File I/O (Input/Output)

In this tutorial, you'll learn about Python file operations. More specifically, opening a file, reading from it, writing into it, closing it, and various file methods that you should be aware of.

Files

Files are named locations on disk to store related information. They are used to permanently store data in a non-volatile memory (e.g. hard disk).

Since Random Access Memory (RAM) is volatile (which loses its data when the computer is turned off), we use files for future use of the data by permanently storing them.

When we want to read from or write to a file, we need to open it first. When we are done, it needs to be closed so that the resources that are tied with the file are freed.

Hence, in Python, a file operation takes place in the following order:

Open a file
Read or write (perform operation)
Close the file

Opening Files in Python

Python has a built-in open() function to open a file. This function returns a file object, also called a handle, as it is used to read or modify the file accordingly.

>>> f = open("test.txt")	# open file in current directory
>>> f = open("C:/Python38/README.txt") # specifying full path

We can specify the mode while opening a file. In mode, we specify whether we want to read r , write w or append a to the file. We can also specify if we want to open the file in text mode or binary mode.

The default is reading in text mode. In this mode, we get strings when reading from the file.

On the other hand, binary mode returns bytes and this is the mode to be used when dealing with non-text files like images or executable files.

f = open("test.txt")	# equivalent to 'r' or 'rt'
f = open("test.txt",'w') # write in text mode
f = open("img.bmp",'r+b') # read and write in binary mode

Unlike other languages, the character a does not imply the number 97 until it is encoded using ASCII (or other equivalent encodings).

Moreover, the default encoding is platform dependent. In windows, it is cp1252 but utf-8 in Linux.

So, we must not also rely on the default encoding or else our code will behave differently in different platforms.

Hence, when working with files in text mode, it is highly recommended to specify the encoding type.

f = open("test.txt", mode='r', encoding='utf-8')

Closing Files in Python

When we are done with performing operations on the file, we need to properly close the file.

Closing a file will free up the resources that were tied with the file. It is done using the close() method available in Python.

Python has a garbage collector to clean up unreferenced objects but we must not rely on it to close the file.

f = open("test.txt", encoding = 'utf-8')
# perform file operations
f.close()

This method is not entirely safe. If an exception occurs when we are performing some operation with the file, the code exits without closing the file.

A safer way is to use a try...finally block.

try:
    f = open("test.txt", encoding = 'utf-8')
    # perform file operations
finally:
    f.close()

This way, we are guaranteeing that the file is properly closed even if an exception is raised that causes program flow to stop.

The best way to close a file is by using the with statement. This ensures that the file is closed when the block inside the with statement is exited.

We don't need to explicitly call the close() method. It is done internally.

with open("test.txt", encoding = 'utf-8') as f:
    # perform file operations

Writing to Files in Python

In order to write into a file in Python, we need to open it in write w , append a or exclusive creation x mode.

We need to be careful with the w mode, as it will overwrite into the file if it already exists. Due to this, all the previous data are erased.

Writing a string or sequence of bytes (for binary files) is done using the write() method. This method returns the number of characters written to the file.

with open("test.txt",'w',encoding = 'utf-8') as f:
    f.write("my first file\n")
    f.write("This file\n\n")
    f.write("contains three lines\n")

This program will create a new file named test.txt in the current directory if it does not exist. If it does exist, it is overwritten.

We must include the newline characters ourselves to distinguish the different lines.

Reading Files in Python

To read a file in Python, we must open the file in reading r mode.

There are various methods available for this purpose. We can use the read(size) method to read in the size number of data. If the size parameter is not specified, it reads and returns up to the end of the file.

We can read the text.txt file we wrote in the above section in the following way:

>>> f = open("test.txt",'r',encoding = 'utf-8')
>>> f.read(4)	# read the first 4 data
'This'

>>> f.read(4)	# read the next 4 data
' is '

>>> f.read()	# read in the rest till end of file
 'my first file\nThis file\ncontains three lines\n'

>>> f.read() # further reading returns empty sting
 ''

We can see that the read() method returns a newline as '\n' . Once the end of the file is reached, we get an empty string on further reading.

We can change our current file cursor (position) using the seek() method. Similarly, the tell() method returns our current position (in number of bytes).

>>> f.tell()	# get the current file position
56

>>> f.seek(0)	# bring file cursor to initial position
0

>>> print(f.read()) # read the entire file
This is my first file
This file
contains three lines

We can read a file line-by-line using a for loop. This is both efficient and fast.

>>> for line in f:
...	print(line, end = '')
...
This is my first file
This file
contains three lines

In this program, the lines in the file itself include a newline character \n . So, we use the end parameter of the print() function to avoid two newlines when printing.

Alternatively, we can use the readline() method to read individual lines of a file. This method reads a file till the newline, including the newline character.

>>> f.readline()
 'This is my first file\n'

>>> f.readline()
 'This file\n'

>>> f.readline()
 'contains three lines\n'

>>> f.readline()
 ''

Lastly, the readlines() method returns a list of remaining lines of the entire file. All these reading methods return empty values when the end of file (EOF) is reached.

>>> f.readlines()
['This is my first file\n', 'This file\n', 'contains three lines\n']

Python File Methods

There are various methods available with the file object. Some of them have been used in the above examples.

Here is the complete list of methods in text mode with a brief description:

Python Directory and Files Management

Python Directory

If there are a large number of files to handle in our Python program, we can arrange our code within different directories to make things more manageable.

A directory or folder is a collection of files and subdirectories. Python has the os module that provides us with many useful methods to work with directories (and files as well).

Get Current Directory

We can get the present working directory using the getcwd() method of the os module.

This method returns the current working directory in the form of a string. We can also use the getcwdb() method to get it as bytes object.

>>> import os

>>> os.getcwd()
 'C:\\Program Files\\PyScripter'

>>> os.getcwdb()
 b'C:\\Program Files\\PyScripter'

The extra backslash implies an escape sequence. The print() function will render this properly.

>>> print(os.getcwd())
C:\Program Files\PyScripter

Changing Directory

We can change the current working directory by using the chdir() method.

The new path that we want to change into must be supplied as a string to this method. We can use both the forward-slash / or the backward-slash to separate the path elements.

It is safer to use an escape sequence when using the backward slash.

>>> os.chdir('C:\\Python33')

>>> print(os.getcwd())
 C:\Python33

List Directories and Files

All files and sub-directories inside a directory can be retrieved using the listdir() method.

This method takes in a path and returns a list of subdirectories and files in that path. If no path is specified, it returns the list of subdirectories and files from the current working directory.

>>> print(os.getcwd())
 C:\Python33

>>> os.listdir()
 ['DLLs',
 'Doc',
 'include',
 'Lib',
 'libs',
 'LICENSE.txt',
 'NEWS.txt',
 'python.exe',
 'pythonw.exe',
 'README.txt',
 'Scripts',
 'tcl',
 'Tools']

>>> os.listdir('G:\\')
 ['$RECYCLE.BIN',
 'Movies',
 'Music',
 'Photos',
 'Series',
 'System Volume Information']

Making a New Directory

We can make a new directory using the mkdir() method.

This method takes in the path of the new directory. If the full path is not specified, the new directory is created in the current working directory.

>>> os.mkdir('test')

>>> os.listdir()
 ['test']

Renaming a Directory or a File

The rename() method can rename a directory or a file.

For renaming any directory or file, the rename() method takes in two basic arguments: the old name as the first argument and the new name as the second argument.

>>> os.listdir()
 ['test']

>>> os.rename('test','new_one')

>>> os.listdir()
 ['new_one']

Removing Directory or File

A file can be removed (deleted) using the remove() method.

Similarly, the rmdir() method removes an empty directory.

>>> os.listdir()
 ['new_one', 'old.txt']

>>> os.remove('old.txt')
>>> os.listdir()
 ['new_one']

>>> os.rmdir('new_one')
>>> os.listdir()
 []

Note : The rmdir() method can only remove empty directories.

In order to remove a non-empty directory, we can use the rmtree() method inside the shutil module.

>>> os.listdir()
 ['test']
3
4 >>> os.rmdir('test')
5 Traceback (most recent call last):
6 ...
7 OSError: [WinError 145] The directory is not empty: 'test'
8
9 >>> import shutil
10
11 >>> shutil.rmtree('test')
12 >>> os.listdir()
13 []

Click for further reading on Working With Files in Python.

Python JSON

In this tutorial, you will learn to parse, read and write JSON in Python with the help of examples. Also, you will learn to convert JSON to dict and pretty print it.

JSON (JavaScript Object Notation) is a popular data format used for representing structured data. It's common to transmit and receive data between a server and web application in JSON format.

In Python, JSON exists as a string. For example:

p = '{"name": "Bob", "languages": ["Python", "Java"]}'

It's also common to store a JSON object in a file. To represent a JSON file, it must be saved with the .json file extension.

Import JSON Module

To work with JSON (string, or file containing JSON object), you can use Python's json module. You need to import the module before you can use it.

import json

Parse JSON in Python

The json module makes it easy to parse JSON strings and files containing JSON object.

Example 1: Python JSON to dict

You can parse a JSON string using json.loads() method. The method returns a dictionary.

import json

person = '{"name": "Bob", "languages": ["English", "Fench"]}'
person_dict = json.loads(person)

# Output: {'name': 'Bob', 'languages': ['English', 'Fench']}
print( person_dict)

# Output: ['English', 'French']
print(person_dict['languages'])

Here, person is a JSON string, and person_dict is a dictionary.

Example 2: Python read JSON file

You can use json.load() method to read a file containing JSON object.

Suppose, you have a file named person.json which contains a JSON object.

{
    "name": "Bob", 
    "languages": ["English", "Fench"]
}

Here's how you can parse this file:

import json

with open('path_to_file/person.json') as f:
  data = json.load(f)

# Output: {'name': 'Bob', 'languages': ['English', 'Fench']}
print(data)

Here, we have used the open() function to read the json file. Then, the file is parsed using json.load() method which gives us a dictionary named data.

Python Convert to JSON string

You can convert a dictionary to JSON string using json.dumps() method.

Example 3: Convert dict to JSON

import json

person_dict = {
    'name': 'Bob',
    'age': 12,
    'children': None
}
person_json = json.dumps(person_dict)

# Output: {"name": "Bob", "age": 12, "children": null}
print(person_json)

Here's a table showing Python objects and their equivalent conversion to JSON.

Python

JSON Equivalent

dict

object

list, tuple

array

str

string

int, float, int

number

True

true

False

false

None

null

Writing JSON to a file

To write JSON to a file in Python, we can use json.dump() method.

Example 4: Writing JSON to a file

import json

person_dict = {
    "name": "Bob",
    "languages": ["English", "Fench"],
    "married": True,
    "age": 32
}

with open('person.txt', 'w') as json_file:
  json.dump(person_dict, json_file)

In the above program, we have opened a file named person.txt in writing mode using 'w'. If the file doesn't already exist, it will be created. Then, json.dump() transforms person_dict to a JSON string which will be saved in the person.txt file.

When you run the program, the person.txt file will be created. The file has following text inside it.

{"name": "Bob", "languages": ["English", "Fench"], "married": true, "age": 32}

Python pretty print JSON

To analyze and debug JSON data, we may need to print it in a more readable format. This can be done by passing additional parameters indent and sort_keys to json.dumps() and json.dump() method.

Example 5: Python pretty print JSON

import json

person_string = '{"name": "Bob", "languages": "English", "numbers": [2, 1.6, null]}'

# Getting dictionary
person_dict = json.loads(person_string)

# Pretty Printing JSON string back
print(json.dumps(person_dict, indent = 4, sort_keys=True))

When you run the program, the output will be:

{
    "languages": "English",
    "name": "Bob",
    "numbers": [
        2,
        1.6,
        null
    ]
}

In the above program, we have used 4 spaces for indentation. And, the keys are sorted in ascending order. By the way, the default value of indent is None. And, the default value of sort_keys is False.

You can also define the separators, default value is (", ", ": "), which means using a comma and a space to separate each object, and a colon and a space to separate keys from values.

import json

x = {
  "name": "John",
  "age": 30,
  "married": True,
  "divorced": False,
  "children": ("Ann","Billy"),
  "pets": None,
  "cars": [
    {"model": "BMW 230", "mpg": 27.5},
    {"model": "Ford Edge", "mpg": 24.1}
  ]
}

# use . and a space to separate objects, and a space, a = and a space to separate keys from their values:
print(json.dumps(x, indent=4, separators=(". ", " = ")))

When you run the program, the output will be:

{
    "name" = "John".
    "age" = 30.
    "married" = true.
    "divorced" = false.
    "children" = [
        "Ann".
        "Billy"
    ].
    "pets" = null.
    "cars" = [
        {
            "model" = "BMW 230".
            "mpg" = 27.5
        }.
        {
            "model" = "Ford Edge".
            "mpg" = 24.1
        }
    ]
}

Python CSV

In this tutorial, we will learn how to read and write into CSV files in Python with the help of examples.

A CSV (Comma Separated Values) format is one of the most simple and common ways to store tabular data. To represent a CSV file, it must be saved with the .csv file extension.

Let's take an example:

If you open the above CSV file using a text editor such as sublime text, you will see:

SN, Name, City
1, Michael, New Jersey
2, Jack, California

As you can see, the elements of a CSV file are separated by commas. Here, , is a delimiter. You can have any single character as your delimiter as per your needs.

Note: The csv module can also be used for other file extensions (like: .txt) as long as their contents are in proper structure.

Working with CSV files in Python

While we could use the built-in open() function to work with CSV files in Python, there is a dedicated csv module that makes working with CSV files much easier.

Before we can use the methods to the csv module, we need to import the module first using:

import csv

Reading CSV files Using csv.reader()

To read a CSV file in Python, we can use the csv.reader() function. Suppose we have a csv file named people.csv in the current directory with the following entries.

Name

Age

Profession

Name

Age

Profession

Jack

Doctor

Miller

Engineer

Let's read this file using csv.reader():

Example 1: Read CSV Having Comma Delimiter

import csv
with open('people.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)
```
Output
```
['Name', 'Age', 'Profession']
['Jack', '23', 'Doctor']
['Miller', '22', 'Engineer']

Here, we have opened the people.csv file in reading mode using:

with open('people.csv', 'r') as file:
    .. .. ...

Then, the csv.reader() is used to read the file, which returns an iterable reader object.

The reader object is then iterated using a for loop to print the contents of each row.

In the above example, we are using the csv.reader() function in default mode for CSV files having comma delimiter.

However, the function is much more customizable.

Suppose our CSV file was using tab as a delimiter. To read such files, we can pass optional parameters to the csv.reader() function. Let's take an example.

Example 2: Read CSV file Having Tab Delimiter

import csv
with open('people.csv', 'r',) as file:
    reader = csv.reader(file, delimiter = '\t')
    for row in reader:
        print(row)

Notice the optional parameter delimiter = '\t' in the above example.

The complete syntax of the csv.reader() function is:

csv.reader(csvfile, dialect='excel', **optional_parameters)

As you can see from the syntax, we can also pass the dialect parameter to the csv.reader() function. The dialect parameter allows us to make the function more flexible.

Writing CSV files Using csv.writer()

To write to a CSV file in Python, we can use the csv.writer() function.

The csv.writer() function returns a writer object that converts the user's data into a delimited string. This string can later be used to write into CSV files using the writerow() function. Let's take an example.

Example 3: Write to a CSV file

import csv
with open('protagonist.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerow(["SN", "Movie", "Protagonist"])
    writer.writerow([1, "Lord of the Rings", "Frodo Baggins"])
    writer.writerow([2, "Harry Potter", "Harry Potter"])

When we run the above program, a protagonist.csv file is created with the following content:

SN,Movie,Protagonist
1,Lord of the Rings,Frodo Baggins
2,Harry Potter,Harry Potter

In the above program, we have opened the file in writing mode.

Then, we have passed each row as a list. These lists are converted to a delimited string and written into the CSV file.

Example 4: Writing multiple rows with writerows()

If we need to write the contents of the 2-dimensional list to a CSV file, here's how we can do it.

import csv
csv_rowlist = [["SN", "Movie", "Protagonist"], [1, "Lord of the Rings", "Frodo Baggins"],
               [2, "Harry Potter", "Harry Potter"]]
with open('protagonist.csv', 'w') as file:
    writer = csv.writer(file)
    writer.writerows(csv_rowlist)

The output of the program is the same as in Example 3.

Here, our 2-dimensional list is passed to the writer.writerows() method to write the content of the list to the CSV file.

Example 5: Writing to a CSV File with Tab Delimiter

import csv
with open('protagonist.csv', 'w') as file:
    writer = csv.writer(file, delimiter = '\t')
    writer.writerow(["SN", "Movie", "Protagonist"])
    writer.writerow([1, "Lord of the Rings", "Frodo Baggins"])
    writer.writerow([2, "Harry Potter", "Harry Potter"])

Notice the optional parameter delimiter = '\t' in the csv.writer() function.

The complete syntax of the csv.writer() function is:

csv.writer(csvfile, dialect='excel', **optional_parameters)

Similar to csv.reader(), you can also pass dialect parameter the csv.writer() function to make the function much more customizable.

Python csv.DictReader() Class

The objects of a csv.DictReader() class can be used to read a CSV file as a dictionary.

Example 6: Python csv.DictReader()

Suppose we have the same file people.csv as in Example 1.

Name

Age

Profession

Jack

Doctor

Miller

Engineer

Let's see how csv.DictReader() can be used.

import csv
with open("people.csv", 'r') as file:
    csv_file = csv.DictReader(file)
    for row in csv_file:
        print(dict(row))

{'Name': 'Jack', ' Age': ' 23', ' Profession': ' Doctor'}
{'Name': 'Miller', ' Age': ' 22', ' Profession': ' Engineer'}

As we can see, the entries of the first row are the dictionary keys. And, the entries in the other rows are the dictionary values.

Here, csv_file is a csv.DictReader() object. The object can be iterated over using a for loop. The csv.DictReader() returned an OrderedDict type for each row. That's why we used dict() to convert each row to a dictionary.

Notice that, we have explicitly used the dict() method to create dictionaries inside the for loop.

print(dict(row))

Note: Starting from Python 3.8, csv.DictReader() returns a dictionary for each row, and we do not need to use dict() explicitly.

The full syntax of the csv.DictReader() class is:

csv.DictReader(file, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds)

To learn more about it in detail, visit: Python csv.DictReader() class

Python csv.DictWriter() Class

The objects of csv.DictWriter() class can be used to write to a CSV file from a Python dictionary.

The minimal syntax of the csv.DictWriter() class is:

csv.DictWriter(file, fieldnames)

Here,

file - CSV file where we want to write to
fieldnames - a list object which should contain the column headers specifying the order in which data should be written in the CSV file

Example 7: Python csv.DictWriter()

import csv

with open('players.csv', 'w', newline='') as file:
    fieldnames = ['player_name', 'fide_rating']
    writer = csv.DictWriter(file, fieldnames=fieldnames)

    writer.writeheader()
    writer.writerow({'player_name': 'Magnus Carlsen', 'fide_rating': 2870})
    writer.writerow({'player_name': 'Fabiano Caruana', 'fide_rating': 2822})
    writer.writerow({'player_name': 'Ding Liren', 'fide_rating': 2801})

The program creates a players.csv file with the following entries:

player_name,fide_rating
Magnus Carlsen,2870
Fabiano Caruana,2822
Ding Liren,2801

The full syntax of the csv.DictWriter() class is:

csv.DictWriter(f, fieldnames, restval='', extrasaction='raise', dialect='excel', *args, **kwds)

To learn more about it in detail, visit: Python csv.DictWriter() class

Using the Pandas library to Handle CSV files

Pandas is a popular data science library in Python for data manipulation and analysis. If we are working with huge chunks of data, it's better to use pandas to handle CSV files for ease and efficiency.

Before we can use pandas, we need to install it. To learn more, visit: How to install Pandas?

Once we install it, we can import Pandas as:

import pandas as pd

To read the CSV file using pandas, we can use the read_csv() function.

import pandas as pd
pd.read_csv("people.csv")

Here, the program reads people.csv from the current directory.

To write to a CSV file, we need to call the to_csv() function of a DataFrame.

import pandas as pd

# creating a data frame
df = pd.DataFrame([['Jack', 24], ['Rose', 22]], columns = ['Name', 'Age'])

# writing data frame to a CSV file
df.to_csv('person.csv')

Here, we have created a DataFrame using the pd.DataFrame() method. Then, the to_csv() function for this object is called, to write into person.csv.

PreviousErrors & Exceptions NextWeek 5

Last updated 2 years ago