Reading binary files in Python
Reading binary files means reading data that is stored in a binary format, which is not human-readable. Unlike text files, which store data as readable characters, binary files store data as raw bytes.
Binary files store data as a sequence of bytes. Each byte can represent a wide range of values, from simple text characters to more complex data structures like images, videos and executable programs.
Table of Content
Different Modes for Binary Files in Python
When working with binary files in Python, there are specific modes we can use to open them:
- ‘rb’: Read binary – Opens the file for reading in binary mode.
- ‘wb’: Write binary – Opens the file for writing in binary mode.
- ‘ab’: Append binary – Opens the file for appending in binary mode.
Opening and Closing Binary Files
To work with files in Python, we use the open() function to open a file and the close() method to close it. Using with statement ensures that the file is properly closed after its suite finishes.
Opening Files
We can open a file using the open() function, which takes the file path and mode as arguments.
# Using 'with' to open and read a binary file
with open('example.bin', 'rb') as f:
# No need to manually close the file
Closing Files
If we do not use with statement, we need to manually close the file using the close() method to ensure that all resources are released.
# Opening a file and closing it manually
f = open('example.bin', 'rb')
try:
bin = f.read()
finally:
f.close()
In the above example, the finally block ensures that the file is closed even if an error occurs during file operations.
Output:
b'\x00\nNotoSanSha\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00LWFNGWp1\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
Reading Binary Files in Python
Using the open() Function with Binary Mode
The open() function is used to open files in Python. When dealing with binary files, we need to specify the mode as ‘rb’ (read binary).
Example:
# Opening a binary file in 'rb' mode
f = open('example.bin', 'rb')
# Perform operations
bin = f.read()
print(bin)
# Closing the file
f.close()
Reading Binary file line by line
By using readlines() method we can read all lines in a file. However, in binary mode, it returns a list of lines, each ending with a newline byte (b’\n’).
Example:
# Reading all lines in a binary file using readlines()
with open('example.bin', 'rb') as f:
lines = f.readlines()
for i in lines:
print(i)
Output:
b"b'\\x00\\nNotoSanSha\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\\r\n"
b'x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00LWFNGWp1\\'
Reading Binary File in Chunks
Reading a binary file in chunks is useful when dealing with large files that cannot be read into memory all at once. This uses read(size) method which reads up to size bytes from the file. If the size is not specified, it reads until the end of the file.
Example:
# Reading a binary file in chunks
size = 1024 # Define the chunk size
with open('example.bin', 'rb') as f:
while True:
chunk = f.read(size)
if not chunk:
break
# Process the chunk (for demonstration, we'll print it)
print(chunk)
Output:
b"b'\\x00\\nNotoSanSha\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\\r\nx00
\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00LWFNGWp1\\"
Related Article
Python | Convert String to bytes
Reading binary files in Python – FAQs
How to Read a Python Binary File
To read a binary file in Python, you use the
open()
function with the'rb'
mode, which stands for “read binary.” This approach ensures that the file is read as is, without any transformations that might occur if the file were opened in text mode.with open('file.bin', 'rb') as file:
binary_data = file.read()
print(binary_data) # This will print the binary content of the file
How to Convert Binary Data to Readable Format in Python
Converting binary data to a readable format often depends on the type of data stored in the binary. For simple data like strings, you can directly decode the binary using an appropriate character encoding:
text = binary_data.decode('utf-8') # Decoding binary data to a string using UTF-8 encoding
print(text)For complex data structures, you might need to use modules like
struct
to unpack the binary data properly.
What is the Difference Between seek()
and tell()
Methods?
seek()
Method: This method is used to change the current file position in a file stream. Theseek()
method takes a parameter that specifies the position (in bytes) in the file to move to. This is useful for “jumping” to a certain part of the file to start reading or writing from there.tell()
Method: This method is used to find out the current position in a file. When you calltell()
, it returns an integer that represents the current position of the file pointer in bytes.with open('file.bin', 'rb') as file:
print(file.tell()) # Initially, it will print '0' as the file pointer is at the beginning
file.seek(10) # Move the pointer to 10 bytes from the beginning
print(file.tell()) # Now it will print '10'
How to Convert Binary File to Text
Converting a binary file to text involves reading the binary file and then decoding it to a string using the appropriate encoding, as binary files can contain anything from image data to encoded text.
with open('file.bin', 'rb') as file:
binary_data = file.read()
text_data = binary_data.decode('utf-8') # Ensure the encoding matches the file content
print(text_data)
How to Make a Binary File Readable
Making a binary file readable can mean two things: converting it to a text format (as shown above) or interpreting its contents properly based on its structure. For non-text data (like images or compiled data), you would typically use a specific library or tool that understands the format:
- For images, you might use a library like PIL/Pillow to read and display the image.
- For serialized objects, you might use
pickle
to deserialize them.Here’s an example using
pickle
to deserialize and make readable a binary file containing Python objects:import pickle
with open('data.pkl', 'rb') as file:
data = pickle.load(file)
print(data) # Assuming the data is a dictionary, list, etc.These methods cover various scenarios involving binary files, from direct reading and displaying as text to more complex deserialization or decoding depending on the specific type of binary data.