How to Convert Bytes to Int in Python?
Converting bytes to integers in Python involves interpreting a sequence of byte data as a numerical value. For example, if you have the byte sequence b’\x00\x01′, it can be converted to the integer 1.
Using int.from_bytes()
int.from_bytes() method is used to convert a byte object into an integer. It allows specifying the byte order (either ‘big’ or ‘little’) and whether the integer is signed or unsigned.
Syntax:
int.from_bytes(bytes, byteorder, *, signed=False)
Parameter:
- bytes: A byte object.
- byteorder :”big” (MSB first) or “little” (LSB first).
- signed : False (default) for unsigned integers, True for signed integers.
Returns: An int equivalent to the given bytes.
Example:
# Using Big-endian byte order
byte_val_1 = b'\x00\x01'
res_1 = int.from_bytes(byte_val_1, "big")
print(res_1)
# Using Little-endian byte order
byte_val_2 = b'\x00\x10'
res_2 = int.from_bytes(byte_val_2, "little")
print(res_2)
# Using Signed Integer Representation
byte_val_3 = b'\xfc\x00'
res_3 = int.from_bytes(byte_val_3, "big", signed=True)
print(res_3)
Output
1 4096 -1024
Explanation:
- Big-endian (b’\x00\x01′): The most significant byte (0x00) comes first, followed by 0x01. This gives the result 1 because 0x00 * 256 + 0x01 = 1.
- Little-endian (b’\x00\x10′): The least significant byte (0x10) comes first, followed by 0x00. In little-endian, 0x10 represents 4096 because 0x10 * 256 = 4096.
- Signed integer (b’\xfc\x00′): Interpreting the bytes as a signed integer gives -1024. Here, 0xfc represents a negative number in two’s complement (a representation for signed integers).
Table of Content
Using struct.unpack()
The struct.unpack() function is part of Python’s struct module. It unpacks a byte object into a tuple of values according to the format specified.
Syntax:
struct.unpack(format, bytes)
Parameters:
- format: A format string (‘>H’ for big-endian, <H for little-endian, >I for 4-byte integer, etc.).
- bytes : A byte object.
Returns: A tuple containing the unpacked integer.
Example:
import struct
# Unpacking a 2-byte unsigned short (big-endian)
byte_data = b'\x01\x02'
res_1 = struct.unpack('>H', byte_data)[0]
print(res_1)
# Unpacking a 4-byte unsigned int (little-endian)
byte_data = b'\x01\x00\x00\x00'
res_2 = struct.unpack('<I', byte_data)[0]
print(res_2)
# Unpacking a signed 4-byte integer (big-endian)
byte_data = b'\xff\xff\xff\xfc'
res_3 = struct.unpack('>i', byte_data)[0]
print(res_3)
Output
258 1 -4
Explanation:
- Big-endian (b’\x01\x02′): The format string ‘>H’ unpacks the two bytes as a big-endian unsigned short. 0x01 * 256 + 0x02 gives 258.
- Little-endian (b’\x01\x00\x00\x00′): The format string ‘<I’ unpacks the 4-byte integer as little-endian. The value is 0x01 in little-endian, which is 1.
- Signed integer (b’\xff\xff\xff\xfc’): The format string ‘>i’ unpacks the 4 bytes as a signed integer. The value is interpreted as -4 in two’s complement form.
Using numpy.frombuffer
If you’re working with large amounts of binary data, numpy provides a convenient method called frombuffer() to convert byte data into an array. It also allows you to specify the byte order and data type.
numpy.frombuffer(bytes, dtype)
Parameters:
- bytes: The byte object to be interpreted.
- dtype: The data type for the conversion (e.g., ‘>u2’ for big-endian unsigned short).
Returns: A numpy array containing the interpreted data.
import numpy as np
# Big-endian unsigned short (2 bytes)
byte_data = b'\x01\x02'
res_1 = np.frombuffer(byte_data, dtype='>u2')[0]
print(res_1)
# Big-endian signed int (4 bytes)
byte_data = b'\x01\x00\x00\x00'
res_2 = np.frombuffer(byte_data, dtype='>i4')[0]
print(res_2)
# Little-endian unsigned int (4 bytes)
byte_data = b'\x01\x00\x00\x00'
res_3 = np.frombuffer(byte_data, dtype='<u4')[0]
print(res_3)
Output
258 16777216 1
Explanation:
- Big-endian unsigned short (b’\x01\x02′): The byte sequence is interpreted as a big-endian unsigned short (‘>u2’), resulting in the value 258 because 0x01 * 256 + 0x02 = 258.
- Big-endian signed integer (b’\x01\x00\x00\x00′): The byte sequence is interpreted as a signed 4-byte integer (‘>i4’), giving the value 16777216.
- Little-endian unsigned integer (b’\x01\x00\x00\x00′): The byte sequence is interpreted as a little-endian unsigned integer (‘<u4’), which results in 1.