You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am runing pipreqsnb . which requires the incremental decoder function IncrementalDecoder from this lib, and it returns this error:
File "C:\Users\[USERNAME]\anaconda3\envs\pdfparser\Lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 124872: character maps to <undefined>
To resolve this, you need to specify an encoding that can handle a broader range of characters, such as utf-8, and also specify how to handle decoding errors. Here's how you can modify your IncrementalDecoder class to handle this:
class IncrementalDecoder(codecs.IncrementalDecoder):
def __init__(self, errors='ignore'):
super().__init__(errors=errors)
self.encoding = 'utf-8'
def decode(self, input, final=False):
try:
# Attempt to decode using utf-8return codecs.getdecoder(self.encoding)(input, errors=self.errors)[0]
except UnicodeDecodeError:
# If decoding fails, use charmap with error handlingreturn codecs.charmap_decode(input, errors=self.errors)[0]
But not really sure. Hopefully this solves my issue.
CPython versions tested on:
3.13
Operating systems tested on:
Windows
The text was updated successfully, but these errors were encountered:
I don't have time for checking whether this is a CPython issue or not as C:\Users\[USERNAME]\anaconda3\envs\pdfparser\Lib\encodings\cp1252.py hints that it's not from us though I don't know if it's pdfparser is just bundling our Lib/encodings/cp1252.py. as it might be an issue with how pipreqsnb actually calls the incremental decoder.
I can have a look at this on Sunday.
picnixz
changed the title
Charmap Error
Issue with IncrementalDecoder and pipreqsnbMar 19, 2025
I don't think this is a cpython issue. 0x81 is not a legal byte in a cp1252-encoded file. Either the data file has an error or pipreqsnb is in error in specifying that encoding. codecs.IncrementalDecoder just specifies the methods needed for each codec-specific incremental decoder. The incremental decoder for cp1252 should raise on 0x81 (and a few other bytes).
Bug report
Bug description:
I am runing
pipreqsnb .
which requires the incremental decoder functionIncrementalDecoder
from this lib, and it returns this error:To resolve this, you need to specify an encoding that can handle a broader range of characters, such as utf-8, and also specify how to handle decoding errors. Here's how you can modify your IncrementalDecoder class to handle this:
But not really sure. Hopefully this solves my issue.
CPython versions tested on:
3.13
Operating systems tested on:
Windows
The text was updated successfully, but these errors were encountered: