Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http.client._MAXHEADERS = 100 limit no longer sufficient #131724

Open
jmacdone opened this issue Mar 25, 2025 · 3 comments
Open

http.client._MAXHEADERS = 100 limit no longer sufficient #131724

jmacdone opened this issue Mar 25, 2025 · 3 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@jmacdone
Copy link

jmacdone commented Mar 25, 2025

Bug report

Bug description:

This hard-coded sanity check for HTTP response headers is no longer sufficient to fetch a Microsoft 365 page.

_MAXHEADERS = 100

I do have a case open with Microsoft Support (TrackingID#2503130010002871), but it's not getting much traction as it's not causing problems with full web browsers.

Steps to reproduce:

import http.client
con = http.client.HTTPSConnection('outlook.office365.com')
con.request("GET", "/owa/example.edu")  # any domain seems to trigger
r = con.getresponse()

And that throws a HTTPException

>>> con = http.client.HTTPSConnection('outlook.office365.com')
>>> con.request("GET", "/owa/foo.bar")
>>> r = con.getresponse()
Traceback (most recent call last):
  File "<python-input-8>", line 1, in <module>
    r = con.getresponse()
  File "C:\Users\jmacdone\AppData\Local\Programs\Python\Python313-arm64\Lib\http\client.py", line 1428, in getresponse
    response.begin()
    ~~~~~~~~~~~~~~^^
  File "C:\Users\jmacdone\AppData\Local\Programs\Python\Python313-arm64\Lib\http\client.py", line 350, in begin
    self.headers = self.msg = parse_headers(self.fp)
                              ~~~~~~~~~~~~~^^^^^^^^^
  File "C:\Users\jmacdone\AppData\Local\Programs\Python\Python313-arm64\Lib\http\client.py", line 248, in parse_headers
    headers = _read_headers(fp)
  File "C:\Users\jmacdone\AppData\Local\Programs\Python\Python313-arm64\Lib\http\client.py", line 226, in _read_headers
    raise HTTPException("got more than %d headers" % _MAXHEADERS)
http.client.HTTPException: got more than 100 headers
>>>

It seems to be just spilling over with 101 headers. Though, not consistently. Presumably it depends upon which load balancer node is responding.

$ curl --silent -D - 'https://outlook.office365.com/owa/example.edu' | grep -E "^[a-zA-Z-]+: " | wc -l

returns with 96, 99, 101, etc. headers, depending on Microsoft's mood unknown factors.

For background, it's common to use https://outlook.com/example.edu as a domain hint ("smart link") to go directly to a tenant's identity provider and avoid the "Please provide your email address" step. We have a nagios check for that, which broke recently as the number of Set-Cookie: OpenIdConnect.token.[...] variants continues to grow.

CPython versions tested on:

3.13, 3.9, 3.11

Operating systems tested on:

Linux, Windows

@jmacdone jmacdone added the type-bug An unexpected behavior, bug, or error label Mar 25, 2025
@StanFromIreland
Copy link
Contributor

This seems to only restrict HTTPSConnection and HTTPConnection is not affected.

@picnixz picnixz added the stdlib Python modules in the Lib dir label Mar 28, 2025
@jmacdone
Copy link
Author

Just to help clarify, picking a new arbitrary limit allows getresponse() to complete without an exception.

>>> import http.client
>>> http.client._MAXHEADERS = 120
>>> con = http.client.HTTPSConnection('outlook.office365.com')
>>> con.request("GET", "/owa/example.edu")  # any domain seems to trigger
>>> r = con.getresponse()
>>> r.read()
b'<html><head><title>Object moved</title> [...]'

@StanFromIreland I don't have an explicit test case, but I think this would affect both classes. HTTPSConnection.getresponse() is inherited from HTTPConnection.getresponse()

class HTTPSConnection(HTTPConnection):

@vstinner
Copy link
Member

Maybe we should add a parameter to customize this limit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants