-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
locale.getlocale() returns a non RFC1766 language code #82986
Comments
It seems that something with windows 10, python 3.8, or both changed where According to the documentation: https://docs.python.org/3/library/locale.html?highlight=locale%20getlocale#locale.getlocale , the language code should be in RFC1766 format: Language-Tag = Primary-tag *( "-" Subtag ) but in python 3.8, I am getting a language code that doesn't meet RFC1766 specs: PS C:\Users\auror> py -3
Python 3.8.0 (tags/v3.8.0:fa919fd, Oct 14 2019, 19:37:50) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import platform; platform.platform()
'Windows-10-10.0.18362-SP0'
>>> import locale; locale.getlocale(); locale.getdefaultlocale()
('English_United States', '1252')
('en_US', 'cp1252')
>>> on the same machine, with python 3.7.4: PS C:\Python37> .\python.exe
Python 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import platform; platform.platform()
'Windows-10-10.0.18362-SP0'
>>> import locale; locale.getlocale(); locale.getdefaultlocale()
(None, None)
('en_US', 'cp1252')
>>> also interesting that the encoding is different in py3.8 between these issues might be related, but stuff found hwen searching for 'locale' bugs: https://bugs.python.org/issue26024 |
Not really "strange results" - fact is, now "getlocale()" returns the locale name *as if* it were already set from the beginnning (because it is, at least in part). Before: >>> import locale # Python 3.7, new shell
>>> locale.getlocale()
(None, None)
>>> locale.setlocale(locale.LC_ALL, '') # say Hi from Italy
'Italian_Italy.1252'
>>> locale.getlocale()
('Italian_Italy', '1252') now: >>> import locale # Python 3.8, new shell
>>> locale.getlocale()
('Italian_Italy', '1252') As for why returned locale names are "a little different" in Windows, I found no better explanation that Eryk Sun's essays in https://bugs.python.org/issue37945. Long story short, it's not even a bug anymore... it's a hot mess and it won't be solved anytime soon. What *is* changed, though, is that now Python on Windows appears to set the locale, implicitly, right from the start. >>> import locale # Python 3.8, new shell
>>> locale.getlocale()
('Italian_Italy', '1252')
>>> locale.localeconv()
{'int_curr_symbol': '', 'currency_symbol': '', 'mon_decimal_point': '', 'mon_thousands_sep': '', 'mon_grouping': [], 'positive_sign': '', 'negative_sign': '', 'int_frac_digits': 127, 'frac_digits': 127, 'p_cs_precedes': 127, 'p_sep_by_space': 127, 'n_cs_precedes': 127, 'n_sep_by_space': 127, 'p_sign_posn': 127, 'n_sign_posn': 127, 'decimal_point': '.', 'thousands_sep': '', 'grouping': []} As you can see, we have an Italian locale only in the name: the conventions are still those of the default C locale. >>> locale.setlocale(locale.LC_ALL, '')
'Italian_Italy.1252'
>>> locale.localeconv()
{'int_curr_symbol': 'EUR', 'currency_symbol': '€', ... ... } ... now we enjoy a real Italian locale - pizza, pasta, gelato and all. What happened? >>> locale.setlocale(locale.LC_CTYPE, '') # Python 3.7
'Italian_Italy.1252'
>>> locale.getlocale()
('Italian_Italy', '1252') ...and that's because locale.getlocale() with no arguments default, wait for it, to getlocale(category=LC_CTYPE), as documented! So, why Python 3.8 now pre-sets LC_CTYPE on Windows? Apparently, bpo-34485 is part of the ongoing shakespearian feud between Victor Stinner and the Python locale code. If you squint hard enough, you will see the answer here: https://vstinner.github.io/locale-bugfixes-python3.html but at this point, I don't know if anyone still keeps the score. To sum up:
|
I stumpled upon this issue. Currently the doc still says What is the working way to get the default locale code from Windows as |
There are lots of entries missing in the locale_alias table, which you can inspect here for Python 3.11: https://github.com/python/cpython/blob/3.11/Lib/locale.py#L779 italian_italy is just one of those, and I know many, many more, like french_belgoum, dutch_aruba, dutch_belgium, dutch_netherlands I'm not sure why the 'English_United States' lookup fails. The comment above the table says that underscores and dashes are removed before lookup, but the handling of spaces is a bit obscure to me. |
Is this just a docs issue now? |
I think it's a doc issue and how to fix this has been addressed in #82986 (comment) (namely indicating the behaviour on Windows). However I'm not a locale expert so I can't say for sure it'd be sufficient. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: