Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datetime.datetime.strptime get day error #84417

Open
zhanying mannequin opened this issue Apr 9, 2020 · 13 comments
Open

datetime.datetime.strptime get day error #84417

zhanying mannequin opened this issue Apr 9, 2020 · 13 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@zhanying
Copy link
Mannequin

zhanying mannequin commented Apr 9, 2020

BPO 40236
Nosy @abalkin, @ericvsmith, @karlcow, @pganssle, @akulakov
PRs
  • bpo-40236: add strptime 0th week day test #30318
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2020-04-09.09:55:31.623>
    labels = ['3.7', '3.8', 'type-bug', 'library', '3.9']
    title = 'datetime.datetime.strptime get day error'
    updated_at = <Date 2022-01-01.15:45:08.343>
    user = 'https://bugs.python.org/zhanying'

    bugs.python.org fields:

    activity = <Date 2022-01-01.15:45:08.343>
    actor = 'andrei.avk'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2020-04-09.09:55:31.623>
    creator = 'zhanying'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 40236
    keywords = ['patch']
    message_count = 11.0
    messages = ['366039', '366046', '366055', '366062', '366070', '366100', '366101', '374480', '374486', '409448', '409464']
    nosy_count = 6.0
    nosy_names = ['belopolsky', 'eric.smith', 'karlcow', 'p-ganssle', 'zhanying', 'andrei.avk']
    pr_nums = ['30318']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue40236'
    versions = ['Python 3.6', 'Python 3.7', 'Python 3.8', 'Python 3.9']

    @zhanying
    Copy link
    Mannequin Author

    zhanying mannequin commented Apr 9, 2020

    In [7]: datetime.datetime.strptime("2024-0-3 00:00:00", "%Y-%W-%w %H:%M:%S")
    Out[7]: datetime.datetime(2024, 1, 3, 0, 0)

    In [8]: datetime.datetime.strptime("2024-1-3 00:00:00", "%Y-%W-%w %H:%M:%S")
    Out[8]: datetime.datetime(2024, 1, 3, 0, 0)

    @zhanying zhanying mannequin added type-bug An unexpected behavior, bug, or error labels Apr 9, 2020
    @ericvsmith
    Copy link
    Member

    Can you tell us what platform you're on? Also, please include the header that's printed out when you run python from the command line. For example, mine shows:

    $ python3
    Python 3.7.6 (default, Jan 30 2020, 10:29:04) 
    [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux

    @ericvsmith ericvsmith added stdlib Python modules in the Lib dir labels Apr 9, 2020
    @pganssle
    Copy link
    Member

    pganssle commented Apr 9, 2020

    I can reproduce this on Linux with Python 3.8.2.

    I think this may be a bug, but it may also just be platform-specific weirdness. Either way it's very curious behavior:

    >>> datetime.strptime("2023-0-0", "%Y-%W-%w")                         
    datetime.datetime(2023, 1, 1, 0, 0)
    >>> datetime.strptime("2023-0-1", "%Y-%W-%w")                         
    datetime.datetime(2022, 12, 26, 0, 0)

    The definition for %W (and %U, which is related) goes like this:

    Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0.

    2024 starts on a Monday, so there should be no Week 0 in that year at all. Seems to me like it's undefined what happens when you put in a string that puts in an invalid value for "%Y-%W-%w".

    Seems to me that we are just passing through the behavior of time.strptime in this case (which just calls out to what the platform does):

    >>> time.strptime("2024-0-3", "%Y-%W-%w")                             
    time.struct_time(tm_year=2024, tm_mon=1, tm_mday=3, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=2, tm_yday=3, tm_isdst=-1)

    I am open to discussion about trying to rationalize this behavior - it would be a bit tricky but if we moved to our own implementation of the algorithm to calculate %W we could detect this situation and throw an exception. I'd rather see if this is intended behavior in the underlying C implementation first, though. If this is consistent across platforms and not just some random implementation detail, people may be relying on it.

    I propose that we:

    1. Determine what happens on different platforms (might be easy to just make a PR that asserts the current behavior and see if/how it breaks on any of the supported platforms).
    2. Determine why it works the way it does.

    After that, at the very least we should document the behavior with a warning or a footnote or something. If we make any changes to the behavior they would be 3.9+, but the documentation changes can be backported.

    Thanks for the bug report zhanying! Very interesting!

    @pganssle pganssle added 3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes labels Apr 9, 2020
    @pganssle
    Copy link
    Member

    pganssle commented Apr 9, 2020

    Likely relevant is bpo-23136, where they dealt with similar issues in the past.

    I don't see any explicit test for this behavior, but it seems that the solution is to try to be consistent and to not raise a ValueError.

    Looking at this issue, I think it's a manifestation of a similar bug that hits when a year starts with a Monday.

    It seems like the behavior is that the following days (%W-%w) should be sequential in any year: 00-1, 00-2, 00-3, 00-4, 00-5, 00-6, 00-0, 01-1, 01-2, ...

    Since 2024 starts in a Monday, the first day of the year should be 2024-01-1, and the 2024-00-1 week should start 2023-12-25 rather than duplicating the following week.

    I think there's an equivalent issue with dates of the form "%Y-%U-%w", but happening on years that start with a Sunday.

    @ericvsmith
    Copy link
    Member

    I thought that strptime is platform specific (which is why I asked for the platform info). But, looking at the existing docs https://docs.python.org/3.5/library/time.html#time.strptime

    "But strptime() is independent of any platform and thus does not necessarily support all directives available that are not documented as supported."

    I'm not exactly sure what that means overall, with the double negative. But it say it's not platform specific.

    @zhanying
    Copy link
    Mannequin Author

    zhanying mannequin commented Apr 10, 2020

    My platform is this.

    #python
    Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31)
    [GCC 7.3.0] on linux

    Type "help", "copyright", "credits" or "license" for more information.

    At 2020-04-09 20:45:51, "Eric V. Smith" <report@bugs.python.org> wrote:

    Eric V. Smith <eric@trueblade.com> added the comment:

    Can you tell us what platform you're on? Also, please include the header that's printed out when you run python from the command line. For example, mine shows:

    $ python3
    Python 3.7.6 (default, Jan 30 2020, 10:29:04)
    [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux

    ----------
    components: +Library (Lib)
    nosy: +eric.smith


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue40236\>


    @zhanying
    Copy link
    Mannequin Author

    zhanying mannequin commented Apr 10, 2020

    i read the source code, in this part

    def _calc_julian_from_U_or_W(year, week_of_year, day_of_week, week_starts_Mon):
    """Calculate the Julian day based on the year, week of the year, and day of
        the week, with week_start_day representing whether the week of the year
        assumes the week starts on Sunday or Monday (6 or 0)."""
    first_weekday = datetime_date(year, 1, 1).weekday()
    # If we are dealing with the %U directive (week starts on Sunday), it's
        # easier to just shift the view to Sunday being the first day of the
        # week.
    if not week_starts_Mon:
            first_weekday = (first_weekday + 1) % 7
    day_of_week = (day_of_week + 1) % 7
    # Need to watch out for a week 0 (when the first day of the year is not
        # the same as that specified by %U or %W).
    week_0_length = (7 - first_weekday) % 7
    if week_of_year == 0:
    return 1 + day_of_week - first_weekday
    else:
            days_to_week = week_0_length + (7 * (week_of_year - 1))
    return 1 + days_to_week + day_of_week

    when first_weekday is 0, that year start with Monday, week_of_year equar 0 or 1, this func return same value

    At 2020-04-09 20:45:51, "Eric V. Smith" <report@bugs.python.org> wrote:

    Eric V. Smith <eric@trueblade.com> added the comment:

    Can you tell us what platform you're on? Also, please include the header that's printed out when you run python from the command line. For example, mine shows:

    $ python3
    Python 3.7.6 (default, Jan 30 2020, 10:29:04)
    [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux

    ----------
    components: +Library (Lib)
    nosy: +eric.smith


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue40236\>


    @karlcow
    Copy link
    Mannequin

    karlcow mannequin commented Jul 28, 2020

    Same on macOS 10.15.6 (19G73)

    Python 3.8.3 (v3.8.3:6f8c8320e9, May 13 2020, 16:29:34) 
    [Clang 6.0 (clang-600.0.57)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import datetime
    >>> datetime.datetime.strptime("2024-0-3 00:00:00", "%Y-%W-%w %H:%M:%S")
    datetime.datetime(2024, 1, 3, 0, 0)
    >>> datetime.datetime.strptime("2024-1-3 00:00:00", "%Y-%W-%w %H:%M:%S")
    datetime.datetime(2024, 1, 3, 0, 0)

    Also
    https://pubs.opengroup.org/onlinepubs/007908799/xsh/strptime.html

    note that iso8601 doesn't have this issue.
    %V - ISO 8601 week of the year as a decimal number [01, 53].
    https://en.wikipedia.org/wiki/ISO_week_date

    @karlcow
    Copy link
    Mannequin

    karlcow mannequin commented Jul 28, 2020

    Also this.

    >>> import datetime
    >>> d0 = datetime.datetime.strptime("2024-0-3 00:00:00", "%Y-%W-%w %H:%M:%S")
    >>> d0.strftime("%Y-%W-%w %H:%M:%S")
    '2024-01-3 00:00:00'
    >>> d1 = datetime.datetime.strptime("2024-1-3 00:00:00", "%Y-%W-%w %H:%M:%S")
    >>> d1.strftime("%Y-%W-%w %H:%M:%S")
    '2024-01-3 00:00:00'
    >>> d2301 = datetime.datetime.strptime("2023-0-1 00:00:00", "%Y-%W-%w %H:%M:%S")
    >>> d2311 = datetime.datetime.strptime("2023-1-1 00:00:00", "%Y-%W-%w %H:%M:%S")
    >>> d2301
    datetime.datetime(2022, 12, 26, 0, 0)
    >>> d2311
    datetime.datetime(2023, 1, 2, 0, 0)
    >>> d2311.strftime("%Y-%W-%w %H:%M:%S")
    '2023-01-1 00:00:00'
    >>> d2301.strftime("%Y-%W-%w %H:%M:%S")
    '2022-52-1 00:00:00'

    Week 0 2023 became Week 52 2022 (which is correct but might lead to surprises)

    @akulakov
    Copy link
    Contributor

    akulakov commented Jan 1, 2022

    I am open to discussion about trying to rationalize this behavior - it would be a bit tricky but if we moved to our own implementation of the algorithm to calculate %W we could detect this situation and throw an exception.

    Paul:

    I'm guessing here but I think the design makes sense from a certain angle: consider that 0-th week is the first partial week of the year. For some date calculations you might want to go forward, and for others you may go back. If you're going back, it's easier to keep the year and the week the same and reference other days within that week, rather than decrementing the year and changing to 12th month.

    This leaves the odd case of there being no partial week. Logically 0th week could refer to the last week of previous year, but that feels wrong because you're referring to it as a week of e.g. 2024 when all of its days are in 2023, so it's entirely a 2023 week.

    So a precise definition would be to say that 0-th week is always the first week of the year, whether partial or full; while 1-th week is always the first full week. It follows from this definition that sometimes 0-th and 1-th are the same week.

    I'll make a test PR to check if that's how time.strptime behaves on all platforms.

    @akulakov
    Copy link
    Contributor

    akulakov commented Jan 1, 2022

    I didn't realize that time.strptime is just using python module _strptime.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @encukou encukou removed 3.9 only security fixes 3.8 (EOL) end of life 3.7 (EOL) end of life labels Mar 21, 2025
    @StanFromIreland
    Copy link
    Contributor

    StanFromIreland commented Mar 22, 2025

    This is not a bug, this is just because 2024 started on a Monday, so week 0 is also week 1. week 0 is meant to be before the first Monday of the year, but wait, the year starts on a Monday, so there are no preceding days, so the week is just an alias for week 1! The current behavior for handling this case seems fine to me.

    From the docs:

    All days in a new year preceding the first Monday are considered to be in week 0.

    >>> import datetime
    >>> datetime.datetime.strptime("2024-0-3 00:00:00", "%Y-%W-%w %H:%M:%S")
    datetime.datetime(2024, 1, 3, 0, 0)
    >>> datetime.datetime.strptime("2024-1-3 00:00:00", "%Y-%W-%w %H:%M:%S")
    datetime.datetime(2024, 1, 3, 0, 0)
    >>> datetime.datetime.strptime("2023-1-3 00:00:00", "%Y-%W-%w %H:%M:%S")
    datetime.datetime(2023, 1, 4, 0, 0)
    >>> datetime.datetime.strptime("2023-0-3 00:00:00", "%Y-%W-%w %H:%M:%S") # 2023 started on a Sunday
    datetime.datetime(2022, 12, 28, 0, 0)
    >>> datetime.datetime.strptime("2022-1-3 00:00:00", "%Y-%W-%w %H:%M:%S")
    datetime.datetime(2022, 1, 5, 0, 0)
    >>> datetime.datetime.strptime("2022-0-3 00:00:00", "%Y-%W-%w %H:%M:%S") # 2022 started on a Saturday
    datetime.datetime(2021, 12, 29, 0, 0)
    

    This should be closed, it's odd but it's correct and is in the docs.

    @pganssle
    Copy link
    Member

    @StanFromIreland I don't think you are adding anything new to the discussion from 5 years ago. Yes, that is the root of the problem, but I don't see anywhere in the docs that says that in the event that there are no days in the new year before the first Monday that week zero is an alias for week one. That is highly counter-intuitive behavior.

    It is still not at all clear if this is a random accidental implementation detail leaking through an edge case or if it is a deliberate choice. It is also not clear if people rely on this property deliberately or not, or if there are people out there relying on this property not being true, and it is intermittently causing bugs (these are not mutually exclusive options).

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    Development

    No branches or pull requests

    5 participants