Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-131421: fix ASDL grammar for Dict to have an expr?* keys field #131419

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Demonstrandum
Copy link

@Demonstrandum Demonstrandum commented Mar 18, 2025

In the ast documentation for Python:

When doing dictionary unpacking using dictionary literals the expression to be expanded goes in the values list, with a None at the corresponding position in keys.

Hence, keys is really a expr?* and not a expr*.

In the `ast` documentation for Python:

* https://docs.python.org/3/library/ast.html#ast.Dict
it is made clear that:

> When doing dictionary unpacking using dictionary literals the expression to be expanded goes in the values list, with a `None` at the corresponding position in `keys`.

Hence, `keys` is really a `expr?*` and *not* a `expr*`.
Copy link

cpython-cla-bot bot commented Mar 18, 2025

All commit authors signed the Contributor License Agreement.
CLA signed

@bedevere-app
Copy link

bedevere-app bot commented Mar 18, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@Demonstrandum Demonstrandum changed the title Python.asdl: Fix Dict variant to have an expr?* keys field. gh-131421: Python.asdl: Fix Dict variant to have an expr?* keys field. Mar 18, 2025
@@ -63,7 +63,7 @@ module Python
| UnaryOp(unaryop op, expr operand)
| Lambda(arguments args, expr body)
| IfExp(expr test, expr body, expr orelse)
| Dict(expr* keys, expr* values)
| Dict(expr?* keys, expr* values)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of change makes me wonder whether we should also add '+' to indicate 1 or more items. For instance nonlocal requires at least one name, same for import, global and others.

Similarly, there is no explanation of what * and ? actually mean. While they can be inferred, I would expect a caption for them. WDYT @JelleZijlstra?

@picnixz picnixz changed the title gh-131421: Python.asdl: Fix Dict variant to have an expr?* keys field. gh-131421: fix ASDL grammar for Dict to have an expr?* keys field Mar 18, 2025
@JelleZijlstra
Copy link
Member

Python.asdl is used to generate the AST code, and this change doesn't compile (to reproduce run make regen-ast locally; ideally we'd fix CI to do that).

I don't think it's really worth changing the grammar here.

@picnixz
Copy link
Member

picnixz commented Mar 18, 2025

I don't think it's really worth changing the grammar here.

Can we perhaps add a comment instead? because now that I think about it, for static analysis tools, we could be surprised (like, if we only focus on the ASDL and not on the documented entry itself). [I could be surprised in Sphinx though I don't think I've encountered an issue with this]

@Demonstrandum
Copy link
Author

It represents a point where it completely breaks expectation, where up until then, the ASDL (as far as I have observed) was being completely respected by the ast module. Since * and ? are modifiers on types, and expr? is still a type, morally a star should be able to follow any type, including expr? to get a expr?*, for example.

Would this be very difficult to parse / break a lot of handling?

@JelleZijlstra
Copy link
Member

It's been like this for a long time and I don't recall any other reports of confusion about it, so there's a good argument to stick with the working code.

That said, if you can provide a patch that parses this new syntax correctly and that compiles correctly, we can consider it.

@Demonstrandum
Copy link
Author

@JelleZijlstra I updated the Parser/asdl.py ASDL parser to support chained quantifiers. I made sure the opt and seq interface it exposes remains the same (based of the last quantifier supplied). This means the generated Python/Python-ast.c is unchanged.

Naturally, make regen-ast works now.

Comment on lines +67 to +71
class Quantifier:
class _Optional: pass
class _Sequence: pass
OPTIONAL = _Optional()
SEQUENCE = _Sequence()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this just be an enum?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I just prefer these to have unique/singleton values (as opposed to ints or whatnot).

I can change it if its outside the expected coding style.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you wanted singletons you could also just do OPTIONAL = object() but to be honest I think an Enum/StrEnum is a better choice here since it's more explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(one reason to do the class trick is to have a bit of a nicer repr(), but OTOH, I don't think we need to worry about importing enum)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth importing enum? If you want to add your suggested change I can commit it.

Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is already the case or not, but can you quickly check that the tests for ast.parse('{}') handle this case?

elif self.cur_token.kind == TokenKind.Question:
is_opt = True
quantifiers = []
while self.cur_token.kind in (TokenKind.Asterisk, TokenKind.Question):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we check that we don't have ** for instance or ???

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. But both ?? and ** are legitimate types, ** perhaps more legitimate than ?? is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, actually the current parser doesn't care. I thought that it would have warned the user. And should T** actually mean a list of list of T? is the obtained C code designed to handle this kind of grammar?

I have no strong opinion now on whether we should warn ??.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is essentially no change to the C code at all at this point. I don't know how this was handled before, since expr?* was always there implicitly, but this didn't seem to matter.

class Field(AST):
def __init__(self, type, name=None, seq=False, opt=False):
def __init__(self, type, name=None, quantifiers=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need to support more than two qualifiers? and/or shouldn't we check the consistency of those qualifiers as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently there are no more than two quantifiers on any type, as per my change to Python.asdl.

I'm just implementing the EBNF grammar described at the top of the asdl.py file. And I don't think this files does any amount of extra validation so far anyway.

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants