bpo-41746: Add type information to asdl_seq objects by pablogsal · Pull Request #22223 · python/cpython

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

pablogsal

@pablogsal pablogsal changed the title Add type information to asdl_seq objects bpo-41746: Add type information to asdl_seq objects

Sep 13, 2020

@pablogsal

This PR adds the following:

  • Add new capability to the PEG parser to type variable assignments. For instance:
       | a[asdl_stmt_seq*]=';'.small_stmt+ [';'] NEWLINE { a }
  • Add new sequence types from the asdl definition (automatically generated)
  • Make asdl_seq type a generic aliasing pointer type.
  • Create a new asdl_generic_seq for the generic case using void*.
  • The old asdl_seq_GET/ast_seq_SET macros now are typed.
  • New asdl_seq_GET_UNTYPED/ast_seq_SET_UNTYPED macros for dealing with generic sequences.
  • Changes all possible asdl_seq types to use specific versions everywhere.

ℹ️ I recommend reviewing individual commits and skip the ones marked as [Automatic changes].

@bedevere-bot

🤖 New build scheduled with the buildbot fleet by @pablogsal for commit d9af84ad23c7afce7a51f76db96ab95f5d6c49a7 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

isidentical

isidentical

lysnikolaou

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, great work really! 🚀

I have left some comments on places, where typed sequences could be used in the parser, but these could probably be changed in a separate PR as well. This one is already big enough.

@@ -259,9 +259,10 @@ def collect_todo(self, gen: ParserGenerator) -> None:


class NamedItem:
def __init__(self, name: Optional[str], item: Item):
def __init__(self, name: Optional[str], item: Item, type: Optional[str] = None):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__str__ and __repr__ will have to be updated here as well, right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(for the future) is there are reason to avoid using dataclasses for these meta-grammar items itself?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the advantage? We use custom __repr__ and __str__ , we do have some transformations in the __init__ and we do not compare nodes.

isidentical

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great (the AST parts)!

@pablogsal

Something to fix after this PR is that the CHECK_CALL macro does type erasure (it always returns a void*) so we should transform it into a macro.

@pablogsal

@pablogsal

@lysnikolaou Could you review this when you have some spare cycles? I am a bit afraid of potential merge conflicts given the size of the PR and how many parts it touches... :(

@lysnikolaou

@pablogsal I'm sorry i didn't get to it yet. I'll be sure to make amother review tomorrow.

lysnikolaou

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Only twoone thing I missed in my last review, the arguments to the function _PyPegen_slash_with_default and _PyPegen_star_etc can be typed. If you don't have the time, I can update these and push to your branch.

lysnikolaou

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind. I did it myself, this goes faster. I hope that's okay. It's certainly a go now!

As a left-over task for another PR, we should update all the little grammars in test_peg_generator as well.

@pablogsal

As a left-over task for another PR, we should update all the little grammars in test_peg_generator as well.

@lysnikolaou If you have some time, could you give it a go to this:

Something to fix after this PR is that the CHECK_CALL macro does type erasure (it always returns a void*) so we should transform it into a macro.

Unfortunately, these days I will be a bit busy learning how to release-manager :)

@pablogsal

Thanks a lot for the reviews! ❤️

xzy3 pushed a commit to xzy3/cpython that referenced this pull request

Oct 18, 2020
* Add new capability to the PEG parser to type variable assignments. For instance:
```
       | a[asdl_stmt_seq*]=';'.small_stmt+ [';'] NEWLINE { a }
```

* Add new sequence types from the asdl definition (automatically generated)
* Make `asdl_seq` type a generic aliasing pointer type.
* Create a new `asdl_generic_seq` for the generic case using `void*`.
* The old `asdl_seq_GET`/`ast_seq_SET` macros now are typed.
* New `asdl_seq_GET_UNTYPED`/`ast_seq_SET_UNTYPED` macros for dealing with generic sequences.
* Changes all possible `asdl_seq` types to use specific versions everywhere.