bpo-42123: Run the parser two times and only enable invalid rules on the second run by lysnikolaou · Pull Request #22111 · python/cpython

Here's the easiest way I could think of to do this. Are we okay with something
as simple as this or would we like to investigate further whether we could
actually use the memo cache wherever possible?

The first parser run is only responsible for detecting whether
there is a SyntaxError or not. If there isn't, the AST gets returned.
Otherwise, the parser is run a second time with all the invalid_*
rules enabled so that all the customized error messages get produced.

Regarding bpo-41659, we can now implement a very simple fix:

--- a/Grammar/python.gram
+++ b/Grammar/python.gram
@@ -460,6 +460,7 @@ await_primary[expr_ty] (memo):
     | AWAIT a=primary { CHECK_VERSION(5, "Await expressions are", _Py_Await(a, EXTRA)) }
     | primary
 primary[expr_ty]:
+    | primary b='{' { RAISE_SYNTAX_ERROR_KNOWN_LOCATION(b, "invalid syntax") }
     | a=primary '.' b=NAME { _Py_Attribute(a, b->v.Name.id, Load, EXTRA) }
     | a=primary b=genexp { _Py_Call(a, CHECK(_PyPegen_singleton_seq(p, b)), NULL, EXTRA) }
     | a=primary '(' b=[arguments] ')' {

With this patch, I'm now getting:

➜  cpygen git:(parser-second-run) ✗ ./python
Python 3.10.0a0 (heads/parser-second-run-dirty:72fe7254b9, Sep  5 2020, 16:40:28) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a{
  File "<stdin>", line 1
    a{
     ^
SyntaxError: invalid syntax
>>> a.b.c{
  File "<stdin>", line 1
    a.b.c{
         ^
SyntaxError: invalid syntax

Thoughts?

https://bugs.python.org/issue42123