Issue2180
Created on 2008-02-25 01:55 by jaredgrubb, last changed 2022-04-11 14:56 by admin. This issue is now closed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 13401 | merged | Anthony Sottile, 2019-05-18 01:39 | |
| Messages (8) | |||
|---|---|---|---|
| msg62956 - (view) | Author: Jared Grubb (jaredgrubb) | Date: 2008-02-25 01:59 | |
tokenize does not handle line joining properly, as the following string
fails the CPython tokenizer but passes the tokenize module.
Example 1:
>>> s = "if 1:\n \\\n #hey\n print 1"
>>> exec s
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 3
#hey
^
SyntaxError: invalid syntax
>>> tokenize.tokenize(StringIO(s).readline)
1,0-1,2: NAME 'if'
1,3-1,4: NUMBER '1'
1,4-1,5: OP ':'
1,5-1,6: NEWLINE '\n'
2,0-2,2: INDENT ' '
3,2-3,6: COMMENT '#hey'
3,6-3,7: NEWLINE '\n'
4,2-4,7: NAME 'print'
4,8-4,9: NUMBER '1'
5,0-5,0: DEDENT ''
5,0-5,0: ENDMARKER ''
|
|||
| msg62960 - (view) | Author: Jared Grubb (jaredgrubb) | Date: 2008-02-25 02:22 | |
CPython allows \ at EOF, but tokenize does not.
>>> s = 'print 1\\\n'
>>> exec s
1
>>> tokenize.tokenize(StringIO(s).readline)
1,0-1,5: NAME 'print'
1,6-1,7: NUMBER '1'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File
"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/tokenize.py",
line 153, in tokenize
tokenize_loop(readline, tokeneater)
File
"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/tokenize.py",
line 159, in tokenize_loop
for token_info in generate_tokens(readline):
File
"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/tokenize.py",
line 283, in generate_tokens
raise TokenError, ("EOF in multi-line statement", (lnum, 0))
tokenize.TokenError: ('EOF in multi-line statement', (2, 0))
|
|||
| msg116977 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2010-09-20 21:26 | |
Nobody appears to be interested so I'll close this in a couple of weeks unless someone objects, unless a patch is provided. |
|||
| msg116985 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2010-09-20 21:51 | |
Mark, please stop closing these based on age. The needs to be a determination whether this is a valid bug. If so, then a patch is needed. If not, it can be closed. |
|||
| msg143716 - (view) | Author: Meador Inge (meador.inge) * ![]() |
Date: 2011-09-08 01:39 | |
That syntax error is coming from the CPython parser and *not* the tokenizer. Both CPython and the 'tokenizer' modules produce the same tokenization: [meadori@motherbrain cpython]$ cat repro.py if 1: \ pass [meadori@motherbrain cpython]$ ./python tokenize.py repro.py 0,0-0,0: ENCODING 'utf-8' 1,0-1,2: NAME 'if' 1,3-1,4: NUMBER '1' 1,4-1,5: OP ':' 1,5-1,6: NEWLINE '\n' 2,0-2,2: INDENT ' ' 3,0-3,1: NEWLINE '\n' 4,2-4,6: NAME 'pass' 4,6-4,7: NEWLINE '\n' 5,0-5,0: DEDENT '' 5,0-5,0: ENDMARKER '' [44319 refs] [meadori@motherbrain cpython]$ ./python -d repro.py | grep Token | tail -10 File "repro.py", line 3 ^ SyntaxError: invalid syntax [44305 refs] Token NEWLINE/'' ... It's a token we know Token DEDENT/'' ... It's a token we know Token NEWLINE/'' ... It's a token we know Token ENDMARKER/'' ... It's a token we know Token NAME/'if' ... It's a keyword Token NUMBER/'1' ... It's a token we know Token COLON/':' ... It's a token we know Token NEWLINE/'' ... It's a token we know Token INDENT/'' ... It's a token we know Token NEWLINE/'' ... It's a token we know The NEWLINE INDENT NEWLINE tokenization causes the parser to choke because 'suite' nonterminals: suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT are defined as NEWLINE INDENT. It seems appropriate that the NEWLINE after INDENT should be dropped by both tokenizers. In other words, I think: """ if 1: \ pass """ should produce the same tokenization as: """ if 1: pass """ This seems consistent with with how explicit line joining is defined [2]. [1] http://hg.python.org/cpython/file/92842e347d98/Grammar/Grammar [2] http://docs.python.org/reference/lexical_analysis.html#explicit-line-joining |
|||
| msg339576 - (view) | Author: Anthony Sottile (Anthony Sottile) * | Date: 2019-04-07 14:32 | |
Here's an example in the wild which still reproduces with python3.8a3: https://github.com/SecureAuthCorp/impacket/blob/194b22ed2fc85c4f241375fb7ebe4e0d89626c8c/impacket/examples/remcomsvc.py#L1669 This was reported as a bug on flake8: https://gitlab.com/pycqa/flake8/issues/532 Here's the reproduction with python3.8: $ python3.8 --version --version Python 3.8.0a3 (default, Mar 27 2019, 03:46:44) [GCC 7.3.0] $ python3.8 impacket/examples/remcomsvc.py $ python3.8 -mtokenize impacket/examples/remcomsvc.py impacket/examples/remcomsvc.py:1670:0: error: EOF in multi-line statement |
|||
| msg342807 - (view) | Author: miss-islington (miss-islington) | Date: 2019-05-18 18:27 | |
New changeset abea73bf4a320ff658c9a98fef3d948a142e61a9 by Miss Islington (bot) (Anthony Sottile) in branch 'master': bpo-2180: Treat line continuation at EOF as a `SyntaxError` (GH-13401) https://github.com/python/cpython/commit/abea73bf4a320ff658c9a98fef3d948a142e61a9 |
|||
| msg342817 - (view) | Author: Gregory P. Smith (gregory.p.smith) * ![]() |
Date: 2019-05-18 21:02 | |
Thanks for figuring this one out Anthony! :) |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:56:31 | admin | set | github: 46433 |
| 2019-05-18 21:02:56 | gregory.p.smith | set | status: open -> closed resolution: fixed messages: + msg342817 stage: patch review -> commit review |
| 2019-05-18 18:27:30 | miss-islington | set | nosy:
+ miss-islington messages: + msg342807 |
| 2019-05-18 07:16:47 | gregory.p.smith | set | assignee: gregory.p.smith nosy: + gregory.p.smith |
| 2019-05-18 01:39:12 | Anthony Sottile | set | keywords:
+ patch stage: needs patch -> patch review pull_requests: + pull_request13312 |
| 2019-04-07 14:32:44 | Anthony Sottile | set | nosy:
+ Anthony Sottile messages:
+ msg339576 |
| 2014-02-03 19:15:35 | BreamoreBoy | set | nosy:
- BreamoreBoy |
| 2011-09-08 01:39:11 | meador.inge | set | messages:
+ msg143716 stage: test needed -> needs patch |
| 2010-09-27 03:19:42 | meador.inge | set | nosy:
+ meador.inge |
| 2010-09-20 21:51:51 | rhettinger | set | status: pending -> open nosy:
+ rhettinger assignee: jhylton -> (no value) |
| 2010-09-20 21:26:23 | BreamoreBoy | set | status: open -> pending nosy: + BreamoreBoy messages: + msg116977 |
| 2010-08-21 17:06:34 | BreamoreBoy | unlink | issue1230484 dependencies |
| 2010-08-21 17:03:39 | BreamoreBoy | set | versions: + Python 3.1, Python 2.7, Python 3.2, - Python 2.6 |
| 2009-02-16 02:26:11 | ajaksu2 | link | issue1230484 dependencies |
| 2009-02-16 02:20:41 | ajaksu2 | set | stage: test needed versions: + Python 2.6, - Python 2.5 |
| 2008-03-20 03:08:15 | jafo | set | assignee: jhylton nosy: + jhylton |
| 2008-02-25 02:22:29 | jaredgrubb | set | messages: + msg62960 |
| 2008-02-25 01:59:17 | jaredgrubb | set | messages: + msg62956 |
| 2008-02-25 01:55:51 | jaredgrubb | create | |
