bpo-42827: Fix crash on SyntaxError in multiline expressions (GH-24140)

When trying to extract the error line for the error message there
are two distinct cases:

1. The input comes from a file, which means that we can extract the
   error line by using `PyErr_ProgramTextObject` and which we already
   do.
2. The input does not come from a file, at which point we need to get
   the source code from the tokenizer:
   * If the tokenizer's current line number is the same with the line
     of the error, we get the line from `tok->buf` and we're ready.
   * Else, we can extract the error line from the source code in the
     following two ways:
     * If the input comes from a string we have all the input
       in `tok->str` and we can extract the error line from it.
     * If the input comes from stdin, i.e. the interactive prompt, we
       do not have access to the previous line. That's why a new
       field `tok->stdin_content` is added which holds the whole input for the
       current (multiline) statement or expression. We can then extract the
       error line from `tok->stdin_content` like we do in the string case above.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
This commit is contained in:
Lysandros Nikolaou 2021-01-14 23:36:30 +02:00 committed by GitHub
parent 9712358277
commit e5fe509054
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
5 changed files with 64 additions and 2 deletions

View file

@ -81,6 +81,7 @@ tok_new(void)
tok->decoding_readline = NULL;
tok->decoding_buffer = NULL;
tok->type_comments = 0;
tok->stdin_content = NULL;
tok->async_hacks = 0;
tok->async_def = 0;
@ -816,6 +817,8 @@ PyTokenizer_Free(struct tok_state *tok)
PyMem_Free(tok->buf);
if (tok->input)
PyMem_Free(tok->input);
if (tok->stdin_content)
PyMem_Free(tok->stdin_content);
PyMem_Free(tok);
}
@ -856,6 +859,24 @@ tok_nextc(struct tok_state *tok)
if (translated == NULL)
return EOF;
newtok = translated;
if (tok->stdin_content == NULL) {
tok->stdin_content = PyMem_Malloc(strlen(translated) + 1);
if (tok->stdin_content == NULL) {
tok->done = E_NOMEM;
return EOF;
}
sprintf(tok->stdin_content, "%s", translated);
}
else {
char *new_str = PyMem_Malloc(strlen(tok->stdin_content) + strlen(translated) + 1);
if (new_str == NULL) {
tok->done = E_NOMEM;
return EOF;
}
sprintf(new_str, "%s%s", tok->stdin_content, translated);
PyMem_Free(tok->stdin_content);
tok->stdin_content = new_str;
}
}
if (tok->encoding && newtok && *newtok) {
/* Recode to UTF-8 */