mirror of
https://github.com/python/cpython.git
synced 2025-08-03 00:23:06 +00:00
SF patch #998993: The UTF-8 and the UTF-16 stateful decoders now support
decoding incomplete input (when the input stream is temporarily exhausted). codecs.StreamReader now implements buffering, which enables proper readline support for the UTF-16 decoders. codecs.StreamReader.read() has a new argument chars which specifies the number of characters to return. codecs.StreamReader.readline() and codecs.StreamReader.readlines() have a new argument keepends. Trailing "\n"s will be stripped from the lines if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and PyUnicode_DecodeUTF16Stateful.
This commit is contained in:
parent
a708d6e3b0
commit
69652035bc
12 changed files with 419 additions and 173 deletions
|
@ -394,9 +394,14 @@ order to be compatible to the Python codec registry.
|
|||
be extended with \function{register_error()}.
|
||||
\end{classdesc}
|
||||
|
||||
\begin{methoddesc}{read}{\optional{size}}
|
||||
\begin{methoddesc}{read}{\optional{size\optional{, chars}}}
|
||||
Decodes data from the stream and returns the resulting object.
|
||||
|
||||
\var{chars} indicates the number of characters to read from the
|
||||
stream. \function{read()} will never return more than \vars{chars}
|
||||
characters, but it might return less, if there are not enough
|
||||
characters available.
|
||||
|
||||
\var{size} indicates the approximate maximum number of bytes to read
|
||||
from the stream for decoding purposes. The decoder can modify this
|
||||
setting as appropriate. The default value -1 indicates to read and
|
||||
|
@ -407,29 +412,29 @@ order to be compatible to the Python codec registry.
|
|||
read as much data as is allowed within the definition of the encoding
|
||||
and the given size, e.g. if optional encoding endings or state
|
||||
markers are available on the stream, these should be read too.
|
||||
|
||||
\versionchanged[\var{chars} argument added]{2.4}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{readline}{[size]}
|
||||
\begin{methoddesc}{readline}{\optional{size\optional{, keepends}}}
|
||||
Read one line from the input stream and return the
|
||||
decoded data.
|
||||
|
||||
Unlike the \method{readlines()} method, this method inherits
|
||||
the line breaking knowledge from the underlying stream's
|
||||
\method{readline()} method -- there is currently no support for line
|
||||
breaking using the codec decoder due to lack of line buffering.
|
||||
Sublcasses should however, if possible, try to implement this method
|
||||
using their own knowledge of line breaking.
|
||||
|
||||
\var{size}, if given, is passed as size argument to the stream's
|
||||
\method{readline()} method.
|
||||
|
||||
If \var{keepends} is false lineends will be stripped from the
|
||||
lines returned.
|
||||
|
||||
\versionchanged[\var{keepends} argument added]{2.4}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{readlines}{[sizehint]}
|
||||
\begin{methoddesc}{readlines}{\optional{sizehint\optional{, keepends}}}
|
||||
Read all lines available on the input stream and return them as list
|
||||
of lines.
|
||||
|
||||
Line breaks are implemented using the codec's decoder method and are
|
||||
included in the list entries.
|
||||
included in the list entries if \var{keepends} is true.
|
||||
|
||||
\var{sizehint}, if given, is passed as \var{size} argument to the
|
||||
stream's \method{read()} method.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue