gh-117151: IO performance improvement, increase io.DEFAULT_BUFFER_SIZE to 128k (GH-118144)

Co-authored-by: rmorotti <romain.morotti@man.com>
This commit is contained in:
morotti 2025-03-07 19:36:12 +00:00 committed by GitHub
parent 4bf25a0dc8
commit b1b4f9625c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
7 changed files with 38 additions and 22 deletions

View file

@ -23,8 +23,9 @@ if hasattr(os, 'SEEK_HOLE') :
valid_seek_flags.add(os.SEEK_HOLE)
valid_seek_flags.add(os.SEEK_DATA)
# open() uses st_blksize whenever we can
DEFAULT_BUFFER_SIZE = 8 * 1024 # bytes
# open() uses max(min(blocksize, 8 MiB), DEFAULT_BUFFER_SIZE)
# when the device block size is available.
DEFAULT_BUFFER_SIZE = 128 * 1024 # bytes
# NOTE: Base classes defined here are registered with the "official" ABCs
# defined in io.py. We don't use real inheritance though, because we don't want
@ -123,10 +124,10 @@ def open(file, mode="r", buffering=-1, encoding=None, errors=None,
the size of a fixed-size chunk buffer. When no buffering argument is
given, the default buffering policy works as follows:
* Binary files are buffered in fixed-size chunks; the size of the buffer
is chosen using a heuristic trying to determine the underlying device's
"block size" and falling back on `io.DEFAULT_BUFFER_SIZE`.
On many systems, the buffer will typically be 4096 or 8192 bytes long.
* Binary files are buffered in fixed-size chunks; the size of the buffer
is max(min(blocksize, 8 MiB), DEFAULT_BUFFER_SIZE)
when the device block size is available.
On most systems, the buffer will typically be 128 kilobytes long.
* "Interactive" text files (files for which isatty() returns True)
use line buffering. Other text files use the policy described above
@ -242,7 +243,7 @@ def open(file, mode="r", buffering=-1, encoding=None, errors=None,
buffering = -1
line_buffering = True
if buffering < 0:
buffering = raw._blksize
buffering = max(min(raw._blksize, 8192 * 1024), DEFAULT_BUFFER_SIZE)
if buffering < 0:
raise ValueError("invalid buffering size")
if buffering == 0: