bpo-33671: efficient zero-copy for shutil.copy* functions (Linux, OSX and Win) (#7160)

* have shutil.copyfileobj use sendfile() if possible

* refactoring: use ctx manager

* add test with non-regular file obj

* emulate case where file size can't be determined

* reference _copyfileobj_sendfile directly

* add test for offset() at certain position

* add test for empty file

* add test for non regular file dst

* small refactoring

* leave copyfileobj() alone in order to not introduce any incompatibility

* minor refactoring

* remove old test

* update docstring

* update docstring; rename exception class

* detect platforms which only support file to socket zero copy

* don't run test on platforms where file-to-file zero copy is not supported

* use tempfiles

* reset verbosity

* add test for smaller chunks

* add big file size test

* add comment

* update doc

* update whatsnew doc

* update doc

* catch Exception

* remove unused import

* add test case for error on second sendfile() call

* turn docstring into comment

* add one more test

* update comment

* add Misc/NEWS entry

* get rid of COPY_BUFSIZE; it belongs to another PR

* update doc

* expose posix._fcopyfile() for OSX

* merge from linux branch

* merge from linux branch

* expose fcopyfile

* arg clinic for the win implementation

* convert path type to path_t

* expose CopyFileW

* fix windows tests

* release GIL

* minor refactoring

* update doc

* update comment

* update docstrings

* rename functions

* rename test classes

* update doc

* update doc

* update docstrings and comments

* avoid do import nt|posix modules if unnecessary

* set nt|posix modules to None if not available

* micro speedup

* update description

* add doc note

* use better wording in doc

* rename function using 'fastcopy' prefix instead of 'zerocopy'

* use :ref: in rst doc

* change wording in doc

* add test to make sure sendfile() doesn't get called aymore in case it doesn't support file to file copies

* move CopyFileW in _winapi and actually expose CopyFileExW instead

* fix line endings

* add tests for mode bits

* add docstring

* remove test file mode class; let's keep it for later when Istart addressing OSX fcopyfile() specific copies

* update doc to reflect new changes

* update doc

* adjust tests on win

* fix argument clinic error

* update doc

* OSX: expose copyfile(3) instead of fcopyfile(3); also expose flags arg to python

* osx / copyfile: use path_t instead of char

* do not set dst name in the OSError exception in order to remain consistent with platforms which cannot do that (e.g. linux)

* add same file test

* add test for same file

* have osx copyfile() pre-emptively check if src and dst are the same, otherwise it will return immedialtey and src file content gets deleted

* turn PermissionError into appropriate SameFileError

* expose ERROR_SHARING_VIOLATION in order to raise more appropriate SameFileError

* honour follow_symlinks arg when using CopyFileEx

* update Misc/NEWS

* expose CreateDirectoryEx mock

* change C type

* CreateDirectoryExW actual implementation

* provide specific makedirs() implementation for win

* fix typo

* skeleton for SetNamedSecurityInfo

* get security info for src path

* finally set security attrs

* add unit tests

* mimick os.makedirs() behavior and raise if dst dir exists

* set 2 paths for OSError object

* set 2 paths for OSError object

* expand windows test

* in case of exception on os.sendfile() set filename and filename2 exception attributes

* set 2 filenames (src, dst) for OSError in case copyfile() fails on OSX

* update doc

* do not use CreateDirectoryEx() in copytree() if source dir is a symlink (breaks test_copytree_symlink_dir); instead just create a plain dir and remain consistent with POSIX implementation

* use bytearray() and readinto()

* use memoryview() with bytearray()

* refactoring + introduce a new _fastcopy_binfileobj() fun

* remove CopyFileEx and other C wrappers

* remove code related to CopyFileEx

* Recognize binary files in copyfileobj()
...and use fastest _fastcopy_binfileobj() when possible

* set 1MB copy bufsize on win; also add a global _COPY_BUFSIZE variable

* use ctx manager for memoryview()

* update doc

* remove outdated doc

* remove last CopyFileEx remnants

* OSX - use fcopyfile(3) instead of copyfile(3)

...as an extra safety measure: in case src/dst are "exotic" files (non
regular or living on a network fs etc.) we better fail on open() instead
of copyfile(3) as we're not quite sure what's gonna happen in that
case.

* update doc
This commit is contained in:
Giampaolo Rodola 2018-06-12 23:04:50 +02:00 committed by GitHub
parent 33cd058f21
commit 4a172ccc73
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
8 changed files with 595 additions and 19 deletions

View file

@ -97,6 +97,10 @@ corresponding Unix manual entries for more information on calls.");
#include <sys/sendfile.h>
#endif
#if defined(__APPLE__)
#include <copyfile.h>
#endif
#ifdef HAVE_SCHED_H
#include <sched.h>
#endif
@ -8742,6 +8746,34 @@ done:
#endif /* HAVE_SENDFILE */
#if defined(__APPLE__)
/*[clinic input]
os._fcopyfile
infd: int
outfd: int
flags: int
/
Efficiently copy content or metadata of 2 regular file descriptors (OSX).
[clinic start generated code]*/
static PyObject *
os__fcopyfile_impl(PyObject *module, int infd, int outfd, int flags)
/*[clinic end generated code: output=8e8885c721ec38e3 input=aeb9456804eec879]*/
{
int ret;
Py_BEGIN_ALLOW_THREADS
ret = fcopyfile(infd, outfd, NULL, flags);
Py_END_ALLOW_THREADS
if (ret < 0)
return posix_error();
Py_RETURN_NONE;
}
#endif
/*[clinic input]
os.fstat
@ -12918,6 +12950,7 @@ static PyMethodDef posix_methods[] = {
OS_UTIME_METHODDEF
OS_TIMES_METHODDEF
OS__EXIT_METHODDEF
OS__FCOPYFILE_METHODDEF
OS_EXECV_METHODDEF
OS_EXECVE_METHODDEF
OS_SPAWNV_METHODDEF
@ -13537,6 +13570,10 @@ all_ins(PyObject *m)
if (PyModule_AddIntMacro(m, GRND_NONBLOCK)) return -1;
#endif
#if defined(__APPLE__)
if (PyModule_AddIntConstant(m, "_COPYFILE_DATA", COPYFILE_DATA)) return -1;
#endif
return 0;
}