Commit graph

48 commits

Author SHA1 Message Date
Jason R. Coombs
8d6eb0c262
gh-135276: Refresh zipfile.Path from zipp 3.23 (#135277)
Apply changes from zipp 3.23
2025-06-08 19:20:20 +00:00
Tim Hatch
1298511b41
gh-72680: Fix false positives when using zipfile.is_zipfile() (GH-134250)
bpo-28494: Improve zipfile.is_zipfile reliability

The zipfile.is_zipfile function would only search for the EndOfZipfile
section header. This failed to correctly identify non-zipfiles that
contained this header. Now the zipfile.is_zipfile function verifies
the first central directory entry.

Changes:
* Extended zipfile.is_zipfile to verify zipfile catalog
* Added tests to validate failure of binary non-zipfiles
* Reuse 'concat' handling for is_zipfile

Co-authored-by: John Jolly <john.jolly@gmail.com>
2025-05-20 18:32:41 -07:00
Carey Metcalfe
35f47d0589
gh-132983: Fix small issues with zstd support in zipfile (#133723)
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Emma Smith <emma@emmatyping.dev>
2025-05-13 16:43:09 +01:00
Emma Smith
c273f59fb3
gh-132983: Add the compression.zstd pacakge and tests (#133365)
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Tomas R. <tomas.roun8@gmail.com>
Co-authored-by: Rogdham <contact@rogdham.net>
2025-05-06 01:38:08 +01:00
Hugo van Kemenade
4ac916ae33
gh-130645: Add color to stdlib argparse CLIs (gh-133380) 2025-05-05 19:46:46 +02:00
Serhiy Storchaka
84a08f8629
gh-133306: Use \z instead of \Z in regular expressions in the stdlib (GH-133337) 2025-05-03 17:58:49 +03:00
Serhiy Storchaka
0f04f2456a
gh-117779: Fix reading duplicated entries in zipfile by name (GH-129254) 2025-04-08 13:56:42 +03:00
Emma Smith
6cd1d6c6b1
gh-84481: Make ZipFile.data_offset more robust (#132178) 2025-04-08 10:43:14 +03:00
Emma Smith
0788948dcb
gh-84481: Add ZipFile.data_offset attribute (#132165)
* Add ZipFile.data_offset attribute

This attribute provides the offset to zip data from the start of the file, when available.

* Add blurb-it

* Try fixing class ref in NEWS
2025-04-06 13:51:42 -07:00
Bénédikt Tran
a95dca7b98
gh-118761: Improve import time for pstats and zipfile (#128981)
Importing `pstats` or `zipfile` is now roughly 20% faster.

This is achieved by removing type annotations depending on `typing`.
2025-01-23 14:49:36 +00:00
Wulian
5d57959d7d
gh-91279: ZipFile.writestr now respect SOURCE_DATE_EPOCH (#124435) 2025-01-20 13:12:29 -05:00
5ec1cff
dda02eb7be
GH-128131: Completely support random read access of uncompressed unencrypted files in ZipFile (#128143)
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2025-01-20 13:04:43 -05:00
Bénédikt Tran
7e819ce0f3
gh-123424: add ZipInfo._for_archive to set suitable default properties (#123429)
---------

Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
2024-12-29 18:30:53 +00:00
Dima Ryazanov
7ed6c5c696
gh-127847: Fix position in the special-cased zipfile seek (#127856)
---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
2024-12-24 15:56:42 +00:00
Bénédikt Tran
e0ef08f5b4
gh-122356: restore the position of a file-like object after zipfile.is_zipfile (#122397) 2024-11-24 11:36:15 -05:00
Jan Hicken
160758a574
gh-126565: Skip zipfile.Path.exists check in write mode (#126576)
When `zipfile.Path.open` is called, the implementation will check
whether the path already exists in the ZIP file. However, this check is
only required when the ZIP file is in read mode. By swapping arguments
of the `and` operator, the short-circuiting will prevent the check from
being run in write mode.

This change will improve the performance of `open()`, because checking
whether a file exists is slow in write mode, especially when the archive
has many members.
2024-11-10 09:57:24 -05:00
Cody Maloney
556dc9b8a7
gh-113977, gh-120754: Remove unbounded reads from zipfile (GH-122101)
GH-113977, GH-120754: Remove unbounded reads from zipfile

Read without a size may read an unbounded amount of data + allocate
unbounded size buffers. Move to capped size reads to prevent potential
issues.

Co-authored-by: Daniel Hillier <daniel.hillier@gmail.com>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2024-11-02 22:28:51 -07:00
Xie Yanbo
e9eedf19c9
Fix invisible character typo (#123933)
Remove accidental addition of zero-width character (U+FEFF) reported by @jaraco:
- c3f4a6b524 (commitcomment-146456562)
2024-09-11 07:44:46 -04:00
Jason R. Coombs
2231286d78
gh-123270: Replaced SanitizedNames with a more surgical fix. (#123354)
Applies changes from zipp 3.20.1 and jaraco/zipp#124
2024-08-27 17:10:30 -04:00
Jason R. Coombs
6aa35f3002
gh-122903: Honor directories in zipfile.Path.glob. (#122908) 2024-08-11 20:33:33 -04:00
Jason R. Coombs
9cd0326310
gh-122905: Sanitize names in zipfile.Path. (#122906)
Ported from zipp 3.19.1; ref jaraco/zipp#119.
2024-08-11 19:48:50 -04:00
Jason R. Coombs
42a34ddb0b
gh-119588: Implement zipfile.Path.is_symlink (zipp 3.19.0). (#119591) 2024-06-03 11:13:07 -04:00
Geoffrey Thomas
ef172521a9
Remove almost all unpaired backticks in docstrings (#119231)
As reported in #117847 and #115366, an unpaired backtick in a docstring
tends to confuse e.g. Sphinx running on subclasses of standard library
objects, and the typographic style of using a backtick as an opening
quote is no longer in favor. Convert almost all uses of the form

    The variable `foo' should do xyz

to

    The variable 'foo' should do xyz

and also fix up miscellaneous other unpaired backticks (extraneous /
missing characters).

No functional change is intended here other than in human-readable
docstrings.
2024-05-22 12:35:18 -04:00
Xie Yanbo
c3f4a6b524
Fix typo in Lib/zipfile/_path/__init__.py (#118622) 2024-05-06 13:58:27 +00:00
Serhiy Storchaka
51ef89cd9a
gh-115961: Add name and mode attributes for compressed file-like objects (GH-116036)
* Add name and mode attributes for compressed and archived file-like objects
  in modules bz2, lzma, tarfile and zipfile.
* Change the value of the mode attribute of GzipFile from integer (1 or 2)
  to string ('rb' or 'wb').
* Change the value of the mode attribute of ZipExtFile from 'r' to 'rb'.
2024-04-21 11:46:39 +03:00
Deborah
a32d693948
gh-102190: Add additional zipfile pwd= arg docstrings (gh-102195)
This just documents the parameter that already exists.

---------

Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
2024-03-31 20:11:48 +00:00
Serhiy Storchaka
567ab3bd15
gh-117084: Fix ZIP file extraction for directory entry names with backslashes on Windows (GH-117129) 2024-03-22 20:08:00 +02:00
Jason R. Coombs
be59aaf3ab
gh-106531: Refresh zipfile._path with zipp 3.18. (#116835)
* gh-106531: Refresh zipfile._path with zipp 3.18.

* Add blurb
2024-03-14 21:53:50 +00:00
Serhiy Storchaka
5d2794a16b
gh-67837, gh-112998: Fix dirs creation in concurrent extraction (GH-115082)
Avoid race conditions in the creation of directories during concurrent
extraction in tarfile and zipfile.

Co-authored-by: Samantha Hughes <shughes-uk@users.noreply.github.com>
Co-authored-by: Peder Bergebakken Sundt <pbsds@hotmail.com>
2024-02-11 12:38:07 +02:00
Gregory P. Smith
b44b9d9900
gh-113971: Make zipfile.ZipInfo._compresslevel public as .compress_level (#113969)
Make zipfile.ZipInfo.compress_level public.

A property is used to retain the behavior of the ._compresslevel.

People constructing zipfile.ZipInfo instances to pass into existing APIs to control per-file compression levels already treat this as public, there was never a reason for it not to be.

I used the more modern name compress_level instead of compresslevel as the keyword argument on other ZipFile APIs is called to be consistent with compress_type and a general long term preference of not runningwordstogether without a separator in names.
2024-01-12 20:15:05 +00:00
Serhiy Storchaka
66363b9a7b
gh-109858: Protect zipfile from "quoted-overlap" zipbomb (GH-110016)
Raise BadZipFile when try to read an entry that overlaps with other entry or
central directory.
2024-01-10 15:55:36 +02:00
AN Long
541c5dbb81
gh-112795: Allow / folder in a zipfile (#112932)
Allow extraction (no-op) of a "/" folder in a zipfile, they are commonly added by some archive creation tools.

Co-authored-by: Erlend E. Aasland <erlend@python.org>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
2024-01-07 01:14:18 +00:00
Shantanu
29e6c7b68a
gh-112578: Fix RuntimeWarning when running zipfile (GH-112579) 2023-12-03 13:09:29 +02:00
Jokimax
c73b0f3560
gh-102956: Fix returning of empty byte strings after seek in zipfile … (#103565)
gh-102956: Fix returning of empty byte strings after seek in zipfile module. This was a regression in 3.12.0 due to a performance enhancement.
2023-10-24 21:15:42 +00:00
Kirill Podoprigora
4110cfec12
gh-110715: Add missing import in zipfile (gh-110822) 2023-10-14 16:17:47 +09:00
Jason R. Coombs
e9791ba351
gh-88233: zipfile: refactor _strip_extra (#102084)
* Refactor zipfile._strip_extra to use higher level abstractions for extras instead of a heavy-state loop.

* Add blurb

* Remove _strip_extra and use _Extra.strip directly.

* Use memoryview to avoid unnecessary copies while splitting Extras.
2023-09-25 19:46:58 -04:00
Jason R. Coombs
22980dc7c9
gh-106752: Sync with zipp 3.16.2 (#106757)
* gh-106752: Sync with zipp 3.16.2

* Add blurb
2023-07-15 09:21:17 -04:00
Jason R. Coombs
03185f0c15
gh-106752: Move zipfile._path into its own package (#106753)
* gh-106752: Move zipfile._path into its own package so it may have supplementary behavior.

* Add blurb
2023-07-14 20:40:46 +00:00
Carey Metcalfe
798bcaa1eb
gh-103861: Fix Zip64 extensions not being properly applied in some cases (#103863)
Fix Zip64 extensions not being properly applied in some cases:

Fixes an issue where adding a small file to a `ZipFile`
object while forcing zip64 extensions causes an extra Zip64 record to be
added to the zip, but doesn't update the `min_version` or file sizes in
the primary central directory header.

Also fixed an edge case in checking if zip64 extensions are required:

This fixes an issue where if data requiring zip64 extensions was added
to an unseekable stream without specifying `force_zip64=True`, zip64
extensions would not be used and a RuntimeError would not be raised when
closing the file (even though the size would be known at that point).
This would result in successfully writing corrupt zip files.

Deciding if zip64 extensions are required outside of the `FileHeader`
function means that both `FileHeader` and `_ZipWriteFile` will always be
in sync. Previously, the `FileHeader` function could enable zip64
extensions without propagating that decision to the `_ZipWriteFile`
class, which would then not correctly write the data descriptor record
or check for errors on close.

If anyone is actually using `ZipInfo.FileHeader` as a public API without
explicitly passing True or False in for zip64, their own code may still be
susceptible to that kind of bug unless they make a similar change to
where the zip64 decision happens.

Fixes #103861

---------

Co-authored-by: Gregory P. Smith <greg@krypto.org>
2023-05-16 00:43:44 -07:00
Carey Metcalfe
4abfe6a14b
GH-92184: Convert os.altsep to '/' in filenames when creating ZipInfo objects (#92185)
This causes the zipfile module to also consider the character defined by
`os.altsep` (if there is one) to be a path separator and convert it to a
forward slash, as defined by the zip specification.

A logical no-op on all known platforms today as os.altsep is currently only set to a meaningful value on Windows (where it is "/").
2023-05-11 07:25:16 +00:00
Yeojin Kim
8f70b16e33
gh-86094: Add support for Unicode Path Extra Field in ZipFile (gh-102566) 2023-04-05 20:54:48 +09:00
Jason R. Coombs
a35fd38b57
gh-102209: Sync with zipp 3.15 moving complexity tests into dedicated module (#102232)
Sync with jaraco/zipp@757a4e1a.
2023-02-25 11:15:48 -05:00
Jason R. Coombs
36854bbb24
gh-101566: Sync with zipp 3.14. (GH-102018) 2023-02-20 13:01:58 -08:00
Tim Hatch
59e86caca8
gh-88233: zipfile: handle extras after a zip64 extra (GH-96161)
Previously, any data _after_ the zip64 extra would be removed.

With many new tests.

Fixes #88233

Automerge-Triggered-By: GH:jaraco
2023-02-20 09:07:03 -08:00
Gregory P. Smith
5927013e47
gh-101144: Allow open and read_text encoding to be positional. (#101145)
The zipfile.Path open() and read_text() encoding parameter can be supplied as a positional argument without causing a TypeError again. 3.10.0b1 included a regression that made it keyword only.

Documentation update included as users writing code to be compatible with a wide range of versions will need to consider this for some time.
2023-01-19 23:04:30 -08:00
dmjohnsson23
59665d0280
Improve zip64 limit error message (#95892) 2022-11-30 16:44:41 +05:30
Jason R. Coombs
93f22d30eb
gh-98108: Add limited pickleability to zipfile.Path (GH-98109)
* gh-98098: Move zipfile into a package.

* Moved test_zipfile to a package

* Extracted module for test_path.

* Add blurb

* Add jaraco as owner of zipfile.Path.

* Synchronize with minor changes found at jaraco/zipp@d9e7f4352d.

* gh-98108: Sync with zipp 3.9.1 adding pickleability.
2022-11-26 13:05:41 -05:00
Jason R. Coombs
7796d3179b
gh-98098: Create packages from zipfile and test_zipfile (gh-98103)
* gh-98098: Move zipfile into a package.

* Moved test_zipfile to a package

* Extracted module for test_path.

* Add blurb

* Add jaraco as owner of zipfile.Path.

* Synchronize with minor changes found at jaraco/zipp@d9e7f4352d.
2022-11-26 09:44:13 -05:00