bpo-45863: tarfile: don't zero out header fields unnecessarily (GH-29693)

Numeric fields of type float, notably mtime, can't be represented
exactly in the ustar header, so the pax header is used. But it is
helpful to set them to the nearest int (i.e. second rather than
nanosecond precision mtimes) in the ustar header as well, for the
benefit of unarchivers that don't understand the pax header.

Add test for tarfile.TarInfo.create_pax_header to confirm correct
behaviour.
This commit is contained in:
Joshua Root 2022-02-10 04:06:19 +11:00 committed by GitHub
parent c0a5ebeb12
commit bf2d44ffb0
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
3 changed files with 71 additions and 6 deletions

View file

@ -888,15 +888,24 @@ class TarInfo(object):
# Test number fields for values that exceed the field limit or values
# that like to be stored as float.
for name, digits in (("uid", 8), ("gid", 8), ("size", 12), ("mtime", 12)):
if name in pax_headers:
# The pax header has priority. Avoid overflow.
info[name] = 0
continue
needs_pax = False
val = info[name]
if not 0 <= val < 8 ** (digits - 1) or isinstance(val, float):
pax_headers[name] = str(val)
val_is_float = isinstance(val, float)
val_int = round(val) if val_is_float else val
if not 0 <= val_int < 8 ** (digits - 1):
# Avoid overflow.
info[name] = 0
needs_pax = True
elif val_is_float:
# Put rounded value in ustar header, and full
# precision value in pax header.
info[name] = val_int
needs_pax = True
# The existing pax header has priority.
if needs_pax and name not in pax_headers:
pax_headers[name] = str(val)
# Create a pax extended header if necessary.
if pax_headers: