cpython/Doc/lib/libtarfile.tex
Thomas Wouters 902d6ebddd Merged revisions 53005-53303 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r53012 | walter.doerwald | 2006-12-12 22:55:31 +0100 (Tue, 12 Dec 2006) | 2 lines

  Fix typo.
........
  r53023 | brett.cannon | 2006-12-13 23:31:37 +0100 (Wed, 13 Dec 2006) | 2 lines

  Remove an unneeded import of 'warnings'.
........
  r53025 | brett.cannon | 2006-12-14 00:02:38 +0100 (Thu, 14 Dec 2006) | 2 lines

  Remove unneeded imports of 'warnings'.
........
  r53026 | brett.cannon | 2006-12-14 00:09:53 +0100 (Thu, 14 Dec 2006) | 4 lines

  Add test.test_support.guard_warnings_filter .  This function returns a context
  manager that protects warnings.filter from being modified once the context is
  exited.
........
  r53029 | george.yoshida | 2006-12-14 03:22:44 +0100 (Thu, 14 Dec 2006) | 2 lines

  Note that guard_warnings_filter was added in 2.6
........
  r53031 | vinay.sajip | 2006-12-14 09:53:55 +0100 (Thu, 14 Dec 2006) | 1 line

  Added news on recent changes to logging
........
  r53032 | andrew.kuchling | 2006-12-14 19:57:53 +0100 (Thu, 14 Dec 2006) | 1 line

  [Patch #1599256 from David Watson] check that os.fsync is available before using it
........
  r53042 | kurt.kaiser | 2006-12-15 06:13:11 +0100 (Fri, 15 Dec 2006) | 6 lines

  1. Avoid hang when encountering a duplicate in a completion list. Bug 1571112.
  2. Duplicate some old entries from Python's NEWS to IDLE's NEWS.txt

  M    AutoCompleteWindow.py
  M    NEWS.txt
........
  r53048 | andrew.kuchling | 2006-12-18 18:12:31 +0100 (Mon, 18 Dec 2006) | 1 line

  [Bug #1618083] Add missing word; make a few grammar fixes
........
  r53050 | andrew.kuchling | 2006-12-18 18:16:05 +0100 (Mon, 18 Dec 2006) | 1 line

  Bump version
........
  r53051 | andrew.kuchling | 2006-12-18 18:22:07 +0100 (Mon, 18 Dec 2006) | 1 line

  [Bug #1616726] Fix description of generator.close(); if you raise some random exception, the exception is raised and doesn't trigger a RuntimeError
........
  r53052 | andrew.kuchling | 2006-12-18 18:38:14 +0100 (Mon, 18 Dec 2006) | 1 line

  Describe new methods in Queue module
........
  r53053 | andrew.kuchling | 2006-12-18 20:22:24 +0100 (Mon, 18 Dec 2006) | 1 line

  [Patch #1615868 by Lars Gustaebel] Use Py_off_t to fix BZ2File.seek() for offsets > 2Gb
........
  r53057 | andrew.kuchling | 2006-12-18 22:29:07 +0100 (Mon, 18 Dec 2006) | 1 line

  Fix markup
........
  r53063 | thomas.wouters | 2006-12-19 09:17:50 +0100 (Tue, 19 Dec 2006) | 5 lines


  Make sre's SubPattern objects accept slice objects like it already accepts
  simple slices.
........
  r53065 | andrew.kuchling | 2006-12-19 15:13:05 +0100 (Tue, 19 Dec 2006) | 6 lines

  [Patch #1618455 by Ben Maurer] Improve speed of HMAC by using str.translate()
     instead of a more general XOR that has to construct a list.

  Slightly modified from Maurer's patch: the _strxor() function is no longer
  necessary at all.
........
  r53066 | andrew.kuchling | 2006-12-19 15:28:23 +0100 (Tue, 19 Dec 2006) | 9 lines

  [Bug #1613651] Document socket.recv_into, socket.recvfrom_into

  Also, the text for recvfrom told you to read recv() for an explanation of the
  'flags' argument, but recv() just pointed you at the man page.  Copied the
  man-page text to recvfrom(), recvfrom_into, recv_into to avoid the pointless
  redirection.

  I don't have LaTeX on this machine; hope my markup is OK.
........
  r53067 | andrew.kuchling | 2006-12-19 15:29:04 +0100 (Tue, 19 Dec 2006) | 1 line

  Comment typo
........
  r53068 | andrew.kuchling | 2006-12-19 16:11:41 +0100 (Tue, 19 Dec 2006) | 1 line

  [Patch #1617413 from Dug Song] Fix HTTP Basic authentication via HTTPS
........
  r53071 | andrew.kuchling | 2006-12-19 16:18:12 +0100 (Tue, 19 Dec 2006) | 1 line

  [Patch #1600491 from Jim Jewett] Describe how to build help files on Windows
........
  r53073 | andrew.kuchling | 2006-12-19 16:43:10 +0100 (Tue, 19 Dec 2006) | 6 lines

  [Patch #1587139 by kxroberto] Protect lock acquisition/release with
  try...finally to ensure the lock is always released.  This could use
  the 'with' statement, but the patch uses 'finally'.

  2.5 backport candidate.
........
  r53074 | vinay.sajip | 2006-12-19 19:29:11 +0100 (Tue, 19 Dec 2006) | 1 line

  Updated documentation for findCaller() to indicate that a 3-tuple is now returned, rather than a 2-tuple.
........
  r53090 | georg.brandl | 2006-12-19 23:06:46 +0100 (Tue, 19 Dec 2006) | 3 lines

  Patch #1484695: The tarfile module now raises a HeaderError exception
  if a buffer given to frombuf() is invalid.
........
  r53099 | raymond.hettinger | 2006-12-20 07:42:06 +0100 (Wed, 20 Dec 2006) | 5 lines

  Bug #1590891:   random.randrange don't return correct value for big number

  Needs to be backported.
........
  r53106 | georg.brandl | 2006-12-20 12:55:16 +0100 (Wed, 20 Dec 2006) | 3 lines

  Testcase for patch #1484695.
........
  r53110 | andrew.kuchling | 2006-12-20 20:48:20 +0100 (Wed, 20 Dec 2006) | 17 lines

  [Apply length-checking.diff from bug #1599254]

  Add length checking to single-file mailbox formats: before doing a
  flush() on a mailbox, seek to the end and verify its length is
  unchanged, raising ExternalClashError if the file's length has
  changed.

  This fix avoids potential data loss if some other process appends to
  the mailbox file after the table of contents has been generated;
  instead of overwriting the modified file, you'll get the exception.

  I also noticed that the self._lookup() call in self.flush() wasn't
  necessary (everything that sets self._pending to True also calls
  self.lookup()), and replaced it by an assertion.

  2.5 backport candidate.
........
  r53112 | andrew.kuchling | 2006-12-20 20:57:10 +0100 (Wed, 20 Dec 2006) | 1 line

  [Bug #1619674] Make sum() use the term iterable, not sequence
........
  r53113 | andrew.kuchling | 2006-12-20 20:58:11 +0100 (Wed, 20 Dec 2006) | 1 line

  Two grammar fixes
........
  r53115 | andrew.kuchling | 2006-12-20 21:11:12 +0100 (Wed, 20 Dec 2006) | 5 lines

  Some other built-in functions are described with 'sequence' arguments
  that should really be 'iterable'; this commit changes them.

  Did I miss any?  Did I introduce any errors?
........
  r53117 | andrew.kuchling | 2006-12-20 21:20:42 +0100 (Wed, 20 Dec 2006) | 1 line

  [Bug #1619680] in_dll() arguments are documented in the wrong order
........
  r53120 | neal.norwitz | 2006-12-21 05:38:00 +0100 (Thu, 21 Dec 2006) | 1 line

  Lars asked for permission on on python-dev for work on tarfile.py
........
  r53125 | andrew.kuchling | 2006-12-21 14:40:29 +0100 (Thu, 21 Dec 2006) | 1 line

  Mention the os.SEEK_* constants
........
  r53129 | walter.doerwald | 2006-12-21 19:06:30 +0100 (Thu, 21 Dec 2006) | 2 lines

  Fix typo.
........
  r53131 | thomas.heller | 2006-12-21 19:30:56 +0100 (Thu, 21 Dec 2006) | 3 lines

  Fix wrong markup of an argument in a method signature.
  Will backport.
........
  r53137 | andrew.kuchling | 2006-12-22 01:50:56 +0100 (Fri, 22 Dec 2006) | 1 line

  Typo fix
........
  r53139 | andrew.kuchling | 2006-12-22 14:25:02 +0100 (Fri, 22 Dec 2006) | 1 line

  [Bug #737202; fix from Titus Brown] Make CGIHTTPServer work for scripts in sub-directories
........
  r53141 | andrew.kuchling | 2006-12-22 16:04:45 +0100 (Fri, 22 Dec 2006) | 6 lines

  [Bug #802128] Make the mode argument of dumbdbm actually work the way it's
  described, and add a test for it.

  2.5 bugfix candidate, maybe; arguably this patch changes the API of
  dumbdbm and shouldn't be added in a point-release.
........
  r53142 | andrew.kuchling | 2006-12-22 16:16:58 +0100 (Fri, 22 Dec 2006) | 6 lines

  [Bug #802128 continued] Modify mode depending on the process umask.

  Is there really no other way to read the umask than to set it?

  Hope this works on Windows...
........
  r53145 | andrew.kuchling | 2006-12-22 17:43:26 +0100 (Fri, 22 Dec 2006) | 1 line

  [Bug #776202] Apply Walter Doerwald's patch to use text mode for encoded files
........
  r53146 | andrew.kuchling | 2006-12-22 19:41:42 +0100 (Fri, 22 Dec 2006) | 9 lines

  [Patch #783050 from Patrick Lynch] The emulation of forkpty() is incorrect;
  the master should close the slave fd.

  Added a test to test_pty.py that reads from the master_fd after doing
  a pty.fork(); without the fix it hangs forever instead of raising an
  exception.  (<crossing fingers for the buildbots>)

  2.5 backport candidate.
........
  r53147 | andrew.kuchling | 2006-12-22 20:06:16 +0100 (Fri, 22 Dec 2006) | 1 line

  [Patch #827559 from Chris Gonnerman] Make SimpleHTTPServer redirect when a directory URL is missing the trailing slash; this lets relative links work.
........
  r53149 | andrew.kuchling | 2006-12-22 20:21:27 +0100 (Fri, 22 Dec 2006) | 1 line

  Darn; this test works when you run test_pty.py directly, but fails when regrtest runs it (the os.read() raises os.error).  I can't figure out the cause, so am commenting out the test.
........
  r53150 | andrew.kuchling | 2006-12-22 22:48:19 +0100 (Fri, 22 Dec 2006) | 1 line

  Frak; this test also fails
........
  r53153 | lars.gustaebel | 2006-12-23 17:40:13 +0100 (Sat, 23 Dec 2006) | 5 lines

  Patch #1230446: tarfile.py: fix ExFileObject so that read() and tell()
  work correctly together with readline().

  Will backport to 2.5.
........
  r53155 | lars.gustaebel | 2006-12-23 18:57:23 +0100 (Sat, 23 Dec 2006) | 5 lines

  Patch #1262036: Prevent TarFiles from being added to themselves under
  certain conditions.

  Will backport to 2.5.
........
  r53159 | andrew.kuchling | 2006-12-27 04:25:31 +0100 (Wed, 27 Dec 2006) | 4 lines

  [Part of patch #1182394] Move the HMAC blocksize to be a class-level
  constant; this allows changing it in a subclass.  To accommodate this,
  copy() now uses __class__.  Also add some text to a comment.
........
  r53160 | andrew.kuchling | 2006-12-27 04:31:24 +0100 (Wed, 27 Dec 2006) | 1 line

  [Rest of patch #1182394] Add ._current() method so that we can use the written-in-C .hexdigest() method
........
  r53161 | lars.gustaebel | 2006-12-27 11:30:46 +0100 (Wed, 27 Dec 2006) | 4 lines

  Patch #1504073: Fix tarfile.open() for mode "r" with a fileobj argument.

  Will backport to 2.5.
........
  r53165 | neal.norwitz | 2006-12-28 05:39:20 +0100 (Thu, 28 Dec 2006) | 1 line

  Remove a stray (old) macro name left around (I guess)
........
  r53188 | neal.norwitz | 2006-12-29 04:01:53 +0100 (Fri, 29 Dec 2006) | 1 line

  SF bug #1623890, fix argument name in docstring
........
  r53200 | raymond.hettinger | 2006-12-30 05:01:17 +0100 (Sat, 30 Dec 2006) | 1 line

  For sets with cyclical reprs, emit an ellipsis instead of infinitely recursing.
........
  r53232 | brett.cannon | 2007-01-04 01:23:49 +0100 (Thu, 04 Jan 2007) | 3 lines

  Add EnvironmentVarGuard to test.test_support.  Provides a context manager to
  temporarily set or unset environment variables.
........
  r53235 | neal.norwitz | 2007-01-04 07:25:31 +0100 (Thu, 04 Jan 2007) | 1 line

  SF #1627373, fix typo in CarbonEvt.
........
  r53244 | raymond.hettinger | 2007-01-04 18:53:34 +0100 (Thu, 04 Jan 2007) | 1 line

  Fix stability of heapq's nlargest() and nsmallest().
........
  r53249 | martin.v.loewis | 2007-01-04 22:06:12 +0100 (Thu, 04 Jan 2007) | 3 lines

  Bug #1566280: Explicitly invoke threading._shutdown from Py_Main,
  to avoid relying on atexit.
  Will backport to 2.5.
........
  r53252 | gregory.p.smith | 2007-01-05 02:59:42 +0100 (Fri, 05 Jan 2007) | 3 lines

  Support linking of the bsddb module against BerkeleyDB 4.5.x
  (will backport to 2.5)
........
  r53253 | gregory.p.smith | 2007-01-05 03:06:17 +0100 (Fri, 05 Jan 2007) | 2 lines

  bump module version to match supported berkeleydb version
........
  r53255 | neal.norwitz | 2007-01-05 06:25:22 +0100 (Fri, 05 Jan 2007) | 6 lines

  Prevent crash on shutdown which can occur if we are finalizing
  and the module dict has been cleared already and some object
  raises a warning (like in a __del__).

  Will backport.
........
  r53258 | gregory.p.smith | 2007-01-05 08:21:35 +0100 (Fri, 05 Jan 2007) | 2 lines

  typo fix
........
  r53260 | neal.norwitz | 2007-01-05 09:06:43 +0100 (Fri, 05 Jan 2007) | 1 line

  Add Collin Winter for access to update PEP 3107
........
  r53262 | andrew.kuchling | 2007-01-05 15:22:17 +0100 (Fri, 05 Jan 2007) | 1 line

  [Bug #1622533] Make docstrings raw strings because they contain control characters (\0, \1)
........
  r53264 | andrew.kuchling | 2007-01-05 16:51:24 +0100 (Fri, 05 Jan 2007) | 1 line

  [Patch #1520904] Fix bsddb tests to write to the temp directory instead of the Lib/bsddb/test directory
........
  r53279 | brett.cannon | 2007-01-05 22:45:09 +0100 (Fri, 05 Jan 2007) | 3 lines

  Silence a warning from gcc 4.0.1 by specifying a function's parameter list is
  'void' instead of just a set of empty parentheses.
........
  r53285 | raymond.hettinger | 2007-01-06 02:14:41 +0100 (Sat, 06 Jan 2007) | 2 lines

  SF# 1409443:  Expand comment to cover the interaction between f->f_lasti and the PREDICT macros.
........
  r53286 | anthony.baxter | 2007-01-06 05:45:54 +0100 (Sat, 06 Jan 2007) | 1 line

  update to (c) years to include 2007
........
  r53291 | neal.norwitz | 2007-01-06 22:24:35 +0100 (Sat, 06 Jan 2007) | 1 line

  Add Josiah to SF for maintaining asyncore/asynchat
........
  r53293 | peter.astrand | 2007-01-07 09:53:46 +0100 (Sun, 07 Jan 2007) | 1 line

  Re-implemented fix for #1531862 once again, in a way that works with Python 2.2. Fixes bug #1603424.
........
  r53295 | peter.astrand | 2007-01-07 15:34:16 +0100 (Sun, 07 Jan 2007) | 1 line

  Avoid O(N**2) bottleneck in _communicate_(). Fixes #1598181.
........
  r53300 | raymond.hettinger | 2007-01-08 19:09:20 +0100 (Mon, 08 Jan 2007) | 1 line

  Fix zero-length corner case for iterating over a mutating deque.
........
  r53301 | vinay.sajip | 2007-01-08 19:50:32 +0100 (Mon, 08 Jan 2007) | 4 lines

  Bare except clause removed from SMTPHandler.emit(). Now, only ImportError is trapped.
  Bare except clause removed from SocketHandler.createSocket(). Now, only socket.error is trapped.
  (SF #411881)
........
  r53302 | vinay.sajip | 2007-01-08 19:51:46 +0100 (Mon, 08 Jan 2007) | 2 lines

  Bare except clause removed from LogRecord.__init__. Now, only ValueError, TypeError and AttributeError are trapped.
  (SF #411881)
........
  r53303 | vinay.sajip | 2007-01-08 19:52:36 +0100 (Mon, 08 Jan 2007) | 1 line

  Added entries about removal of some bare except clauses from logging.
........
2007-01-09 23:18:33 +00:00

503 lines
18 KiB
TeX

\section{\module{tarfile} --- Read and write tar archive files}
\declaremodule{standard}{tarfile}
\modulesynopsis{Read and write tar-format archive files.}
\versionadded{2.3}
\moduleauthor{Lars Gust\"abel}{lars@gustaebel.de}
\sectionauthor{Lars Gust\"abel}{lars@gustaebel.de}
The \module{tarfile} module makes it possible to read and create tar archives.
Some facts and figures:
\begin{itemize}
\item reads and writes \module{gzip} and \module{bzip2} compressed archives.
\item creates \POSIX{} 1003.1-1990 compliant or GNU tar compatible archives.
\item reads GNU tar extensions \emph{longname}, \emph{longlink} and
\emph{sparse}.
\item stores pathnames of unlimited length using GNU tar extensions.
\item handles directories, regular files, hardlinks, symbolic links, fifos,
character devices and block devices and is able to acquire and
restore file information like timestamp, access permissions and owner.
\item can handle tape devices.
\end{itemize}
\begin{funcdesc}{open}{\optional{name\optional{, mode
\optional{, fileobj\optional{, bufsize}}}}}
Return a \class{TarFile} object for the pathname \var{name}.
For detailed information on \class{TarFile} objects,
see \citetitle{TarFile Objects} (section \ref{tarfile-objects}).
\var{mode} has to be a string of the form \code{'filemode[:compression]'},
it defaults to \code{'r'}. Here is a full list of mode combinations:
\begin{tableii}{c|l}{code}{mode}{action}
\lineii{'r' or 'r:*'}{Open for reading with transparent compression (recommended).}
\lineii{'r:'}{Open for reading exclusively without compression.}
\lineii{'r:gz'}{Open for reading with gzip compression.}
\lineii{'r:bz2'}{Open for reading with bzip2 compression.}
\lineii{'a' or 'a:'}{Open for appending with no compression.}
\lineii{'w' or 'w:'}{Open for uncompressed writing.}
\lineii{'w:gz'}{Open for gzip compressed writing.}
\lineii{'w:bz2'}{Open for bzip2 compressed writing.}
\end{tableii}
Note that \code{'a:gz'} or \code{'a:bz2'} is not possible.
If \var{mode} is not suitable to open a certain (compressed) file for
reading, \exception{ReadError} is raised. Use \var{mode} \code{'r'} to
avoid this. If a compression method is not supported,
\exception{CompressionError} is raised.
If \var{fileobj} is specified, it is used as an alternative to
a file object opened for \var{name}.
For special purposes, there is a second format for \var{mode}:
\code{'filemode|[compression]'}. \function{open()} will return a
\class{TarFile} object that processes its data as a stream of
blocks. No random seeking will be done on the file. If given,
\var{fileobj} may be any object that has a \method{read()} or
\method{write()} method (depending on the \var{mode}).
\var{bufsize} specifies the blocksize and defaults to \code{20 *
512} bytes. Use this variant in combination with
e.g. \code{sys.stdin}, a socket file object or a tape device.
However, such a \class{TarFile} object is limited in that it does
not allow to be accessed randomly, see ``Examples''
(section~\ref{tar-examples}). The currently possible modes:
\begin{tableii}{c|l}{code}{Mode}{Action}
\lineii{'r|*'}{Open a \emph{stream} of tar blocks for reading with transparent compression.}
\lineii{'r|'}{Open a \emph{stream} of uncompressed tar blocks for reading.}
\lineii{'r|gz'}{Open a gzip compressed \emph{stream} for reading.}
\lineii{'r|bz2'}{Open a bzip2 compressed \emph{stream} for reading.}
\lineii{'w|'}{Open an uncompressed \emph{stream} for writing.}
\lineii{'w|gz'}{Open an gzip compressed \emph{stream} for writing.}
\lineii{'w|bz2'}{Open an bzip2 compressed \emph{stream} for writing.}
\end{tableii}
\end{funcdesc}
\begin{classdesc*}{TarFile}
Class for reading and writing tar archives. Do not use this
class directly, better use \function{open()} instead.
See ``TarFile Objects'' (section~\ref{tarfile-objects}).
\end{classdesc*}
\begin{funcdesc}{is_tarfile}{name}
Return \constant{True} if \var{name} is a tar archive file, that
the \module{tarfile} module can read.
\end{funcdesc}
\begin{classdesc}{TarFileCompat}{filename\optional{, mode\optional{,
compression}}}
Class for limited access to tar archives with a
\refmodule{zipfile}-like interface. Please consult the
documentation of the \refmodule{zipfile} module for more details.
\var{compression} must be one of the following constants:
\begin{datadesc}{TAR_PLAIN}
Constant for an uncompressed tar archive.
\end{datadesc}
\begin{datadesc}{TAR_GZIPPED}
Constant for a \refmodule{gzip} compressed tar archive.
\end{datadesc}
\end{classdesc}
\begin{excdesc}{TarError}
Base class for all \module{tarfile} exceptions.
\end{excdesc}
\begin{excdesc}{ReadError}
Is raised when a tar archive is opened, that either cannot be handled by
the \module{tarfile} module or is somehow invalid.
\end{excdesc}
\begin{excdesc}{CompressionError}
Is raised when a compression method is not supported or when the data
cannot be decoded properly.
\end{excdesc}
\begin{excdesc}{StreamError}
Is raised for the limitations that are typical for stream-like
\class{TarFile} objects.
\end{excdesc}
\begin{excdesc}{ExtractError}
Is raised for \emph{non-fatal} errors when using \method{extract()}, but
only if \member{TarFile.errorlevel}\code{ == 2}.
\end{excdesc}
\begin{excdesc}{HeaderError}
Is raised by \method{frombuf()} if the buffer it gets is invalid.
\versionadded{2.6}
\end{excdesc}
\begin{seealso}
\seemodule{zipfile}{Documentation of the \refmodule{zipfile}
standard module.}
\seetitle[http://www.gnu.org/software/tar/manual/html_node/tar_134.html\#SEC134]
{GNU tar manual, Basic Tar Format}{Documentation for tar archive files,
including GNU tar extensions.}
\end{seealso}
%-----------------
% TarFile Objects
%-----------------
\subsection{TarFile Objects \label{tarfile-objects}}
The \class{TarFile} object provides an interface to a tar archive. A tar
archive is a sequence of blocks. An archive member (a stored file) is made up
of a header block followed by data blocks. It is possible, to store a file in a
tar archive several times. Each archive member is represented by a
\class{TarInfo} object, see \citetitle{TarInfo Objects} (section
\ref{tarinfo-objects}) for details.
\begin{classdesc}{TarFile}{\optional{name
\optional{, mode\optional{, fileobj}}}}
Open an \emph{(uncompressed)} tar archive \var{name}.
\var{mode} is either \code{'r'} to read from an existing archive,
\code{'a'} to append data to an existing file or \code{'w'} to create a new
file overwriting an existing one. \var{mode} defaults to \code{'r'}.
If \var{fileobj} is given, it is used for reading or writing data.
If it can be determined, \var{mode} is overridden by \var{fileobj}'s mode.
\begin{notice}
\var{fileobj} is not closed, when \class{TarFile} is closed.
\end{notice}
\end{classdesc}
\begin{methoddesc}{open}{...}
Alternative constructor. The \function{open()} function on module level is
actually a shortcut to this classmethod. See section~\ref{module-tarfile}
for details.
\end{methoddesc}
\begin{methoddesc}{getmember}{name}
Return a \class{TarInfo} object for member \var{name}. If \var{name} can
not be found in the archive, \exception{KeyError} is raised.
\begin{notice}
If a member occurs more than once in the archive, its last
occurrence is assumed to be the most up-to-date version.
\end{notice}
\end{methoddesc}
\begin{methoddesc}{getmembers}{}
Return the members of the archive as a list of \class{TarInfo} objects.
The list has the same order as the members in the archive.
\end{methoddesc}
\begin{methoddesc}{getnames}{}
Return the members as a list of their names. It has the same order as
the list returned by \method{getmembers()}.
\end{methoddesc}
\begin{methoddesc}{list}{verbose=True}
Print a table of contents to \code{sys.stdout}. If \var{verbose} is
\constant{False}, only the names of the members are printed. If it is
\constant{True}, output similar to that of \program{ls -l} is produced.
\end{methoddesc}
\begin{methoddesc}{next}{}
Return the next member of the archive as a \class{TarInfo} object, when
\class{TarFile} is opened for reading. Return \code{None} if there is no
more available.
\end{methoddesc}
\begin{methoddesc}{extractall}{\optional{path\optional{, members}}}
Extract all members from the archive to the current working directory
or directory \var{path}. If optional \var{members} is given, it must be
a subset of the list returned by \method{getmembers()}.
Directory informations like owner, modification time and permissions are
set after all members have been extracted. This is done to work around two
problems: A directory's modification time is reset each time a file is
created in it. And, if a directory's permissions do not allow writing,
extracting files to it will fail.
\versionadded{2.5}
\end{methoddesc}
\begin{methoddesc}{extract}{member\optional{, path}}
Extract a member from the archive to the current working directory,
using its full name. Its file information is extracted as accurately as
possible.
\var{member} may be a filename or a \class{TarInfo} object.
You can specify a different directory using \var{path}.
\begin{notice}
Because the \method{extract()} method allows random access to a tar
archive there are some issues you must take care of yourself. See the
description for \method{extractall()} above.
\end{notice}
\end{methoddesc}
\begin{methoddesc}{extractfile}{member}
Extract a member from the archive as a file object.
\var{member} may be a filename or a \class{TarInfo} object.
If \var{member} is a regular file, a file-like object is returned.
If \var{member} is a link, a file-like object is constructed from the
link's target.
If \var{member} is none of the above, \code{None} is returned.
\begin{notice}
The file-like object is read-only and provides the following methods:
\method{read()}, \method{readline()}, \method{readlines()},
\method{seek()}, \method{tell()}.
\end{notice}
\end{methoddesc}
\begin{methoddesc}{add}{name\optional{, arcname\optional{, recursive}}}
Add the file \var{name} to the archive. \var{name} may be any type
of file (directory, fifo, symbolic link, etc.).
If given, \var{arcname} specifies an alternative name for the file in the
archive. Directories are added recursively by default.
This can be avoided by setting \var{recursive} to \constant{False};
the default is \constant{True}.
\end{methoddesc}
\begin{methoddesc}{addfile}{tarinfo\optional{, fileobj}}
Add the \class{TarInfo} object \var{tarinfo} to the archive.
If \var{fileobj} is given, \code{\var{tarinfo}.size} bytes are read
from it and added to the archive. You can create \class{TarInfo} objects
using \method{gettarinfo()}.
\begin{notice}
On Windows platforms, \var{fileobj} should always be opened with mode
\code{'rb'} to avoid irritation about the file size.
\end{notice}
\end{methoddesc}
\begin{methoddesc}{gettarinfo}{\optional{name\optional{,
arcname\optional{, fileobj}}}}
Create a \class{TarInfo} object for either the file \var{name} or
the file object \var{fileobj} (using \function{os.fstat()} on its
file descriptor). You can modify some of the \class{TarInfo}'s
attributes before you add it using \method{addfile()}. If given,
\var{arcname} specifies an alternative name for the file in the
archive.
\end{methoddesc}
\begin{methoddesc}{close}{}
Close the \class{TarFile}. In write mode, two finishing zero
blocks are appended to the archive.
\end{methoddesc}
\begin{memberdesc}{posix}
If true, create a \POSIX{} 1003.1-1990 compliant archive. GNU
extensions are not used, because they are not part of the \POSIX{}
standard. This limits the length of filenames to at most 256,
link names to 100 characters and the maximum file size to 8
gigabytes. A \exception{ValueError} is raised if a file exceeds
this limit. If false, create a GNU tar compatible archive. It
will not be \POSIX{} compliant, but can store files without any
of the above restrictions.
\versionchanged[\var{posix} defaults to \constant{False}]{2.4}
\end{memberdesc}
\begin{memberdesc}{dereference}
If false, add symbolic and hard links to archive. If true, add the
content of the target files to the archive. This has no effect on
systems that do not support symbolic links.
\end{memberdesc}
\begin{memberdesc}{ignore_zeros}
If false, treat an empty block as the end of the archive. If true,
skip empty (and invalid) blocks and try to get as many members as
possible. This is only useful for concatenated or damaged
archives.
\end{memberdesc}
\begin{memberdesc}{debug=0}
To be set from \code{0} (no debug messages; the default) up to
\code{3} (all debug messages). The messages are written to
\code{sys.stderr}.
\end{memberdesc}
\begin{memberdesc}{errorlevel}
If \code{0} (the default), all errors are ignored when using
\method{extract()}. Nevertheless, they appear as error messages
in the debug output, when debugging is enabled. If \code{1}, all
\emph{fatal} errors are raised as \exception{OSError} or
\exception{IOError} exceptions. If \code{2}, all \emph{non-fatal}
errors are raised as \exception{TarError} exceptions as well.
\end{memberdesc}
%-----------------
% TarInfo Objects
%-----------------
\subsection{TarInfo Objects \label{tarinfo-objects}}
A \class{TarInfo} object represents one member in a
\class{TarFile}. Aside from storing all required attributes of a file
(like file type, size, time, permissions, owner etc.), it provides
some useful methods to determine its type. It does \emph{not} contain
the file's data itself.
\class{TarInfo} objects are returned by \class{TarFile}'s methods
\method{getmember()}, \method{getmembers()} and \method{gettarinfo()}.
\begin{classdesc}{TarInfo}{\optional{name}}
Create a \class{TarInfo} object.
\end{classdesc}
\begin{methoddesc}{frombuf}{}
Create and return a \class{TarInfo} object from a string buffer.
\versionadded[Raises \exception{HeaderError} if the buffer is
invalid.]{2.6}
\end{methoddesc}
\begin{methoddesc}{tobuf}{posix}
Create a string buffer from a \class{TarInfo} object.
See \class{TarFile}'s \member{posix} attribute for information
on the \var{posix} argument. It defaults to \constant{False}.
\versionadded[The \var{posix} parameter]{2.5}
\end{methoddesc}
A \code{TarInfo} object has the following public data attributes:
\begin{memberdesc}{name}
Name of the archive member.
\end{memberdesc}
\begin{memberdesc}{size}
Size in bytes.
\end{memberdesc}
\begin{memberdesc}{mtime}
Time of last modification.
\end{memberdesc}
\begin{memberdesc}{mode}
Permission bits.
\end{memberdesc}
\begin{memberdesc}{type}
File type. \var{type} is usually one of these constants:
\constant{REGTYPE}, \constant{AREGTYPE}, \constant{LNKTYPE},
\constant{SYMTYPE}, \constant{DIRTYPE}, \constant{FIFOTYPE},
\constant{CONTTYPE}, \constant{CHRTYPE}, \constant{BLKTYPE},
\constant{GNUTYPE_SPARSE}. To determine the type of a
\class{TarInfo} object more conveniently, use the \code{is_*()}
methods below.
\end{memberdesc}
\begin{memberdesc}{linkname}
Name of the target file name, which is only present in
\class{TarInfo} objects of type \constant{LNKTYPE} and
\constant{SYMTYPE}.
\end{memberdesc}
\begin{memberdesc}{uid}
User ID of the user who originally stored this member.
\end{memberdesc}
\begin{memberdesc}{gid}
Group ID of the user who originally stored this member.
\end{memberdesc}
\begin{memberdesc}{uname}
User name.
\end{memberdesc}
\begin{memberdesc}{gname}
Group name.
\end{memberdesc}
A \class{TarInfo} object also provides some convenient query methods:
\begin{methoddesc}{isfile}{}
Return \constant{True} if the \class{Tarinfo} object is a regular
file.
\end{methoddesc}
\begin{methoddesc}{isreg}{}
Same as \method{isfile()}.
\end{methoddesc}
\begin{methoddesc}{isdir}{}
Return \constant{True} if it is a directory.
\end{methoddesc}
\begin{methoddesc}{issym}{}
Return \constant{True} if it is a symbolic link.
\end{methoddesc}
\begin{methoddesc}{islnk}{}
Return \constant{True} if it is a hard link.
\end{methoddesc}
\begin{methoddesc}{ischr}{}
Return \constant{True} if it is a character device.
\end{methoddesc}
\begin{methoddesc}{isblk}{}
Return \constant{True} if it is a block device.
\end{methoddesc}
\begin{methoddesc}{isfifo}{}
Return \constant{True} if it is a FIFO.
\end{methoddesc}
\begin{methoddesc}{isdev}{}
Return \constant{True} if it is one of character device, block
device or FIFO.
\end{methoddesc}
%------------------------
% Examples
%------------------------
\subsection{Examples \label{tar-examples}}
How to extract an entire tar archive to the current working directory:
\begin{verbatim}
import tarfile
tar = tarfile.open("sample.tar.gz")
tar.extractall()
tar.close()
\end{verbatim}
How to create an uncompressed tar archive from a list of filenames:
\begin{verbatim}
import tarfile
tar = tarfile.open("sample.tar", "w")
for name in ["foo", "bar", "quux"]:
tar.add(name)
tar.close()
\end{verbatim}
How to read a gzip compressed tar archive and display some member information:
\begin{verbatim}
import tarfile
tar = tarfile.open("sample.tar.gz", "r:gz")
for tarinfo in tar:
print tarinfo.name, "is", tarinfo.size, "bytes in size and is",
if tarinfo.isreg():
print "a regular file."
elif tarinfo.isdir():
print "a directory."
else:
print "something else."
tar.close()
\end{verbatim}
How to create a tar archive with faked information:
\begin{verbatim}
import tarfile
tar = tarfile.open("sample.tar.gz", "w:gz")
for name in namelist:
tarinfo = tar.gettarinfo(name, "fakeproj-1.0/" + name)
tarinfo.uid = 123
tarinfo.gid = 456
tarinfo.uname = "johndoe"
tarinfo.gname = "fake"
tar.addfile(tarinfo, file(name))
tar.close()
\end{verbatim}
The \emph{only} way to extract an uncompressed tar stream from
\code{sys.stdin}:
\begin{verbatim}
import sys
import tarfile
tar = tarfile.open(mode="r|", fileobj=sys.stdin)
for tarinfo in tar:
tar.extract(tarinfo)
tar.close()
\end{verbatim}