cpython/Doc/lib/liburlparse.tex
Thomas Wouters 902d6ebddd Merged revisions 53005-53303 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r53012 | walter.doerwald | 2006-12-12 22:55:31 +0100 (Tue, 12 Dec 2006) | 2 lines

  Fix typo.
........
  r53023 | brett.cannon | 2006-12-13 23:31:37 +0100 (Wed, 13 Dec 2006) | 2 lines

  Remove an unneeded import of 'warnings'.
........
  r53025 | brett.cannon | 2006-12-14 00:02:38 +0100 (Thu, 14 Dec 2006) | 2 lines

  Remove unneeded imports of 'warnings'.
........
  r53026 | brett.cannon | 2006-12-14 00:09:53 +0100 (Thu, 14 Dec 2006) | 4 lines

  Add test.test_support.guard_warnings_filter .  This function returns a context
  manager that protects warnings.filter from being modified once the context is
  exited.
........
  r53029 | george.yoshida | 2006-12-14 03:22:44 +0100 (Thu, 14 Dec 2006) | 2 lines

  Note that guard_warnings_filter was added in 2.6
........
  r53031 | vinay.sajip | 2006-12-14 09:53:55 +0100 (Thu, 14 Dec 2006) | 1 line

  Added news on recent changes to logging
........
  r53032 | andrew.kuchling | 2006-12-14 19:57:53 +0100 (Thu, 14 Dec 2006) | 1 line

  [Patch #1599256 from David Watson] check that os.fsync is available before using it
........
  r53042 | kurt.kaiser | 2006-12-15 06:13:11 +0100 (Fri, 15 Dec 2006) | 6 lines

  1. Avoid hang when encountering a duplicate in a completion list. Bug 1571112.
  2. Duplicate some old entries from Python's NEWS to IDLE's NEWS.txt

  M    AutoCompleteWindow.py
  M    NEWS.txt
........
  r53048 | andrew.kuchling | 2006-12-18 18:12:31 +0100 (Mon, 18 Dec 2006) | 1 line

  [Bug #1618083] Add missing word; make a few grammar fixes
........
  r53050 | andrew.kuchling | 2006-12-18 18:16:05 +0100 (Mon, 18 Dec 2006) | 1 line

  Bump version
........
  r53051 | andrew.kuchling | 2006-12-18 18:22:07 +0100 (Mon, 18 Dec 2006) | 1 line

  [Bug #1616726] Fix description of generator.close(); if you raise some random exception, the exception is raised and doesn't trigger a RuntimeError
........
  r53052 | andrew.kuchling | 2006-12-18 18:38:14 +0100 (Mon, 18 Dec 2006) | 1 line

  Describe new methods in Queue module
........
  r53053 | andrew.kuchling | 2006-12-18 20:22:24 +0100 (Mon, 18 Dec 2006) | 1 line

  [Patch #1615868 by Lars Gustaebel] Use Py_off_t to fix BZ2File.seek() for offsets > 2Gb
........
  r53057 | andrew.kuchling | 2006-12-18 22:29:07 +0100 (Mon, 18 Dec 2006) | 1 line

  Fix markup
........
  r53063 | thomas.wouters | 2006-12-19 09:17:50 +0100 (Tue, 19 Dec 2006) | 5 lines


  Make sre's SubPattern objects accept slice objects like it already accepts
  simple slices.
........
  r53065 | andrew.kuchling | 2006-12-19 15:13:05 +0100 (Tue, 19 Dec 2006) | 6 lines

  [Patch #1618455 by Ben Maurer] Improve speed of HMAC by using str.translate()
     instead of a more general XOR that has to construct a list.

  Slightly modified from Maurer's patch: the _strxor() function is no longer
  necessary at all.
........
  r53066 | andrew.kuchling | 2006-12-19 15:28:23 +0100 (Tue, 19 Dec 2006) | 9 lines

  [Bug #1613651] Document socket.recv_into, socket.recvfrom_into

  Also, the text for recvfrom told you to read recv() for an explanation of the
  'flags' argument, but recv() just pointed you at the man page.  Copied the
  man-page text to recvfrom(), recvfrom_into, recv_into to avoid the pointless
  redirection.

  I don't have LaTeX on this machine; hope my markup is OK.
........
  r53067 | andrew.kuchling | 2006-12-19 15:29:04 +0100 (Tue, 19 Dec 2006) | 1 line

  Comment typo
........
  r53068 | andrew.kuchling | 2006-12-19 16:11:41 +0100 (Tue, 19 Dec 2006) | 1 line

  [Patch #1617413 from Dug Song] Fix HTTP Basic authentication via HTTPS
........
  r53071 | andrew.kuchling | 2006-12-19 16:18:12 +0100 (Tue, 19 Dec 2006) | 1 line

  [Patch #1600491 from Jim Jewett] Describe how to build help files on Windows
........
  r53073 | andrew.kuchling | 2006-12-19 16:43:10 +0100 (Tue, 19 Dec 2006) | 6 lines

  [Patch #1587139 by kxroberto] Protect lock acquisition/release with
  try...finally to ensure the lock is always released.  This could use
  the 'with' statement, but the patch uses 'finally'.

  2.5 backport candidate.
........
  r53074 | vinay.sajip | 2006-12-19 19:29:11 +0100 (Tue, 19 Dec 2006) | 1 line

  Updated documentation for findCaller() to indicate that a 3-tuple is now returned, rather than a 2-tuple.
........
  r53090 | georg.brandl | 2006-12-19 23:06:46 +0100 (Tue, 19 Dec 2006) | 3 lines

  Patch #1484695: The tarfile module now raises a HeaderError exception
  if a buffer given to frombuf() is invalid.
........
  r53099 | raymond.hettinger | 2006-12-20 07:42:06 +0100 (Wed, 20 Dec 2006) | 5 lines

  Bug #1590891:   random.randrange don't return correct value for big number

  Needs to be backported.
........
  r53106 | georg.brandl | 2006-12-20 12:55:16 +0100 (Wed, 20 Dec 2006) | 3 lines

  Testcase for patch #1484695.
........
  r53110 | andrew.kuchling | 2006-12-20 20:48:20 +0100 (Wed, 20 Dec 2006) | 17 lines

  [Apply length-checking.diff from bug #1599254]

  Add length checking to single-file mailbox formats: before doing a
  flush() on a mailbox, seek to the end and verify its length is
  unchanged, raising ExternalClashError if the file's length has
  changed.

  This fix avoids potential data loss if some other process appends to
  the mailbox file after the table of contents has been generated;
  instead of overwriting the modified file, you'll get the exception.

  I also noticed that the self._lookup() call in self.flush() wasn't
  necessary (everything that sets self._pending to True also calls
  self.lookup()), and replaced it by an assertion.

  2.5 backport candidate.
........
  r53112 | andrew.kuchling | 2006-12-20 20:57:10 +0100 (Wed, 20 Dec 2006) | 1 line

  [Bug #1619674] Make sum() use the term iterable, not sequence
........
  r53113 | andrew.kuchling | 2006-12-20 20:58:11 +0100 (Wed, 20 Dec 2006) | 1 line

  Two grammar fixes
........
  r53115 | andrew.kuchling | 2006-12-20 21:11:12 +0100 (Wed, 20 Dec 2006) | 5 lines

  Some other built-in functions are described with 'sequence' arguments
  that should really be 'iterable'; this commit changes them.

  Did I miss any?  Did I introduce any errors?
........
  r53117 | andrew.kuchling | 2006-12-20 21:20:42 +0100 (Wed, 20 Dec 2006) | 1 line

  [Bug #1619680] in_dll() arguments are documented in the wrong order
........
  r53120 | neal.norwitz | 2006-12-21 05:38:00 +0100 (Thu, 21 Dec 2006) | 1 line

  Lars asked for permission on on python-dev for work on tarfile.py
........
  r53125 | andrew.kuchling | 2006-12-21 14:40:29 +0100 (Thu, 21 Dec 2006) | 1 line

  Mention the os.SEEK_* constants
........
  r53129 | walter.doerwald | 2006-12-21 19:06:30 +0100 (Thu, 21 Dec 2006) | 2 lines

  Fix typo.
........
  r53131 | thomas.heller | 2006-12-21 19:30:56 +0100 (Thu, 21 Dec 2006) | 3 lines

  Fix wrong markup of an argument in a method signature.
  Will backport.
........
  r53137 | andrew.kuchling | 2006-12-22 01:50:56 +0100 (Fri, 22 Dec 2006) | 1 line

  Typo fix
........
  r53139 | andrew.kuchling | 2006-12-22 14:25:02 +0100 (Fri, 22 Dec 2006) | 1 line

  [Bug #737202; fix from Titus Brown] Make CGIHTTPServer work for scripts in sub-directories
........
  r53141 | andrew.kuchling | 2006-12-22 16:04:45 +0100 (Fri, 22 Dec 2006) | 6 lines

  [Bug #802128] Make the mode argument of dumbdbm actually work the way it's
  described, and add a test for it.

  2.5 bugfix candidate, maybe; arguably this patch changes the API of
  dumbdbm and shouldn't be added in a point-release.
........
  r53142 | andrew.kuchling | 2006-12-22 16:16:58 +0100 (Fri, 22 Dec 2006) | 6 lines

  [Bug #802128 continued] Modify mode depending on the process umask.

  Is there really no other way to read the umask than to set it?

  Hope this works on Windows...
........
  r53145 | andrew.kuchling | 2006-12-22 17:43:26 +0100 (Fri, 22 Dec 2006) | 1 line

  [Bug #776202] Apply Walter Doerwald's patch to use text mode for encoded files
........
  r53146 | andrew.kuchling | 2006-12-22 19:41:42 +0100 (Fri, 22 Dec 2006) | 9 lines

  [Patch #783050 from Patrick Lynch] The emulation of forkpty() is incorrect;
  the master should close the slave fd.

  Added a test to test_pty.py that reads from the master_fd after doing
  a pty.fork(); without the fix it hangs forever instead of raising an
  exception.  (<crossing fingers for the buildbots>)

  2.5 backport candidate.
........
  r53147 | andrew.kuchling | 2006-12-22 20:06:16 +0100 (Fri, 22 Dec 2006) | 1 line

  [Patch #827559 from Chris Gonnerman] Make SimpleHTTPServer redirect when a directory URL is missing the trailing slash; this lets relative links work.
........
  r53149 | andrew.kuchling | 2006-12-22 20:21:27 +0100 (Fri, 22 Dec 2006) | 1 line

  Darn; this test works when you run test_pty.py directly, but fails when regrtest runs it (the os.read() raises os.error).  I can't figure out the cause, so am commenting out the test.
........
  r53150 | andrew.kuchling | 2006-12-22 22:48:19 +0100 (Fri, 22 Dec 2006) | 1 line

  Frak; this test also fails
........
  r53153 | lars.gustaebel | 2006-12-23 17:40:13 +0100 (Sat, 23 Dec 2006) | 5 lines

  Patch #1230446: tarfile.py: fix ExFileObject so that read() and tell()
  work correctly together with readline().

  Will backport to 2.5.
........
  r53155 | lars.gustaebel | 2006-12-23 18:57:23 +0100 (Sat, 23 Dec 2006) | 5 lines

  Patch #1262036: Prevent TarFiles from being added to themselves under
  certain conditions.

  Will backport to 2.5.
........
  r53159 | andrew.kuchling | 2006-12-27 04:25:31 +0100 (Wed, 27 Dec 2006) | 4 lines

  [Part of patch #1182394] Move the HMAC blocksize to be a class-level
  constant; this allows changing it in a subclass.  To accommodate this,
  copy() now uses __class__.  Also add some text to a comment.
........
  r53160 | andrew.kuchling | 2006-12-27 04:31:24 +0100 (Wed, 27 Dec 2006) | 1 line

  [Rest of patch #1182394] Add ._current() method so that we can use the written-in-C .hexdigest() method
........
  r53161 | lars.gustaebel | 2006-12-27 11:30:46 +0100 (Wed, 27 Dec 2006) | 4 lines

  Patch #1504073: Fix tarfile.open() for mode "r" with a fileobj argument.

  Will backport to 2.5.
........
  r53165 | neal.norwitz | 2006-12-28 05:39:20 +0100 (Thu, 28 Dec 2006) | 1 line

  Remove a stray (old) macro name left around (I guess)
........
  r53188 | neal.norwitz | 2006-12-29 04:01:53 +0100 (Fri, 29 Dec 2006) | 1 line

  SF bug #1623890, fix argument name in docstring
........
  r53200 | raymond.hettinger | 2006-12-30 05:01:17 +0100 (Sat, 30 Dec 2006) | 1 line

  For sets with cyclical reprs, emit an ellipsis instead of infinitely recursing.
........
  r53232 | brett.cannon | 2007-01-04 01:23:49 +0100 (Thu, 04 Jan 2007) | 3 lines

  Add EnvironmentVarGuard to test.test_support.  Provides a context manager to
  temporarily set or unset environment variables.
........
  r53235 | neal.norwitz | 2007-01-04 07:25:31 +0100 (Thu, 04 Jan 2007) | 1 line

  SF #1627373, fix typo in CarbonEvt.
........
  r53244 | raymond.hettinger | 2007-01-04 18:53:34 +0100 (Thu, 04 Jan 2007) | 1 line

  Fix stability of heapq's nlargest() and nsmallest().
........
  r53249 | martin.v.loewis | 2007-01-04 22:06:12 +0100 (Thu, 04 Jan 2007) | 3 lines

  Bug #1566280: Explicitly invoke threading._shutdown from Py_Main,
  to avoid relying on atexit.
  Will backport to 2.5.
........
  r53252 | gregory.p.smith | 2007-01-05 02:59:42 +0100 (Fri, 05 Jan 2007) | 3 lines

  Support linking of the bsddb module against BerkeleyDB 4.5.x
  (will backport to 2.5)
........
  r53253 | gregory.p.smith | 2007-01-05 03:06:17 +0100 (Fri, 05 Jan 2007) | 2 lines

  bump module version to match supported berkeleydb version
........
  r53255 | neal.norwitz | 2007-01-05 06:25:22 +0100 (Fri, 05 Jan 2007) | 6 lines

  Prevent crash on shutdown which can occur if we are finalizing
  and the module dict has been cleared already and some object
  raises a warning (like in a __del__).

  Will backport.
........
  r53258 | gregory.p.smith | 2007-01-05 08:21:35 +0100 (Fri, 05 Jan 2007) | 2 lines

  typo fix
........
  r53260 | neal.norwitz | 2007-01-05 09:06:43 +0100 (Fri, 05 Jan 2007) | 1 line

  Add Collin Winter for access to update PEP 3107
........
  r53262 | andrew.kuchling | 2007-01-05 15:22:17 +0100 (Fri, 05 Jan 2007) | 1 line

  [Bug #1622533] Make docstrings raw strings because they contain control characters (\0, \1)
........
  r53264 | andrew.kuchling | 2007-01-05 16:51:24 +0100 (Fri, 05 Jan 2007) | 1 line

  [Patch #1520904] Fix bsddb tests to write to the temp directory instead of the Lib/bsddb/test directory
........
  r53279 | brett.cannon | 2007-01-05 22:45:09 +0100 (Fri, 05 Jan 2007) | 3 lines

  Silence a warning from gcc 4.0.1 by specifying a function's parameter list is
  'void' instead of just a set of empty parentheses.
........
  r53285 | raymond.hettinger | 2007-01-06 02:14:41 +0100 (Sat, 06 Jan 2007) | 2 lines

  SF# 1409443:  Expand comment to cover the interaction between f->f_lasti and the PREDICT macros.
........
  r53286 | anthony.baxter | 2007-01-06 05:45:54 +0100 (Sat, 06 Jan 2007) | 1 line

  update to (c) years to include 2007
........
  r53291 | neal.norwitz | 2007-01-06 22:24:35 +0100 (Sat, 06 Jan 2007) | 1 line

  Add Josiah to SF for maintaining asyncore/asynchat
........
  r53293 | peter.astrand | 2007-01-07 09:53:46 +0100 (Sun, 07 Jan 2007) | 1 line

  Re-implemented fix for #1531862 once again, in a way that works with Python 2.2. Fixes bug #1603424.
........
  r53295 | peter.astrand | 2007-01-07 15:34:16 +0100 (Sun, 07 Jan 2007) | 1 line

  Avoid O(N**2) bottleneck in _communicate_(). Fixes #1598181.
........
  r53300 | raymond.hettinger | 2007-01-08 19:09:20 +0100 (Mon, 08 Jan 2007) | 1 line

  Fix zero-length corner case for iterating over a mutating deque.
........
  r53301 | vinay.sajip | 2007-01-08 19:50:32 +0100 (Mon, 08 Jan 2007) | 4 lines

  Bare except clause removed from SMTPHandler.emit(). Now, only ImportError is trapped.
  Bare except clause removed from SocketHandler.createSocket(). Now, only socket.error is trapped.
  (SF #411881)
........
  r53302 | vinay.sajip | 2007-01-08 19:51:46 +0100 (Mon, 08 Jan 2007) | 2 lines

  Bare except clause removed from LogRecord.__init__. Now, only ValueError, TypeError and AttributeError are trapped.
  (SF #411881)
........
  r53303 | vinay.sajip | 2007-01-08 19:52:36 +0100 (Mon, 08 Jan 2007) | 1 line

  Added entries about removal of some bare except clauses from logging.
........
2007-01-09 23:18:33 +00:00

253 lines
10 KiB
TeX

\section{\module{urlparse} ---
Parse URLs into components}
\declaremodule{standard}{urlparse}
\modulesynopsis{Parse URLs into components.}
\index{WWW}
\index{World Wide Web}
\index{URL}
\indexii{URL}{parsing}
\indexii{relative}{URL}
This module defines a standard interface to break Uniform Resource
Locator (URL) strings up in components (addressing scheme, network
location, path etc.), to combine the components back into a URL
string, and to convert a ``relative URL'' to an absolute URL given a
``base URL.''
The module has been designed to match the Internet RFC on Relative
Uniform Resource Locators (and discovered a bug in an earlier
draft!). It supports the following URL schemes:
\code{file}, \code{ftp}, \code{gopher}, \code{hdl}, \code{http},
\code{https}, \code{imap}, \code{mailto}, \code{mms}, \code{news},
\code{nntp}, \code{prospero}, \code{rsync}, \code{rtsp}, \code{rtspu},
\code{sftp}, \code{shttp}, \code{sip}, \code{sips}, \code{snews}, \code{svn},
\code{svn+ssh}, \code{telnet}, \code{wais}.
\versionadded[Support for the \code{sftp} and \code{sips} schemes]{2.5}
The \module{urlparse} module defines the following functions:
\begin{funcdesc}{urlparse}{urlstring\optional{,
default_scheme\optional{, allow_fragments}}}
Parse a URL into six components, returning a 6-tuple. This
corresponds to the general structure of a URL:
\code{\var{scheme}://\var{netloc}/\var{path};\var{parameters}?\var{query}\#\var{fragment}}.
Each tuple item is a string, possibly empty.
The components are not broken up in smaller parts (for example, the network
location is a single string), and \% escapes are not expanded.
The delimiters as shown above are not part of the result,
except for a leading slash in the \var{path} component, which is
retained if present. For example:
\begin{verbatim}
>>> from urlparse import urlparse
>>> o = urlparse('http://www.cwi.nl:80/%7Eguido/Python.html')
>>> o
('http', 'www.cwi.nl:80', '/%7Eguido/Python.html', '', '', '')
>>> o.scheme
'http'
>>> o.port
80
>>> o.geturl()
'http://www.cwi.nl:80/%7Eguido/Python.html'
\end{verbatim}
If the \var{default_scheme} argument is specified, it gives the
default addressing scheme, to be used only if the URL does not
specify one. The default value for this argument is the empty string.
If the \var{allow_fragments} argument is false, fragment identifiers
are not allowed, even if the URL's addressing scheme normally does
support them. The default value for this argument is \constant{True}.
The return value is actually an instance of a subclass of
\pytype{tuple}. This class has the following additional read-only
convenience attributes:
\begin{tableiv}{l|c|l|c}{member}{Attribute}{Index}{Value}{Value if not present}
\lineiv{scheme} {0} {URL scheme specifier} {empty string}
\lineiv{netloc} {1} {Network location part} {empty string}
\lineiv{path} {2} {Hierarchical path} {empty string}
\lineiv{params} {3} {Parameters for last path element} {empty string}
\lineiv{query} {4} {Query component} {empty string}
\lineiv{fragment}{5} {Fragment identifier} {empty string}
\lineiv{username}{ } {User name} {\constant{None}}
\lineiv{password}{ } {Password} {\constant{None}}
\lineiv{hostname}{ } {Host name (lower case)} {\constant{None}}
\lineiv{port} { } {Port number as integer, if present} {\constant{None}}
\end{tableiv}
See section~\ref{urlparse-result-object}, ``Results of
\function{urlparse()} and \function{urlsplit()},'' for more
information on the result object.
\versionchanged[Added attributes to return value]{2.5}
\end{funcdesc}
\begin{funcdesc}{urlunparse}{parts}
Construct a URL from a tuple as returned by \code{urlparse()}.
The \var{parts} argument can be any six-item iterable.
This may result in a slightly different, but equivalent URL, if the
URL that was parsed originally had unnecessary delimiters (for example,
a ? with an empty query; the RFC states that these are equivalent).
\end{funcdesc}
\begin{funcdesc}{urlsplit}{urlstring\optional{,
default_scheme\optional{, allow_fragments}}}
This is similar to \function{urlparse()}, but does not split the
params from the URL. This should generally be used instead of
\function{urlparse()} if the more recent URL syntax allowing
parameters to be applied to each segment of the \var{path} portion of
the URL (see \rfc{2396}) is wanted. A separate function is needed to
separate the path segments and parameters. This function returns a
5-tuple: (addressing scheme, network location, path, query, fragment
identifier).
The return value is actually an instance of a subclass of
\pytype{tuple}. This class has the following additional read-only
convenience attributes:
\begin{tableiv}{l|c|l|c}{member}{Attribute}{Index}{Value}{Value if not present}
\lineiv{scheme} {0} {URL scheme specifier} {empty string}
\lineiv{netloc} {1} {Network location part} {empty string}
\lineiv{path} {2} {Hierarchical path} {empty string}
\lineiv{query} {3} {Query component} {empty string}
\lineiv{fragment} {4} {Fragment identifier} {empty string}
\lineiv{username} { } {User name} {\constant{None}}
\lineiv{password} { } {Password} {\constant{None}}
\lineiv{hostname} { } {Host name (lower case)} {\constant{None}}
\lineiv{port} { } {Port number as integer, if present} {\constant{None}}
\end{tableiv}
See section~\ref{urlparse-result-object}, ``Results of
\function{urlparse()} and \function{urlsplit()},'' for more
information on the result object.
\versionadded{2.2}
\versionchanged[Added attributes to return value]{2.5}
\end{funcdesc}
\begin{funcdesc}{urlunsplit}{parts}
Combine the elements of a tuple as returned by \function{urlsplit()}
into a complete URL as a string.
The \var{parts} argument can be any five-item iterable.
This may result in a slightly different, but equivalent URL, if the
URL that was parsed originally had unnecessary delimiters (for example,
a ? with an empty query; the RFC states that these are equivalent).
\versionadded{2.2}
\end{funcdesc}
\begin{funcdesc}{urljoin}{base, url\optional{, allow_fragments}}
Construct a full (``absolute'') URL by combining a ``base URL''
(\var{base}) with another URL (\var{url}). Informally, this
uses components of the base URL, in particular the addressing scheme,
the network location and (part of) the path, to provide missing
components in the relative URL. For example:
\begin{verbatim}
>>> from urlparse import urljoin
>>> urljoin('http://www.cwi.nl/%7Eguido/Python.html', 'FAQ.html')
'http://www.cwi.nl/%7Eguido/FAQ.html'
\end{verbatim}
The \var{allow_fragments} argument has the same meaning and default as
for \function{urlparse()}.
\note{If \var{url} is an absolute URL (that is, starting with \code{//}
or \code{scheme://}, the \var{url}'s host name and/or scheme
will be present in the result. For example:}
\begin{verbatim}
>>> urljoin('http://www.cwi.nl/%7Eguido/Python.html',
... '//www.python.org/%7Eguido')
'http://www.python.org/%7Eguido'
\end{verbatim}
If you do not want that behavior, preprocess
the \var{url} with \function{urlsplit()} and \function{urlunsplit()},
removing possible \em{scheme} and \em{netloc} parts.
\end{funcdesc}
\begin{funcdesc}{urldefrag}{url}
If \var{url} contains a fragment identifier, returns a modified
version of \var{url} with no fragment identifier, and the fragment
identifier as a separate string. If there is no fragment identifier
in \var{url}, returns \var{url} unmodified and an empty string.
\end{funcdesc}
\begin{seealso}
\seerfc{1738}{Uniform Resource Locators (URL)}{
This specifies the formal syntax and semantics of absolute
URLs.}
\seerfc{1808}{Relative Uniform Resource Locators}{
This Request For Comments includes the rules for joining an
absolute and a relative URL, including a fair number of
``Abnormal Examples'' which govern the treatment of border
cases.}
\seerfc{2396}{Uniform Resource Identifiers (URI): Generic Syntax}{
Document describing the generic syntactic requirements for
both Uniform Resource Names (URNs) and Uniform Resource
Locators (URLs).}
\end{seealso}
\subsection{Results of \function{urlparse()} and \function{urlsplit()}
\label{urlparse-result-object}}
The result objects from the \function{urlparse()} and
\function{urlsplit()} functions are subclasses of the \pytype{tuple}
type. These subclasses add the attributes described in those
functions, as well as provide an additional method:
\begin{methoddesc}[ParseResult]{geturl}{}
Return the re-combined version of the original URL as a string.
This may differ from the original URL in that the scheme will always
be normalized to lower case and empty components may be dropped.
Specifically, empty parameters, queries, and fragment identifiers
will be removed.
The result of this method is a fixpoint if passed back through the
original parsing function:
\begin{verbatim}
>>> import urlparse
>>> url = 'HTTP://www.Python.org/doc/#'
>>> r1 = urlparse.urlsplit(url)
>>> r1.geturl()
'http://www.Python.org/doc/'
>>> r2 = urlparse.urlsplit(r1.geturl())
>>> r2.geturl()
'http://www.Python.org/doc/'
\end{verbatim}
\versionadded{2.5}
\end{methoddesc}
The following classes provide the implementations of the parse results::
\begin{classdesc*}{BaseResult}
Base class for the concrete result classes. This provides most of
the attribute definitions. It does not provide a \method{geturl()}
method. It is derived from \class{tuple}, but does not override the
\method{__init__()} or \method{__new__()} methods.
\end{classdesc*}
\begin{classdesc}{ParseResult}{scheme, netloc, path, params, query, fragment}
Concrete class for \function{urlparse()} results. The
\method{__new__()} method is overridden to support checking that the
right number of arguments are passed.
\end{classdesc}
\begin{classdesc}{SplitResult}{scheme, netloc, path, query, fragment}
Concrete class for \function{urlsplit()} results. The
\method{__new__()} method is overridden to support checking that the
right number of arguments are passed.
\end{classdesc}