mirror of
https://github.com/python/cpython.git
synced 2025-11-13 07:26:31 +00:00
Move the urllib-inherited API to a distinguished section
This commit is contained in:
parent
0358a17838
commit
b8eb9cbd71
1 changed files with 164 additions and 156 deletions
|
|
@ -104,52 +104,6 @@ The :mod:`urllib.request` module defines the following functions:
|
||||||
member variable to modify its position in the handlers list.
|
member variable to modify its position in the handlers list.
|
||||||
|
|
||||||
|
|
||||||
.. function:: urlretrieve(url, filename=None, reporthook=None, data=None)
|
|
||||||
|
|
||||||
Copy a network object denoted by a URL to a local file, if necessary. If the URL
|
|
||||||
points to a local file, or a valid cached copy of the object exists, the object
|
|
||||||
is not copied. Return a tuple ``(filename, headers)`` where *filename* is the
|
|
||||||
local file name under which the object can be found, and *headers* is whatever
|
|
||||||
the :meth:`info` method of the object returned by :func:`urlopen` returned (for
|
|
||||||
a remote object, possibly cached). Exceptions are the same as for
|
|
||||||
:func:`urlopen`.
|
|
||||||
|
|
||||||
The second argument, if present, specifies the file location to copy to (if
|
|
||||||
absent, the location will be a tempfile with a generated name). The third
|
|
||||||
argument, if present, is a hook function that will be called once on
|
|
||||||
establishment of the network connection and once after each block read
|
|
||||||
thereafter. The hook will be passed three arguments; a count of blocks
|
|
||||||
transferred so far, a block size in bytes, and the total size of the file. The
|
|
||||||
third argument may be ``-1`` on older FTP servers which do not return a file
|
|
||||||
size in response to a retrieval request.
|
|
||||||
|
|
||||||
If the *url* uses the :file:`http:` scheme identifier, the optional *data*
|
|
||||||
argument may be given to specify a ``POST`` request (normally the request type
|
|
||||||
is ``GET``). The *data* argument must in standard
|
|
||||||
:mimetype:`application/x-www-form-urlencoded` format; see the :func:`urlencode`
|
|
||||||
function below.
|
|
||||||
|
|
||||||
:func:`urlretrieve` will raise :exc:`ContentTooShortError` when it detects that
|
|
||||||
the amount of data available was less than the expected amount (which is the
|
|
||||||
size reported by a *Content-Length* header). This can occur, for example, when
|
|
||||||
the download is interrupted.
|
|
||||||
|
|
||||||
The *Content-Length* is treated as a lower bound: if there's more data to read,
|
|
||||||
urlretrieve reads more data, but if less data is available, it raises the
|
|
||||||
exception.
|
|
||||||
|
|
||||||
You can still retrieve the downloaded data in this case, it is stored in the
|
|
||||||
:attr:`content` attribute of the exception instance.
|
|
||||||
|
|
||||||
If no *Content-Length* header was supplied, urlretrieve can not check the size
|
|
||||||
of the data it has downloaded, and just returns it. In this case you just have
|
|
||||||
to assume that the download was successful.
|
|
||||||
|
|
||||||
.. function:: urlcleanup()
|
|
||||||
|
|
||||||
Clear the cache that may have been built up by previous calls to
|
|
||||||
:func:`urlretrieve`.
|
|
||||||
|
|
||||||
.. function:: pathname2url(path)
|
.. function:: pathname2url(path)
|
||||||
|
|
||||||
Convert the pathname *path* from the local syntax for a path to the form used in
|
Convert the pathname *path* from the local syntax for a path to the form used in
|
||||||
|
|
@ -218,116 +172,6 @@ The following classes are provided:
|
||||||
fetching of the image, this should be true.
|
fetching of the image, this should be true.
|
||||||
|
|
||||||
|
|
||||||
.. class:: URLopener(proxies=None, **x509)
|
|
||||||
|
|
||||||
Base class for opening and reading URLs. Unless you need to support opening
|
|
||||||
objects using schemes other than :file:`http:`, :file:`ftp:`, or :file:`file:`,
|
|
||||||
you probably want to use :class:`FancyURLopener`.
|
|
||||||
|
|
||||||
By default, the :class:`URLopener` class sends a :mailheader:`User-Agent` header
|
|
||||||
of ``urllib/VVV``, where *VVV* is the :mod:`urllib` version number.
|
|
||||||
Applications can define their own :mailheader:`User-Agent` header by subclassing
|
|
||||||
:class:`URLopener` or :class:`FancyURLopener` and setting the class attribute
|
|
||||||
:attr:`version` to an appropriate string value in the subclass definition.
|
|
||||||
|
|
||||||
The optional *proxies* parameter should be a dictionary mapping scheme names to
|
|
||||||
proxy URLs, where an empty dictionary turns proxies off completely. Its default
|
|
||||||
value is ``None``, in which case environmental proxy settings will be used if
|
|
||||||
present, as discussed in the definition of :func:`urlopen`, above.
|
|
||||||
|
|
||||||
Additional keyword parameters, collected in *x509*, may be used for
|
|
||||||
authentication of the client when using the :file:`https:` scheme. The keywords
|
|
||||||
*key_file* and *cert_file* are supported to provide an SSL key and certificate;
|
|
||||||
both are needed to support client authentication.
|
|
||||||
|
|
||||||
:class:`URLopener` objects will raise an :exc:`IOError` exception if the server
|
|
||||||
returns an error code.
|
|
||||||
|
|
||||||
.. method:: open(fullurl, data=None)
|
|
||||||
|
|
||||||
Open *fullurl* using the appropriate protocol. This method sets up cache and
|
|
||||||
proxy information, then calls the appropriate open method with its input
|
|
||||||
arguments. If the scheme is not recognized, :meth:`open_unknown` is called.
|
|
||||||
The *data* argument has the same meaning as the *data* argument of
|
|
||||||
:func:`urlopen`.
|
|
||||||
|
|
||||||
|
|
||||||
.. method:: open_unknown(fullurl, data=None)
|
|
||||||
|
|
||||||
Overridable interface to open unknown URL types.
|
|
||||||
|
|
||||||
|
|
||||||
.. method:: retrieve(url, filename=None, reporthook=None, data=None)
|
|
||||||
|
|
||||||
Retrieves the contents of *url* and places it in *filename*. The return value
|
|
||||||
is a tuple consisting of a local filename and either a
|
|
||||||
:class:`email.message.Message` object containing the response headers (for remote
|
|
||||||
URLs) or ``None`` (for local URLs). The caller must then open and read the
|
|
||||||
contents of *filename*. If *filename* is not given and the URL refers to a
|
|
||||||
local file, the input filename is returned. If the URL is non-local and
|
|
||||||
*filename* is not given, the filename is the output of :func:`tempfile.mktemp`
|
|
||||||
with a suffix that matches the suffix of the last path component of the input
|
|
||||||
URL. If *reporthook* is given, it must be a function accepting three numeric
|
|
||||||
parameters. It will be called after each chunk of data is read from the
|
|
||||||
network. *reporthook* is ignored for local URLs.
|
|
||||||
|
|
||||||
If the *url* uses the :file:`http:` scheme identifier, the optional *data*
|
|
||||||
argument may be given to specify a ``POST`` request (normally the request type
|
|
||||||
is ``GET``). The *data* argument must in standard
|
|
||||||
:mimetype:`application/x-www-form-urlencoded` format; see the :func:`urlencode`
|
|
||||||
function below.
|
|
||||||
|
|
||||||
|
|
||||||
.. attribute:: version
|
|
||||||
|
|
||||||
Variable that specifies the user agent of the opener object. To get
|
|
||||||
:mod:`urllib` to tell servers that it is a particular user agent, set this in a
|
|
||||||
subclass as a class variable or in the constructor before calling the base
|
|
||||||
constructor.
|
|
||||||
|
|
||||||
|
|
||||||
.. class:: FancyURLopener(...)
|
|
||||||
|
|
||||||
:class:`FancyURLopener` subclasses :class:`URLopener` providing default handling
|
|
||||||
for the following HTTP response codes: 301, 302, 303, 307 and 401. For the 30x
|
|
||||||
response codes listed above, the :mailheader:`Location` header is used to fetch
|
|
||||||
the actual URL. For 401 response codes (authentication required), basic HTTP
|
|
||||||
authentication is performed. For the 30x response codes, recursion is bounded
|
|
||||||
by the value of the *maxtries* attribute, which defaults to 10.
|
|
||||||
|
|
||||||
For all other response codes, the method :meth:`http_error_default` is called
|
|
||||||
which you can override in subclasses to handle the error appropriately.
|
|
||||||
|
|
||||||
.. note::
|
|
||||||
|
|
||||||
According to the letter of :rfc:`2616`, 301 and 302 responses to POST requests
|
|
||||||
must not be automatically redirected without confirmation by the user. In
|
|
||||||
reality, browsers do allow automatic redirection of these responses, changing
|
|
||||||
the POST to a GET, and :mod:`urllib` reproduces this behaviour.
|
|
||||||
|
|
||||||
The parameters to the constructor are the same as those for :class:`URLopener`.
|
|
||||||
|
|
||||||
.. note::
|
|
||||||
|
|
||||||
When performing basic authentication, a :class:`FancyURLopener` instance calls
|
|
||||||
its :meth:`prompt_user_passwd` method. The default implementation asks the
|
|
||||||
users for the required information on the controlling terminal. A subclass may
|
|
||||||
override this method to support more appropriate behavior if needed.
|
|
||||||
|
|
||||||
The :class:`FancyURLopener` class offers one additional method that should be
|
|
||||||
overloaded to provide the appropriate behavior:
|
|
||||||
|
|
||||||
.. method:: prompt_user_passwd(host, realm)
|
|
||||||
|
|
||||||
Return information needed to authenticate the user at the given host in the
|
|
||||||
specified security realm. The return value should be a tuple, ``(user,
|
|
||||||
password)``, which can be used for basic authentication.
|
|
||||||
|
|
||||||
The implementation prompts for this information on the terminal; an application
|
|
||||||
should override this method to use an appropriate interaction model in the local
|
|
||||||
environment.
|
|
||||||
|
|
||||||
|
|
||||||
.. class:: OpenerDirector()
|
.. class:: OpenerDirector()
|
||||||
|
|
||||||
The :class:`OpenerDirector` class opens URLs via :class:`BaseHandler`\ s chained
|
The :class:`OpenerDirector` class opens URLs via :class:`BaseHandler`\ s chained
|
||||||
|
|
@ -1220,6 +1064,170 @@ The following example uses no proxies at all, overriding environment settings::
|
||||||
>>> f.read().decode('utf-8')
|
>>> f.read().decode('utf-8')
|
||||||
|
|
||||||
|
|
||||||
|
Legacy interface
|
||||||
|
----------------
|
||||||
|
|
||||||
|
The following functions and classes are ported from the Python 2 module
|
||||||
|
``urllib`` (as opposed to ``urllib2``). They might become deprecated at
|
||||||
|
some point in the future.
|
||||||
|
|
||||||
|
|
||||||
|
.. function:: urlretrieve(url, filename=None, reporthook=None, data=None)
|
||||||
|
|
||||||
|
Copy a network object denoted by a URL to a local file, if necessary. If the URL
|
||||||
|
points to a local file, or a valid cached copy of the object exists, the object
|
||||||
|
is not copied. Return a tuple ``(filename, headers)`` where *filename* is the
|
||||||
|
local file name under which the object can be found, and *headers* is whatever
|
||||||
|
the :meth:`info` method of the object returned by :func:`urlopen` returned (for
|
||||||
|
a remote object, possibly cached). Exceptions are the same as for
|
||||||
|
:func:`urlopen`.
|
||||||
|
|
||||||
|
The second argument, if present, specifies the file location to copy to (if
|
||||||
|
absent, the location will be a tempfile with a generated name). The third
|
||||||
|
argument, if present, is a hook function that will be called once on
|
||||||
|
establishment of the network connection and once after each block read
|
||||||
|
thereafter. The hook will be passed three arguments; a count of blocks
|
||||||
|
transferred so far, a block size in bytes, and the total size of the file. The
|
||||||
|
third argument may be ``-1`` on older FTP servers which do not return a file
|
||||||
|
size in response to a retrieval request.
|
||||||
|
|
||||||
|
If the *url* uses the :file:`http:` scheme identifier, the optional *data*
|
||||||
|
argument may be given to specify a ``POST`` request (normally the request type
|
||||||
|
is ``GET``). The *data* argument must in standard
|
||||||
|
:mimetype:`application/x-www-form-urlencoded` format; see the :func:`urlencode`
|
||||||
|
function below.
|
||||||
|
|
||||||
|
:func:`urlretrieve` will raise :exc:`ContentTooShortError` when it detects that
|
||||||
|
the amount of data available was less than the expected amount (which is the
|
||||||
|
size reported by a *Content-Length* header). This can occur, for example, when
|
||||||
|
the download is interrupted.
|
||||||
|
|
||||||
|
The *Content-Length* is treated as a lower bound: if there's more data to read,
|
||||||
|
urlretrieve reads more data, but if less data is available, it raises the
|
||||||
|
exception.
|
||||||
|
|
||||||
|
You can still retrieve the downloaded data in this case, it is stored in the
|
||||||
|
:attr:`content` attribute of the exception instance.
|
||||||
|
|
||||||
|
If no *Content-Length* header was supplied, urlretrieve can not check the size
|
||||||
|
of the data it has downloaded, and just returns it. In this case you just have
|
||||||
|
to assume that the download was successful.
|
||||||
|
|
||||||
|
.. function:: urlcleanup()
|
||||||
|
|
||||||
|
Clear the cache that may have been built up by previous calls to
|
||||||
|
:func:`urlretrieve`.
|
||||||
|
|
||||||
|
.. class:: URLopener(proxies=None, **x509)
|
||||||
|
|
||||||
|
Base class for opening and reading URLs. Unless you need to support opening
|
||||||
|
objects using schemes other than :file:`http:`, :file:`ftp:`, or :file:`file:`,
|
||||||
|
you probably want to use :class:`FancyURLopener`.
|
||||||
|
|
||||||
|
By default, the :class:`URLopener` class sends a :mailheader:`User-Agent` header
|
||||||
|
of ``urllib/VVV``, where *VVV* is the :mod:`urllib` version number.
|
||||||
|
Applications can define their own :mailheader:`User-Agent` header by subclassing
|
||||||
|
:class:`URLopener` or :class:`FancyURLopener` and setting the class attribute
|
||||||
|
:attr:`version` to an appropriate string value in the subclass definition.
|
||||||
|
|
||||||
|
The optional *proxies* parameter should be a dictionary mapping scheme names to
|
||||||
|
proxy URLs, where an empty dictionary turns proxies off completely. Its default
|
||||||
|
value is ``None``, in which case environmental proxy settings will be used if
|
||||||
|
present, as discussed in the definition of :func:`urlopen`, above.
|
||||||
|
|
||||||
|
Additional keyword parameters, collected in *x509*, may be used for
|
||||||
|
authentication of the client when using the :file:`https:` scheme. The keywords
|
||||||
|
*key_file* and *cert_file* are supported to provide an SSL key and certificate;
|
||||||
|
both are needed to support client authentication.
|
||||||
|
|
||||||
|
:class:`URLopener` objects will raise an :exc:`IOError` exception if the server
|
||||||
|
returns an error code.
|
||||||
|
|
||||||
|
.. method:: open(fullurl, data=None)
|
||||||
|
|
||||||
|
Open *fullurl* using the appropriate protocol. This method sets up cache and
|
||||||
|
proxy information, then calls the appropriate open method with its input
|
||||||
|
arguments. If the scheme is not recognized, :meth:`open_unknown` is called.
|
||||||
|
The *data* argument has the same meaning as the *data* argument of
|
||||||
|
:func:`urlopen`.
|
||||||
|
|
||||||
|
|
||||||
|
.. method:: open_unknown(fullurl, data=None)
|
||||||
|
|
||||||
|
Overridable interface to open unknown URL types.
|
||||||
|
|
||||||
|
|
||||||
|
.. method:: retrieve(url, filename=None, reporthook=None, data=None)
|
||||||
|
|
||||||
|
Retrieves the contents of *url* and places it in *filename*. The return value
|
||||||
|
is a tuple consisting of a local filename and either a
|
||||||
|
:class:`email.message.Message` object containing the response headers (for remote
|
||||||
|
URLs) or ``None`` (for local URLs). The caller must then open and read the
|
||||||
|
contents of *filename*. If *filename* is not given and the URL refers to a
|
||||||
|
local file, the input filename is returned. If the URL is non-local and
|
||||||
|
*filename* is not given, the filename is the output of :func:`tempfile.mktemp`
|
||||||
|
with a suffix that matches the suffix of the last path component of the input
|
||||||
|
URL. If *reporthook* is given, it must be a function accepting three numeric
|
||||||
|
parameters. It will be called after each chunk of data is read from the
|
||||||
|
network. *reporthook* is ignored for local URLs.
|
||||||
|
|
||||||
|
If the *url* uses the :file:`http:` scheme identifier, the optional *data*
|
||||||
|
argument may be given to specify a ``POST`` request (normally the request type
|
||||||
|
is ``GET``). The *data* argument must in standard
|
||||||
|
:mimetype:`application/x-www-form-urlencoded` format; see the :func:`urlencode`
|
||||||
|
function below.
|
||||||
|
|
||||||
|
|
||||||
|
.. attribute:: version
|
||||||
|
|
||||||
|
Variable that specifies the user agent of the opener object. To get
|
||||||
|
:mod:`urllib` to tell servers that it is a particular user agent, set this in a
|
||||||
|
subclass as a class variable or in the constructor before calling the base
|
||||||
|
constructor.
|
||||||
|
|
||||||
|
|
||||||
|
.. class:: FancyURLopener(...)
|
||||||
|
|
||||||
|
:class:`FancyURLopener` subclasses :class:`URLopener` providing default handling
|
||||||
|
for the following HTTP response codes: 301, 302, 303, 307 and 401. For the 30x
|
||||||
|
response codes listed above, the :mailheader:`Location` header is used to fetch
|
||||||
|
the actual URL. For 401 response codes (authentication required), basic HTTP
|
||||||
|
authentication is performed. For the 30x response codes, recursion is bounded
|
||||||
|
by the value of the *maxtries* attribute, which defaults to 10.
|
||||||
|
|
||||||
|
For all other response codes, the method :meth:`http_error_default` is called
|
||||||
|
which you can override in subclasses to handle the error appropriately.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
According to the letter of :rfc:`2616`, 301 and 302 responses to POST requests
|
||||||
|
must not be automatically redirected without confirmation by the user. In
|
||||||
|
reality, browsers do allow automatic redirection of these responses, changing
|
||||||
|
the POST to a GET, and :mod:`urllib` reproduces this behaviour.
|
||||||
|
|
||||||
|
The parameters to the constructor are the same as those for :class:`URLopener`.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
When performing basic authentication, a :class:`FancyURLopener` instance calls
|
||||||
|
its :meth:`prompt_user_passwd` method. The default implementation asks the
|
||||||
|
users for the required information on the controlling terminal. A subclass may
|
||||||
|
override this method to support more appropriate behavior if needed.
|
||||||
|
|
||||||
|
The :class:`FancyURLopener` class offers one additional method that should be
|
||||||
|
overloaded to provide the appropriate behavior:
|
||||||
|
|
||||||
|
.. method:: prompt_user_passwd(host, realm)
|
||||||
|
|
||||||
|
Return information needed to authenticate the user at the given host in the
|
||||||
|
specified security realm. The return value should be a tuple, ``(user,
|
||||||
|
password)``, which can be used for basic authentication.
|
||||||
|
|
||||||
|
The implementation prompts for this information on the terminal; an application
|
||||||
|
should override this method to use an appropriate interaction model in the local
|
||||||
|
environment.
|
||||||
|
|
||||||
|
|
||||||
:mod:`urllib.request` Restrictions
|
:mod:`urllib.request` Restrictions
|
||||||
----------------------------------
|
----------------------------------
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue