Issue #25576: Remove application/x-www-form-urlencoded charset advice

No charset parameter is standardized for this Content-Type value. Also
clarify that urlencode() outputs ASCII.
This commit is contained in:
Martin Panter 2015-11-24 22:33:18 +00:00
parent ed92910852
commit cda85a0d1c
3 changed files with 16 additions and 30 deletions

View file

@ -36,13 +36,8 @@ The :mod:`urllib.request` module defines the following functions:
*data* should be a buffer in the standard
:mimetype:`application/x-www-form-urlencoded` format. The
:func:`urllib.parse.urlencode` function takes a mapping or sequence of
2-tuples and returns a string in this format. It should be encoded to bytes
before being used as the *data* parameter. The charset parameter in
``Content-Type`` header may be used to specify the encoding. If charset
parameter is not sent with the Content-Type header, the server following the
HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1
encoding. It is advisable to use charset parameter with encoding used in
``Content-Type`` header with the :class:`Request`.
2-tuples and returns an ASCII text string in this format. It should
be encoded to bytes before being used as the *data* parameter.
urllib.request module uses HTTP/1.1 and includes ``Connection:close`` header
in its HTTP requests.
@ -179,16 +174,9 @@ The following classes are provided:
the only ones that use *data*; the HTTP request will be a POST instead of a
GET when the *data* parameter is provided. *data* should be a buffer in the
standard :mimetype:`application/x-www-form-urlencoded` format.
The :func:`urllib.parse.urlencode` function takes a mapping or sequence of
2-tuples and returns a string in this format. It should be encoded to bytes
before being used as the *data* parameter. The charset parameter in
``Content-Type`` header may be used to specify the encoding. If charset
parameter is not sent with the Content-Type header, the server following the
HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1
encoding. It is advisable to use charset parameter with encoding used in
``Content-Type`` header with the :class:`Request`.
2-tuples and returns an ASCII string in this format. It should be
encoded to bytes before being used as the *data* parameter.
*headers* should be a dictionary, and will be treated as if
:meth:`add_header` was called with each key and value as arguments.
@ -201,7 +189,7 @@ The following classes are provided:
``"Python-urllib/2.6"`` (on Python 2.6).
An example of using ``Content-Type`` header with *data* argument would be
sending a dictionary like ``{"Content-Type":" application/x-www-form-urlencoded;charset=utf-8"}``.
sending a dictionary like ``{"Content-Type": "application/x-www-form-urlencoded"}``.
The final two arguments are only of interest for correct handling
of third-party HTTP cookies:
@ -1169,7 +1157,7 @@ every :class:`Request`. To change this::
opener.open('http://www.example.com/')
Also, remember that a few standard headers (:mailheader:`Content-Length`,
:mailheader:`Content-Type` without charset parameter and :mailheader:`Host`)
:mailheader:`Content-Type` and :mailheader:`Host`)
are added when the :class:`Request` is passed to :func:`urlopen` (or
:meth:`OpenerDirector.open`).
@ -1192,11 +1180,8 @@ from urlencode is encoded to bytes before it is sent to urlopen as data::
>>> import urllib.request
>>> import urllib.parse
>>> data = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
>>> data = data.encode('utf-8')
>>> request = urllib.request.Request("http://requestb.in/xrbl82xr")
>>> # adding charset parameter to the Content-Type header.
>>> request.add_header("Content-Type","application/x-www-form-urlencoded;charset=utf-8")
>>> with urllib.request.urlopen(request, data) as f:
>>> data = data.encode('ascii')
>>> with urllib.request.urlopen("http://requestb.in/xrbl82xr", data) as f:
... print(f.read().decode('utf-8'))
...