mirror of
https://github.com/django/django.git
synced 2025-07-24 05:36:15 +00:00
Fixed #19508 -- Implemented uri_to_iri as per RFC.
Thanks Loic Bistuer for helping in shaping the patch and Claude Paroz for the review.
This commit is contained in:
parent
3af5af1a61
commit
10b17a22be
9 changed files with 189 additions and 42 deletions
|
@ -173,11 +173,11 @@ URL from an IRI_ -- very loosely speaking, a URI_ that can contain Unicode
|
|||
characters. Quoting and converting an IRI to URI can be a little tricky, so
|
||||
Django provides some assistance.
|
||||
|
||||
* The function ``django.utils.encoding.iri_to_uri()`` implements the
|
||||
conversion from IRI to URI as required by the specification (:rfc:`3987`).
|
||||
* The function :func:`django.utils.encoding.iri_to_uri()` implements the
|
||||
conversion from IRI to URI as required by the specification (:rfc:`3987#section-3.1`).
|
||||
|
||||
* The functions ``django.utils.http.urlquote()`` and
|
||||
``django.utils.http.urlquote_plus()`` are versions of Python's standard
|
||||
* The functions :func:`django.utils.http.urlquote()` and
|
||||
:func:`django.utils.http.urlquote_plus()` are versions of Python's standard
|
||||
``urllib.quote()`` and ``urllib.quote_plus()`` that work with non-ASCII
|
||||
characters. (The data is converted to UTF-8 prior to encoding.)
|
||||
|
||||
|
@ -213,12 +213,29 @@ you can construct your IRI without worrying about whether it contains
|
|||
non-ASCII characters and then, right at the end, call ``iri_to_uri()`` on the
|
||||
result.
|
||||
|
||||
The ``iri_to_uri()`` function is also idempotent, which means the following is
|
||||
always true::
|
||||
Similarly, Django provides :func:`django.utils.encoding.uri_to_iri()` which
|
||||
implements the conversion from URI to IRI as per :rfc:`3987#section-3.2`.
|
||||
It decodes all percent-encodings except those that don't represent a valid
|
||||
UTF-8 sequence.
|
||||
|
||||
An example to demonstrate::
|
||||
|
||||
>>> uri_to_iri('/%E2%99%A5%E2%99%A5/?utf8=%E2%9C%93')
|
||||
'/♥♥/?utf8=✓'
|
||||
>>> uri_to_iri('%A9helloworld')
|
||||
'%A9helloworld'
|
||||
|
||||
In the first example, the UTF-8 characters and reserved characters are
|
||||
unquoted. In the second, the percent-encoding remains unchanged because it
|
||||
lies outside the valid UTF-8 range.
|
||||
|
||||
Both ``iri_to_uri()`` and ``uri_to_iri()`` functions are idempotent, which means the
|
||||
following is always true::
|
||||
|
||||
iri_to_uri(iri_to_uri(some_string)) = iri_to_uri(some_string)
|
||||
uri_to_iri(uri_to_iri(some_string)) = uri_to_iri(some_string)
|
||||
|
||||
So you can safely call it multiple times on the same IRI without risking
|
||||
So you can safely call it multiple times on the same URI/IRI without risking
|
||||
double-quoting problems.
|
||||
|
||||
.. _URI: http://www.ietf.org/rfc/rfc2396.txt
|
||||
|
|
|
@ -271,7 +271,20 @@ The functions defined in this module share the following properties:
|
|||
since we are assuming input is either UTF-8 or unicode already, we can
|
||||
simplify things a little from the full method.
|
||||
|
||||
Returns an ASCII string containing the encoded result.
|
||||
Takes an IRI in UTF-8 bytes and returns ASCII bytes containing the encoded
|
||||
result.
|
||||
|
||||
.. function:: uri_to_iri(uri)
|
||||
|
||||
.. versionadded:: 1.8
|
||||
|
||||
Converts a Uniform Resource Identifier into an Internationalized Resource
|
||||
Identifier.
|
||||
|
||||
This is an algorithm from section 3.2 of :rfc:`3987#section-3.2`.
|
||||
|
||||
Takes a URI in ASCII bytes and returns a unicode string containing the
|
||||
encoded result.
|
||||
|
||||
.. function:: filepath_to_uri(path)
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue