mirror of
https://github.com/python/cpython.git
synced 2025-07-07 19:35:27 +00:00
GH-123599: url2pathname()
: don't call gethostbyname()
by default (#132610)
Follow-up to 66cdb2bd8a
.
Add *resolve_host* keyword-only argument to `url2pathname()`, defaulting to
false. When set to true, we call `socket.gethostbyname()` to resolve the
URL hostname.
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
This commit is contained in:
parent
082dbf7788
commit
8e08ac9f32
7 changed files with 48 additions and 30 deletions
|
@ -872,10 +872,10 @@ conforming to :rfc:`8089`.
|
||||||
.. versionadded:: 3.13
|
.. versionadded:: 3.13
|
||||||
|
|
||||||
.. versionchanged:: next
|
.. versionchanged:: next
|
||||||
If a URL authority (e.g. a hostname) is present and resolves to a local
|
The URL authority is discarded if it matches the local hostname.
|
||||||
address, it is discarded. If an authority is present and *doesn't*
|
Otherwise, if the authority isn't empty or ``localhost``, then on
|
||||||
resolve to a local address, then on Windows a UNC path is returned (as
|
Windows a UNC path is returned (as before), and on other platforms a
|
||||||
before), and on other platforms a :exc:`ValueError` is raised.
|
:exc:`ValueError` is raised.
|
||||||
|
|
||||||
|
|
||||||
.. method:: Path.as_uri()
|
.. method:: Path.as_uri()
|
||||||
|
|
|
@ -172,10 +172,10 @@ The :mod:`urllib.request` module defines the following functions:
|
||||||
the URL ``///etc/hosts``.
|
the URL ``///etc/hosts``.
|
||||||
|
|
||||||
.. versionchanged:: next
|
.. versionchanged:: next
|
||||||
The *add_scheme* argument was added.
|
The *add_scheme* parameter was added.
|
||||||
|
|
||||||
|
|
||||||
.. function:: url2pathname(url, *, require_scheme=False)
|
.. function:: url2pathname(url, *, require_scheme=False, resolve_host=False)
|
||||||
|
|
||||||
Convert the given ``file:`` URL to a local path. This function uses
|
Convert the given ``file:`` URL to a local path. This function uses
|
||||||
:func:`~urllib.parse.unquote` to decode the URL.
|
:func:`~urllib.parse.unquote` to decode the URL.
|
||||||
|
@ -185,6 +185,13 @@ The :mod:`urllib.request` module defines the following functions:
|
||||||
value should include the prefix; a :exc:`~urllib.error.URLError` is raised
|
value should include the prefix; a :exc:`~urllib.error.URLError` is raised
|
||||||
if it doesn't.
|
if it doesn't.
|
||||||
|
|
||||||
|
The URL authority is discarded if it is empty, ``localhost``, or the local
|
||||||
|
hostname. Otherwise, if *resolve_host* is set to true, the authority is
|
||||||
|
resolved using :func:`socket.gethostbyname` and discarded if it matches a
|
||||||
|
local IP address (as per :rfc:`RFC 8089 §3 <8089#section-3>`). If the
|
||||||
|
authority is still unhandled, then on Windows a UNC path is returned, and
|
||||||
|
on other platforms a :exc:`~urllib.error.URLError` is raised.
|
||||||
|
|
||||||
This example shows the function being used on Windows::
|
This example shows the function being used on Windows::
|
||||||
|
|
||||||
>>> from urllib.request import url2pathname
|
>>> from urllib.request import url2pathname
|
||||||
|
@ -198,14 +205,13 @@ The :mod:`urllib.request` module defines the following functions:
|
||||||
:exc:`OSError` exception to be raised on Windows.
|
:exc:`OSError` exception to be raised on Windows.
|
||||||
|
|
||||||
.. versionchanged:: next
|
.. versionchanged:: next
|
||||||
This function calls :func:`socket.gethostbyname` if the URL authority
|
The URL authority is discarded if it matches the local hostname.
|
||||||
isn't empty, ``localhost``, or the machine hostname. If the authority
|
Otherwise, if the authority isn't empty or ``localhost``, then on
|
||||||
resolves to a local IP address then it is discarded; otherwise, on
|
|
||||||
Windows a UNC path is returned (as before), and on other platforms a
|
Windows a UNC path is returned (as before), and on other platforms a
|
||||||
:exc:`~urllib.error.URLError` is raised.
|
:exc:`~urllib.error.URLError` is raised.
|
||||||
|
|
||||||
.. versionchanged:: next
|
.. versionchanged:: next
|
||||||
The *require_scheme* argument was added.
|
The *require_scheme* and *resolve_host* parameters were added.
|
||||||
|
|
||||||
|
|
||||||
.. function:: getproxies()
|
.. function:: getproxies()
|
||||||
|
|
|
@ -1703,9 +1703,11 @@ urllib
|
||||||
|
|
||||||
- Accept a complete URL when the new *require_scheme* argument is set to
|
- Accept a complete URL when the new *require_scheme* argument is set to
|
||||||
true.
|
true.
|
||||||
- Discard URL authorities that resolve to a local IP address.
|
- Discard URL authority if it matches the local hostname.
|
||||||
- Raise :exc:`~urllib.error.URLError` if a URL authority doesn't resolve
|
- Discard URL authority if it resolves to a local IP address when the new
|
||||||
to a local IP address, except on Windows where we return a UNC path.
|
*resolve_host* argument is set to true.
|
||||||
|
- Raise :exc:`~urllib.error.URLError` if a URL authority isn't local,
|
||||||
|
except on Windows where we return a UNC path as before.
|
||||||
|
|
||||||
In :func:`urllib.request.pathname2url`:
|
In :func:`urllib.request.pathname2url`:
|
||||||
|
|
||||||
|
|
|
@ -3290,7 +3290,6 @@ class PathTest(PurePathTest):
|
||||||
self.assertEqual(P.from_uri('file:////foo/bar'), P('//foo/bar'))
|
self.assertEqual(P.from_uri('file:////foo/bar'), P('//foo/bar'))
|
||||||
self.assertEqual(P.from_uri('file://localhost/foo/bar'), P('/foo/bar'))
|
self.assertEqual(P.from_uri('file://localhost/foo/bar'), P('/foo/bar'))
|
||||||
if not is_wasi:
|
if not is_wasi:
|
||||||
self.assertEqual(P.from_uri('file://127.0.0.1/foo/bar'), P('/foo/bar'))
|
|
||||||
self.assertEqual(P.from_uri(f'file://{socket.gethostname()}/foo/bar'),
|
self.assertEqual(P.from_uri(f'file://{socket.gethostname()}/foo/bar'),
|
||||||
P('/foo/bar'))
|
P('/foo/bar'))
|
||||||
self.assertRaises(ValueError, P.from_uri, 'foo/bar')
|
self.assertRaises(ValueError, P.from_uri, 'foo/bar')
|
||||||
|
|
|
@ -1551,7 +1551,8 @@ class Pathname_Tests(unittest.TestCase):
|
||||||
urllib.request.url2pathname(url, require_scheme=True),
|
urllib.request.url2pathname(url, require_scheme=True),
|
||||||
expected_path)
|
expected_path)
|
||||||
|
|
||||||
error_subtests = [
|
def test_url2pathname_require_scheme_errors(self):
|
||||||
|
subtests = [
|
||||||
'',
|
'',
|
||||||
':',
|
':',
|
||||||
'foo',
|
'foo',
|
||||||
|
@ -1561,13 +1562,20 @@ class Pathname_Tests(unittest.TestCase):
|
||||||
'data:file:foo',
|
'data:file:foo',
|
||||||
'data:file://foo',
|
'data:file://foo',
|
||||||
]
|
]
|
||||||
for url in error_subtests:
|
for url in subtests:
|
||||||
with self.subTest(url=url):
|
with self.subTest(url=url):
|
||||||
self.assertRaises(
|
self.assertRaises(
|
||||||
urllib.error.URLError,
|
urllib.error.URLError,
|
||||||
urllib.request.url2pathname,
|
urllib.request.url2pathname,
|
||||||
url, require_scheme=True)
|
url, require_scheme=True)
|
||||||
|
|
||||||
|
def test_url2pathname_resolve_host(self):
|
||||||
|
fn = urllib.request.url2pathname
|
||||||
|
sep = os.path.sep
|
||||||
|
self.assertEqual(fn('//127.0.0.1/foo/bar', resolve_host=True), f'{sep}foo{sep}bar')
|
||||||
|
self.assertEqual(fn(f'//{socket.gethostname()}/foo/bar'), f'{sep}foo{sep}bar')
|
||||||
|
self.assertEqual(fn(f'//{socket.gethostname()}/foo/bar', resolve_host=True), f'{sep}foo{sep}bar')
|
||||||
|
|
||||||
@unittest.skipUnless(sys.platform == 'win32',
|
@unittest.skipUnless(sys.platform == 'win32',
|
||||||
'test specific to Windows pathnames.')
|
'test specific to Windows pathnames.')
|
||||||
def test_url2pathname_win(self):
|
def test_url2pathname_win(self):
|
||||||
|
@ -1598,6 +1606,7 @@ class Pathname_Tests(unittest.TestCase):
|
||||||
self.assertEqual(fn('//server/path/to/file'), '\\\\server\\path\\to\\file')
|
self.assertEqual(fn('//server/path/to/file'), '\\\\server\\path\\to\\file')
|
||||||
self.assertEqual(fn('////server/path/to/file'), '\\\\server\\path\\to\\file')
|
self.assertEqual(fn('////server/path/to/file'), '\\\\server\\path\\to\\file')
|
||||||
self.assertEqual(fn('/////server/path/to/file'), '\\\\server\\path\\to\\file')
|
self.assertEqual(fn('/////server/path/to/file'), '\\\\server\\path\\to\\file')
|
||||||
|
self.assertEqual(fn('//127.0.0.1/path/to/file'), '\\\\127.0.0.1\\path\\to\\file')
|
||||||
# Localhost paths
|
# Localhost paths
|
||||||
self.assertEqual(fn('//localhost/C:/path/to/file'), 'C:\\path\\to\\file')
|
self.assertEqual(fn('//localhost/C:/path/to/file'), 'C:\\path\\to\\file')
|
||||||
self.assertEqual(fn('//localhost/C|/path/to/file'), 'C:\\path\\to\\file')
|
self.assertEqual(fn('//localhost/C|/path/to/file'), 'C:\\path\\to\\file')
|
||||||
|
@ -1622,8 +1631,7 @@ class Pathname_Tests(unittest.TestCase):
|
||||||
self.assertRaises(urllib.error.URLError, fn, '//:80/foo/bar')
|
self.assertRaises(urllib.error.URLError, fn, '//:80/foo/bar')
|
||||||
self.assertRaises(urllib.error.URLError, fn, '//:/foo/bar')
|
self.assertRaises(urllib.error.URLError, fn, '//:/foo/bar')
|
||||||
self.assertRaises(urllib.error.URLError, fn, '//c:80/foo/bar')
|
self.assertRaises(urllib.error.URLError, fn, '//c:80/foo/bar')
|
||||||
self.assertEqual(fn('//127.0.0.1/foo/bar'), '/foo/bar')
|
self.assertRaises(urllib.error.URLError, fn, '//127.0.0.1/foo/bar')
|
||||||
self.assertEqual(fn(f'//{socket.gethostname()}/foo/bar'), '/foo/bar')
|
|
||||||
|
|
||||||
@unittest.skipUnless(os_helper.FS_NONASCII, 'need os_helper.FS_NONASCII')
|
@unittest.skipUnless(os_helper.FS_NONASCII, 'need os_helper.FS_NONASCII')
|
||||||
def test_url2pathname_nonascii(self):
|
def test_url2pathname_nonascii(self):
|
||||||
|
|
|
@ -1466,7 +1466,7 @@ class FileHandler(BaseHandler):
|
||||||
def open_local_file(self, req):
|
def open_local_file(self, req):
|
||||||
import email.utils
|
import email.utils
|
||||||
import mimetypes
|
import mimetypes
|
||||||
localfile = url2pathname(req.full_url, require_scheme=True)
|
localfile = url2pathname(req.full_url, require_scheme=True, resolve_host=True)
|
||||||
try:
|
try:
|
||||||
stats = os.stat(localfile)
|
stats = os.stat(localfile)
|
||||||
size = stats.st_size
|
size = stats.st_size
|
||||||
|
@ -1482,7 +1482,7 @@ class FileHandler(BaseHandler):
|
||||||
|
|
||||||
file_open = open_local_file
|
file_open = open_local_file
|
||||||
|
|
||||||
def _is_local_authority(authority):
|
def _is_local_authority(authority, resolve):
|
||||||
# Compare hostnames
|
# Compare hostnames
|
||||||
if not authority or authority == 'localhost':
|
if not authority or authority == 'localhost':
|
||||||
return True
|
return True
|
||||||
|
@ -1494,9 +1494,11 @@ def _is_local_authority(authority):
|
||||||
if authority == hostname:
|
if authority == hostname:
|
||||||
return True
|
return True
|
||||||
# Compare IP addresses
|
# Compare IP addresses
|
||||||
|
if not resolve:
|
||||||
|
return False
|
||||||
try:
|
try:
|
||||||
address = socket.gethostbyname(authority)
|
address = socket.gethostbyname(authority)
|
||||||
except (socket.gaierror, AttributeError):
|
except (socket.gaierror, AttributeError, UnicodeEncodeError):
|
||||||
return False
|
return False
|
||||||
return address in FileHandler().get_names()
|
return address in FileHandler().get_names()
|
||||||
|
|
||||||
|
@ -1641,13 +1643,16 @@ class DataHandler(BaseHandler):
|
||||||
return addinfourl(io.BytesIO(data), headers, url)
|
return addinfourl(io.BytesIO(data), headers, url)
|
||||||
|
|
||||||
|
|
||||||
# Code move from the old urllib module
|
# Code moved from the old urllib module
|
||||||
|
|
||||||
def url2pathname(url, *, require_scheme=False):
|
def url2pathname(url, *, require_scheme=False, resolve_host=False):
|
||||||
"""Convert the given file URL to a local file system path.
|
"""Convert the given file URL to a local file system path.
|
||||||
|
|
||||||
The 'file:' scheme prefix must be omitted unless *require_scheme*
|
The 'file:' scheme prefix must be omitted unless *require_scheme*
|
||||||
is set to true.
|
is set to true.
|
||||||
|
|
||||||
|
The URL authority may be resolved with gethostbyname() if
|
||||||
|
*resolve_host* is set to true.
|
||||||
"""
|
"""
|
||||||
if require_scheme:
|
if require_scheme:
|
||||||
scheme, url = _splittype(url)
|
scheme, url = _splittype(url)
|
||||||
|
@ -1655,7 +1660,7 @@ def url2pathname(url, *, require_scheme=False):
|
||||||
raise URLError("URL is missing a 'file:' scheme")
|
raise URLError("URL is missing a 'file:' scheme")
|
||||||
authority, url = _splithost(url)
|
authority, url = _splithost(url)
|
||||||
if os.name == 'nt':
|
if os.name == 'nt':
|
||||||
if not _is_local_authority(authority):
|
if not _is_local_authority(authority, resolve_host):
|
||||||
# e.g. file://server/share/file.txt
|
# e.g. file://server/share/file.txt
|
||||||
url = '//' + authority + url
|
url = '//' + authority + url
|
||||||
elif url[:3] == '///':
|
elif url[:3] == '///':
|
||||||
|
@ -1669,7 +1674,7 @@ def url2pathname(url, *, require_scheme=False):
|
||||||
# Older URLs use a pipe after a drive letter
|
# Older URLs use a pipe after a drive letter
|
||||||
url = url[:1] + ':' + url[2:]
|
url = url[:1] + ':' + url[2:]
|
||||||
url = url.replace('/', '\\')
|
url = url.replace('/', '\\')
|
||||||
elif not _is_local_authority(authority):
|
elif not _is_local_authority(authority, resolve_host):
|
||||||
raise URLError("file:// scheme is supported only on localhost")
|
raise URLError("file:// scheme is supported only on localhost")
|
||||||
encoding = sys.getfilesystemencoding()
|
encoding = sys.getfilesystemencoding()
|
||||||
errors = sys.getfilesystemencodeerrors()
|
errors = sys.getfilesystemencodeerrors()
|
||||||
|
|
|
@ -1,5 +1,3 @@
|
||||||
Fix issue where :func:`urllib.request.url2pathname` mishandled file URLs with
|
Add *resolve_host* keyword-only parameter to
|
||||||
authorities. If an authority is present and resolves to ``localhost``, it is
|
:func:`urllib.request.url2pathname`, and fix handling of file URLs with
|
||||||
now discarded. If an authority is present but *doesn't* resolve to
|
authorities.
|
||||||
``localhost``, then on Windows a UNC path is returned (as before), and on
|
|
||||||
other platforms a :exc:`urllib.error.URLError` is now raised.
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue