gh-130283: update deprecated links and examples in urllib.request docs (#130284)

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2025-12-23 09:19:18 +00:00 · 2025-03-23 13:29:29 +00:00 · 2025-03-23 13:29:29 +00:00 · fd459b1153
commit fd459b1153
parent 557d2d20d4
1 changed files with 15 additions and 14 deletions
--- a/Doc/library/urllib.request.rst
+++ b/Doc/library/urllib.request.rst
@ -1215,17 +1215,13 @@ In addition to the examples below, more examples are given in
 :ref:`urllib-howto`.

 This example gets the python.org main page and displays the first 300 bytes of
-it. ::
+it::

   >>> import urllib.request
   >>> with urllib.request.urlopen('http://www.python.org/') as f:
   ...     print(f.read(300))
   ...
-   b'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
-   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n\n\n<html
-   xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">\n\n<head>\n
-   <meta http-equiv="content-type" content="text/html; charset=utf-8" />\n
-   <title>Python Programming '
+   b'<!doctype html>\n<!--[if lt IE 7]>   <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9">   <![endif]-->\n<!--[if IE 7]>      <html class="no-js ie7 lt-ie8 lt-ie9">          <![endif]-->\n<!--[if IE 8]>      <html class="no-js ie8 lt-ie9">

 Note that urlopen returns a bytes object.  This is because there is no way
 for urlopen to automatically determine the encoding of the byte stream
@ -1233,21 +1229,24 @@ it receives from the HTTP server. In general, a program will decode
 the returned bytes object to string once it determines or guesses
 the appropriate encoding.

-The following W3C document, https://www.w3.org/International/O-charset\ , lists
-the various ways in which an (X)HTML or an XML document could have specified its
+The following HTML spec document, https://html.spec.whatwg.org/#charset, lists
+the various ways in which an HTML or an XML document could have specified its
 encoding information.

+For additional information, see the W3C document: https://www.w3.org/International/questions/qa-html-encoding-declarations.
+
 As the python.org website uses *utf-8* encoding as specified in its meta tag, we
-will use the same for decoding the bytes object. ::
+will use the same for decoding the bytes object::

   >>> with urllib.request.urlopen('http://www.python.org/') as f:
   ...     print(f.read(100).decode('utf-8'))
   ...
-   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
-   "http://www.w3.org/TR/xhtml1/DTD/xhtm
+   <!doctype html>
+   <!--[if lt IE 7]>   <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9">   <![endif]-->
+   <!-

 It is also possible to achieve the same result without using the
-:term:`context manager` approach. ::
+:term:`context manager` approach::

   >>> import urllib.request
   >>> f = urllib.request.urlopen('http://www.python.org/')
@ -1255,8 +1254,10 @@ It is also possible to achieve the same result without using the
   ...     print(f.read(100).decode('utf-8'))
   ... finally:
   ...     f.close()
-   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
-   "http://www.w3.org/TR/xhtml1/DTD/xhtm
+   ...
+   <!doctype html>
+   <!--[if lt IE 7]>   <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9">   <![endif]-->
+   <!--

 In the following example, we are sending a data-stream to the stdin of a CGI
 and reading the data it returns to us. Note that this example will only work