- Issue #2550: The approach used by client/server code for obtaining ports

to listen on in network-oriented tests has been refined in an effort to facilitate running multiple instances of the entire regression test suite in parallel without issue. test_support.bind_port() has been fixed such that it will always return a unique port -- which wasn't always the case with the previous implementation, especially if socket options had been set that affected address reuse (i.e. SO_REUSEADDR, SO_REUSEPORT). The new implementation of bind_port() will actually raise an exception if it is passed an AF_INET/SOCK_STREAM socket with either the SO_REUSEADDR or SO_REUSEPORT socket option set. Furthermore, if available, bind_port() will set the SO_EXCLUSIVEADDRUSE option on the socket it's been passed. This currently only applies to Windows. This option prevents any other sockets from binding to the host/port we've bound to, thus removing the possibility of the 'non-deterministic' behaviour, as Microsoft puts it, that occurs when a second SOCK_STREAM socket binds and accepts to a host/port that's already been bound by another socket. The optional preferred port parameter to bind_port() has been removed. Under no circumstances should tests be hard coding ports! test_support.find_unused_port() has also been introduced, which will pass a temporary socket object to bind_port() in order to obtain an unused port. The temporary socket object is then closed and deleted, and the port is returned. This method should only be used for obtaining an unused port in order to pass to an external program (i.e. the -accept [port] argument to openssl's s_server mode) or as a parameter to a server-oriented class that doesn't give you direct access to the underlying socket used. Finally, test_support.HOST has been introduced, which should be used for the host argument of any relevant socket calls (i.e. bind and connect). The following tests were updated to following the new conventions: test_socket, test_smtplib, test_asyncore, test_ssl, test_httplib, test_poplib, test_ftplib, test_telnetlib, test_socketserver, test_asynchat and test_socket_ssl. It is now possible for multiple instances of the regression test suite to run in parallel without issue.
2025-09-15 05:06:12 +00:00 · 2008-04-08 23:47:30 +00:00 · 2008-04-08 23:47:30 +00:00 · e41b0061dd
commit e41b0061dd
parent 02f33b43dc
13 changed files with 713 additions and 648 deletions
--- a/Lib/test/test_support.py
+++ b/Lib/test/test_support.py
@ -103,31 +103,97 @@ def requires(resource, msg=None):
            msg = "Use of the `%s' resource not enabled" % resource
        raise ResourceDenied(msg)

-def bind_port(sock, host='', preferred_port=54321):
-    """Try to bind the sock to a port.  If we are running multiple
-    tests and we don't try multiple ports, the test can fail.  This
-    makes the test more robust."""
+HOST = 'localhost'

-    # Find some random ports that hopefully no one is listening on.
-    # Ideally each test would clean up after itself and not continue listening
-    # on any ports.  However, this isn't the case.  The last port (0) is
-    # a stop-gap that asks the O/S to assign a port.  Whenever the warning
-    # message below is printed, the test that is listening on the port should
-    # be fixed to close the socket at the end of the test.
-    # Another reason why we can't use a port is another process (possibly
-    # another instance of the test suite) is using the same port.
-    for port in [preferred_port, 9907, 10243, 32999, 0]:
-        try:
-            sock.bind((host, port))
-            if port == 0:
-                port = sock.getsockname()[1]
-            return port
-        except socket.error, (err, msg):
-            if err != errno.EADDRINUSE:
-                raise
-            print >>sys.__stderr__, \
-                '  WARNING: failed to listen on port %d, trying another' % port
-    raise TestFailed('unable to find port to listen on')
+def find_unused_port(family=socket.AF_INET, socktype=socket.SOCK_STREAM):
+    """Returns an unused port that should be suitable for binding.  This is
+    achieved by creating a temporary socket with the same family and type as
+    the 'sock' parameter (default is AF_INET, SOCK_STREAM), and binding it to
+    the specified host address (defaults to 0.0.0.0) with the port set to 0,
+    eliciting an unused ephemeral port from the OS.  The temporary socket is
+    then closed and deleted, and the ephemeral port is returned.
+
+    Either this method or bind_port() should be used for any tests where a
+    server socket needs to be bound to a particular port for the duration of
+    the test.  Which one to use depends on whether the calling code is creating
+    a python socket, or if an unused port needs to be provided in a constructor
+    or passed to an external program (i.e. the -accept argument to openssl's
+    s_server mode).  Always prefer bind_port() over find_unused_port() where
+    possible.  Hard coded ports should *NEVER* be used.  As soon as a server
+    socket is bound to a hard coded port, the ability to run multiple instances
+    of the test simultaneously on the same host is compromised, which makes the
+    test a ticking time bomb in a buildbot environment. On Unix buildbots, this
+    may simply manifest as a failed test, which can be recovered from without
+    intervention in most cases, but on Windows, the entire python process can
+    completely and utterly wedge, requiring someone to log in to the buildbot
+    and manually kill the affected process.
+
+    (This is easy to reproduce on Windows, unfortunately, and can be traced to
+    the SO_REUSEADDR socket option having different semantics on Windows versus
+    Unix/Linux.  On Unix, you can't have two AF_INET SOCK_STREAM sockets bind,
+    listen and then accept connections on identical host/ports.  An EADDRINUSE
+    socket.error will be raised at some point (depending on the platform and
+    the order bind and listen were called on each socket).
+
+    However, on Windows, if SO_REUSEADDR is set on the sockets, no EADDRINUSE
+    will ever be raised when attempting to bind two identical host/ports. When
+    accept() is called on each socket, the second caller's process will steal
+    the port from the first caller, leaving them both in an awkwardly wedged
+    state where they'll no longer respond to any signals or graceful kills, and
+    must be forcibly killed via OpenProcess()/TerminateProcess().
+
+    The solution on Windows is to use the SO_EXCLUSIVEADDRUSE socket option
+    instead of SO_REUSEADDR, which effectively affords the same semantics as
+    SO_REUSEADDR on Unix.  Given the propensity of Unix developers in the Open
+    Source world compared to Windows ones, this is a common mistake.  A quick
+    look over OpenSSL's 0.9.8g source shows that they use SO_REUSEADDR when
+    openssl.exe is called with the 's_server' option, for example. See
+    http://bugs.python.org/issue2550 for more info.  The following site also
+    has a very thorough description about the implications of both REUSEADDR
+    and EXCLUSIVEADDRUSE on Windows:
+    http://msdn2.microsoft.com/en-us/library/ms740621(VS.85).aspx)
+
+    XXX: although this approach is a vast improvement on previous attempts to
+    elicit unused ports, it rests heavily on the assumption that the ephemeral
+    port returned to us by the OS won't immediately be dished back out to some
+    other process when we close and delete our temporary socket but before our
+    calling code has a chance to bind the returned port.  We can deal with this
+    issue if/when we come across it."""
+    tempsock = socket.socket(family, socktype)
+    port = bind_port(tempsock)
+    tempsock.close()
+    del tempsock
+    return port
+
+def bind_port(sock, host=HOST):
+    """Bind the socket to a free port and return the port number.  Relies on
+    ephemeral ports in order to ensure we are using an unbound port.  This is
+    important as many tests may be running simultaneously, especially in a
+    buildbot environment.  This method raises an exception if the sock.family
+    is AF_INET and sock.type is SOCK_STREAM, *and* the socket has SO_REUSEADDR
+    or SO_REUSEPORT set on it.  Tests should *never* set these socket options
+    for TCP/IP sockets.  The only case for setting these options is testing
+    multicasting via multiple UDP sockets.
+
+    Additionally, if the SO_EXCLUSIVEADDRUSE socket option is available (i.e.
+    on Windows), it will be set on the socket.  This will prevent anyone else
+    from bind()'ing to our host/port for the duration of the test.
+    """
+    if sock.family == socket.AF_INET and sock.type == socket.SOCK_STREAM:
+        if hasattr(socket, 'SO_REUSEADDR'):
+            if sock.getsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR) == 1:
+                raise TestFailed("tests should never set the SO_REUSEADDR "   \
+                                 "socket option on TCP/IP sockets!")
+        if hasattr(socket, 'SO_REUSEPORT'):
+            if sock.getsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT) == 1:
+                raise TestFailed("tests should never set the SO_REUSEPORT "   \
+                                 "socket option on TCP/IP sockets!")
+        if hasattr(socket, 'SO_EXCLUSIVEADDRUSE'):
+            sock.setsockopt(socket.SOL_SOCKET, socket.SO_EXCLUSIVEADDRUSE, 1)
+
+    sock.bind((host, 0))
+    port = sock.getsockname()[1]
+    return port

 FUZZ = 1e-6