mirror of
				https://github.com/python/cpython.git
				synced 2025-10-25 15:58:57 +00:00 
			
		
		
		
	 c5605dffdb
			
		
	
	
		c5605dffdb
		
	
	
	
	
		
			
			svn+ssh://svn.python.org/python/branches/py3k
................
  r73941 | georg.brandl | 2009-07-11 12:39:00 +0200 (Sa, 11 Jul 2009) | 9 lines
  Merged revisions 73940 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk
  ........
    r73940 | georg.brandl | 2009-07-11 12:37:38 +0200 (Sa, 11 Jul 2009) | 1 line
    #6430: add note about size of "u" type.
  ........
................
  r73942 | georg.brandl | 2009-07-11 12:39:23 +0200 (Sa, 11 Jul 2009) | 1 line
  #6430: remove mention of "w" array typecode.
................
  r73943 | georg.brandl | 2009-07-11 12:43:08 +0200 (Sa, 11 Jul 2009) | 1 line
  #6421: The self argument of module-level PyCFunctions is now a reference to the module object.
................
  r74076 | georg.brandl | 2009-07-18 11:07:48 +0200 (Sa, 18 Jul 2009) | 1 line
  #6502: add missing comma in docstring.
................
  r74094 | georg.brandl | 2009-07-19 09:25:56 +0200 (So, 19 Jul 2009) | 10 lines
  Recorded merge of revisions 74089 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk
  ........
    r74089 | senthil.kumaran | 2009-07-19 04:43:43 +0200 (So, 19 Jul 2009) | 3 lines
    Fix for issue5102, timeout value propages between redirects, proxy, digest and
    auth handlers. Fixed tests to reflect the same.
  ........
................
  r74186 | georg.brandl | 2009-07-23 11:19:09 +0200 (Do, 23 Jul 2009) | 9 lines
  Recorded merge of revisions 74185 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk
  ........
    r74185 | georg.brandl | 2009-07-23 11:17:09 +0200 (Do, 23 Jul 2009) | 1 line
    Fix the "pylocals" gdb command.
  ........
................
  r74211 | georg.brandl | 2009-07-26 16:48:09 +0200 (So, 26 Jul 2009) | 9 lines
  Recorded merge of revisions 74210 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk
  ........
    r74210 | georg.brandl | 2009-07-26 16:44:23 +0200 (So, 26 Jul 2009) | 1 line
    Move member descriptions inside the classes.
  ........
................
  r74212 | georg.brandl | 2009-07-26 16:54:51 +0200 (So, 26 Jul 2009) | 9 lines
  Merged revisions 74209 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk
  ........
    r74209 | georg.brandl | 2009-07-26 16:37:28 +0200 (So, 26 Jul 2009) | 1 line
    builtin -> built-in.
  ........
................
  r74213 | georg.brandl | 2009-07-26 17:02:41 +0200 (So, 26 Jul 2009) | 9 lines
  Merged revisions 74207 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk
  ........
    r74207 | georg.brandl | 2009-07-26 16:19:57 +0200 (So, 26 Jul 2009) | 1 line
    #6577: fix (hopefully) all links to builtin instead of module/class-specific objects.
  ........
................
  r74214 | georg.brandl | 2009-07-26 17:03:49 +0200 (So, 26 Jul 2009) | 9 lines
  Merged revisions 74205 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk
  ........
    r74205 | georg.brandl | 2009-07-26 15:36:39 +0200 (So, 26 Jul 2009) | 1 line
    #6576: fix cross-refs in re docs.
  ........
................
  r74247 | georg.brandl | 2009-07-29 09:27:08 +0200 (Mi, 29 Jul 2009) | 9 lines
  Merged revisions 74239 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk
  ........
    r74239 | georg.brandl | 2009-07-28 18:55:32 +0000 (Di, 28 Jul 2009) | 1 line
    Clarify quote_plus() usage.
  ........
................
  r74254 | georg.brandl | 2009-07-29 18:14:16 +0200 (Mi, 29 Jul 2009) | 1 line
  #6586: fix return/argument type doc for os.read() and os.write().
................
  r74262 | alexandre.vassalotti | 2009-07-29 21:54:39 +0200 (Mi, 29 Jul 2009) | 57 lines
  Merged revisions 74074,74077,74111,74188,74192-74193,74200,74252-74253,74258-74261 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk
  ........
    r74074 | georg.brandl | 2009-07-18 05:03:10 -0400 (Sat, 18 Jul 2009) | 1 line
    #6513: fix example code: warning categories are classes, not instances.
  ........
    r74077 | georg.brandl | 2009-07-18 05:43:40 -0400 (Sat, 18 Jul 2009) | 1 line
    #6489: fix an ambiguity in getiterator() documentation.
  ........
    r74111 | benjamin.peterson | 2009-07-20 09:30:10 -0400 (Mon, 20 Jul 2009) | 1 line
    remove docs for deprecated -p option
  ........
    r74188 | benjamin.peterson | 2009-07-23 10:25:31 -0400 (Thu, 23 Jul 2009) | 1 line
    use bools
  ........
    r74192 | georg.brandl | 2009-07-24 12:28:38 -0400 (Fri, 24 Jul 2009) | 1 line
    Fix arg types of et#.
  ........
    r74193 | georg.brandl | 2009-07-24 12:46:38 -0400 (Fri, 24 Jul 2009) | 1 line
    Dont put "void" in signature for nullary functions.
  ........
    r74200 | georg.brandl | 2009-07-25 09:02:15 -0400 (Sat, 25 Jul 2009) | 1 line
    #6571: add index entries for more operators.
  ........
    r74252 | georg.brandl | 2009-07-29 12:06:31 -0400 (Wed, 29 Jul 2009) | 1 line
    #6593: fix link targets.
  ........
    r74253 | georg.brandl | 2009-07-29 12:09:17 -0400 (Wed, 29 Jul 2009) | 1 line
    #6591: add reference to ioctl in fcntl module for platforms other than Windows.
  ........
    r74258 | georg.brandl | 2009-07-29 12:57:05 -0400 (Wed, 29 Jul 2009) | 1 line
    Add a link to readline, and mention IPython and bpython.
  ........
    r74259 | georg.brandl | 2009-07-29 13:07:21 -0400 (Wed, 29 Jul 2009) | 1 line
    Fix some markup and small factual glitches found by M. Markert.
  ........
    r74260 | georg.brandl | 2009-07-29 13:15:20 -0400 (Wed, 29 Jul 2009) | 1 line
    Fix a few markup glitches.
  ........
    r74261 | georg.brandl | 2009-07-29 13:50:25 -0400 (Wed, 29 Jul 2009) | 1 line
    Rewrite the section about classes a bit; mostly tidbits, and a larger update to the section about "private" variables to reflect the Pythonic consensus better.
  ........
................
  r74311 | georg.brandl | 2009-08-04 22:29:27 +0200 (Di, 04 Aug 2009) | 1 line
  Slightly improve buffer-related error message.
................
  r74334 | georg.brandl | 2009-08-06 19:51:03 +0200 (Do, 06 Aug 2009) | 1 line
  #6648: mention surrogateescape handler where all standard handlers are listed.
................
  r74368 | georg.brandl | 2009-08-13 09:56:35 +0200 (Do, 13 Aug 2009) | 21 lines
  Merged revisions 74328,74332-74333,74365 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk
  ........
    r74328 | georg.brandl | 2009-08-06 17:06:25 +0200 (Do, 06 Aug 2009) | 1 line
    Fix base keyword arg name for int() and long().
  ........
    r74332 | georg.brandl | 2009-08-06 19:23:21 +0200 (Do, 06 Aug 2009) | 1 line
    Fix punctuation and one copy-paste error.
  ........
    r74333 | georg.brandl | 2009-08-06 19:43:55 +0200 (Do, 06 Aug 2009) | 1 line
    #6658: fix two typos.
  ........
    r74365 | georg.brandl | 2009-08-13 09:48:05 +0200 (Do, 13 Aug 2009) | 1 line
    #6679: Remove mention that sub supports no flags.
  ........
................
		
	
			
		
			
				
	
	
		
			228 lines
		
	
	
	
		
			9.4 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			228 lines
		
	
	
	
		
			9.4 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| :mod:`heapq` --- Heap queue algorithm
 | |
| =====================================
 | |
| 
 | |
| .. module:: heapq
 | |
|    :synopsis: Heap queue algorithm (a.k.a. priority queue).
 | |
| .. moduleauthor:: Kevin O'Connor
 | |
| .. sectionauthor:: Guido van Rossum <guido@python.org>
 | |
| .. sectionauthor:: François Pinard
 | |
| 
 | |
| This module provides an implementation of the heap queue algorithm, also known
 | |
| as the priority queue algorithm.
 | |
| 
 | |
| Heaps are arrays for which ``heap[k] <= heap[2*k+1]`` and ``heap[k] <=
 | |
| heap[2*k+2]`` for all *k*, counting elements from zero.  For the sake of
 | |
| comparison, non-existing elements are considered to be infinite.  The
 | |
| interesting property of a heap is that ``heap[0]`` is always its smallest
 | |
| element.
 | |
| 
 | |
| The API below differs from textbook heap algorithms in two aspects: (a) We use
 | |
| zero-based indexing.  This makes the relationship between the index for a node
 | |
| and the indexes for its children slightly less obvious, but is more suitable
 | |
| since Python uses zero-based indexing. (b) Our pop method returns the smallest
 | |
| item, not the largest (called a "min heap" in textbooks; a "max heap" is more
 | |
| common in texts because of its suitability for in-place sorting).
 | |
| 
 | |
| These two make it possible to view the heap as a regular Python list without
 | |
| surprises: ``heap[0]`` is the smallest item, and ``heap.sort()`` maintains the
 | |
| heap invariant!
 | |
| 
 | |
| To create a heap, use a list initialized to ``[]``, or you can transform a
 | |
| populated list into a heap via function :func:`heapify`.
 | |
| 
 | |
| The following functions are provided:
 | |
| 
 | |
| 
 | |
| .. function:: heappush(heap, item)
 | |
| 
 | |
|    Push the value *item* onto the *heap*, maintaining the heap invariant.
 | |
| 
 | |
| 
 | |
| .. function:: heappop(heap)
 | |
| 
 | |
|    Pop and return the smallest item from the *heap*, maintaining the heap
 | |
|    invariant.  If the heap is empty, :exc:`IndexError` is raised.
 | |
| 
 | |
| 
 | |
| .. function:: heappushpop(heap, item)
 | |
| 
 | |
|    Push *item* on the heap, then pop and return the smallest item from the
 | |
|    *heap*.  The combined action runs more efficiently than :func:`heappush`
 | |
|    followed by a separate call to :func:`heappop`.
 | |
| 
 | |
| 
 | |
| .. function:: heapify(x)
 | |
| 
 | |
|    Transform list *x* into a heap, in-place, in linear time.
 | |
| 
 | |
| 
 | |
| .. function:: heapreplace(heap, item)
 | |
| 
 | |
|    Pop and return the smallest item from the *heap*, and also push the new *item*.
 | |
|    The heap size doesn't change. If the heap is empty, :exc:`IndexError` is raised.
 | |
|    This is more efficient than :func:`heappop` followed by  :func:`heappush`, and
 | |
|    can be more appropriate when using a fixed-size heap.  Note that the value
 | |
|    returned may be larger than *item*!  That constrains reasonable uses of this
 | |
|    routine unless written as part of a conditional replacement::
 | |
| 
 | |
|       if item > heap[0]:
 | |
|           item = heapreplace(heap, item)
 | |
| 
 | |
| Example of use:
 | |
| 
 | |
|    >>> from heapq import heappush, heappop
 | |
|    >>> heap = []
 | |
|    >>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
 | |
|    >>> for item in data:
 | |
|    ...     heappush(heap, item)
 | |
|    ...
 | |
|    >>> ordered = []
 | |
|    >>> while heap:
 | |
|    ...     ordered.append(heappop(heap))
 | |
|    ...
 | |
|    >>> ordered
 | |
|    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
 | |
|    >>> data.sort()
 | |
|    >>> data == ordered
 | |
|    True
 | |
| 
 | |
| Using a heap to insert items at the correct place in a priority queue:
 | |
| 
 | |
|    >>> heap = []
 | |
|    >>> data = [(1, 'J'), (4, 'N'), (3, 'H'), (2, 'O')]
 | |
|    >>> for item in data:
 | |
|    ...     heappush(heap, item)
 | |
|    ...
 | |
|    >>> while heap:
 | |
|    ...     print(heappop(heap)[1])
 | |
|    J
 | |
|    O
 | |
|    H
 | |
|    N
 | |
| 
 | |
| 
 | |
| The module also offers three general purpose functions based on heaps.
 | |
| 
 | |
| 
 | |
| .. function:: merge(*iterables)
 | |
| 
 | |
|    Merge multiple sorted inputs into a single sorted output (for example, merge
 | |
|    timestamped entries from multiple log files).  Returns an :term:`iterator`
 | |
|    over the sorted values.
 | |
| 
 | |
|    Similar to ``sorted(itertools.chain(*iterables))`` but returns an iterable, does
 | |
|    not pull the data into memory all at once, and assumes that each of the input
 | |
|    streams is already sorted (smallest to largest).
 | |
| 
 | |
| 
 | |
| .. function:: nlargest(n, iterable, key=None)
 | |
| 
 | |
|    Return a list with the *n* largest elements from the dataset defined by
 | |
|    *iterable*.  *key*, if provided, specifies a function of one argument that is
 | |
|    used to extract a comparison key from each element in the iterable:
 | |
|    ``key=str.lower`` Equivalent to:  ``sorted(iterable, key=key,
 | |
|    reverse=True)[:n]``
 | |
| 
 | |
| 
 | |
| .. function:: nsmallest(n, iterable, key=None)
 | |
| 
 | |
|    Return a list with the *n* smallest elements from the dataset defined by
 | |
|    *iterable*.  *key*, if provided, specifies a function of one argument that is
 | |
|    used to extract a comparison key from each element in the iterable:
 | |
|    ``key=str.lower`` Equivalent to:  ``sorted(iterable, key=key)[:n]``
 | |
| 
 | |
| 
 | |
| The latter two functions perform best for smaller values of *n*.  For larger
 | |
| values, it is more efficient to use the :func:`sorted` function.  Also, when
 | |
| ``n==1``, it is more efficient to use the built-in :func:`min` and :func:`max`
 | |
| functions.
 | |
| 
 | |
| 
 | |
| Theory
 | |
| ------
 | |
| 
 | |
| (This explanation is due to François Pinard.  The Python code for this module
 | |
| was contributed by Kevin O'Connor.)
 | |
| 
 | |
| Heaps are arrays for which ``a[k] <= a[2*k+1]`` and ``a[k] <= a[2*k+2]`` for all
 | |
| *k*, counting elements from 0.  For the sake of comparison, non-existing
 | |
| elements are considered to be infinite.  The interesting property of a heap is
 | |
| that ``a[0]`` is always its smallest element.
 | |
| 
 | |
| The strange invariant above is meant to be an efficient memory representation
 | |
| for a tournament.  The numbers below are *k*, not ``a[k]``::
 | |
| 
 | |
|                                   0
 | |
| 
 | |
|                  1                                 2
 | |
| 
 | |
|          3               4                5               6
 | |
| 
 | |
|      7       8       9       10      11      12      13      14
 | |
| 
 | |
|    15 16   17 18   19 20   21 22   23 24   25 26   27 28   29 30
 | |
| 
 | |
| In the tree above, each cell *k* is topping ``2*k+1`` and ``2*k+2``. In an usual
 | |
| binary tournament we see in sports, each cell is the winner over the two cells
 | |
| it tops, and we can trace the winner down the tree to see all opponents s/he
 | |
| had.  However, in many computer applications of such tournaments, we do not need
 | |
| to trace the history of a winner. To be more memory efficient, when a winner is
 | |
| promoted, we try to replace it by something else at a lower level, and the rule
 | |
| becomes that a cell and the two cells it tops contain three different items, but
 | |
| the top cell "wins" over the two topped cells.
 | |
| 
 | |
| If this heap invariant is protected at all time, index 0 is clearly the overall
 | |
| winner.  The simplest algorithmic way to remove it and find the "next" winner is
 | |
| to move some loser (let's say cell 30 in the diagram above) into the 0 position,
 | |
| and then percolate this new 0 down the tree, exchanging values, until the
 | |
| invariant is re-established. This is clearly logarithmic on the total number of
 | |
| items in the tree. By iterating over all items, you get an O(n log n) sort.
 | |
| 
 | |
| A nice feature of this sort is that you can efficiently insert new items while
 | |
| the sort is going on, provided that the inserted items are not "better" than the
 | |
| last 0'th element you extracted.  This is especially useful in simulation
 | |
| contexts, where the tree holds all incoming events, and the "win" condition
 | |
| means the smallest scheduled time.  When an event schedule other events for
 | |
| execution, they are scheduled into the future, so they can easily go into the
 | |
| heap.  So, a heap is a good structure for implementing schedulers (this is what
 | |
| I used for my MIDI sequencer :-).
 | |
| 
 | |
| Various structures for implementing schedulers have been extensively studied,
 | |
| and heaps are good for this, as they are reasonably speedy, the speed is almost
 | |
| constant, and the worst case is not much different than the average case.
 | |
| However, there are other representations which are more efficient overall, yet
 | |
| the worst cases might be terrible.
 | |
| 
 | |
| Heaps are also very useful in big disk sorts.  You most probably all know that a
 | |
| big sort implies producing "runs" (which are pre-sorted sequences, which size is
 | |
| usually related to the amount of CPU memory), followed by a merging passes for
 | |
| these runs, which merging is often very cleverly organised [#]_. It is very
 | |
| important that the initial sort produces the longest runs possible.  Tournaments
 | |
| are a good way to that.  If, using all the memory available to hold a
 | |
| tournament, you replace and percolate items that happen to fit the current run,
 | |
| you'll produce runs which are twice the size of the memory for random input, and
 | |
| much better for input fuzzily ordered.
 | |
| 
 | |
| Moreover, if you output the 0'th item on disk and get an input which may not fit
 | |
| in the current tournament (because the value "wins" over the last output value),
 | |
| it cannot fit in the heap, so the size of the heap decreases.  The freed memory
 | |
| could be cleverly reused immediately for progressively building a second heap,
 | |
| which grows at exactly the same rate the first heap is melting.  When the first
 | |
| heap completely vanishes, you switch heaps and start a new run.  Clever and
 | |
| quite effective!
 | |
| 
 | |
| In a word, heaps are useful memory structures to know.  I use them in a few
 | |
| applications, and I think it is good to keep a 'heap' module around. :-)
 | |
| 
 | |
| .. rubric:: Footnotes
 | |
| 
 | |
| .. [#] The disk balancing algorithms which are current, nowadays, are more annoying
 | |
|    than clever, and this is a consequence of the seeking capabilities of the disks.
 | |
|    On devices which cannot seek, like big tape drives, the story was quite
 | |
|    different, and one had to be very clever to ensure (far in advance) that each
 | |
|    tape movement will be the most effective possible (that is, will best
 | |
|    participate at "progressing" the merge).  Some tapes were even able to read
 | |
|    backwards, and this was also used to avoid the rewinding time. Believe me, real
 | |
|    good tape sorts were quite spectacular to watch! From all times, sorting has
 | |
|    always been a Great Art! :-)
 | |
| 
 |