From 9a81333152ccc4ff6916400baad78eae43194c39 Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 09:00:29 -0500 Subject: [PATCH 01/15] Remove distracting and wordy logostic_map example. --- Doc/library/itertools.rst | 18 +++--------------- 1 file changed, 3 insertions(+), 15 deletions(-) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index 35a71335b35fb6..40915aadfea52f 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -133,10 +133,9 @@ loops that truncate the stream. There are a number of uses for the *func* argument. It can be set to :func:`min` for a running minimum, :func:`max` for a running maximum, or :func:`operator.mul` for a running product. Amortization tables can be - built by accumulating interest and applying payments. First-order - `recurrence relations `_ - can be modeled by supplying the initial value in the iterable and using only - the accumulated total in *func* argument:: + built by accumulating interest and applying payments: + + .. doctest:: >>> data = [3, 4, 6, 2, 1, 9, 0, 7, 5, 8] >>> list(accumulate(data, operator.mul)) # running product @@ -149,17 +148,6 @@ loops that truncate the stream. >>> list(accumulate(cashflows, lambda bal, pmt: bal*1.05 + pmt)) [1000, 960.0, 918.0, 873.9000000000001, 827.5950000000001] - # Chaotic recurrence relation https://en.wikipedia.org/wiki/Logistic_map - >>> logistic_map = lambda x, _: r * x * (1 - x) - >>> r = 3.8 - >>> x0 = 0.4 - >>> inputs = repeat(x0, 36) # only the initial value is used - >>> [format(x, '.2f') for x in accumulate(inputs, logistic_map)] - ['0.40', '0.91', '0.30', '0.81', '0.60', '0.92', '0.29', '0.79', '0.63', - '0.88', '0.39', '0.90', '0.33', '0.84', '0.52', '0.95', '0.18', '0.57', - '0.93', '0.25', '0.71', '0.79', '0.63', '0.88', '0.39', '0.91', '0.32', - '0.83', '0.54', '0.95', '0.20', '0.60', '0.91', '0.30', '0.80', '0.60'] - See :func:`functools.reduce` for a similar function that returns only the final accumulated value. From 655e624e9c817d74256a10dc3c3c989ea1627ed4 Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 09:36:14 -0500 Subject: [PATCH 02/15] Make the groupby() code equivalent more readable --- Doc/library/itertools.rst | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index 40915aadfea52f..e567f3540c2f0e 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -437,14 +437,17 @@ loops that truncate the stream. class groupby: # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D + def __init__(self, iterable, key=None): if key is None: key = lambda x: x self.keyfunc = key self.it = iter(iterable) self.tgtkey = self.currkey = self.currvalue = object() + def __iter__(self): return self + def __next__(self): self.id = object() while self.currkey == self.tgtkey: @@ -452,6 +455,7 @@ loops that truncate the stream. self.currkey = self.keyfunc(self.currvalue) self.tgtkey = self.currkey return (self.currkey, self._grouper(self.tgtkey, self.id)) + def _grouper(self, tgtkey, id): while self.id is id and self.currkey == tgtkey: yield self.currvalue From b1a632a36f97a7a4901831463cac2aec5d8771f8 Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 09:40:52 -0500 Subject: [PATCH 03/15] Break islice() docs into smaller paragraphs --- Doc/library/itertools.rst | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index e567f3540c2f0e..3f06baa4b83696 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -474,10 +474,17 @@ loops that truncate the stream. Afterward, elements are returned consecutively unless *step* is set higher than one which results in items being skipped. If *stop* is ``None``, then iteration continues until the iterator is exhausted, if at all; otherwise, it stops at the - specified position. Unlike regular slicing, :func:`islice` does not support - negative values for *start*, *stop*, or *step*. Can be used to extract related - fields from data where the internal structure has been flattened (for example, a - multi-line report may list a name field on every third line). Roughly equivalent to:: + specified position. + + If *start* is ``None``, then iteration starts at zero. If *step* is ``None``, + then the step defaults to one. + + Unlike regular slicing, :func:`islice` does not support negative values for + *start*, *stop*, or *step*. Can be used to extract related fields from + data where the internal structure has been flattened (for example, a + multi-line report may list a name field on every third line). + + Roughly equivalent to:: def islice(iterable, *args): # islice('ABCDEFG', 2) --> A B @@ -504,8 +511,6 @@ loops that truncate the stream. for i, element in zip(range(i + 1, stop), iterable): pass - If *start* is ``None``, then iteration starts at zero. If *step* is ``None``, - then the step defaults to one. .. function:: pairwise(iterable) From f874f0c7cfe8a65619626cbd0dfbe722c815bc9e Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 09:56:22 -0500 Subject: [PATCH 04/15] Minor wording improvements for the combinatoric iterators --- Doc/library/itertools.rst | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index 3f06baa4b83696..2266c99f5b2ba2 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -229,10 +229,10 @@ loops that truncate the stream. The combination tuples are emitted in lexicographic ordering according to the order of the input *iterable*. So, if the input *iterable* is sorted, - the combination tuples will be produced in sorted order. + the output tuples will be produced in sorted order. Elements are treated as unique based on their position, not on their - value. So if the input elements are unique, there will be no repeat + value. So if the input elements are unique, there will be no repeated values in each combination. Roughly equivalent to:: @@ -278,7 +278,7 @@ loops that truncate the stream. The combination tuples are emitted in lexicographic ordering according to the order of the input *iterable*. So, if the input *iterable* is sorted, - the combination tuples will be produced in sorted order. + the output tuples will be produced in sorted order. Elements are treated as unique based on their position, not on their value. So if the input elements are unique, the generated combinations @@ -539,13 +539,13 @@ loops that truncate the stream. of the *iterable* and all possible full-length permutations are generated. - The permutation tuples are emitted in lexicographic ordering according to + The permutation tuples are emitted in lexicographic order according to the order of the input *iterable*. So, if the input *iterable* is sorted, - the combination tuples will be produced in sorted order. + the output tuples will be produced in sorted order. Elements are treated as unique based on their position, not on their - value. So if the input elements are unique, there will be no repeat - values in each permutation. + value. So if the input elements are unique, there will be no repeated + values within a permutation. Roughly equivalent to:: From 2cd06a0c279fb1df11c50b796da511c35ce0ea59 Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 10:00:51 -0500 Subject: [PATCH 05/15] Remove redundant text covered by the example below. --- Doc/library/itertools.rst | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index 2266c99f5b2ba2..fd984153a6f1f1 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -625,9 +625,7 @@ loops that truncate the stream. .. function:: repeat(object[, times]) Make an iterator that returns *object* over and over again. Runs indefinitely - unless the *times* argument is specified. Used as argument to :func:`map` for - invariant parameters to the called function. Also used with :func:`zip` to - create an invariant part of a tuple record. + unless the *times* argument is specified. Roughly equivalent to:: From 6e655d6075b7dfca66ee5c510c190f4b5b61e450 Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 10:01:45 -0500 Subject: [PATCH 06/15] Break string of sentences into two paragraphs --- Doc/library/itertools.rst | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index fd984153a6f1f1..e7fe640ac29472 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -648,9 +648,12 @@ loops that truncate the stream. Make an iterator that computes the function using arguments obtained from the iterable. Used instead of :func:`map` when argument parameters are already - grouped in tuples from a single iterable (the data has been "pre-zipped"). The - difference between :func:`map` and :func:`starmap` parallels the distinction - between ``function(a,b)`` and ``function(*c)``. Roughly equivalent to:: + grouped in tuples from a single iterable (when the data has been + "pre-zipped"). + + The difference between :func:`map` and :func:`starmap` parallels the + distinction between ``function(a,b)`` and ``function(*c)``. Roughly + equivalent to:: def starmap(function, iterable): # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000 From 62f4ff6cf2c80ee70e710b97acb0ba72c07c906c Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 10:04:29 -0500 Subject: [PATCH 07/15] Remove unnecessary words in tee() docs --- Doc/library/itertools.rst | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index e7fe640ac29472..d79797427c7ca2 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -681,9 +681,7 @@ loops that truncate the stream. The following Python code helps explain what *tee* does (although the actual implementation is more complex and uses only a single underlying - :abbr:`FIFO (first-in, first-out)` queue). - - Roughly equivalent to:: + :abbr:`FIFO (first-in, first-out)` queue):: def tee(iterable, n=2): it = iter(iterable) @@ -700,7 +698,7 @@ loops that truncate the stream. yield mydeque.popleft() return tuple(gen(d) for d in deques) - Once :func:`tee` has made a split, the original *iterable* should not be + Once :func:`tee` has been created, the original *iterable* should not be used anywhere else; otherwise, the *iterable* could get advanced without the tee objects being informed. From 81e294a5033225c75e56bdf07a4a7802da50a049 Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 10:43:19 -0500 Subject: [PATCH 08/15] Improve iter_index() recipe to handle non-sequence iterable inputs --- Doc/library/itertools.rst | 36 +++++++++++++++++++++++++++++------- 1 file changed, 29 insertions(+), 7 deletions(-) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index d79797427c7ca2..9681992bee39f2 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -844,15 +844,25 @@ which incur interpreter overhead. for k in range(len(roots) + 1) ] - def iter_index(seq, value, start=0): - "Return indices where a value occurs in a sequence." + def iter_index(iterable, value, start=0): + "Return indices where a value occurs in a sequence or iterable." # iter_index('AABCADEAF', 'A') --> 0 1 4 7 - i = start - 1 try: - while True: - yield (i := seq.index(value, i+1)) - except ValueError: - pass + seq_index = iterable.index + except AttributeError: + # Slow path for general iterables + it = islice(iterable, start, None) + for i, element in enumerate(it, start): + if element is value or element == value: + yield i + else: + # Fast path for sequences + i = start - 1 + try: + while True: + yield (i := seq_index(value, i+1)) + except ValueError: + pass def sieve(n): "Primes less than n" @@ -1192,6 +1202,18 @@ which incur interpreter overhead. [] >>> list(iter_index('', 'X')) [] + >>> list(iter_index('AABCADEAF', 'A', 1)) + [1, 4, 7] + >>> list(iter_index(iter('AABCADEAF'), 'A', 1)) + [1, 4, 7] + >>> list(iter_index('AABCADEAF', 'A', 2)) + [4, 7] + >>> list(iter_index(iter('AABCADEAF'), 'A', 2)) + [4, 7] + >>> list(iter_index('AABCADEAF', 'A', 10)) + [] + >>> list(iter_index(iter('AABCADEAF'), 'A', 10)) + [] >>> list(sieve(30)) [2, 3, 5, 7, 11, 13, 17, 19, 23, 29] From 78d49a47a8968a66cf1242e42ac2a977f1f037ed Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 11:47:18 -0500 Subject: [PATCH 09/15] Describe the purpose of the recipes section --- Doc/library/itertools.rst | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index 9681992bee39f2..06b5f2378b0ab2 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -752,14 +752,28 @@ Itertools Recipes This section shows recipes for creating an extended toolset using the existing itertools as building blocks. +The primary purpose of the itertools recipes is educational. The recipes show +various ways of thinking about individual tools -- for example, that +``chain.from_iterable`` is related to the concept of flattening. The recipes +also give ideas about ways that the tools can be combined -- for example, how +`compress()` and `range()` can work together. The recipes also show patterns +for using itertools with the :mod:`operator` and :mod:`collections` modules as +well as with the built-in itertools such as ``map()``, ``filter()``, +``reversed()``, and `enumerate()`. + +A secondary purpose of the recipes is to serve as an incubator. The +``accumulate()``, ``compress()``, and ``pairwise()`` itertools started out as +recipes. Currently, the ``iter_index()`` recipe is being tested to see +whether it proves its worth. + Substantially all of these recipes and many, many others can be installed from the `more-itertools project `_ found on the Python Package Index:: python -m pip install more-itertools -The extended tools offer the same high performance as the underlying toolset. -The superior memory performance is kept by processing elements one at a time +Many of the recipes offer the same high performance as the underlying toolset. +Superior memory performance is kept by processing elements one at a time rather than bringing the whole iterable into memory all at once. Code volume is kept small by linking the tools together in a functional style which helps eliminate temporary variables. High speed is retained by preferring From 3019dfc7707cceacedd8b2181deb7df4682e3344 Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 12:05:10 -0500 Subject: [PATCH 10/15] Note better alternative for unique_everseen() --- Doc/library/itertools.rst | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index 06b5f2378b0ab2..6bdc05975ff9fd 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -998,16 +998,19 @@ which incur interpreter overhead. # unique_everseen('AAAABBBCCDAABBB') --> A B C D # unique_everseen('ABBCcAD', str.lower) --> A B C D seen = set() - seen_add = seen.add if key is None: for element in filterfalse(seen.__contains__, iterable): - seen_add(element) + seen.add(element) yield element + # Note: The steps shown above are intended to demonstrate + # filterfalse(). For order preserving deduplication, + # a better solution is: + # yield from dict.fromkeys(iterable) else: for element in iterable: k = key(element) if k not in seen: - seen_add(k) + seen.add(k) yield element def unique_justseen(iterable, key=None): From 256bb591ba618d0b3759156dd1e30a3e12deb0e0 Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 12:11:28 -0500 Subject: [PATCH 11/15] Missing article --- Doc/library/itertools.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index 6bdc05975ff9fd..65c4c3dbf5d4e6 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -698,7 +698,7 @@ loops that truncate the stream. yield mydeque.popleft() return tuple(gen(d) for d in deques) - Once :func:`tee` has been created, the original *iterable* should not be + Once a :func:`tee` has been created, the original *iterable* should not be used anywhere else; otherwise, the *iterable* could get advanced without the tee objects being informed. From ac0cc45ccfe3ca92c77eda6a8d526f6277ceb2a2 Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 12:12:36 -0500 Subject: [PATCH 12/15] Markup --- Doc/library/itertools.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index 65c4c3dbf5d4e6..59f47f63f98d21 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -759,7 +759,7 @@ also give ideas about ways that the tools can be combined -- for example, how `compress()` and `range()` can work together. The recipes also show patterns for using itertools with the :mod:`operator` and :mod:`collections` modules as well as with the built-in itertools such as ``map()``, ``filter()``, -``reversed()``, and `enumerate()`. +``reversed()``, and ``enumerate()``. A secondary purpose of the recipes is to serve as an incubator. The ``accumulate()``, ``compress()``, and ``pairwise()`` itertools started out as From 78f0586f04e9ef383b1ea539881e43445822fa73 Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 12:25:53 -0500 Subject: [PATCH 13/15] Add imports for the new doctests --- Doc/library/itertools.rst | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index 59f47f63f98d21..9976eff33f0c68 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -10,6 +10,10 @@ .. testsetup:: from itertools import * + import collections + import math + import operator + import random -------------- From 1dff762459644bf4f92d147ed31fe5a968417bf0 Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 12:27:03 -0500 Subject: [PATCH 14/15] Doctest the example --- Doc/library/itertools.rst | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index 9976eff33f0c68..caf5e53abac83c 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -643,7 +643,9 @@ loops that truncate the stream. yield object A common use for *repeat* is to supply a stream of constant values to *map* - or *zip*:: + or *zip*: + + .. doctest:: >>> list(map(pow, range(10), repeat(2))) [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] From 5138ddd03b996033ab04c075d3fe0ab116a91ce3 Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Tue, 18 Oct 2022 12:33:38 -0500 Subject: [PATCH 15/15] Sphinx requires an actual unicode em dash --- Doc/library/itertools.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index caf5e53abac83c..07bb08625375e4 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -759,9 +759,9 @@ This section shows recipes for creating an extended toolset using the existing itertools as building blocks. The primary purpose of the itertools recipes is educational. The recipes show -various ways of thinking about individual tools -- for example, that +various ways of thinking about individual tools — for example, that ``chain.from_iterable`` is related to the concept of flattening. The recipes -also give ideas about ways that the tools can be combined -- for example, how +also give ideas about ways that the tools can be combined — for example, how `compress()` and `range()` can work together. The recipes also show patterns for using itertools with the :mod:`operator` and :mod:`collections` modules as well as with the built-in itertools such as ``map()``, ``filter()``,