Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for TrustedLen (trusted_len) #37572

Open
bluss opened this issue Nov 3, 2016 · 36 comments
Open

Tracking issue for TrustedLen (trusted_len) #37572

bluss opened this issue Nov 3, 2016 · 36 comments
Labels
A-iterators Area: Iterators B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@bluss
Copy link
Member

bluss commented Nov 3, 2016

Implemented in #37306

Open Questions:

    • Naming: something with Iterator in the name?
    • Naming: size vs len
    • Does TrustedLen pose any requirements on the ExactSizeIterator impl if one exists for the same type?
    • Does TrustedLen pose any requirements on the Clone impl if one exists for the same type?
    • Check all the specific impls, including complicated ones like Flatten on arrays or FlatMap that returns arrays.
@bluss bluss added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. B-unstable Blocker: Implemented in the nightly compiler and unstable. labels Nov 3, 2016
@est31
Copy link
Member

est31 commented Nov 8, 2016

Oh, finally this thing is there!

I was avoiding collect() like the plague because of this missing.

@photino
Copy link

photino commented Nov 9, 2016

I think it is better named TrustedLenIterator, since we don't call ExactSizeIterator as ExactSize.

@jethrogb
Copy link
Contributor

Or perhaps TrustedSizeIterator?

@cyplo
Copy link
Contributor

cyplo commented May 6, 2017

As the referenced code is merged - is this still an issue ? thank you !

@bluss
Copy link
Member Author

bluss commented May 6, 2017

Yes, tracking issues keep note on the code while it is still an unstable feature; they are closed when the feature is stabilized or removed.

@alexbool
Copy link
Contributor

alexbool commented May 6, 2017

By the way, what is holding stabilization right now?

@vadixidav
Copy link
Contributor

vadixidav commented May 30, 2017

I feel that ExactSizeIterator should be addressed in the documentation for TrustedLen. Although the default implementation should be correct, TrustedLen should also specify that any override in ExactSizeIterator (or rather that the trait overall) should also fulfill the contract to be implemented.

Any thoughts?

@Mark-Simulacrum Mark-Simulacrum added the C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. label Jul 22, 2017
@ishitatsuyuki
Copy link
Contributor

Do we need a RFC for stabilization? Also, is TrustedLen meant to be the successor (replacement) of ExactSizeIterator?

@bluss
Copy link
Member Author

bluss commented Nov 11, 2017

It serves a different purpose, so I don't think it is a successor. Good start towards stabilization would be to incorporate the naming suggestions from this tracking issue I think.

@BatmanAoD
Copy link
Member

Unless I misunderstand (which is likely), this trait indicates that the number of iterations can be calculated in advance, and that it will be a finite number, possibly greater than max usize.

In that case, I would suggest one of FiniteSizeIterator or KnownSizeIterator.

  • ...SizeIterator is consistent with ExactSizeIterator, as mentioned above.
    • "Size" is possibly ambiguous since it usually refers to bytes-of-memory, so I think ...LenIterator would be a better name for both traits, but I also think that consistency is a more important consideration.
    • If and when Tracking issue for trait aliases #41517 is completed, ...SizeIterator traits could be renamed to ...LenIterator and the old names can be aliased to the new ones, then deprecated in the next epoch.
  • "Known" may be a bit of a misnomer, since the length isn't technically required to be cached prior to calling size_hint, and as far as I can tell even calling size_hint won't necessarily actually calculate the exact number if it's larger than the maximum usize. But the value must in principle be knowable, and KnownSizeIterator seems cleaner and more user-friendly than KnowableSizeIterator.
  • "Finite", by implicit contrast to "infinite" or "non-finite", nicely avoids implying that the number is small enough to be useful as, e.g., an array length--which is the primary important distinguishing characteristic from ExactSizeIterator.

I also considered BoundedSizeIterator, but this seems strictly inferior to FiniteSizeIterator since it carries the same correct connotation of having a possibly large-size but fails to indicate that the upper and lower bounds should match.

By the way, what exactly does size_hint return as the lower-bound when the actual length is too large to fit in a usize?

@sfackler
Copy link
Member

usize::max_value()

@scottmcm
Copy link
Member

The Trusted part here makes me think of rust-lang/rfcs#2237 (comment). It sortof feels like the things currently specializing on TrustedLen want instead of specialize on I: unsafe ExactSizeIterator as a way to say "I need to trust that this actually does what its documentation says it does".

is TrustedLen meant to be the successor (replacement) of ExactSizeIterator?

As currently spec'd, they're complements of each other. Speaking roughly, ExactSizeIterator can be shrunk but not grown, whereas TrustedLen can be grown but not shrunk. (For specific examples, Skip propagates ExactSizeIterator but not TrustedLen, and Chain propagates TrustedLen but not ExactSizeIterator.)

@SimonSapin
Copy link
Contributor

The libs team discussed this and the consensus was that, since this trait was introduced in order to specialize on it, it should stay unstable until trait specialization itself becomes closer to stabilization.

@Phlosioneer
Copy link
Contributor

Phlosioneer commented Apr 1, 2018

ExactSizeIterator should be a strict subset of TrustedLen. Therefore, it should make sense to define impl<T: ExactSizeIterator> TrustedLen for T. Is there any reason we haven't already done that?

@sfackler
Copy link
Member

sfackler commented Apr 1, 2018

TrustedLen is an unsafe trait and ExactSizeIterator is not.

@scottmcm
Copy link
Member

scottmcm commented Apr 1, 2018

@Phlosioneer

ExactSizeIterator should be a strict subset of TrustedLen

That's not quite true; see the example in #37572 (comment).

@Phlosioneer
Copy link
Contributor

Phlosioneer commented Apr 1, 2018

TrustedLen is an unsafe trait and ExactSizeIterator is not.

Ah. So perhaps eventually.

@scottmcm
I saw that example. I agree, they're not the same, and must be propagated differently. But if you already have an ExactSizeIterator, then you clearly also satisfy the requirements of a TrustedLen where the length fits inside of a usize.

The type doesn't actually mean anything inherently; it's only a refinement on the return value of size_hint.

@clarfonthey
Copy link
Contributor

clarfonthey commented Apr 22, 2018

IMHO we're missing UnboundedIterator, which was one of the ideas TrustedLen aimed to solve. It seems silly to have an unsafe trait for this when a safe one would suffice. (Silly note on semantics: "unbounded" here means can't be bounded by a usize, which may still be finite.)

As far as TrustedLen goes, I'd go with BoundedIterator, where the size_hint is (usize, usize) rather than (usize, Option<usize>). That would essentially replace TrustedLen, minus the unbounded case.

@Phlosioneer
Copy link
Contributor

Phlosioneer commented Feb 1, 2019

I think there are only two things blocking stability for this trait. First, there's questions about what the name should be. Second, we need to improve documentation to make the distinction between this and ExactSizeIterator crystal clear.

There are also open questions here about methods that may be included, but they can be handled as separate issues to allow the trait to continue toward stabilization.

Here's what is not in question:

I suggest the lang team (or lib team? I get the teams all mixed up) should consider the options presented here and revise the name. Then we can change the name in nightly, and see if there are any comments for a few months. This allows us to make small, steady steps forward.

Edit: I spotted the comment #37572 (comment) that said the team wanted to block this until trait specialization is done. I'd still like to make small steps forward, but if they want to hold off then I understand. Sorry for the disruption.

@Coder-256

This comment has been minimized.

@scottmcm

This comment has been minimized.

Alexhuszagh referenced this issue in Alexhuszagh/rust-lexical Sep 2, 2019
@dtolnay
Copy link
Member

dtolnay commented Jan 18, 2020

Implementing TrustedLen for Cycle<I> where I: Clone + TrustedLen as proposed in #68352 would require that we attach a requirement to TrustedLen that if a Clone impl exists then a cloned iterator will have the same length as the original.

I believe that we do not want to require this guarantee on TrustedLen impls, but I am raising it here because this would need to be settled prior to stabilization; let me know if anyone would like to make a case.

@WaffleLapkin
Copy link
Member

I think that TrustedLen should require trusted_len method:

fn trusted_len(&self) -> Option<usize>;

instead of relying on Iterator::size_hint, just like ExactSizeIterator do. Option<usize> is mush easyer to understand (Some(n) - n elements, None - more than usize::MAX elements) than (usize, Option<usize>) ((n, Some(n)) - n elements, (usize::MAX, None) - more than usize::MAX elements, (x, None) - error)

@KodrAus KodrAus added A-iterators Area: Iterators Libs-Tracked Libs issues that are tracked on the team's project board. labels Jul 31, 2020
@AnttiParaoanu
Copy link

If TrustedLen would require ExactSizeIterator to be implemented correctly, it would be possible to have the following impl for Skip:

unsafe impl<I> TrustedLen for Skip<I> where I: ExactSizeIterator + TrustedLen {}

Implemented correctly for type I means that I.size_hint().1 == Some(I.len()). Since the length of I must be an usize, Skip<I> can calculate its own exact length.

@sdroege
Copy link
Contributor

sdroege commented Sep 2, 2020

Closely related to TrustedLen is TrustedRandomAccess, which doesn't seem to have a tracking issue. Should there be one, should it forever stay unstable or should it become part of this one?

@the8472
Copy link
Member

the8472 commented Mar 30, 2021

In some cases it would be useful to know when an iterator has a known length that is guaranteed to be <= usize::MAX. E.g. when you're collecting and want to transform it into a counted loop. Currently this needs a fallback code path for the None case which is awkward because it's probably going to run out of memory anyway (except for ZSTs) and bloats the implementation.

Having a stricter TrustedUsizeLen would be helpful. TrustedRandomAccess can be abused for that purpose since it happens to include that guarantee too but it is implemented on fewer types. I think it would be better to extract this orthogonal feature from TrustedRandomAccess into TrustedUsizeLen, which then could enable

let a: Iterator + TrustedLen + TrustedUsizeLen = ...;
let b: Iterator + TrustedLen + TrustedUsizeLen = ...;
let chained: Iterator + TrustedLen  = a.chain(b);
let length_recovered: Iterator + TrustedLen + TrustedUsizeLen = chained.take(usize::MAX)

I.e. Take could implement TrustedUsizeLen when its inner iterator is only TrustedLen.

This in turn could enable specializations on TrustedRandomAccess + TrustedUsizeLen for iterators that were previously not randomly accessible because the size couldn't be guaranteed at compile time to fit into a usize.

@jethrogb
Copy link
Contributor

Even usize::MAX is probably going to run out of memory though. Your actual maximum length you can handle is likely much lower than that. This depends on runtime behavior (the amount of RAM available on the system) so you won't be able to handle this at compile time anyway.

@the8472
Copy link
Member

the8472 commented Mar 30, 2021

The take(usize::MAX) would just be needed to guarantee at compile time that the iterator length always fits into usize, as an upper bound. The actual size can still be lower than that if each link in the chain is small enough. The issue with TrustedLen is that it can return None in principle, even if a concrete instantiation of an iterator chain never will, and that prevents some specializations because they require a concrete bound.

@ssomers
Copy link
Contributor

ssomers commented Dec 26, 2021

The current description confuses me. Some annotations for my future self and hopefully others:

  • "An iterator that reports an accurate length using size_hint."
    → Accurate in the sense of reliable, trustworthy, not more than in the sense of precise, detailed.
  • "The iterator reports a size hint where it is either exact (lower bound is equal to upper bound), or the upper bound is None. The upper bound must only be None if the actual iterator length is larger than usize::MAX. In that case, the lower bound must be usize::MAX, resulting in an Iterator::size_hint() of (usize::MAX, None)."
    → In short, exact or "larger than usize::MAX" (which includes infinite).
  • "The iterator must produce exactly the number of elements it reported or diverge before reaching the end."
    → This must be meant to apply only if the iterator reports an exact size hint. If the iterator reports (usize::MAX, None), the previous paragraph said it must produce more than usize::MAX (or diverge, presumably) and comments in rc.rs and sync.rs corroborate.

Why is the definition not "ExactSizeIterator without mercy"? To allow chains of TrustedLen iterators to be TrustedLen too, without panicking or failing on overflow. Also, to allow repeating with a finite count (e.g. repeat("blah").take(3)) to be TrustedLen (playground). Neither of these are or could easily be ExactSizeIterator.

When is a properly implemented iterator TrustedLen but not ExactSizeIterator? Apart from the adapters discussed above, when it can yield more than usize::MAX elements, like Repeat or like range<u64> when usize is smaller.

When is a properly implemented iterator ExactSizeIterator but not TrustedLen? Apart from the shortening adapters discussed above (Skip and StepBy), there are plenty of examples in the standard library of ExactSizeIterator but I don't see why they're not TrustedLen. Map is but Inspect isn't, some iterators outside the core crate are and some aren't, iterators on a character or string like EscapeDefault are not.

@the8472
Copy link
Member

the8472 commented Dec 26, 2021

→ Accurate in the sense of reliable, trustworthy, not in the sense of precise, detailed.

Considering the nature of size_hint (normally allowing loose bounds) it is also promises being precise in the sense of having tight bounds to the extent that the integer type allows.

and comments in rc.rs and sync.rs corroborate.

Well, those comments are just me interpreting TrustedLen, so to not turn this into a game of telephone I think a better anchor is the behavior of Chain which was one of the motivating uses for TrustedLen. A chain of two trusted len iters only returns (usize::MAX, None) if the length truly exceeds usize and will immediately become finite when the iterator does.

I should have updated TrustedLen docs when I did that.

When is a properly implemented iterator ExactSizeIterator but not TrustedLen?

We could conceivably mark some things as ExactSizeIterator on the assumption that for example Clone is well-behaved but can't do that for TrustedLen because the latter requires a stronger contract. E.g. with some helper traits that could be done for Take<Cycle<T>>.

@ssomers
Copy link
Contributor

ssomers commented Dec 28, 2021

We could conceivably mark some things as ExactSizeIterator on the assumption that for example Clone is well-behaved but can't do that for TrustedLen

Granted, it is a reason why a possible ExactSizeIterator would not be TrustedLen, though that #68352 case is quite specific to the Cycle adapter, that isn't a true iterator adapter but an Iterator+Clone adapter. But it's far from the reason why so many ExactSizeIterator in the standard library are not TrustedLen - they aren't even adapters.

@the8472
Copy link
Member

the8472 commented Dec 28, 2021

Oh sure I was giving a reason why an adapter cannot be TrustedLen while still being ExactSizeIterator. I guess those you're referring to could be but nobody has done the work or nobody needed those optimizations.

ExactSizeIterator enables some additional trait bounds, e.g. backwards iteration on some adapters. TrustedLen on the other hand only provides performance improvements, so it's less obvious when it could be useful.

@ssomers
Copy link
Contributor

ssomers commented Dec 28, 2021

I was beginning to think TrustedLen is only marginally useful anyway, but then godbolt shows that an iterator like std::iter::repeat("blah").take(17) is completely evaluated at compile time, but an extra .skip(0) kills TrustedLen and thwarts optimization.

@tgnottingham
Copy link
Contributor

What do you think about adding a trusted_len method to this trait, which defaults to returning the upper bound of size_hint (as an Option<usize>), possibly with a debug assertion that the lower and upper bounds are equal?

Adding trusted_len creates the possibilty of a bug where it and size_hint are inconsistent with one another, but it eliminates bugs where clients of the trait incorrectly depend on the lower bound of size_hint, rather than the upper bound. For each implementation of TrustedLen, there are potentially many client uses of size_hint, so having the implementation handle it once in trusted_len may reduce the chance of bugs, particularly with the default implementation in place.

@the8472
Copy link
Member

the8472 commented Apr 10, 2022

It could be an associated function instead TrustedLen::len(&Self) rather than TrustedLen::len(&self), then there's no issue with it being overridden or inconsistent with size_hint.

@HadrienG2
Copy link

HadrienG2 commented Mar 3, 2023

Former comment Just as a data point for why this is useful, I just had to create a copy of this trait for [a fun project of mine](https://users.rust-lang.org/t/iteratorilp-speeding-up-iterator-reductions-with-instruction-level-parallelism/90289) that wouldn't work without some unsafe assertion that `ExactSizeIterator::len()` is implemented correctly.

I believe I have used due dilligence in trying every safe alternative to if Some(item) = iter.next() { item } else { unreachable_unchecked() }, but nothing else would produce satisfactory codegen.

Turns out I was wrong, and my project requires only an accurate lower bound, so a strict subset of TrustedLen's guarantees that is honored by any correctly implemented iterator. The best way to handle this would probably be some kind of ?unsafe trait Iterator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-iterators Area: Iterators B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests