Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an assertion to HashSet::get_or_insert_with #518

Closed
wants to merge 1 commit into from

Conversation

cuviper
Copy link
Member

@cuviper cuviper commented Apr 9, 2024

Since the Rust libs-api team considers it problematic for HashSet<T>
to be unreliable with well-behaved T (not user-controlled), we should
not allow mismatches to be inserted through get_or_insert_with.

Since the Rust libs-api team considers it problematic for `HashSet<T>`
to be unreliable with well-behaved `T` (not user-controlled), we should
not allow mismatches to be inserted through `get_or_insert_with`.
@cuviper
Copy link
Member Author

cuviper commented Apr 9, 2024

I didn't notice #400 before, but I think that's doing a lot more than is really needed for the immediate problem. For instance, I don't think it's important to compare the hashes, because that calculation is not under user control here. And value equality already implies hash equality, especially for known-good non-user types.

@JustForFun88
Copy link
Contributor

JustForFun88 commented Apr 9, 2024

I didn't notice #400 before, but I think that's doing a lot more than is really needed for the immediate problem. For instance, I don't think it's important to compare the hashes, because that calculation is not under user control here. And value equality already implies hash equality, especially for known-good non-user types.

I can correct the code if necessary.

@JustForFun88
Copy link
Contributor

I didn't notice #400 before, but I think that's doing a lot more than is really needed for the immediate problem. For instance, I don't think it's important to compare the hashes, because that calculation is not under user control here. And value equality already implies hash equality, especially for known-good non-user types.

I removed unnecessary code. Now the comparison is carried out by values. I don’t think this will affect performance. I use the from_key_hashed_nocheck method, so the hash is calculated only once, where the old version of the function calculated the hash twice (separately inside from_key and inside or_insert_with methods)

@Amanieu
Copy link
Member

Amanieu commented Apr 9, 2024

For some background, this came up while I was trying to remove RawEntry from hashbrown (by moving it under an off-by-default cargo feature). The only internal uses of the raw entry API were from the HashSet::get_or_insert* methods. I can rewrite get_or_insert and get_or_insert_owned to use Entry and EntryRef (with some minor changes). But for get_or_insert_with I would have to fall back to the RawTable API, which I would rather avoid: HashSet should only expose operations that can also be performed on a normal HashMap.

@JustForFun88
Copy link
Contributor

I can rewrite get_or_insert and get_or_insert_owned to use Entry and EntryRef (with some minor changes). But for get_or_insert_with I would have to fall back to the RawTable API, which I would rather avoid: HashSet should only expose operations that can also be performed on a normal HashMap.

Well, get_or_insert can be rewritten to Entry by slightly changing OccupiedEntry::key, namely by adding the lifetime 'a to the key K (pub fn key(&self) -> &'a K).

But get_or_insert_owned is not so easy to solve. First, you need to use EntryRef, not Entry. Secondly, all existing VacantEntryRef methods work with K: From<&Q>, and not with Q: ToOwned<Owned = K>.

You need to either rewrite get_or_insert_owned, or change the signatures of VacantEntryRef to ToOwned, or add a new method for VacantEntryRef (ala from_owned_with).

@cuviper
Copy link
Member Author

cuviper commented Apr 10, 2024

Well, get_or_insert can be rewritten to Entry by slightly changing OccupiedEntry::key, namely by adding the lifetime 'a to the key K (pub fn key(&self) -> &'a K).

That would be unsound, because a user could follow that with a remove call, making that reference dangling. That's why the raw entry has fn into_key(self) -> &'a mut K, consuming the entry so it can no longer be used.

We could add a similar OccupiedEntry::into_key(self) -> &'a K though.

@JustForFun88
Copy link
Contributor

That would be unsound, because a user could follow that with a remove call, making that reference dangling. That's why the raw entry has fn into_key(self) -> &'a mut K, consuming the entry so it can no longer be used.

That's right, I forgot about the remove. Well, in any case, we will have to expand the existing HashMap API, if we want remove RawEntry

@cuviper
Copy link
Member Author

cuviper commented Sep 18, 2024

Subsumed by #555.

@cuviper cuviper closed this Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants