Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUST-283 Replace linked-hash-map with indexmap #247

Merged
merged 5 commits into from
Apr 21, 2021

Conversation

patrickfreed
Copy link
Contributor

RUST-283

This PR updates Document to wrap an IndexMap (from the indexmap crate) instead of LinkedHashMap from the now defunct linked-hash-map crate.

As part of that, I also updated the Entry API to match that of IndexMap / std::HashMap, since ours used to be slightly different.

@@ -39,11 +39,12 @@ decimal128 = ["decimal"]
name = "bson"

[dependencies]
ahash = "0.7.2"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ahash crate provides a fast yet safe hashing algorithm. In fact, I think it's now the default one used by std::HashMap: https://github.com/rust-lang/hashbrown/blob/master/Cargo.toml. In 2.0, indexmap looks like they will be updating their default to use it, so we can remove this dependency then (indexmap-rs/indexmap#135)

@patrickfreed
Copy link
Contributor Author

The clippy failures appear to be on master too, I'll fix them separately.

@patrickfreed patrickfreed marked this pull request as ready for review April 8, 2021 23:26
/// A view into a single entry in a map, which may either be vacant or occupied.
///
/// This enum is constructed from the entry method on HashMap.
pub enum Entry<'a> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this to match both IndexMap and std::HashMap's Entry API. I updated just the parts that are breaking and left the remaining bits to future, additive work. I'll file a ticket once this is merged for us to remember to do the rest of it. It's basically just adding a few methods to OccupiedEntry and VacantEntry.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from looking at the docs for std::HashMap's OccupiedEntry type, I assume that will cover a way to obtain the value in an entry?

this does make me a little confused about the old API. how were users expected to obtain an entry's underlying value? going back to the document and looking up the entry's key? that seems important to me, but maybe I'm misunderstanding how people would use this API. (I know we inherited this BSON library, so maybe you don't know the answer there, I'm mainly just curious... and this subject is kind of relevant for the phone screens I'm doing these days...)

anyway, definitely support having an API consistent w/ the standard library's HashMap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, OccupiedEntry will have a fn get() -> &Bson on it which will return the value.

In the old API (and the new one, for now), it looks like you had to use or_insert or or_insert_with to get the value out from an entry, inserting into the into it if it wasn't occupied. It was definitely incomplete and missing a lot of the useful parts of the std::HashMap one.

The Entry API looks mostly useful for mutating / inspecting the map itself without having to compute the hash of an input multiple times. I remember we were doing that in the driver and a user actually submitted a PR to update it to use entry and be a little more efficient.

let mut map = HashMap::new();
// old
if map.get("key").is_none() {
    map.insert("key".to_string(), "value".to_string());
}

// new
if let Entry::Vacant(entry) = map.entry("key") {
    entry.insert("value".to_string());
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh I see. thanks for explanation! cool, seems like the new API should be a nice usability improvement then.

Copy link

@kmahar kmahar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple questions, but generally LGTM!

chrono = "0.4"
rand = "0.7"
serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1.0", features = ["preserve_order"] }
linked-hash-map = "0.5.3"
indexmap = "1.6.2"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the README for the crate it notes that insertion order is not preserved if anything is deleted (though it's not clear to me what the behavior in that case is). I don't see us deleting from the index map anywhere, but is there a chance we'd need to do so in the future and might do so without realizing it would break things?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good catch. It looks like we want to use shift_remove if we want to preserve the order. It's a little bit slower, but this seems like the correct default. I updated remove to use it and made the test for it a little more robust.

/// A view into a single entry in a map, which may either be vacant or occupied.
///
/// This enum is constructed from the entry method on HashMap.
pub enum Entry<'a> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from looking at the docs for std::HashMap's OccupiedEntry type, I assume that will cover a way to obtain the value in an entry?

this does make me a little confused about the old API. how were users expected to obtain an entry's underlying value? going back to the document and looking up the entry's key? that seems important to me, but maybe I'm misunderstanding how people would use this API. (I know we inherited this BSON library, so maybe you don't know the answer there, I'm mainly just curious... and this subject is kind of relevant for the phone screens I'm doing these days...)

anyway, definitely support having an API consistent w/ the standard library's HashMap.

Copy link

@kmahar kmahar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

/// A view into a single entry in a map, which may either be vacant or occupied.
///
/// This enum is constructed from the entry method on HashMap.
pub enum Entry<'a> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh I see. thanks for explanation! cool, seems like the new API should be a nice usability improvement then.

@patrickfreed patrickfreed merged commit ea17c56 into mongodb:master Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants