Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impl Serde for PackedPatternsV1; export Serde impls for PluralElements #5592

Closed
wants to merge 5 commits into from

Conversation

sffc
Copy link
Member

@sffc sffc commented Sep 24, 2024

Depends on #5580

I know @robertbastian previously was not a fan of exporting a Serde impl for PluralElements. I can work around this by moving PluralElementsInner to the icu_plurals::provider module. However, I think it is fine for crates to export Serde impls for their core types. We do this in various other places. The main difference/downside in this particular case is that we have a bitpacked representation in the provider module that most ICU4X devs should be using; I am using the PluralElements impl only for human-readable serialization.

Copy link

dpulls bot commented Sep 25, 2024

🎉 All dependencies have been resolved !

@sffc
Copy link
Member Author

sffc commented Sep 25, 2024

@Manishearth Do you prefer the "human-readable" serialized format here, which requires shipping the builder code (which is not bad but not great), or would you prefer that I serialize something closer to the binary format (the header int and a sequence of patterns) such that I don't need to ship the builder?

@Manishearth
Copy link
Member

Manishearth commented Sep 25, 2024

Hmm, so I don't actually recall if llvm is smart enough to eliminate human readable deserialization code when working with postcard. My concern is codesize bloat for postcard users.

Otherwise having it be actually human readable seems like a good choice. I don't have a strong preference for it.

@@ -31,12 +33,20 @@ pub struct LengthPluralElements<T> {

/// A builder for a [`PackedPatternsV1`].
#[derive(Debug, Clone, PartialEq, Eq)]
#[cfg_attr(feature = "serde", derive(serde::Deserialize))] // human-readable only
#[cfg_attr(feature = "datagen", derive(serde::Serialize))] // human-readable only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a builder type should implement serde.

@@ -20,6 +20,8 @@ use crate::pattern::runtime::Pattern;

/// A field of [`PackedPatternsBuilder`].
#[derive(Debug, Clone, PartialEq, Eq)]
#[cfg_attr(feature = "serde", derive(serde::Deserialize))] // human-readable only
#[cfg_attr(feature = "datagen", derive(serde::Serialize))] // human-readable only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You say this, but there's nothing stopping anyone from using this directly in a data struct, or with a non-human-readable format.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a type exported in a provider module.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't stop anyone from using it in a data struct

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm closing this PR based on the other comments, but I don't think I agree with the premise "That doesn't stop anyone from using it in a data struct": I think it is our duty to help prevent common bugs in clients of ICU4X, but it seems wrong to restrict ICU4X development for the sake of preventing ICU4X developers from doing that. We have a high standard for code with thorough review processes and CI in place.

pub standard: LengthPluralElements<Pattern<'a>>,
/// Patterns for variant 0. If `None`, falls back to standard.
#[cfg_attr(feature = "serde", serde(borrow))]
#[cfg_attr(feature = "serde", serde(skip_serializing_if = "Option::is_none"))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only works for human-readable formats, which is not enforced anywhere.

@sffc
Copy link
Member Author

sffc commented Sep 25, 2024

Hmm, so I don't actually recall if llvm is smart enough to eliminate human readable deserialization code when working with postcard. My concern is codesize bloat for postcard users.

I'm pretty sure it does; the deserialize impl is generic in D: Deserializer, and is_human_readable is inline (it should be an associated constant but that feature of Rust probably predates Serde).

@sffc
Copy link
Member Author

sffc commented Sep 25, 2024

@robertbastian Your comments don't make a lot of sense to me. PackedPatternsBuilder is just the name of a type. I could easily name it UnpackedPatterns or something else. The type is in a provider module.

I can be convinced of avoiding Serde impls on types exported from a library. I don't see the argument for adding to code complexity for Serde impls on types that live in the provider module.

Comment on lines +887 to 893
///
/// # Serialization
///
/// Although this type implements Serde traits, ICU4X developers and clients
/// seeking a more efficient bitpacked representation should consider
/// [`crate::provider::PluralElementsPackedCow`].
pub struct PluralElements<T>(PluralElementsInner<T>);
Copy link
Member Author

@sffc sffc Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertbastian I added you to the review primarily to give you an opportunity to raise an objection to this change, in icu_plurals (I noted this in the OP). Given the lack of comment in this file, is it safe to assume that you are okay with this change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out, I do have objections. The serde impls on PluralElementsInner are currently only used in the human readable code path; this change uses them for all serializers through PluralElements. skip_serializing_if does not work for non-self-describing formats like postcard.

@sffc
Copy link
Member Author

sffc commented Sep 25, 2024

Requesting re-review for clarification of the nature of the comments.

@sffc
Copy link
Member Author

sffc commented Sep 30, 2024

OK the approach I put forth in this PR is not going to be compatible with @robertbastian's comments so I will close this PR and make a new one that uses "something closer to the binary format".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants