Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regular subtag APIs #1932

Merged
merged 7 commits into from
Sep 2, 2022
Merged

Regular subtag APIs #1932

merged 7 commits into from
Sep 2, 2022

Conversation

robertbastian
Copy link
Member

No description provided.

Copy link
Member

@zbraniecki zbraniecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: generally this looks awesome. It will make it easier to maintain the code. There's some cost to readability (macros are by nature a bit more challenging), but I think it's a good ROI.

My only larger concern is about performance of LSR subtags. I handcrafted parsing of each to a cycle. My reasoning is that by far the hottest paths in parsing of Locale are:

  1. Are the first couple characters a valid Language?
  2. Is the second subtag a Script or not?
  3. Is the second/third subtag a Region?

You may notice that each of the three has different codepath to attempt to always check length first and bail early. Then check first character if possible and bail early.
I also tested (criterion, non iai tho!) whether it is faster to first check ascii and conditionally convert to tinystr, or convert and then test - testing tinystr casing is much faster, but parsing to tinystr costs.

This whole microoptimization may be moot tho and maybe the actual result is not going to be affected in a significant way, but please, test it. With iai if possible.

Particularly, rejecting invalid first characters, and fast-skipping Script as commonly omitted - the handcrafting allowed us to not have separate Script::is_valid from parsing, because parsing was instruction-optimized for that.

You also have a couple places where you, for normalization reasons, repeat validation calls where they weren't before. It may or may not be needed, and I'm ok with some overhead in all subtags except of LID so please, test those. Maybe no-ops will be enough to avoid complicating the macro?

components/locid/src/helpers.rs Outdated Show resolved Hide resolved
components/locid/src/helpers.rs Outdated Show resolved Hide resolved
components/locid/src/subtags/region.rs Outdated Show resolved Hide resolved
@robertbastian
Copy link
Member Author

robertbastian commented Jun 15, 2022

Parsing now does:

  • raw slice length check
  • parse TinyStr
    - TinyStr length check (new, this can be smaller than the raw length)
  • validation
  • normalization

For Language and Script this is the same as before. Variant failed early if the slice length is 4 the first character is alpha, which it doesn't do anymore. Region needs to branch on the TinyStr's length for both validation and normalization, that used to be one branch and now it's two consecutive ones, hopefully the compiler can optimize it out. These can be joined if needed but it would complicate the macro for all subtags.

The from-raw path does:

  • parse TinyStr
  • TinyStr length check
  • validation+normalization check

This is identical to before for LSRV. For other tags this path might be new and validation and normalization are sometimes two TinyStr calls, as the combined predicate doesn't exist.

@robertbastian
Copy link
Member Author

Benchmarks look bad though and I don't know why...

@robertbastian
Copy link
Member Author

^ rebased on #2068

@robertbastian robertbastian requested review from Manishearth and removed request for nciric June 17, 2022 15:49
Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I'm requesting changes because we shouldn't land unsafe code that isn't documented.

components/locid/src/helpers.rs Show resolved Hide resolved
components/locid/src/macros.rs Show resolved Hide resolved
utils/tinystr/src/ascii.rs Show resolved Hide resolved
utils/tinystr/src/ascii.rs Outdated Show resolved Hide resolved
utils/tinystr/src/ascii.rs Outdated Show resolved Hide resolved
@sffc sffc added this to the ICU4X 1.0 (Polish) milestone Jun 23, 2022
Manishearth
Manishearth previously approved these changes Jun 23, 2022
Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't have strong opinions on this component but looks good 👍

@sffc
Copy link
Member

sffc commented Jul 22, 2022

@robertbastian Can you determine whether this change can be decoupled from 1.0? If there are any methods that are being removed or renamed, you can do that in a standalone PR first

@robertbastian
Copy link
Member Author

robertbastian commented Aug 30, 2022

@CLAassistant
Copy link

CLAassistant commented Aug 30, 2022

CLA assistant check
All committers have signed the CLA.

@robertbastian
Copy link
Member Author

bench_langid_constr
  Instructions:                2946 (+4.394047%)
  L1 Accesses:                 3715 (+3.481894%)
  L2 Accesses:                    9 (No change)
  RAM Accesses:                  53 (-1.851852%)
  Estimated Cycles:            5615 (+1.628959%)

bench_langid_compare_components
  Instructions:                  14 (No change)
  L1 Accesses:                   19 (No change)
  L2 Accesses:                    3 (No change)
  RAM Accesses:                   1 (No change)
  Estimated Cycles:              69 (No change)

bench_langid_compare_components_str
  Instructions:                  32 (No change)
  L1 Accesses:                   41 (No change)
  L2 Accesses:                    3 (No change)
  RAM Accesses:                   2 (No change)
  Estimated Cycles:             126 (No change)

bench_langid_strict_cmp
  Instructions:                 661 (No change)
  L1 Accesses:                  762 (No change)
  L2 Accesses:                    9 (No change)
  RAM Accesses:                  16 (No change)
  Estimated Cycles:            1367 (No change)

bench_langid_matching
  Instructions:                 126 (No change)
  L1 Accesses:                  163 (No change)
  L2 Accesses:                    9 (No change)
  RAM Accesses:                  10 (No change)
  Estimated Cycles:             558 (No change)

bench_langid_matching_str
  Instructions:                1523 (+2.146211%)
  L1 Accesses:                 1785 (+1.825442%)
  L2 Accesses:                    7 (No change)
  RAM Accesses:                  27 (No change)
  Estimated Cycles:            2765 (+1.170874%)

bench_langid_serialize
  Instructions:                4563 (No change)
  L1 Accesses:                 6423 (+0.015571%)
  L2 Accesses:                   17 (No change)
  RAM Accesses:                  52 (-1.886792%)
  Estimated Cycles:            8328 (-0.406601%)

bench_langid_canonicalize
  Instructions:                7306 (+1.726539%)
  L1 Accesses:                 9928 (+1.275120%)
  L2 Accesses:                   18 (+5.882353%)
  RAM Accesses:                  71 (-2.739726%)
  Estimated Cycles:           12503 (+0.482199%)

langid/overview         time:   [3.4459 us 3.4749 us 3.5056 us]
                        change: [-1.4087% -0.4060% +0.6212%] (p = 0.44 > 0.05)
                        No change in performance detected.

langid/construct/langid time:   [990.92 ns 1.0023 us 1.0170 us]
                        change: [+5.3227% +6.5564% +8.1515%] (p = 0.00 < 0.05)
                        Performance has regressed.

langid/to_string/langid time:   [1.0699 us 1.0786 us 1.0886 us]
                        change: [-31.112% -29.571% -28.160%] (p = 0.00 < 0.05)
                        Performance has improved.
langid/to_string/langid/writeable
                        time:   [685.64 ns 689.68 ns 693.66 ns]
                        change: [+3.1572% +4.1634% +5.1570%] (p = 0.00 < 0.05)
                        Performance has regressed.

langid/compare/struct/langid
                        time:   [140.11 ns 140.99 ns 141.92 ns]
                        change: [-4.3621% -3.1981% -2.0392%] (p = 0.00 < 0.05)
                        Performance has improved.
langid/compare/str/langid
                        time:   [1.0888 us 1.0950 us 1.1018 us]
                        change: [+10.561% +11.239% +11.883%] (p = 0.00 < 0.05)
                        Performance has regressed.
langid/compare/strict_cmp/langid
                        time:   [456.62 ns 459.14 ns 461.62 ns]
                        change: [-2.8353% -1.6745% -0.7075%] (p = 0.00 < 0.05)
                        Change within noise threshold.

langid/canonicalize/langid
                        time:   [2.5731 us 2.5853 us 2.5983 us]
                        change: [-5.8140% -4.7544% -3.6143%] (p = 0.00 < 0.05)
                        Performance has improved.

locale/overview         time:   [4.6733 us 4.7106 us 4.7473 us]
                        change: [+2.3118% +3.3342% +4.3377%] (p = 0.00 < 0.05)
                        Performance has regressed.

locale/construct/locale time:   [1.9567 us 1.9715 us 1.9893 us]
                        change: [-2.1714% -0.9435% +0.2710%] (p = 0.13 > 0.05)
                        No change in performance detected.

locale/to_string/locale time:   [1.3082 us 1.3209 us 1.3366 us]
                        change: [-4.3982% -2.9053% -1.4882%] (p = 0.00 < 0.05)
                        Performance has improved.
locale/to_string/locale/writeable
                        time:   [1.0715 us 1.0770 us 1.0828 us]
                        change: [+0.3527% +1.3658% +2.3336%] (p = 0.01 < 0.05)
                        Change within noise threshold.

locale/compare/struct/locale
                        time:   [265.12 ns 266.57 ns 268.18 ns]
                        change: [-3.9578% -3.1652% -2.3480%] (p = 0.00 < 0.05)
                        Performance has improved.
locale/compare/str/locale
                        time:   [2.1938 us 2.2045 us 2.2159 us]
                        change: [+4.0290% +5.0860% +6.1479%] (p = 0.00 < 0.05)
                        Performance has regressed.
locale/compare/strict_cmp/locale
                        time:   [637.09 ns 644.99 ns 655.71 ns]
                        change: [-2.6526% -1.3131% +0.0698%] (p = 0.05 > 0.05)
                        No change in performance detected.

locale/canonicalize/locale
                        time:   [3.8947 us 3.9168 us 3.9424 us]
                        change: [+1.7613% +2.6642% +3.5496%] (p = 0.00 < 0.05)
                        Performance has regressed.

subtags/language/parse  time:   [196.31 ns 197.31 ns 198.32 ns]
                        change: [-0.6625% +0.3728% +1.4782%] (p = 0.50 > 0.05)
                        No change in performance detected.

subtags/script/parse    time:   [134.40 ns 135.13 ns 135.89 ns]
                        change: [-0.7296% +0.2002% +1.1192%] (p = 0.69 > 0.05)
                        No change in performance detected.

subtags/region/parse    time:   [164.77 ns 165.65 ns 166.63 ns]
                        change: [-21.710% -21.058% -20.438%] (p = 0.00 < 0.05)
                        Performance has improved.

subtags/variant/parse   time:   [212.03 ns 213.21 ns 214.52 ns]
                        change: [+10.139% +11.212% +12.197%] (p = 0.00 < 0.05)
                        Performance has regressed.

@sffc
Copy link
Member

sffc commented Aug 31, 2022

Benchmarks look OK to me.

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Praise: Nice work!

/// Compare with a potentially unnormalized BCP-47 string.
///
/// The return value is equivalent to what would happen if you first parsed the
/// BCP-47 string and then performed a structucal comparison.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// BCP-47 string and then performed a structucal comparison.
/// BCP-47 string and then performed a structural comparison.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

components/locid/src/helpers.rs Show resolved Hide resolved
pub const fn as_bytes(&self) -> &[u8] {
// Safe because `self.bytes.as_slice()` pointer-casts to `&[u8]`,
// and changing the length of that slice to self.len() < N is safe.
unsafe { core::mem::transmute((self.bytes.as_slice().as_ptr(), self.len())) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping @Manishearth to review this unsafe block

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I want to move away from relying on the slice layout matching tuples (miri will eventually not like this, but also we'll have better slice APIs by then)

Can you instead take self.bytes.as_slice() and just transmute that reference? It should work.

Also, explicitly include the from/to types on the transmute::<_,_>() turbofish (and perhaps import core::mem somewhere)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s.to_ascii_lowercase(),
s.is_ascii_lowercase()
&& s.is_ascii_alphanumeric()
&& (s.len() != 4 || s.all_bytes()[0].is_ascii_digit()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Praise: This is a clearer way to express the invariant than we had before

 variant       = 5*8alphanum         ; registered variants
               / (DIGIT 3alphanum)

https://www.rfc-editor.org/rfc/bcp/bcp47.txt

@sffc
Copy link
Member

sffc commented Sep 2, 2022

I want this so I'm going to merge it.

@sffc sffc merged commit 6e6ea16 into unicode-org:main Sep 2, 2022
@robertbastian robertbastian deleted the locale branch September 5, 2022 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants