Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language support improvements #161

Merged
merged 4 commits into from
Jan 29, 2021
Merged

Language support improvements #161

merged 4 commits into from
Jan 29, 2021

Conversation

dbrgn
Copy link
Collaborator

@dbrgn dbrgn commented Jan 29, 2021

After #125, I used the occasion to dig out valgrind and kcachegrind to do some performance optimization (something that's fun, but that I do way too seldom...)

To profile, I created a new standalone binary that included nothing but get_languages and this main:

fn main() {
    get_languages(Some("en_US.utf8".to_string()), Some("de:en:fr".to_string()));
}

I then built in release mode and counted the instructions for main (since the function call to get_languages was inlined).

Results (in CPU instructions and percentage reduced compared to previous version):

  • Initial: 6939 (100%)
  • Replace HashSet for deduplication with linear search based approach: 4801 (-31%)
  • Do not allocate language variable: 4796 (-0.1%)
  • Avoid processing fallback language by inserting it directly into lang_list, not locales: 4543 (-5.3%)
  • Do not collect locales into intermediate vec, instead use a chained iterator: 3900 (-14.2%)

Callees before and after:

before

after

All in all, a reduction in instruction count (for the specified env variables) by 44%, nice!

(Might be of interest to you, @niklasmohrin)

@dbrgn dbrgn self-assigned this Jan 29, 2021
@dbrgn dbrgn force-pushed the multilanguage-improvements branch 2 times, most recently from dfd980b to 7d953dc Compare January 29, 2021 11:03
@niklasmohrin
Copy link
Collaborator

niklasmohrin commented Jan 29, 2021

Hey, this is pretty cool. When first reading, I was wondering why I introduced some of these indirections in the first place, I hope these optimizations do not break the rules from upstream.(edit: obviously, the tests are passing so that seems all right)

I think I can see another optimization reading this. In the implementation of Dedup for Vec, all the items are cloned, but couldn't you move the entire Vec out of self into a temporary (say mem::replace) and then drain the temp and only push back the items that are not in the new self yet?

- Replace `Result` parameters with `Option`
- Replace `HashSet` for deduplication with linear search based approach
- Avoid intermediate allocations

This reduces the instruction count in release mode by almost 50%.
@dbrgn
Copy link
Collaborator Author

dbrgn commented Jan 29, 2021

I hope these optimizations do not break the rules from upstream.

I don't think so, I tried to ensure that the optimizations didnt' change the logic. And the tests should cover all cases, I think.

I think I can see another optimization reading this. In the implementation of Dedup for Vec, all the items are cloned

The items are cloned, but only once (to add them to already_seen). Furthermore, we're only using the method with a Vec<&str>, so only references (pointers) should get cloned.

Regarding the draining of the Vec, I'm not sure I understand you correctly. The vec (self) is already "drained" in-place by using the retain method. But if you can come up with an example, I'm happy to take a look!

@niklasmohrin
Copy link
Collaborator

This is how I would implement it, there are no clones.

niklasmohrin@708d717

@dbrgn
Copy link
Collaborator Author

dbrgn commented Jan 29, 2021

@niklasmohrin hah, this is quite nice!

For the test case above, it increases CPU instructions from 3900 to 3997 (probably because there are no duplicates, so the retain call doesn't need to do anything), but I think your version looks cleaner.

Co-authored-by: Niklas Mohrin <niklas.mohrin@gmail.com>
@dbrgn dbrgn merged commit 32ff6c0 into master Jan 29, 2021
@dbrgn dbrgn deleted the multilanguage-improvements branch January 29, 2021 20:14
@dbrgn dbrgn added this to the v1.5.0 milestone Apr 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants