Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve .chars().count() #37888

Merged
merged 1 commit into from
Nov 21, 2016
Merged

Improve .chars().count() #37888

merged 1 commit into from
Nov 21, 2016

Conversation

bluss
Copy link
Member

@bluss bluss commented Nov 19, 2016

Use a simpler loop to count the char of a string: count the
number of non-continuation bytes. Use count += <conditional> which the
compiler understands well and can apply loop optimizations to.

benchmark descriptions and results for two configurations:

  • ascii: ascii text
  • cy: cyrillic text
  • jp: japanese text
  • words ascii: counting each split_whitespace item from the ascii text
  • words jp: counting each split_whitespace item from the jp text
x86-64 rustc -Copt-level=3
 name               orig_ ns/iter      cmov_ ns/iter      diff ns/iter   diff % 
 count_ascii        1,453 (1755 MB/s)  1,398 (1824 MB/s)           -55   -3.79% 
 count_cy           5,990 (856 MB/s)   2,545 (2016 MB/s)        -3,445  -57.51% 
 count_jp           3,075 (1169 MB/s)  1,772 (2029 MB/s)        -1,303  -42.37% 
 count_words_ascii  4,157 (521 MB/s)   1,797 (1205 MB/s)        -2,360  -56.77% 
 count_words_jp     3,337 (1071 MB/s)  1,772 (2018 MB/s)        -1,565  -46.90%

x86-64 rustc -Ctarget-feature=+avx -Copt-level=3
 name               orig_ ns/iter      cmov_ ns/iter      diff ns/iter   diff % 
 count_ascii        1,444 (1766 MB/s)  763 (3343 MB/s)            -681  -47.16% 
 count_cy           5,871 (874 MB/s)   1,527 (3360 MB/s)        -4,344  -73.99% 
 count_jp           2,874 (1251 MB/s)  1,073 (3351 MB/s)        -1,801  -62.67% 
 count_words_ascii  4,131 (524 MB/s)   1,871 (1157 MB/s)        -2,260  -54.71% 
 count_words_jp     3,253 (1099 MB/s)  1,331 (2686 MB/s)        -1,922  -59.08%

I briefly explored a more involved blocked algorithm (looking at 8 or more bytes at a time),
but the code in this PR was always winning count_words_ascii in particular (counting
many small strings); this solution is an improvement without tradeoffs.

Use a simpler loop to count the `char` of a string: count the
number of non-continuation bytes. Use `count += <conditional>` which the
compiler understands well and can apply loop optimizations to.
@rust-highfive
Copy link
Collaborator

r? @brson

(rust_highfive has picked a reviewer for you, use r? to override)

@bluss
Copy link
Member Author

bluss commented Nov 19, 2016

@alexcrichton
Copy link
Member

@bors: r+

Nice wins!

@bors
Copy link
Contributor

bors commented Nov 20, 2016

📌 Commit 5a3aa2f has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Nov 20, 2016

⌛ Testing commit 5a3aa2f with merge fc2373c...

bors added a commit that referenced this pull request Nov 20, 2016
Improve .chars().count()

Use a simpler loop to count the `char` of a string: count the
number of non-continuation bytes. Use `count += <conditional>` which the
compiler understands well and can apply loop optimizations to.

benchmark descriptions and results for two configurations:

- ascii: ascii text
- cy: cyrillic text
- jp: japanese text
- words ascii: counting each split_whitespace item from the ascii text
- words jp: counting each split_whitespace item from the jp text

```
x86-64 rustc -Copt-level=3
 name               orig_ ns/iter      cmov_ ns/iter      diff ns/iter   diff %
 count_ascii        1,453 (1755 MB/s)  1,398 (1824 MB/s)           -55   -3.79%
 count_cy           5,990 (856 MB/s)   2,545 (2016 MB/s)        -3,445  -57.51%
 count_jp           3,075 (1169 MB/s)  1,772 (2029 MB/s)        -1,303  -42.37%
 count_words_ascii  4,157 (521 MB/s)   1,797 (1205 MB/s)        -2,360  -56.77%
 count_words_jp     3,337 (1071 MB/s)  1,772 (2018 MB/s)        -1,565  -46.90%

x86-64 rustc -Ctarget-feature=+avx -Copt-level=3
 name               orig_ ns/iter      cmov_ ns/iter      diff ns/iter   diff %
 count_ascii        1,444 (1766 MB/s)  763 (3343 MB/s)            -681  -47.16%
 count_cy           5,871 (874 MB/s)   1,527 (3360 MB/s)        -4,344  -73.99%
 count_jp           2,874 (1251 MB/s)  1,073 (3351 MB/s)        -1,801  -62.67%
 count_words_ascii  4,131 (524 MB/s)   1,871 (1157 MB/s)        -2,260  -54.71%
 count_words_jp     3,253 (1099 MB/s)  1,331 (2686 MB/s)        -1,922  -59.08%
```

I briefly explored a more involved blocked algorithm (looking at 8 or more bytes at a time),
but the code in this PR was always winning `count_words_ascii` in particular (counting
many small strings); this solution is an improvement without tradeoffs.
@bors bors merged commit 5a3aa2f into rust-lang:master Nov 21, 2016
@bluss bluss deleted the chars-count branch November 21, 2016 10:05
@brson brson added the relnotes Marks issues that should be documented in the release notes of the next release. label Nov 22, 2016
@llogiq
Copy link
Contributor

llogiq commented Nov 28, 2016

I'm curious – bytecount is much faster than anything else at counting bytes, and should be adaptable to this situation (count bytes lower than 128) without perf loss.

@bluss
Copy link
Member Author

bluss commented Nov 28, 2016

Go ahead and experiment. My comment was

I briefly explored a more involved blocked algorithm (looking at 8 or more bytes at a time),
but the code in this PR was always winning count_words_ascii in particular (counting
many small strings); this solution is an improvement without tradeoffs.

I'm leaving the door open to such improvements, but I suggest looking out for the small-input case as well.

@bluss
Copy link
Member Author

bluss commented Nov 28, 2016

Oh by the way @llogiq did you see this comment? I wanted to tell you, due to possible appication in bytecount, that it can be beneficial (it was to me) to use this kind of raw pointer solution instead of computing separate slice parts up front. (Edit: Oh I now see why you couldn't possibly see that comment).

@bluss
Copy link
Member Author

bluss commented Nov 29, 2016

By the way, it's not counting just bytes lower than 128, but any (non-)continuation byte.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Marks issues that should be documented in the release notes of the next release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants