-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize DecodeString and Decode methods. #1
Conversation
In the DecodeString case it avoids unnecessary intermediate strings. In the Decode case it just avoids bytes.Map, which is likely slow due to having to make a function call for each character.
Note: I manually ran |
n, _, err = enc.decode(dst, src) | ||
func (enc *Encoding) Decode(dst, s []byte) (n int, err error) { | ||
stripped := make([]byte, 0, len(s)) | ||
for i := 0; i < len(s); i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use for _, c := range s {
. Its faster because it optimizes out bounds checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can fix this one. But see below.
return | ||
} | ||
|
||
// DecodeString returns the bytes represented by the base32 string s. | ||
func (enc *Encoding) DecodeString(s string) ([]byte, error) { | ||
s = strings.Map(removeNewlinesMapper, s) | ||
stripped := make([]byte, 0, len(s)) | ||
for i := 0; i < len(s); i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is it returns a rune
and not a byte
when using range for a string. Is there a way to do use range on the raw bytes or a string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You would have to basically cast it to a []byte
first. At which point you might as well just call Decode([]byte(s))
in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. So I was trying to avoid any necessary allocation so is this one okay as is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh, one thing you could do is convert the string to bytes, then overwrite that same array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could do that. It will require an extra pass and seams like an overkill. Eventually the go compiler should be smart enough to eliminate the bounds checks. Most other optimizing compilers will in simple cases like this.
Do you really think it could make that much of a difference? If so I will benchmark it and let you know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wont take an extra pass:
strb := []byte(s)
off := 0
for i, b := range strb {
if b == '\n' || b == '\r' {
continue
}
strb[off] = b
off++
}
n, _, err := enc.decode(strb, strb[:off])
if err != nil {
return nil, err
}
return strb[:n], nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it will. The extra pass is in the conversion from the string
to a []byte
.
I think this is just adding complexity but since you seam to really want it I will run some benchmarks and report back. It may be tomorrow through before I do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
converting a string to a byte array is pretty well optimized. If youre going to count that as an extra pass then you have to count allocating a byte buffer as an extra pass as go zeros every buffer you allocate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like I said, I will run some benchmarks. In many cases you can allocate zeroed memory without require a pass to zero it.
@whyrusleeping Okay, I used "range" in Decode() but left DecodeString() alone. If you are still not happy with this i can run some more tests to see if the index checking is really costing anything and if it would be better to convert to a []byte first. Thanks |
@whyrusleeping I ran some benchmarks. It turns out the index check is either optimized out or otherwise does not matter. Your version is faster because it only allocates once instead of twice (something that I missed yesterday, sorry), not because it avoids the index check. It also turns out that
|
Do this by calling the internal decode using the same byte array for the source and dest.
I just pushed a commit that uses the version of DecodeString uses by the I'm fine if we just go with that (basically your version). The time between the different versions (that only use a single alloc) is very small and various by 20ns each time I run it. |
For compassion I added the original implementation in the benchmarking code (a5bba06). Here are the numbers from another run
|
This LGTM, thanks @kevina! |
In the DecodeString case it avoids unnecessary intermediate strings.
In the Decode case it just avoids bytes.Map, which is likely slow due
to having to make a function call for each character.
In the benchmark in ipfs/kubo#3376 is was able to save about 40 ms of cpu time.