Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize DecodeString and Decode methods. #1

Merged
merged 3 commits into from
Nov 22, 2016
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 17 additions & 14 deletions base32.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,8 @@
package base32

import (
"bytes"
"io"
"strconv"
"strings"
)

/*
Expand Down Expand Up @@ -67,13 +65,6 @@ var HexEncoding = NewEncoding(encodeHex)
var RawStdEncoding = NewEncoding(encodeStd).WithPadding(NoPadding)
var RawHexEncoding = NewEncoding(encodeHex).WithPadding(NoPadding)

var removeNewlinesMapper = func(r rune) rune {
if r == '\r' || r == '\n' {
return -1
}
return r
}

/*
* Encoder
*/
Expand Down Expand Up @@ -344,17 +335,29 @@ func (enc *Encoding) decode(dst, src []byte) (n int, end bool, err error) {
// written. If src contains invalid base32 data, it will return the
// number of bytes successfully written and CorruptInputError.
// New line characters (\r and \n) are ignored.
func (enc *Encoding) Decode(dst, src []byte) (n int, err error) {
src = bytes.Map(removeNewlinesMapper, src)
n, _, err = enc.decode(dst, src)
func (enc *Encoding) Decode(dst, s []byte) (n int, err error) {
stripped := make([]byte, 0, len(s))
for i := 0; i < len(s); i++ {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use for _, c := range s {. Its faster because it optimizes out bounds checks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can fix this one. But see below.

c := s[i]
if c != '\r' && c != '\n' {
stripped = append(stripped, c)
}
}
n, _, err = enc.decode(dst, stripped)
return
}

// DecodeString returns the bytes represented by the base32 string s.
func (enc *Encoding) DecodeString(s string) ([]byte, error) {
s = strings.Map(removeNewlinesMapper, s)
stripped := make([]byte, 0, len(s))
for i := 0; i < len(s); i++ {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is it returns a rune and not a byte when using range for a string. Is there a way to do use range on the raw bytes or a string.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would have to basically cast it to a []byte first. At which point you might as well just call Decode([]byte(s)) in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. So I was trying to avoid any necessary allocation so is this one okay as is?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh, one thing you could do is convert the string to bytes, then overwrite that same array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do that. It will require an extra pass and seams like an overkill. Eventually the go compiler should be smart enough to eliminate the bounds checks. Most other optimizing compilers will in simple cases like this.

Do you really think it could make that much of a difference? If so I will benchmark it and let you know.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wont take an extra pass:

strb := []byte(s)
off := 0
for i, b := range strb {
  if b == '\n' || b == '\r' {
    continue
  }
  strb[off] = b
  off++
}

n, _, err := enc.decode(strb, strb[:off])
if err != nil {
  return nil, err
}
return strb[:n], nil

Copy link
Contributor Author

@kevina kevina Nov 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it will. The extra pass is in the conversion from the string to a []byte.

I think this is just adding complexity but since you seam to really want it I will run some benchmarks and report back. It may be tomorrow through before I do it.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

converting a string to a byte array is pretty well optimized. If youre going to count that as an extra pass then you have to count allocating a byte buffer as an extra pass as go zeros every buffer you allocate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I said, I will run some benchmarks. In many cases you can allocate zeroed memory without require a pass to zero it.

c := s[i]
if c != '\r' && c != '\n' {
stripped = append(stripped, c)
}
}
dbuf := make([]byte, enc.DecodedLen(len(s)))
n, _, err := enc.decode(dbuf, []byte(s))
n, _, err := enc.decode(dbuf, stripped)
return dbuf[:n], err
}

Expand Down