Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question: how to correctly decode CBOR-encoded data that was encoded with cbor.StringToByteString #500

Closed
shanebishop opened this issue Feb 28, 2024 · 3 comments

Comments

@shanebishop
Copy link

I see support for decoding and encoding CBOR data with byte strings was added in issue #446 in PR #465. I would like to use this feature in order to automatically transform input data that can include invalid UTF-8 into CBOR that I can successfully decode (if this is a misuse of the feature, please let me know).

I encoded some data in CBOR with the following encoding options:

options := cbor.CoreDetEncOptions()
options.Time = cbor.TimeRFC3339Nano
options.String = cbor.StringToByteString

I am now trying to decode the CBOR, but everything I have tried so far gives me an error.

This code gives me the error cbor: cannot unmarshal byte string into Go struct field wmd.WmdEvent.created of type time.Time:

decMode, err := cbor.DecOptions{ByteStringToString: cbor.ByteStringToStringAllowed}.DecMode()
if err != nil {
	return nil, err
}
err = decMode.Unmarshal(b.Bytes(), v)

When I use

cbor.DecOptions{DefaultByteStringType: reflect.TypeOf(cbor.ByteString(""))}.DecMode()

I get the error cbor: cannot unmarshal byte string into Go struct field wmd.WmdEvent.id of type string.

How can I correctly decode my CBOR data?

@benluddy
Copy link
Contributor

Hi @shanebishop, have you tried:

cbor.DecOptions{
  ByteStringToString: cbor.ByteStringToStringAllowed,
  DefaultByteStringType: reflect.TypeOf(""),
}

ByteStringToStringAllowed should let you decode a CBOR byte string into the string field.

When decoding into time.Time, the current implementation effectively decodes the data item into interface{}, then tries to convert the resulting Go value to time.Time:

cbor/decode.go

Lines 1270 to 1300 in cfbd0ff

var content interface{}
content, err = d.parse(false)
if err != nil {
return
}
switch c := content.(type) {
case nil:
return
case uint64:
return time.Unix(int64(c), 0), nil
case int64:
return time.Unix(c, 0), nil
case float64:
if math.IsNaN(c) || math.IsInf(c, 0) {
return
}
f1, f2 := math.Modf(c)
return time.Unix(int64(f1), int64(f2*1e9)), nil
case string:
tm, err = time.Parse(time.RFC3339, c)
if err != nil {
tm = time.Time{}
err = errors.New("cbor: cannot set " + c + " for time.Time: " + err.Error())
return
}
return
default:
err = &UnmarshalTypeError{CBORType: t.String(), GoType: typeTime.String()}
return
}

In any case, the above configuration for DefaultByteStringType should allow you to decode an untagged byte string into time.Time. I'm not sure it's desirable that an option controlling decodes into interface{} has a side effect on decodes into time.Time. I was surprised to discover it for myself yesterday (#497 (comment)).

@shanebishop
Copy link
Author

Thanks @benluddy, your suggestion worked! 🎉

Now I just need to figure out how to get com.fasterxml.jackson.dataformat.cbor.databind.CBORMapper in the consuming Java application to play nice with the byte string-encoded CBOR, but that is outside the scope of this question.

@fxamacker
Copy link
Owner

Hey @shanebishop, thanks for opening this issue!

Based on more discussion with @benluddy, the workaround currently relies on unintended side effect 🐞 of implementation details.

New feature/bugfix for this will be implemented this month (probably by adding a new decoding option).

I'll open new issue and will also comment here when the new feature becomes available for you to try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants