Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Transparent encoding and decoding of Go strings to and from CBOR byte strings #446

Closed
benluddy opened this issue Dec 7, 2023 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@benluddy
Copy link
Contributor

benluddy commented Dec 7, 2023

Is your feature request related to a problem? Please describe.

I am attempting to add a CBOR encoding to an existing project that serves many API resources and supports JSON and Protobuf. API object types are primarily defined as Go struct types. Clients of the API that are written in Go can import and use these type definitions directly. Some clients can't use them because they don't know until runtime exactly what type they'll be handling, but are still able to perform useful work by relying on strict API conventions. Those clients are written to operate on "encoding/json"-compatible map[string]interface{} objects.

The existing encoders don't require that Go strings contain valid UTF-8 sequences, and I would like the proposed CBOR encoding to reject CBOR inputs that contain validity errors. This presents a few challenges:

  1. Encoding a Go string that doesn't contain a valid UTF-8 sequence produces CBOR with validity errors (i.e. we can potentially encode something that we refuse to decode).
  2. We can't change existing stable API types to use cbor.ByteString in place of string, since it would be a breaking change for consumers of the API types.
  3. In dynamic clients, an object may have been encoded to CBOR from a Go struct but is decoded into map[string]interface{}. For compatibility with JSON, untagged CBOR byte strings need to decode into interface{} as string and not []byte.

Describe the solution you'd like

I thought it would be most effective to group this proposal under one issue since it comes from a single use case. I hope that's OK.

  1. Add a new encode option controlling the CBOR string type of Go strings. The default would be "text string" and preserve the current behavior, and a second value would be "byte string".
  2. Add a new decode option controlling the Go type when decoding a CBOR byte string into interface{}. The default would be "[]byte", preserving the current behavior, but with the ability to select "string".
  3. Add a new decode option controlling whether or not to allow a CBOR byte string to be decoded into a destination value of type string. By default this can preserve the existing behavior.

Describe alternatives you've considered

  • Pre-validating and rejecting input strings containing invalid UTF-8 would make the CBOR encoder refuse to encode certain objects that the other supported encoders can encode.
  • Sanitizing invalid input strings in advance (as encoding/json does) incurs runtime overhead and isn't required by the existing Protobuf encoding.

Additional context

I'm happy to contribute all changes necessary to support this use case.

@fxamacker
Copy link
Owner

fxamacker commented Dec 11, 2023

Hi @benluddy, thanks for opening this issue! Please feel free to open a PR to add the proposed 3 non-default options! 👍

If my understanding is correct, you want to:

  • avoid overhead of checking/sanitizing UTF-8 string before encoding it, and
  • prevent encoding string containing invalid UTF-8 to CBOR text string

To achieve this, you want to add 3 options for Go string <--> CBOR byte string:

  • encode Go string to CBOR byte string
  • decode CBOR byte string as a Go string when decoding to interface{}
  • allow decoding CBOR byte string directly to Go string

Currently, the UTF8DecodeInvalid option allows decoding CBOR text strings containing invalid UTF-8 to Go string (invalid UTF-8 strings are well-formed but not valid).

Thanks again!

@dinhxuanvu
Copy link
Contributor

This feature has been completed.
/close

@fxamacker
Copy link
Owner

Closed by #465

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants