Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bindnode): allow custom type conversions with options #414

Merged
merged 13 commits into from
Jun 17, 2022

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented May 13, 2022

This arises out of the attempts to push bindnode down into go-fil-markets for go-data-transfer vouchers. There are some types that are difficult to handle without just replacing them and having a conversion layer.

(Skip to custom_test.go for the code TL;DR)

To give some idea of the challenges, here's the list of types in their hierarchies that become difficult:

  • retrievalmarket.DealResponse
    • go-state-types/abi/TokenAmount - go-state-types/big/Int - struct{*big.Int}
  • retrievalmarket.DealProposal
    • retrievalmarket.Params
      • go-state-types/abi/TokenAmount - go-state-types/big/Int - struct{*big.Int}
  • retrievalmarket.DealPayment
    • go-address/Address - struct{string}
    • specs-actors/actors/builtin/paych/SignedVoucher
      • go-address/Address - struct{string}
      • specs-actors/actors/builtin/paych/ModVerifyParams
        • go-address/Address - struct{string}
      • go-state-types/big/Int - struct{*big.Int}
      • go-state-types/crypto/crypto.Signature
        • this is a byteprefix union, SigType is first byte, Data is remainder, it could be read as Bytes and converted internally rather than implementing byteprefix

All of these types currently have their own custom, hand-rolled, UnmarshalCBOR() and MarshalCBOR() methods for cbor-gen. They're also not in go-fil-markets, so attempting to do anything direct with them (like adding custom methods) means chasing through lots of repos.

The approach I'm taking here is to add some options when calling bindnode, with these you can register converters for your custom types. In all of the above instances it's Bytes that we're dealing with, I think it's going to be the most common so I've only implemented that for now but I think a complete version of this would do it for all scalar kinds. Your schema still has to say that the field is Bytes / bytes but when traversing the Go type, if it encounters that type you registered for and it has the right kind then it'll run through your converter, either way.

The current version requires you supply two functions, a to and a from and they are checked for types. A previous iteration which you can see in 12a0ca4 had a converter type interface which had two functions and it'd need one for each kind we're allowing to register. I switched to supplying functions and validating with reflection for a slightly cleaner API. Plus, if you have existing functions like NewFromBytes() (which Address has) that matches the converter signature then you can just pass that in.

@rvagg
Copy link
Member Author

rvagg commented May 13, 2022

Something I'm pondering after this is to be able to do this same thing for but for Any. That's quite invasive unfortunately because it needs to be done at every AssignX() and every Lookup*(), and every iterator Next().

The obvious use-case now is for selectors, which need to be Any anyway. But for cbor-gen they are Deferreds. So I'd like to have a wrapper class that can hold an ipld.Node and has the Deferred methods as well. So it can act as both a bindnode type and a cbor-gen type at the same time and migration is relatively straightforward.

But, it also ends up being a backdoor for all sorts of whacky hacks. You just stop the schema short with an Any, and then defer to some other method of converting your types. You could even mix it with codegen or some other means of pulling in a typed builder. It'd allow you to do novel validation and conditionals too .. potentially lots of behaviours that we may or may not really want to happen, but such is the nature of pluggable systems.

node/bindnode/api.go Outdated Show resolved Hide resolved
Comment on lines 452 to 460
res := customConverter.toFunc.Call([]reflect.Value{w.val.Addr()})
if !res[1].IsNil() {
err, ok := res[1].Interface().(error)
if !ok {
return nil, fmt.Errorf("converter function did not return expected error: %v", res[1].Interface())
}
return nil, err
}
byts, ok := res[0].Interface().([]byte)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having to do the call in the abstract reflection world is going to be slower than if you can use a concrete function - i wonder if generics could be enough for defining these functions rather than needing to have it as a reflect.Value

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I wasn't aware of the size of that cost .. looks like more than 70 times slower than a plain call. Which is not awesome. It's not the worst thing in the world but if we're trying to compete against cbor-gen we're already starting off on a back foot.

So a couple of options:

  1. Use interface{} where for the custom type, and pass in that type via other means (probably just a null pointer like the rest of bindnode uses). Internally it can cast through the various options to find the one that is being used (e.g. FromBytes([]byte) (interface{}, error) & ToBytes(interface{}) ([]byte, error). Then the API just needs that extra argument to pass the type. This steps back a little to the previous version of this API (12a0ca4); it's not terrible but more verbose is slightly annoying.
  2. Generics - this is the nice use-case for generics, but I don't believe we can even use reflection to access the concrete type being used, so we still need to derive that from somewhere. So we end up very similar to option 1, of needing to add the type. I think also we might have trouble casting to find out which one is being used and may have to use reflection to figure that out (like the current impl) but I'm speaking out of a bit of ignorance on what Go generics are giving us.
  3. Live with the cost and 🤞 that reflect: add Value.Caller golang/go#49340 gets implemented sometime soon.

I'll try option 1 out and see how it goes. I'm making this API as "EXPERIMENTAL" in its docs to give a bit of permission to tweak and change it as we figure out how well it works for what we need.

@rvagg rvagg mentioned this pull request May 16, 2022
@rvagg
Copy link
Member Author

rvagg commented May 16, 2022

OK, given the comment above about reflect function call performance, I've gone back to one option per kind for converters, so there's now a AddCustomTypeXConverter option for each scalar kind other than Null.

This now gives concrete functions to call and strong type checking by avoiding reflection. It makes the API more verbose, and it does force the use of interface{} on the user as a stand-in for their custom type, which is a little annoying (especially when you have to type check & convert within your converter functions). Plus you need to supply your type up-front that you want to register; so each of these option functions takes 3 arguments: type, convertFromFn, convertToFn.

I've implemented them all and added a variation for each in the main test that covers this. There's a bunch of whacky types is in there that are represented as one thing in the data model but another thing as a Go type. With them going both ways:

  • a Bool as an int enum with 2 variations
  • an Int that's the integer in words ("two thousand")
  • a Float that's a wrapped big.Float for the user
  • a String that's really an array of | separated []bytes
  • a Bytes that's really a struct wrapping a string (like cid.Cid and graphsync.RequestId)
  • a Bytes that's a wrapped big.Int, like go-state-types/big/Int
  • a Link that's a reversed hex form of the hash digest (Bitcoin ID style)

Copy link
Member

@willscott willscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it feels pretty verbose, this seems like it should be able to have much more reasonable performance on these conversions.

@rvagg
Copy link
Member Author

rvagg commented May 17, 2022

Considering shortening the names: from AddCustomTypeBoolConverter() to just BoolConverter(), TypedBoolConverter(), BoolTypeConverter(). Maybe there's a better term than "converter" too. It's all a bit too verbose now that I'm typing them out.

Copy link
Collaborator

@hannahhoward hannahhoward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the code itself is quite verbose, the end result seems to do exactly what we want it to, and using it doesn't seem too bad. Especially when I look at the markets PR, I think our options list will be pretty standard across all of filecoin, so this should be relatively convenient to implement.

+1 approved

@rvagg
Copy link
Member Author

rvagg commented May 19, 2022

This is close to ready given the work in #415 that's now merged in. I've been putting it into use in filecoin-project/go-fil-markets#713 and have now switched back to go-data-transfer where I'm redoing filecoin-project/go-data-transfer#305 and I'll use this branch there too. That may expose bugs or other problems in the implementation.

@rvagg
Copy link
Member Author

rvagg commented May 23, 2022

I think I'll close #352 when this is merged; there may be further work to do but so far we're making it work fairly nicely for common Filecoin chain types with this PR.

@rvagg
Copy link
Member Author

rvagg commented Jun 10, 2022

I've pulled #429 into this branch to make it a larger bindnode updates branch that I can use in go-fil-markets and go-data-transfer, along with some other updates I'm about to propose.

This was referenced Jun 14, 2022
@rvagg rvagg merged commit b636575 into master Jun 17, 2022
@rvagg rvagg deleted the rvagg/bindnode-extend branch June 17, 2022 08:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🎉 Done
Development

Successfully merging this pull request may close these issues.

node/bindnode: provide a mode that's compatible with cbor-gen
3 participants