-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
selectors vs selector dmt vs selector builders: confusing constructor proliferation #192
Comments
If we're going to de-empahsizing the builders, we need to provide something equally user friendly for making selectors. The main thing I can see producing that is something built on There is a general misconception here that working with I understand the reasons for the builder interfaces, but we need to be realistic that we're several steps away from something that can be used to build data structures easily in everyday code. |
I agree with Hannah - Node and Assembler are powerful interfaces, but just don't compare to plain Go expressions for most use cases. I also sidestepped the interfaces in the HAMT ADL: https://github.com/ipld/go-ipld-adl-hamt/blob/b2c8ddbaab060c3deb13863eb9f37b9542992f64/node.go#L175 |
I really want to be clear that I don't think the builders are bad. I'm just really worried that building more library things on top of them is putting DSLs on a DSL, and I'm concerned that in turn will cause us to spend lots of effort polishing and refining something that, because of the additional layers of indirection, will be dangerously likely to evolve further and further away from the declarative, serializable stuff at the core. And it's the declarative serializable stuff at the core that we should care about the most, because that's what's relevant to:
If an application author wants to use golang DSLs for writing their selectors -- fine. But what about applications where users supply the selector from the command line (and we eschew #189 for now, or say they want the "full power" mode so they're using the full DMT JSON)? That's a totally different story. And I think in this story, the DSLs shouldn't be in the processing path at all, because they just don't add any value. And most of the helpful functions I'd want to add are probably in the territory of I'll edit the original post a bit to try to be clearer on this. w.r.t. the "Either codegen, or use of the new bindnode system, would let us have a named golang type for the selector dmt form." part of the original post: I should clarify a bit more why I'm concerned about that: I'm not necessarily claiming it'll add all the helper methods one would ever want, or that it negates any possible value of having a builder DSL. (Ya'll probably right, there's still journey to go there.) I'm claiming having a named type would make it easier to make library APIs more self-documenting and navigable, and encourage people to hold onto and pass around the right values. The reason I'm concerned about this is another piece of code @ribasushi brought up yesterday: https://github.com/filecoin-project/lotus/blob/efeaea76b97fd52203122ee33a7633da74ba9631/node/impl/client/client.go#L738-L739 -- what happened here is that it was insufficiently obvious to folks that storing the |
After thinking about this for a while: One of the things I want to do to help guide users to the right entrypoints for all these APIs is probably a package layout refactor. Notes on that in #236 . |
2022-05-10 conversation: we don't expect this to get done unless Bedrock takes it on. Bedrock work for selectors will happen in #236 (if it happens at all). |
There are several different types in circulation when working with selectors in go-ipld-prime. Some of this is essential complexity; some of it is not.
Of these types, they do very slightly different things. Some can be converted to others, but the conversations are not bidirectional. It's not clear which are meant to compose (and how) and which aren't. The ease of use is lower than it should be as a result of all this.
First, let's enumerate these different types and talk about them a bit. Afterwards, let's think about which of these we should emphasize or deemphasize when planning new APIs and writing documentation.
Enumerating
selector.Selector
)ipld.Node
that has to have a certain data convention inside)builder.SelectorSpec
, which is a wrapper for # 2 and also has a builder method for # 1.You'll instantly notice that # 2 isn't much of a type. It's pretty much our IPLD equivalent of golang's
interface{}
-- it's practically a wildcard. Nonetheless, this should count here, because selector.Parse()'s input parameter is...ipld.Node
.Why did we end up with all these?
# 1
(the "executable" selector) exposes the functions needed while actually executing.ipld.Node
tree out ofbasicnode
, which is what the "dmt" form (# 2) defacto is. This is important because Selectors sometimes return other Selectors during execution (e.g. recursion does this a lot).bindnode
option available. We might be able to revisit this now.# 2
(the "dmt", the declarative Data Model Tree defining the selector) exists... because it has to.ipld.Node
), and then... you're here; now you callselector.Parse(node)
with it.selector.Parse
.# 3
(selector/builder.*
) exists as a golang DSL to help create selector values.selector.Parse
on it, and returns you # 1.fluent/*
package family. We might be able to do this better now.What should we be emphasizing?
if we can unify 1 & 2...
... then we probably should do that. It makes a whole lot of complexity disappear in a puff.
Also, as mentioned earlier, this might be a lot more tractable now than it was when all of our Selectors code was written. We have codegen now, and/or
bindnode
, both of which provide ways to use memory more efficiently, making unification of the executable selector and the dmt form potentially viable. So, there's good reason to do a mental cachebust and reconsider if this is possible.I suspect the need for # 3 (the
builder.SelectorSpec
stuff) would disappear entirely too, if we could unify # 1 and # 2. There'd just be no more demand for it.A couple of things to watch out for in this, though:
selector.Selector
means it parsed" semantics? We do currently have features likeExploreRecursive
will check to make sure it contains anExploreRecursiveEdge
while we're in the process of going fromipld.Node
->selector.Selector
, and that's an example of something we can't replicate with Schema-based validation alone. We'd have to decide how critical this is. (I tend to lean pretty far towards "make the compiler help you enforce invariants about when things are known to be valid", but people may disagree about the cost/benefit here.)selector.Selector
types are already very immutable in their usage. If that wasn't the case, this would be a much bigger concern. I comment on this mostly for the sake of anyone reading this in the future and looking for guidance on their own designs for other systems; it may be a concern for you, future hypothetical reader.)... if we can't unify them:
... then we should still emphasize # 2 (the "dmt" form) as the central focus for most of our APIs.
The transition from dmt->executable is still one way, so composability is best served by emphasizing the dmt form.
(And the # 1 / executable form is also still not actually composible, itself, anyway -- again, on purpose -- because we made a point of disallowing
ExploreRecursiveEdge
without an appropriate parent, which has the additional affect of making it impossible to construct these from the leaves up.)but we should still get a golang type, so this feels better.
I think a problem that I'm observing from how folks are using these in the wild is... people are very reticent to hold onto or pass around the "dmt" form -- the
ipld.Node
value, in golang -- because it feels icky. It feels untyped. It's not obvious how useful theipld.Node
value with this data is, and it's not very discoverable how to use it by just looking at library API type signatures.That's a bummer. It also makes total sense. I'd get similar feelings looking at this. But it's a bummer, because it's nudging people away from using the thing we'd most like for them to use.
We can observe that sometimes this results in worse code when it's not obvious that the "dmt" form might be worth hanging onto: here is an example where selector builder code was copy-pasted between projects, because the upstream only kept the fully compiled selector (which is noncomposable).
But we have solutions for this. Either codegen, or use of the new
bindnode
system, would let us have a named golang type for the selector dmt form. That should then help with the obviousness problem -- library docs get much more discoverable when there's a more specific golang type name available.(We could also just use type aliases to make the library API documentation more discoverable... but this would not result in a lot of (scratch that: would not result in any) actual compile-time safety, so it's much less desirable than the codegen or bindnode options, which do give some compile-time safety in addition to a named type.)
what about
selector/builder.*
?I think we should avoid using these in the signatures of library functions that compose selector spec hunks.
It's perfectly innocuous to continue to use these in applications. But libraries are a different matter.
It's not that the builder system is bad, it's... just that the thing it is is a golang DSL. Building more library layers above it runs an increasing risk of making design errors where a design choice is nontranslatable to other languages, or starts to significantly diverge from the serial form. Building more library layers above this would also likely just become too many layers of indirection (DSLs can be great, but only when user-facing; something wrapping them is usually the beginning of a nasty complexity flowering).
Sidenote: I think if we wrote all this today, I'd probably have called
selector.Parse(ipld.Node) (selector.Selector, error)
asselector.Compile
instead. That evokes a lot more of the right connotations (such as it moving you towards "ready to run", and away from a "source" form, and, making that move in a probably not-entirely-reversable way).Ah well. Hindsight helps everything. Naming is no exception to that rule.
I'm not sure if this is an issue I wrote up with the expectation of immediate action in mind. But I thought it would be a useful hunk of thought to have as other discussions about selector ease-of-use are started. It was also fuelled in part by conversations today about APIs which have tried to use builder.SelectorSpec.
The text was updated successfully, but these errors were encountered: