Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Topic handles and ownership #198

Open
Stebalien opened this issue Sep 18, 2019 · 10 comments
Open

Topic handles and ownership #198

Stebalien opened this issue Sep 18, 2019 · 10 comments

Comments

@Stebalien
Copy link
Member

Stebalien commented Sep 18, 2019

Motivation

  • It's not possible to forward messages on a topic without also subscribing.
  • It's not possible to instruct pubsub to remain connected to a topic's "mesh" without also subscribing. That means calling Publish without first calling Subscribe will often just drop the message.
  • It's difficult to reliably register/manage validators in the presence of multiple services as validators are registerd per-topic, globally (Global Validators #58).

Proposal

  • Introduce a Join function that returns a Topic handle (which can be closed to leave the topic). Joining a topic handle subscribes to the topic internally, performs any discovery necessary to find peers participating in the topic, and joins the topic mesh.
  • Make topic ownership exclusive. That is, only a single service will be able to "join" and/or participate in a topic at a time.
  • Bind validators to topic handles, unregistering them when the topic handle is closed.
type PubSub interface {
    // Join joins the topic. Messages can then be sent/received via the returned topic handle.
    // Only one service may join a topic at a time.
    Join(topic) (Topic, error)
}
type Topic interface {
    // Subscribe returns a subscription to messages on the topic.
    Subscribe(ops) Subscription
    // Publish publishes messages on the topic.
    Publish(message) error
    // Close leaves the topic.
    Close() error
    ...
}

Discussion

There are two somewhat orthogonal parts to this proposal: topic handles and exclusivity.

Topic Handles

The motivation for the topic handle is pretty clear. Unless we're in a fully connected network, Publish can't currently be used independently of Subscribe and topic handles try to resolve this issue.

Exclusivity

The concern here was: how do we make it easy for multiple services to reliably use a single pubsub service at the same time. The specific concern was validator registration.

We landed on three options:

  1. Global validators applied to a class of topics (e.g., using pattern matching).
  2. Per-subscription validators and independent subscriptions.
  3. Per-subscription validators and exclusive subscriptions.

Global Validators

This is the option I proposed in #58. The user would have to setup global topic validators and would be responsible for managing conflicts up-front. This is pretty close to how DHT validators work.

However, unlike the DHT, we expect services using pubsub to add new validators for new topic types at runtime. A single over-zealous validator registered by one service could interfere with other services.

Independent Subscriptions

Next we considered associating validators with subscriptions (or topic handles). The idea was to:

  1. Allow multiple services to "join" the same topic.
  2. Have each service joining the topic (optionally) register a validator associated with the subscription.

The catch was applying the validators independently. If two services, A and B, subscribe to the same topic but have different ideas of validity for messages that topic, naively rejecting all messages that fail A's validator would prevent B from receiving some messages it would consider valid.

The solution I proposed was treat these subscribers independently:

  1. For each subscriber, forward messages to that subscriber if that subscriber's validator successfully validates the message.
  2. For each message, forward the message if at least one subscriber validates the message.

This would effectively act like two independent pubsub nodes.

Exclusive Subscriptions

The funky part of the "independent subscriptions" proposal is that we could end up forwarding messages that one or more validators marked as invalid. This could, e.g., cause our peers to mark us as "bad" and drop us because we're forwarding bad messages.

We had two key insights:

  1. The real issue is that the topic string doesn't completely describe the topic. Unfortunately, there's not much we can do about this for now.
  2. This problem is likely academic. We couldn't think of any cases where multiple services would want to subscribe/publish on the same topics as some other service.

The second insight gives us a simple solution: only allow one service to "own" a topic at a time and allow that service to control all validators on that topic. This matches the behavior of protocol handlers, transports, etc.

This is also the safest option as we can upgrade to "independent subscriptions" at any time if it becomes an issue.

@aschmahmann
Copy link
Contributor

@Stebalien Thanks for the writing this up.

I wanted to flesh out the motivation behind the Topic Handles part a bit more:

  1. Currently the task of worrying about connecting to relevant PubSub peers is relegated to the developer instead of PubSub. However, when PubSub becomes capable of being responsible for its own peer discovery (Add Discovery #184) we need an "I care" indicator that's true even if we're not "Subscribed" to the topic. This could've been done in a number of ways including hacky approaches like requiring a registered topic validator (even if it's just "return true"), but Topic Handles are more explicit.
  2. Wanting Publish to take a context without resorting to the Options pattern (e.g. pubsub.Publish(topic, WithContext(ctx),...) for it since we'd like to make it mandatory+obvious+fit our existing patterns, but we don't want to break the existing Publish interface. This implies having the old functions that we can deprecate and new ones we can move to.
  3. It looks/feels better to pass around a topic handle then to call PubSub.Func(topic, ....) everywhere.

@vyzo
Copy link
Collaborator

vyzo commented Sep 19, 2019

This all sounds reasonable.

@raulk
Copy link
Member

raulk commented Oct 18, 2019

Strongly agree with this proposal. Here's a little sketch of how I see this design operating. Did I get it right?

image

@aschmahmann
Copy link
Contributor

The second insight gives us a simple solution: only allow one service to "own" a topic at a time and allow that service to control all validators on that topic. This matches the behavior of protocol handlers, transports, etc.

@raulk you're close but off by a bit. The validators will belong to the topic for now so they're defined on each trunk instead of the branches. Also, the proposed API in #184 bundles publishing with the topic handle since there's only one topic handle and there's only a need for one publishing handle.

@raulk
Copy link
Member

raulk commented Oct 18, 2019

Ok, that makes sense. Rationale: defining validators on the leaves (subscribers and publishers) would prevent us from attaching validators to relay-only topics (i.e. a topic we're helping disseminate, but we're not actively consuming from), because such topics have no leaves.

Put a different way: validators secure payloads flowing through a topic (topic handle), not local logic (subscriber, publisher).

@vyzo
Copy link
Collaborator

vyzo commented Oct 18, 2019

I think Relay only nodes must also validate -- otherwise they may forward invalid messages which offers attackers an easy avenue to inject invalid messages without being discovered.
Also, when there is blacklisting in place, the invalid message will get the relay only node blacklisted which is not what we want.

TL;DR: the validator must be attached to the topic, even for relay only nodes.

@aschmahmann
Copy link
Contributor

👍 to the validator must be attached to the topic, even for relay only nodes.

Also, we may at some point want to add the ability to auto-blacklist people who send us invalid messages regularly, which means relays would probably want to validate.

@vyzo
Copy link
Collaborator

vyzo commented Oct 18, 2019

Relays will almost certainly want to validate, otherwise they are open to attack.

@aschmahmann
Copy link
Contributor

Now that the PR starting off the Topic handlers has landed #184. There were a few DX changes I think we should consider

  • Topic should have a Topic() string function which returns the topic name
  • The Subscription object's Topic() function should return *Topic instead of string (breaking)
  • There should be a validator Topic option passed in as part of pubsub.Join(topic, opts...)
  • The un/register validator functions should be deprecated

Thoughts? @Stebalien @raulk @vyzo

@Stebalien
Copy link
Member Author

IMO, yes to all of those.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants