Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On-demand pubsub #332

Open
Stebalien opened this issue May 12, 2020 · 7 comments
Open

On-demand pubsub #332

Stebalien opened this issue May 12, 2020 · 7 comments
Labels
kind/enhancement A net-new feature or improvement to an existing feature

Comments

@Stebalien
Copy link
Member

Currently, pubsub will startup when initialized. Unfortunately, this means:

  • Running three goroutines per peer (at least?).
  • Processing subscriptions from peers.

This is true even if pubsub is enabled but not in-use.

In order to enable pubsub by default in go-ipfs, we need to find some way for pubsub to not take up a bunch of resources when not in-use.

The MVP solution is on-demand startup. We can start pubsub on-demand and stop after some idle timeout after the last subscription closes. This should be fairly simple to implement and will make it possible for us to turn pubsub on by default without significantly and increasing resource usage for nodes that aren't even using pubsub.

The ideal solution is idle peer detection. We don't really need to keep a stream/goroutine open per peer and could instead close streams to peers to which we have not spoken in a while. At the moment this will make the peer think we're dead so we may need a protocol change to implement this correctly.

@Stebalien Stebalien added the kind/enhancement A net-new feature or improvement to an existing feature label May 12, 2020
@Geo25rey
Copy link

@Stebalien What do you think of each node starting and stopping pubsub outside of libp2p? For example, there could be ipfs pubsub [start|stop]. It seems like pubsub is an application specific feature, so the applications that need pubsub start and stop the feature.

With this method, I could see there could be a problem if more than 1 application uses pubsub at the same time. I think a good solution to that would be using a semophore in the following way.

semophore = 0
function start_pubsub() {
    semophore += 1
    if (semophore > 1)
        return // pubsub is already running
    else
        // start pubsub...
}

function stop_pubsub() {
    if (semophore <= 0)
        return // pubsub isn't running...
    else {
        semophore -= 1
        // stop pubsub
    }
}

I like this method over doing a timeout since it requires less average CPU time.

@Stebalien
Copy link
Member Author

Stebalien commented Oct 20, 2020 via email

@Geo25rey
Copy link

Unfortunately, I don't trust apps to properly manage a semaphore (e.g., they can crash and never reduce it).

I meant the semaphore to be managed by libp2p or go-ipfs, not the application using the pubsub service. I do understand your concern. I forgot to account for if an app doesn't properly run stop_pubsub().

The primary reason for a timeout is to reduce the cost of re-starting pubsub (e.g., application restart and/or configuration change). The connections held open by ipfs pubsub subscribe would effectively act as a semaphore/reference count.

So, the timeout would be relatively short (~10 seconds)? Also, I didn't realize starting pubsub was so taxing. Would adding a "suspend" state be useful?

@Stebalien
Copy link
Member Author

Stebalien commented Oct 20, 2020 via email

@Geo25rey
Copy link

I could see always keeping pubsub started. To suspend after the idle timeout, a request can be sent to all peers to not send anymore data to the opened streams and all data received after the suspend state has started can be ignored. To ignore incoming data, have a relatively small buffer contuinously accept data from a blocking read syscall and do nothing with the data.

@Stebalien
Copy link
Member Author

Unfortunately, that's not really going to help.

  • A lot of the cost is keeping the streams open.
  • If we keep the streams open but don't accept data, there isn't much of a point. When we resume, we'll need our peer's subscription lists so we'll need to fetch them somehow.

@RubenKelevra
Copy link

RubenKelevra commented Jan 31, 2021

@Stebalien wrote:

At the moment this will make the peer think we're dead so we may need a protocol change to implement this correctly.

Maybe just remove this expectation? There are peers going online and offline like all the time. So "dead" should maybe never be assumed, but instead checked on demand?

We just keep track of peers who closed the connection, and just have an upper limit like 100 peers or so with the age of like 24 hours in the cache, after which they disappear.

This would allow us to do reconnects without wasting much traffic, and just continue at the last state they had.

@libp2p libp2p deleted a comment from bitcard Feb 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A net-new feature or improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

3 participants