Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How scalable is this? #34

Open
uriva opened this issue Aug 7, 2020 · 1 comment
Open

How scalable is this? #34

uriva opened this issue Aug 7, 2020 · 1 comment
Labels
question Further information is requested

Comments

@uriva
Copy link

uriva commented Aug 7, 2020

  1. Would it support a chunk of 5 billion nodes/edges?
  2. If each has minimal payload, how much time would the process take?
@jeffreylovitz
Copy link
Contributor

Hi @uriva,

  1. If your server has enough RAM to store and query a graph with 5 billion entities, you should not have an issue running the bulk loader. It will automatically divide your input into batches to populate a buffer of up to 2 gigabytes, and maintains a dictionary mapping all nodes to their identifiers.
  2. I'd expect this to take dozens of hours, but there are too many factors in play to be very precise. Generally, load time will scale linearly with the input size. Building a graph with about 5 million nodes, 5 million edges, and 20 million properties on my system takes 220 seconds, so increasing that by a factor of 500 gives about 30 hours as a very very rough estimate.

@jeffreylovitz jeffreylovitz added the question Further information is requested label Oct 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants