Feature Request: Vector Operations #3068

Btibert3 · 2024-03-16T20:23:49Z

I just came across this project, and wow I am impressed. Currently, Neo4j supports vector operations, more specifically, similarity calculations. It would be great if we could extend the concept of fixed-length lists and perform similarity operations. Maybe that support already exists and I am overlooking how to achieve this with your stack, but this would be a great feature to help support RAG operations.

semihsalihoglu-uw · 2024-03-16T20:36:08Z

Thanks for raising this. I agree that we should support some of the common functions. We can follow DuckDB's array functions: https://duckdb.org/docs/sql/data_types/array.html#functions. Arrays in DuckDB are equivalent to our fixed-length list type, so I don't think there is a hurdle in supporting these.

I'll put this into our pipeline.

prrao87 · 2024-03-16T21:37:44Z

Hi @Btibert3 that makes a lot of sense. Could you elaborate a bit on what the intended use case is in the context of RAG? To you, how would the ideal implementation look from a graph query perspective?

Btibert3 · 2024-03-17T20:21:26Z

@semihsalihoglu-uw DuckDB is exactly what I had in my head.

@prrao87 Naive RAG lets you find entries (i.e. nodes) based on the similarity of the vector to the input query. We can go beyond this by further restricting the results by leveraging graph patterns. One example might be to show a list of products based on the user's input query (vector similarity) but further restrict/re-rank the results based on products the user hasn't purchased and behavior of other "similar" users, where similarity in this context is leveraging graph relationships. In this example, the results come from vector-based similarity and graph relationships.

Another example would be to consider the most similar document chunk via vector search, but improving context windows based on linked nodes and variable pattern matching, again using vector similarity but also the structure of the relationships in the graph.

acquamarin · 2024-03-18T15:25:04Z

@Btibert3 May i know which vector operations you are most interested in? So we can implement those in advance.

Btibert3 · 2024-03-18T16:33:35Z

Sure thing.

Neo4j supports Euclidean and cosine: https://neo4j.com/docs/cypher-manual/current/indexes/semantic-indexes/vector-indexes/#indexes-vector-similarity
duckdb supports dot product: https://duckdb.org/docs/test/functions/nested.html#list-functions
pinecone supports the same 3 above: https://docs.pinecone.io/docs/indexes#distance-metrics

In short I believe that you can go pretty far with those three.

prrao87 · 2024-03-18T16:55:52Z

@acquamarin I'd start with cosine and then extend to Euclidean (L2) and then finally dot product, in that order. Cosine seems to be the most common metric used for similarity search in general.

Btibert3 · 2024-03-18T17:26:48Z

If those are the two being considered out of the gate, I completely agree with cosine.

Btibert3 · 2024-03-20T00:13:06Z

Wow! Very impressed.

hpvd · 2024-03-22T10:10:52Z

one week from request to implementation? Just unbelievable :-D
Many thanks for your work!

semihsalihoglu-uw assigned acquamarin Mar 16, 2024

semihsalihoglu-uw added feature New features or missing components of existing features good-warm-up Good warm up feature high-priority labels Mar 16, 2024

manh9203 mentioned this issue Mar 18, 2024

Rework FIXED_LIST #3057

Merged

acquamarin mentioned this issue Mar 19, 2024

Implement array functions #3087

Merged

acquamarin closed this as completed in #3087 Mar 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Vector Operations #3068

Feature Request: Vector Operations #3068

Btibert3 commented Mar 16, 2024

semihsalihoglu-uw commented Mar 16, 2024

prrao87 commented Mar 16, 2024

Btibert3 commented Mar 17, 2024

acquamarin commented Mar 18, 2024

Btibert3 commented Mar 18, 2024

prrao87 commented Mar 18, 2024

Btibert3 commented Mar 18, 2024

Btibert3 commented Mar 20, 2024

hpvd commented Mar 22, 2024

Feature Request: Vector Operations #3068

Feature Request: Vector Operations #3068

Comments

Btibert3 commented Mar 16, 2024

semihsalihoglu-uw commented Mar 16, 2024

prrao87 commented Mar 16, 2024

Btibert3 commented Mar 17, 2024

acquamarin commented Mar 18, 2024

Btibert3 commented Mar 18, 2024

prrao87 commented Mar 18, 2024

Btibert3 commented Mar 18, 2024

Btibert3 commented Mar 20, 2024

hpvd commented Mar 22, 2024