Implicit casting and resolving literal types for ARRAY when doing similarity search #3248

prrao87 · 2024-04-10T17:44:39Z

Consider this similarity search case where I want to return the node whose vector property is nearest to a given query vector provided by the user. I'll use the cosine similarity to demonstrate this.

import os
import shutil
import kuzu

if os.path.exists("./db"):
    shutil.rmtree("./db")

# Create database
db = kuzu.Database("./db")
conn = kuzu.Connection(db)

# Define schema
conn.execute("CREATE NODE TABLE Item(id UINT64, item STRING, price DOUBLE, vector DOUBLE[2], PRIMARY KEY (id))");

# Add data
conn.execute("MERGE (a:Item {id: 1, item: 'apple', price: 2.0, vector: cast([3.1, 4.1], 'DOUBLE[2]')})")
conn.execute("MERGE (b:Item {id: 2, item: 'banana', price: 1.0, vector: cast([5.9, 26.5], 'DOUBLE[2]')})");

# Run similarity search
res = conn.execute("MATCH (a:Item) RETURN a.item, a.price, array_cosine_similarity(a.vector, cast([6.0, 25.0], 'DOUBLE[2]')) AS sim ORDER BY sim DESC")
while res.has_next():
    row = res.get_next()
    print(row)

I want to return banana as the most similar node, based on the provided vector [6.0, 25.0], which should be the closest to the banana vector. I was able to get the above example code to work after some massaging, to return this:

['banana', 1.0, 0.9998642653091405]
['apple', 2.0, 0.9163829638139936]

Verbosity and scope for errors

The fixed-list/array type is currently a bit inconvenient and hard to use for similarity search.
It would be a lot easier if we could simply define this instead (without explicitly performing the cast)

# Add data
conn.execute("MERGE (a:Item {id: 1, item: 'apple', price: 2.0, vector: [3.1, 4.1]})")
conn.execute("MERGE (b:Item {id: 2, item: 'banana', price: 1.0, vector: [5.9, 26.5]})")

# Run similarity search
res = conn.execute("MATCH (a:Item) RETURN a.item, a.price, array_cosine_similarity(a.vector, [6.0, 25.0]) AS sim ORDER BY sim DESC")

Can this be incorporated without any breaking changes to other functionality?

The text was updated successfully, but these errors were encountered:

mxwli · 2024-04-10T17:50:13Z

Probably the best way to incorporate this would be to resolve list literals to ARRAYS and allow implicit casting from ARRAYS to LISTS.

andyfengHKU · 2024-04-10T17:58:59Z

Probably the best way to incorporate this would be to resolve list literals to ARRAYS and allow implicit casting from ARRAYS to LISTS.

Yeah let's just implement this casting rule. Should be straight forward to do so.

andyfengHKU · 2024-04-28T14:49:56Z

Should be fixed in #3394

prrao87 added feature New features or missing components of existing features question Further information is requested usability Issues related to better usability experience, including bad error messages frontend Frontend, i.e., binder, parser, query planning-related issues labels Apr 10, 2024

prrao87 assigned andyfengHKU and acquamarin Apr 10, 2024

prrao87 assigned mxwli and unassigned andyfengHKU and acquamarin Apr 10, 2024

prrao87 removed question Further information is requested feature New features or missing components of existing features labels Apr 10, 2024

mxwli mentioned this issue Apr 24, 2024

Add Implicit Casting from List to Array #3375

Merged

andyfengHKU mentioned this issue Apr 28, 2024

Fix issue 3248 #3394

Merged

andyfengHKU closed this as completed Apr 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implicit casting and resolving literal types for ARRAY when doing similarity search #3248

Implicit casting and resolving literal types for ARRAY when doing similarity search #3248

prrao87 commented Apr 10, 2024 •

edited

Loading

mxwli commented Apr 10, 2024

andyfengHKU commented Apr 10, 2024

andyfengHKU commented Apr 28, 2024

Implicit casting and resolving literal types for ARRAY when doing similarity search #3248

Implicit casting and resolving literal types for ARRAY when doing similarity search #3248

Comments

prrao87 commented Apr 10, 2024 • edited Loading

Verbosity and scope for errors

mxwli commented Apr 10, 2024

andyfengHKU commented Apr 10, 2024

andyfengHKU commented Apr 28, 2024

prrao87 commented Apr 10, 2024 •

edited

Loading