Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport PR #2608 to release/v1.7 for feat: add usearch #2627

Conversation

vdaas-ci
Copy link
Collaborator

  • Implement interface for additional ann algorithm “usearch”.

Description

  • Implemented internal/core/algorithm/usearch/*.go.
  • Edited Makefile and hack/docker/gen/main.go.

Related Issue

Versions

  • Vald Version: v1.7.13
  • Go Version: v1.23.0
  • Rust Version: v1.80.0
  • Docker Version: v27.1.1
  • Kubernetes Version: v1.30.3
  • Helm Version: v3.15.3
  • NGT Version: v2.2.4
  • Faiss Version: v1.8.0

Checklist

Special notes for your reviewer

Summary by CodeRabbit

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced version management for the usearch library, allowing users to check the current version.
    • Added installation support for usearch on both Linux and macOS systems.
  • Enhancements

    • Integrated usearch installation into the Docker build process.
    • Added a comprehensive Go API for managing vector indices with the usearch library.
    • Provided flexible configuration options for the usearch algorithm, enhancing usability.
  • Bug Fixes

    • Improved error handling with a custom error type for search operations.
  • Tests

    • Implemented unit tests to validate the functionality of the Usearch interface.
  • Version Update

    • Updated USEARCH_VERSION to 2.15.1, indicating new features and improvements.

* feat: add usearch

* style: format code with Gofumpt and Prettier

This commit fixes the style issues introduced in 58baee9 according to the output
from Gofumpt and Prettier.

Details: #2608

* feat: impl usearch istallation cmd for ci/base container

* style: format code with Gofumpt and Prettier

This commit fixes the style issues introduced in 938cc12 according to the output
from Gofumpt and Prettier.

Details: #2608

* add: multiple vector test

* fix: add ldconfg to Makefile

* refactor: covert switch to map

---------

Co-authored-by: deepsource-autofix[bot] <62050782+deepsource-autofix[bot]@users.noreply.github.com>
Co-authored-by: Hiroto Funakoshi <hiroto.funakoshi.hiroto@gmail.com>
Co-authored-by: Kiichiro YUKAWA <kyukawa315@gmail.com>
Co-authored-by: Yusuke Kato <kpango@vdaas.org>
Copy link

cloudflare-workers-and-pages bot commented Sep 13, 2024

Deploying vald with  Cloudflare Pages  Cloudflare Pages

Latest commit: bb89ddf
Status: ✅  Deploy successful!
Preview URL: https://669f4acc.vald.pages.dev
Branch Preview URL: https://backport-release-v1-7-featur-4ma7.vald.pages.dev

View logs

Copy link
Contributor

coderabbitai bot commented Sep 13, 2024

Walkthrough

Walkthrough

The changes introduce the management and installation of the usearch library within the project. Key updates include the addition of a USEARCH_VERSION variable in the Makefile, new Makefile targets for versioning and installation, and the incorporation of the usearch library as a dependency in the Go module. Additionally, new files are created to define the API for the usearch library, implement functional options for configuration, and establish error handling. These changes enhance the project's capabilities related to the usearch library.

Changes

File Change Summary
Makefile Added USEARCH_VERSION, .PHONY targets, and installation functionality for usearch.
dockers/ci/base/Dockerfile Included make usearch/install to install the usearch component during the Docker build.
go.mod Added dependency for github.com/unum-cloud/usearch/golang at a specific version.
hack/actions/gen/main.go Introduced usearchVersionPath constant for version path management.
hack/docker/gen/main.go Added usearchPreprocess for installation in Docker container setups.
internal/core/algorithm/usearch/option.go Created functional options for configuring the usearch algorithm.
internal/core/algorithm/usearch/usearch.go Implemented Go API for managing a vector index with various methods for index and vector operations.
internal/core/algorithm/usearch/usearch_test.go Added unit tests for the Usearch functionality to validate search operations.
internal/errors/usearch.go Defined UsearchError type for error handling specific to user search operations.
versions/USEARCH_VERSION Introduced version 2.15.1 for the usearch software.

Possibly related PRs

  • feat: add usearch #2608: The changes in this PR directly relate to the management and installation of the usearch library in the Makefile, including the addition of the USEARCH_VERSION variable and the usearch/install target, which are also present in the main PR.

Suggested labels

actions/backport/release/v1.7, type/feature, area/internal, area/makefile, size/XL

Tip

OpenAI O1 model for chat
  • We have deployed OpenAI's latest O1 model for chat.
  • OpenAI claims that this model has superior reasoning capabilities than their GPT-4o model.
  • Please share any feedback with us in the discussions post.

Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between f1da340 and bb89ddf.

Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
Files selected for processing (10)
  • Makefile (3 hunks)
  • dockers/ci/base/Dockerfile (1 hunks)
  • go.mod (1 hunks)
  • hack/actions/gen/main.go (1 hunks)
  • hack/docker/gen/main.go (3 hunks)
  • internal/core/algorithm/usearch/option.go (1 hunks)
  • internal/core/algorithm/usearch/usearch.go (1 hunks)
  • internal/core/algorithm/usearch/usearch_test.go (1 hunks)
  • internal/errors/usearch.go (1 hunks)
  • versions/USEARCH_VERSION (1 hunks)
Additional comments not posted (41)
versions/USEARCH_VERSION (1)

1-1: LGTM!

The version number 2.15.1 is correctly specified for the usearch library.

internal/errors/usearch.go (2)

1-16: The license header is correctly formatted.

The Apache License 2.0 header is correctly formatted and contains the appropriate information.


17-32: The custom error type and related functions are correctly implemented.

The UsearchError type, NewUsearchError constructor function, and Error method are correctly implemented and follow Go conventions.

dockers/ci/base/Dockerfile (1)

123-123: LGTM! The changes align with the PR objectives.

The addition of the make usearch/install command to install the usearch library is consistent with the PR objective to implement an interface for the usearch algorithm. The installation command follows the same pattern as other libraries like ngt and faiss, indicating consistency in the installation process.

The use of bind mounts and cache mounts in the RUN instruction helps optimize the build process and ensures the installation occurs in the correct context. The removal of source files at the end of the instruction keeps the build environment clean.

internal/core/algorithm/usearch/option.go (8)

45-53: LGTM!

The WithIndexPath option function is correctly implemented. It sets the idxPath field of the usearch struct and returns an error if the provided path is empty.


56-74: LGTM!

The WithQuantizationType option function is correctly implemented. It sets the quantizationType field of the usearch struct and returns an error if the provided quantization type is unsupported.


77-99: LGTM!

The WithMetricType option function is correctly implemented. It sets the metricType field of the usearch struct and returns an error if the provided metric type is unsupported. The normalization of the metric type is a good practice to handle variations in the input.


102-112: LGTM!

The WithDimension option function is correctly implemented. It sets the dimension field of the usearch struct and returns an error if the provided dimension is invalid.


115-124: LGTM!

The WithConnectivity option function is correctly implemented. It sets the connectivity field of the usearch struct and returns an error if the provided connectivity is negative.


127-136: LGTM!

The WithExpansionAdd option function is correctly implemented. It sets the expansionAdd field of the usearch struct and returns an error if the provided expansion add is negative.


139-148: LGTM!

The WithExpansionSearch option function is correctly implemented. It sets the expansionSearch field of the usearch struct and returns an error if the provided expansion search is negative.


151-156: LGTM!

The WithMulti option function is correctly implemented. It sets the multi field of the usearch struct.

internal/core/algorithm/usearch/usearch.go (10)

60-75: LGTM!

The usearch struct is well-defined and contains all the necessary fields to represent the USearch index and its configuration. The use of a mutex for synchronization is a good practice.


79-81: LGTM!

The New function correctly initializes a new USearch instance by calling the gen function with the appropriate arguments.


83-85: LGTM!

The Load function correctly loads an existing USearch index by calling the gen function with the appropriate arguments.


87-128: LGTM!

The gen function correctly generates a new USearch instance or loads an existing one based on the isLoad parameter. It applies the provided options to the usearch struct and creates a new index or loads an existing one accordingly. The error handling is appropriate, and the function returns the created usearch instance.


131-140: LGTM!

The SaveIndex method correctly saves the USearch index to storage. It ensures thread safety by acquiring a write lock on the mutex and calls the Save method of the index with the appropriate path. The error handling is appropriate, and the function returns an error if the save operation fails.


143-152: LGTM!

The SaveIndexWithPath method correctly saves the USearch index to the specified path. It ensures thread safety by acquiring a write lock on the mutex and calls the Save method of the index with the provided path. The error handling is appropriate, and the function returns an error if the save operation fails.


155-163: LGTM!

The GetIndicesSize method correctly returns the number of vectors in the index. It ensures thread safety by acquiring a lock on the mutex and calls the Len method of the index to get the size. The error handling is appropriate, and the function returns the size and any error that occurred.


166-178: LGTM!

The Add method correctly adds a vector to the index. It ensures thread safety by acquiring a lock on the mutex and validates the dimension of the vector before adding it to the index. The error handling is appropriate, and the function returns an error if the dimensions are inconsistent or if the add operation fails.


181-189: LGTM!

The Reserve method correctly reserves memory for a given number of vectors. It ensures thread safety by acquiring a lock on the mutex and calls the Reserve method of the index with the provided vectorCount argument. The error handling is appropriate, and the function returns an error if the reserve operation fails.


192-211: LGTM!

The Search method correctly performs a nearest neighbor search and returns the results. It ensures thread safety by acquiring a lock on the mutex and validates the dimension of the query vector before performing the search. The error handling is appropriate, and the function returns an error if the dimensions are incompatible or if the search operation fails. It handles the case when the search results are empty and returns an appropriate error. The conversion of the search results to the algorithm.SearchResult format is correct.

internal/core/algorithm/usearch/usearch_test.go (11)

150-177: LGTM!

This test case correctly verifies that searching for the same vector returns the ID of the inserted vector with distance 0.


178-205: LGTM!

This test case correctly verifies that searching for a nearby vector returns the ID of the inserted vector with distance 1.


206-246: LGTM!

This test case correctly verifies that searching with a limit returns the IDs of the nearest vectors up to the limit.


247-289: LGTM!

This test case correctly verifies that searching with a limit returns the IDs of the nearest vectors in the correct order.


290-315: LGTM!

This test case correctly verifies that searching with a lower dimension vector returns an error.


316-341: LGTM!

This test case correctly verifies that searching with a higher dimension vector returns an error.


64-67: LGTM!

This helper function correctly creates a unique temporary directory for the index using t.TempDir().


127-148: LGTM!

This helper function correctly creates a Usearch instance, inserts the provided vectors, and returns the instance for further testing.


117-126: LGTM!

This helper function correctly checks the search result against the expected result by comparing the error and the result.


52-61: LGTM!

This helper function correctly closes the Usearch instance after each test, handling the case where the instance is nil.


1-379: LGTM!

The test file usearch_test.go is well-structured and covers important scenarios for the Search method. The test cases and helper functions are correctly implemented and ensure the correctness of the Search method.

hack/actions/gen/main.go (1)

316-316: LGTM!

The addition of the usearchVersionPath constant is a straightforward change that introduces a new versioning path for the USEARCH component. This is likely relevant for managing dependencies or configurations related to USEARCH.

hack/docker/gen/main.go (3)

244-246: LGTM!

The addition of the usearchPreprocess constant is a valid change for integrating the usearch library into the project's build process.


649-649: LGTM!

Appending usearchPreprocess to the Preprocess field of the vald-ci-container configuration is a valid change for ensuring the usearch library is installed in the CI container.


667-668: LGTM!

Appending usearchPreprocess to the Preprocess field of the vald-dev-container configuration is a valid change for ensuring the usearch library is installed in the dev container.

Makefile (3)

88-88: LGTM!

The USEARCH_VERSION variable is defined correctly by reading from the versions/USEARCH_VERSION file, following the established pattern in the Makefile.


607-610: LGTM!

The version/usearch target is defined correctly to print the usearch version by echoing the USEARCH_VERSION variable, following the established pattern in the Makefile.


686-700: LGTM!

The usearch/install target correctly handles the installation of the usearch library for both Linux and macOS operating systems. It efficiently downloads the pre-built binaries from GitHub releases and performs the necessary OS-specific steps to install the library and header files.

go.mod (1)

383-383: Approve the addition of the new usearch dependency.

The inclusion of the github.com/unum-cloud/usearch/golang dependency extends the project's capabilities by integrating with the usearch library. This change appears to be a deliberate enhancement to the project's functionality.

To ensure a smooth integration, please verify that the new dependency is compatible with the existing codebase and doesn't introduce any conflicts or breaking changes. Consider running the following script to search for potential issues:

Review the search results to confirm that the usearch package is being used correctly and consistently throughout the codebase. If any issues are identified, please address them to maintain the project's stability and integrity.

Verification successful

Integration of usearch package is appropriate and well-implemented.

The github.com/unum-cloud/usearch/golang package is actively used in the codebase, with various functions and methods utilizing its capabilities. The presence of test cases and configuration scripts involving usearch indicates that it is well-integrated into the project's workflow. There are no immediate signs of issues or conflicts with the integration.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash

# Search for import statements of the usearch package
rg --type go 'github.com/unum-cloud/usearch/golang' -A 5 

# Search for function calls from the usearch package
rg --type go '[^(]*github.com/unum-cloud/usearch/golang\.\w+\(' -A 5

Length of output: 979


Script:

#!/bin/bash

# Search for any usage of the usearch package by looking for struct or interface usage
rg --type go 'usearch\.\w+' -A 5

# Search for any comments or TODOs related to the usearch package
rg --type go 'usearch' -A 5

Length of output: 19239


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@vdaas-ci
Copy link
Collaborator Author

[CHATOPS:HELP] ChatOps commands.

  • 🙆‍♀️ /approve - approve
  • 🍱 /format - format codes and add licenses
  • /gen-test - generate test codes
  • 🏷️ /label - add labels
  • 🔚 2️⃣ 🔚 /label actions/e2e-deploy - run E2E deploy & integration test

Copy link

codecov bot commented Sep 13, 2024

Codecov Report

Attention: Patch coverage is 0% with 8 lines in your changes missing coverage. Please review.

Please upload report for BASE (release/v1.7@f1da340). Learn more about missing BASE report.

Files with missing lines Patch % Lines
internal/errors/usearch.go 0.00% 5 Missing ⚠️
hack/docker/gen/main.go 0.00% 3 Missing ⚠️
Additional details and impacted files
@@               Coverage Diff               @@
##             release/v1.7    #2627   +/-   ##
===============================================
  Coverage                ?   24.37%           
===============================================
  Files                   ?      532           
  Lines                   ?    45856           
  Branches                ?        0           
===============================================
  Hits                    ?    11176           
  Misses                  ?    33932           
  Partials                ?      748           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kpango kpango closed this Sep 17, 2024
@kpango kpango deleted the backport/release/v1.7/feature/agent/implement-usearch branch September 17, 2024 03:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants