Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial implementation for hnsw #1955

Merged
merged 10 commits into from
Aug 17, 2022
Merged

Conversation

MXueguang
Copy link
Member

@MXueguang MXueguang commented Aug 3, 2022

indexed with 25 threads and default HNSW config
indexing time 12:20:00
efS 10, QPS 716, MRR 0.2376
efS 100, QPS 332, MRR 0.3103
efS 1000, QPS 58, MRR 0.3275

indexed with 1 thread and default HNSW config
indexing time 17:26:00
efS 10, QPS 762, MRR 0.2199
efS 100, QPS 331, MRR 0.3060
efS 1000, QPS 62, MRR 0.3266

multi-thread search works well

indexing seems very slow currently, (slower than index using ES)

Copy link
Member

@lintool lintool left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial comments...

pom.xml Outdated Show resolved Hide resolved
pom.xml Outdated Show resolved Hide resolved
src/main/java/io/anserini/index/IndexVector.java Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Aug 6, 2022

Codecov Report

Merging #1955 (2b400e2) into master (cc6337c) will decrease coverage by 2.53%.
The diff coverage is 0.00%.

@@             Coverage Diff              @@
##             master    #1955      +/-   ##
============================================
- Coverage     60.14%   57.60%   -2.54%     
  Complexity     1061     1061              
============================================
  Files           178      187       +9     
  Lines          9903    10339     +436     
  Branches       1371     1422      +51     
============================================
  Hits           5956     5956              
- Misses         3441     3877     +436     
  Partials        506      506              
Impacted Files Coverage Δ
.../java/io/anserini/collection/VectorCollection.java 0.00% <0.00%> (ø)
src/main/java/io/anserini/index/IndexArgs.java 100.00% <ø> (ø)
...c/main/java/io/anserini/index/IndexVectorArgs.java 0.00% <0.00%> (ø)
.../java/io/anserini/index/IndexVectorCollection.java 0.00% <0.00%> (ø)
...index/generator/LuceneVectorDocumentGenerator.java 0.00% <0.00%> (ø)
...main/java/io/anserini/search/SearchVectorArgs.java 0.00% <0.00%> (ø)
...ava/io/anserini/search/SearchVectorCollection.java 0.00% <0.00%> (ø)
...io/anserini/search/query/VectorQueryGenerator.java 0.00% <0.00%> (ø)
...i/search/topicreader/JsonIntVectorTopicReader.java 0.00% <0.00%> (ø)
...earch/topicreader/JsonStringVectorTopicReader.java 0.00% <0.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@MXueguang MXueguang marked this pull request as ready for review August 16, 2022 16:14
@MXueguang MXueguang requested a review from lintool August 16, 2022 16:14
@MXueguang MXueguang changed the title [WIP] initial implementation for hnsw initial implementation for hnsw Aug 17, 2022
Copy link
Member

@lintool lintool left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for now, but we'll need to circle back to add tests later.

@MXueguang MXueguang merged commit 02fa99d into castorini:master Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants