Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for k-NN Faiss SQfp16 #6249

Merged
merged 21 commits into from
Mar 29, 2024

Conversation

naveentatikonda
Copy link
Member

@naveentatikonda naveentatikonda commented Jan 24, 2024

Description

Add documentation for the new k-NN faiss encoder SQfp16 which quantizes 32 bit float vectors into 16 bit float values using Scalar Quantization results in memory optimization with a very minimal loss of precision. It also boosts the overall performance by enabling the SIMD support(vector dimension must be multiple of 8) on Linux and Mac OS.

Issues Resolved

Closes #5038

Checklist

  • By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
    For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@kolchfa-aws kolchfa-aws self-assigned this Jan 24, 2024
@kolchfa-aws kolchfa-aws added v2.12.0 release-notes PR: Include this PR in the automated release notes 4 - Doc review PR: Doc review in progress labels Jan 24, 2024
Copy link
Collaborator

@kolchfa-aws kolchfa-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @naveentatikonda! A couple of suggestions, and then we'll move this PR to editorial review.

_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved
@naveentatikonda
Copy link
Member Author

@kolchfa-aws Thanks for reviewing it. I have addressed your review comments.

Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kolchfa-aws Please see my comments and changes and let me know if you have any questions. Thanks!

_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved
@naveentatikonda
Copy link
Member Author

@kolchfa-aws @natebower I need to make more changes to this existing documentation. Will address all the review comments and update it on Monday.

@naveentatikonda
Copy link
Member Author

Unfortunately, we need to postpone this feature to 2.13 due to some build related issues. @kolchfa-aws can you pls help to update the labels on the PR and github issue. Thanks!
opensearch-project/opensearch-build#4386 (comment)

@kolchfa-aws kolchfa-aws added v2.13.0 and removed 4 - Doc review PR: Doc review in progress v2.12.0 labels Feb 6, 2024
@hdhalter hdhalter added the 3 - Tech review PR: Tech review in progress label Mar 4, 2024
@hdhalter
Copy link
Contributor

@naveentatikonda - Has anything changed, or is this content good to go? Thanks!

@naveentatikonda
Copy link
Member Author

@naveentatikonda - Has anything changed, or is this content good to go? Thanks!

This documentation needs to be updated. I will make changes this week. Thanks!

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
@kolchfa-aws
Copy link
Collaborator

@naveentatikonda I addressed your comments.

kolchfa-aws and others added 7 commits March 22, 2024 14:10
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
@jmazanec15
Copy link
Member

@kolchfa-aws can this be merged?

@kolchfa-aws
Copy link
Collaborator

@jmazanec15 @naveentatikonda requested a tech review on this PR. Once the tech review is done, we will do an editorial review and then we'll merge.

@naveentatikonda
Copy link
Member Author

@kolchfa-aws can this be merged?

@jmazanec15 I'm waiting for @vamshin to review this PR before moving it to editorial review

## Lucene byte vector

Starting with k-NN plugin version 2.9, you can use `byte` vectors with the `lucene` engine in order to reduce the amount of storage space needed. For more information, see [Lucene byte vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#lucene-byte-vector).

## SIMD optimization for the Faiss engine

Starting with version 2.13, the k-NN plugin supports [Single Instruction Multiple Data (SIMD)](https://en.wikipedia.org/wiki/Single_instruction,_multiple_data) processing if the underlying hardware supports SIMD instructions (AVX2 on x64 architecture and Neon on ARM64 architecture). SIMD is supported by default on Linux machines only for the Faiss engine. SIMD architecture helps boost the overall performance by improving indexing throughput and reducing search latency.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SIMD is supported by default on Linux machines only for the Faiss engine.

SIMD should be CPU architecture dependent right? Why do we say only Linux machine?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, SIMD is CPU architecture dependent. But, right now we are running into some issues on Windows OS due to some limitations with compiler and supporting SIMD for linux OS and mac OS (for development only). So, that's the reason we are explicitly calling it out that it works on linux.

You can use encoders to reduce the memory footprint of a k-NN index at the expense of search accuracy. faiss has
several encoder types, but the plugin currently only supports *flat* and *pq* encoding.
You can use encoders to reduce the memory footprint of a k-NN index at the expense of search accuracy. Faiss has
several encoder types, but the plugin currently only supports `flat`, `pq`, and `sq` encoding.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Faiss has
several encoder types, but the plugin currently only supports flat, pq, and sq encoding

k-NN plugin currently supports flat, pq, and sq encoders from Faiss library?.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack


Parameter name | Required | Default | Updatable | Description
:--- | :--- | :-- | :--- | :---
`type` | false | `fp16` | false | The type of scalar quantization to be used to encode 32-bit float vectors into the corresponding type. As of OpenSearch 2.13, only the `fp16` encoder type is supported. For the `fp16` encoder, vector values must be in the [-65504.0, 65504.0] range.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the fp16 encoder, vector values must be in the [-65504.0, 65504.0] range.

By default fp16 encoder expects vector values to be in the [-65504.0, 65504.0] range.

Also lets add above as Note and probably bold/highlight

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We normally don't format sentences as a note in the parameter table.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it. Shall we add a note about this inside faiss scalar quantization section ?

@@ -221,6 +322,8 @@ If you want to use less memory and index faster than HNSW, while maintaining sim

If memory is a concern, consider adding a PQ encoder to your HNSW or IVF index. Because PQ is a lossy encoding, query quality will drop.

If you want to reduce the memory requirements by a factor of 2 (with very minimal loss of search quality) or by a factor of 4 (with a significant drop in search quality), consider vector quantization. To learn more about vector quantization options, see [k-NN vector quantization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can reduce the memory footprint by factor of 2 by using fp_16 encoder technique(provide link?) with minimal loss in search quality. If your vector dimensions fit in the byte range [-128, 128] we recommend using byte quantizer(provide link?) to cut down memory footprint by factor of 4.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The byte range is [-128, 127], correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, byte range is [-128 to 127]

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Copy link
Member

@vamshin vamshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks

Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@naveentatikonda @kolchfa-aws Please see my comments and changes and let me know if you have any questions. Thanks!

_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved

Optionally, you can specify the parameters in `method.parameters.encoder`. For more information about parameters within the `encoder` object, see [SQ parameters]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#sq-parameters).

The `fp16` encoder converts 32-bit vectors into their 16-bit counterparts. For this encoder type, the vector values must be in the [-65504.0, 65504.0] range. To define handling out-of-range values, the preceding request specifies the `clip` parameter. By default, this parameter is `false` and any vectors containing out-of-range values are rejected. When `clip` is set to `true` (as in the preceding request), out-of-range vector values are rounded up or down so that they are in the supported range. For example, if the original 32-bit vector is `[65510.82, -65504.1]`, the vector will indexed as a 16-bit vector `[65504.0, -65504.0]`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we mean by "To define handling"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded.

_search-plugins/knn/knn-vector-quantization.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-vector-quantization.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-vector-quantization.md Outdated Show resolved Hide resolved
_search-plugins/knn/knn-vector-quantization.md Outdated Show resolved Hide resolved
@hdhalter hdhalter added 5 - Editorial review PR: Editorial review in progress and removed 4 - Doc review PR: Doc review in progress labels Mar 29, 2024
kolchfa-aws and others added 2 commits March 29, 2024 10:54
Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
@kolchfa-aws kolchfa-aws merged commit 5d9edcb into opensearch-project:main Mar 29, 2024
3 checks passed
@hdhalter hdhalter added 3 - Done Issue is done/complete and removed 5 - Editorial review PR: Editorial review in progress labels Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Done Issue is done/complete release-notes PR: Include this PR in the automated release notes v2.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DOC] Faiss Scalar Quantization FP16 (SQfp16)
7 participants