Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial reranker integration in RedisVL #139

Merged
merged 15 commits into from
May 1, 2024
Merged

Initial reranker integration in RedisVL #139

merged 15 commits into from
May 1, 2024

Conversation

tylerhutcherson
Copy link
Collaborator

@tylerhutcherson tylerhutcherson commented Apr 17, 2024

This PR introduces another utility class into RedisVL to improve the querying experience, and result quality, through integration with 3rd party rerankers. We have started with Cohere's reranking API and will also add additional options later on.

TODO (with Cohere):

  • Confirm proper Cohere API param usage. (Clarify best way to use max_chunks_per_doc)
  • Confirm backwards compatibility between Cohere v2 and v3. Best way to add this here?
  • Confirm public method names.
  • Finalize doc strings and examples.
  • Add to docs.
  • Add initial testing suite.

@tylerhutcherson tylerhutcherson added the enhancement New feature or request label Apr 17, 2024
@codecov-commenter
Copy link

codecov-commenter commented Apr 17, 2024

Codecov Report

Attention: Patch coverage is 92.50000% with 12 lines in your changes are missing coverage. Please review.

Project coverage is 88.60%. Comparing base (727d6dc) to head (c6a9a96).

Files Patch % Lines
redisvl/utils/rerank/cohere.py 90.56% 5 Missing ⚠️
redisvl/utils/rerank/base.py 84.61% 4 Missing ⚠️
redisvl/utils/vectorize/text/azureopenai.py 84.61% 2 Missing ⚠️
redisvl/utils/vectorize/base.py 91.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #139      +/-   ##
==========================================
+ Coverage   88.43%   88.60%   +0.16%     
==========================================
  Files          26       29       +3     
  Lines        1565     1667     +102     
==========================================
+ Hits         1384     1477      +93     
- Misses        181      190       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@nabsabraham nabsabraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some suggestions on what the documents need to look like to pass to the rerank call. TLDR:

  • If we want, we are free to send rerank json docs (ie, list of dicts) but we just need to make sure to pass rerank a list of rank_fields that are the keys in the json docs
  • If we want to instead handle the slicing of json fields on the client side, we can do so and then pass rerank a list of strings and then omit passing rank_fields
  • We should default to rerank-english-v3.0

Apologies if I'm missing some of the requirements we discussed a few months ago regarding the integration of the new reranker in redis.

redisvl/utils/rerank/cohere.py Outdated Show resolved Hide resolved
redisvl/utils/rerank/cohere.py Outdated Show resolved Hide resolved
redisvl/utils/rerank/cohere.py Outdated Show resolved Hide resolved
@tylerhutcherson
Copy link
Collaborator Author

left some suggestions on what the documents need to look like to pass to the rerank call. TLDR:

  • If we want, we are free to send rerank json docs (ie, list of dicts) but we just need to make sure to pass rerank a list of rank_fields that are the keys in the json docs
  • If we want to instead handle the slicing of json fields on the client side, we can do so and then pass rerank a list of strings and then omit passing rank_fields
  • We should default to rerank-english-v3.0

Apologies if I'm missing some of the requirements we discussed a few months ago regarding the integration of the new reranker in redis.

Great feedback. I definitely missed the API spec update for v3. My concern with moving to rank_fields was backwards compatibility and the right way to do that. But I think I can updated rank_by to be a list of strings. That way it can be flexible across different reranking implementations. I will work on an update here shortly, and tag you again :) thanks!

Copy link
Contributor

@nabsabraham nabsabraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very clean! lgtm 🎉 saw that the tests are not running because of the api_key but if they all run as expected, test_bad_input addresses the minor comment I left :)

redisvl/utils/rerank/cohere.py Show resolved Hide resolved
tests/integration/test_rerankers.py Outdated Show resolved Hide resolved
@tylerhutcherson tylerhutcherson merged commit 4cd53d0 into main May 1, 2024
20 checks passed
@tylerhutcherson tylerhutcherson deleted the reranker branch May 1, 2024 13:55
justin-cechmanek pushed a commit that referenced this pull request May 6, 2024
This PR introduces another utility class into RedisVL to improve the
querying experience, and result quality, through integration with 3rd
party rerankers. We have started with Cohere's reranking API and will
also add additional options later on.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants