Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to clear ALL data associated with an index #179

Merged
merged 7 commits into from
Jul 9, 2024

Conversation

tylerhutcherson
Copy link
Collaborator

This PR introduces the clear() method to the core SearchEngine class in RedisVL. This now allows for the ability to clear out all data, while leaving the index in place.

Useful for manual cache eviction, manual session clearing, and more.

This PR also updates the extension classes to use the new clear() method as opposed to the SCAN ITER approach.

@tylerhutcherson tylerhutcherson added the enhancement New feature or request label Jul 8, 2024
Copy link
Collaborator

@bsbodden bsbodden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a couple of comments

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should let them control the batching parameter but defaulted to 500?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if someone's goal is to just clear all of the data from an index, we should implement the best practice under the hood and not make the user worry about it? But it certainly woudln't be hard to add an optional arg in the future if folks need it in the future!


# Paginate using queries and delete in batches
for batch in self.paginate(
FilterQuery(FilterExpression("*"), return_fields=["id"]), page_size=500
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlikely scenario, but what if I want to get the (some) data out before destroying the index, e.g. return_fields being configurable also?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the notion that someone might want to be able to export data from an index. However, I think that should be it's own clear feature. For example, an index.export(file_path="file.json") method or something similar?

I think clear() should just be a simple method for clearing data out of your index

@tylerhutcherson tylerhutcherson merged commit 9ca93ef into main Jul 9, 2024
20 checks passed
@tylerhutcherson tylerhutcherson deleted the feat/RAAE-172/clear-data-from-index branch July 9, 2024 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants