diff --git a/CHANGELOG.md b/CHANGELOG.md
index e8e5e6d7b..fd7182e65 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -24,4 +24,5 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 ### Infrastructure
 ### Documentation
 ### Maintenance
+* Remove benchmarks folder from k-NN repo [#2127](https://github.com/opensearch-project/k-NN/pull/2127)
 ### Refactoring
diff --git a/benchmarks/README.md b/benchmarks/README.md
new file mode 100644
index 000000000..2e642d41b
--- /dev/null
+++ b/benchmarks/README.md
@@ -0,0 +1,4 @@
+## Benchmark Folder Tools Deprecated
+All benchmark workloads have been moved to [OpenSearch Benchmark Workloads](https://github.com/opensearch-project/opensearch-benchmark-workloads/tree/main/vectorsearch). Please use OSB tool to run the benchmarks.
+
+If you are still interested in using the old tool, the benchmarks are moved to the [branch](https://github.com/opensearch-project/k-NN/tree/old-benchmarks/benchmarks).  
diff --git a/benchmarks/osb/README.md b/benchmarks/osb/README.md
deleted file mode 100644
index 0d0b05f8d..000000000
--- a/benchmarks/osb/README.md
+++ /dev/null
@@ -1,478 +0,0 @@
-# IMPORTANT NOTE: No new features will be added to this tool . This tool is currently in maintanence mode. All new features will be added to [vector search workload]( https://github.com/opensearch-project/opensearch-benchmark-workloads/tree/main/vectorsearch)
-# OpenSearch Benchmarks for k-NN
-
-## Overview
-
-This directory contains code and configurations to run k-NN benchmarking 
-workloads using OpenSearch Benchmarks.
-
-The [extensions](extensions) directory contains common code shared between 
-procedures. The [procedures](procedures) directory contains the individual 
-test procedures for this workload.
-
-## Getting Started
-
-### OpenSearch Benchmarks Background
-
-OpenSearch Benchmark is a framework for performance benchmarking an OpenSearch 
-cluster. For more details, checkout their 
-[repo](https://github.com/opensearch-project/opensearch-benchmark/). 
-
-Before getting into the benchmarks, it is helpful to know a few terms:
-1. Workload - Top level description of a benchmark suite. A workload will have a `workload.json` file that defines different components of the tests 
-2. Test Procedures - A workload can have a schedule of operations that run the test. However, a workload can also have several test procedures that define their own schedule of operations. This is helpful for sharing code between tests
-3. Operation - An action against the OpenSearch cluster
-4. Parameter source - Producers of parameters for OpenSearch operations
-5. Runners - Code that actually will execute the OpenSearch operations
-
-### Setup
-
-OpenSearch Benchmarks requires Python 3.8 or greater to be installed. One of 
-the easier ways to do this is through Conda, a package and environment 
-management system for Python.
-
-First, follow the 
-[installation instructions](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html) 
-to install Conda on your system.
-
-Next, create a Python 3.8 environment:
-```
-conda create -n knn-osb python=3.8
-```
-
-After the environment is created, activate it:
-```
-source activate knn-osb
-```
-
-Lastly, clone the k-NN repo and install all required python packages:
-```
-git clone https://github.com/opensearch-project/k-NN.git
-cd k-NN/benchmarks/osb
-pip install -r requirements.txt
-```
-
-After all of this completes, you should be ready to run your first benchmark!
-
-### Running a benchmark
-
-Before running a benchmark, make sure you have the endpoint of your cluster and
-  the machine you are running the benchmarks from can access it. 
- Additionally, ensure that all data has been pulled to the client.
-
-Currently, we support 2 test procedures for the k-NN workload: train-test and 
-no-train-test. The train test has steps to train a model included in the 
-schedule, while no train does not. Both test procedures will index a data set 
-of vectors into an OpenSearch index and then run a set of queries against them. 
-
-Once you have decided which test procedure you want to use, open up 
-[params/train-params.json](params/train-params.json) or 
-[params/no-train-params.json](params/no-train-params.json) and 
-fill out the parameters. Notice, at the bottom of `no-train-params.json` there 
-are several parameters that relate to training. Ignore these. They need to be 
-defined for the workload but not used. 
-
-Once the parameters are set, set the URL and PORT of your cluster and run the 
-command to run the test procedure. 
-
-```
-export URL=
-export PORT=
-export PARAMS_FILE=
-export PROCEDURE={no-train-test | train-test}
-
-opensearch-benchmark execute_test \ 
-    --target-hosts $URL:$PORT \ 
-    --workload-path ./workload.json \ 
-    --workload-params ${PARAMS_FILE} \
-    --test-procedure=${PROCEDURE} \
-    --pipeline benchmark-only
-```
-
-## Current Procedures
-
-### No Train Test
-
-The No Train Test procedure is used to test `knn_vector` indices that do not 
-use an algorithm that requires training.
-
-#### Workflow
-
-1. Delete old resources in the cluster if they are present
-2. Create an OpenSearch index with `knn_vector` configured to use the HNSW algorithm
-3. Wait for cluster to be green
-4. Ingest data set into the cluster
-5. Refresh the index
-6. Run queries from data set against the cluster
-
-#### Parameters
-
-| Name                                    | Description                                                              |
-|-----------------------------------------|--------------------------------------------------------------------------|
-| target_index_name                       | Name of index to add vectors to                                          |
-| target_field_name                       | Name of field to add vectors to                                          |
-| target_index_body                       | Path to target index definition                                          |
-| target_index_primary_shards             | Target index primary shards                                              |
-| target_index_replica_shards             | Target index replica shards                                              |
-| target_index_dimension                  | Dimension of target index                                                |
-| target_index_space_type                 | Target index space type                                                  |
-| target_index_bulk_size                  | Target index bulk size                                                   |
-| target_index_bulk_index_data_set_format | Format of vector data set                                                |
-| target_index_bulk_index_data_set_path   | Path to vector data set                                                  |
-| target_index_bulk_index_clients         | Clients to be used for bulk ingestion (must be divisor of data set size) |
-| target_index_max_num_segments           | Number of segments to merge target index down to before beginning search |
-| target_index_force_merge_timeout        | Timeout for of force merge requests in seconds                           |
-| hnsw_ef_search                          | HNSW ef search parameter                                                 |
-| hnsw_ef_construction                    | HNSW ef construction parameter                                           |
-| hnsw_m                                  | HNSW m parameter                                                         |
-| query_k                                 | The number of neighbors to return for the search                         |
-| query_clients                           | Number of clients to use for running queries                             |
-| query_data_set_format                   | Format of vector data set for queries                                    |
-| query_data_set_path                     | Path to vector data set for queries                                      |
-
-#### Metrics
-
-The result metrics of this procedure will look like: 
-```
-------------------------------------------------------
-    _______             __   _____
-   / ____(_)___  ____ _/ /  / ___/_________  ________
-  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
- / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
-/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
-------------------------------------------------------
-
-|                                                         Metric |                    Task |       Value |   Unit |
-|---------------------------------------------------------------:|------------------------:|------------:|-------:|
-|                     Cumulative indexing time of primary shards |                         |     1.82885 |    min |
-|             Min cumulative indexing time across primary shards |                         |      0.4121 |    min |
-|          Median cumulative indexing time across primary shards |                         |    0.559617 |    min |
-|             Max cumulative indexing time across primary shards |                         |    0.857133 |    min |
-|            Cumulative indexing throttle time of primary shards |                         |           0 |    min |
-|    Min cumulative indexing throttle time across primary shards |                         |           0 |    min |
-| Median cumulative indexing throttle time across primary shards |                         |           0 |    min |
-|    Max cumulative indexing throttle time across primary shards |                         |           0 |    min |
-|                        Cumulative merge time of primary shards |                         |     5.89065 |    min |
-|                       Cumulative merge count of primary shards |                         |           3 |        |
-|                Min cumulative merge time across primary shards |                         |     1.95945 |    min |
-|             Median cumulative merge time across primary shards |                         |     1.96345 |    min |
-|                Max cumulative merge time across primary shards |                         |     1.96775 |    min |
-|               Cumulative merge throttle time of primary shards |                         |           0 |    min |
-|       Min cumulative merge throttle time across primary shards |                         |           0 |    min |
-|    Median cumulative merge throttle time across primary shards |                         |           0 |    min |
-|       Max cumulative merge throttle time across primary shards |                         |           0 |    min |
-|                      Cumulative refresh time of primary shards |                         |     8.52517 |    min |
-|                     Cumulative refresh count of primary shards |                         |          29 |        |
-|              Min cumulative refresh time across primary shards |                         |     2.64265 |    min |
-|           Median cumulative refresh time across primary shards |                         |     2.93913 |    min |
-|              Max cumulative refresh time across primary shards |                         |     2.94338 |    min |
-|                        Cumulative flush time of primary shards |                         |  0.00221667 |    min |
-|                       Cumulative flush count of primary shards |                         |           3 |        |
-|                Min cumulative flush time across primary shards |                         | 0.000733333 |    min |
-|             Median cumulative flush time across primary shards |                         | 0.000733333 |    min |
-|                Max cumulative flush time across primary shards |                         |     0.00075 |    min |
-|                                        Total Young Gen GC time |                         |       0.318 |      s |
-|                                       Total Young Gen GC count |                         |           2 |        |
-|                                          Total Old Gen GC time |                         |           0 |      s |
-|                                         Total Old Gen GC count |                         |           0 |        |
-|                                                     Store size |                         |     1.43566 |     GB |
-|                                                  Translog size |                         | 1.53668e-07 |     GB |
-|                                         Heap used for segments |                         |  0.00410843 |     MB |
-|                                       Heap used for doc values |                         | 0.000286102 |     MB |
-|                                            Heap used for terms |                         |  0.00121307 |     MB |
-|                                            Heap used for norms |                         |           0 |     MB |
-|                                           Heap used for points |                         |           0 |     MB |
-|                                    Heap used for stored fields |                         |  0.00260925 |     MB |
-|                                                  Segment count |                         |           3 |        |
-|                                                 Min Throughput |      custom-vector-bulk |     38005.8 | docs/s |
-|                                                Mean Throughput |      custom-vector-bulk |     44827.9 | docs/s |
-|                                              Median Throughput |      custom-vector-bulk |     40507.2 | docs/s |
-|                                                 Max Throughput |      custom-vector-bulk |     88967.8 | docs/s |
-|                                        50th percentile latency |      custom-vector-bulk |     29.5857 |     ms |
-|                                        90th percentile latency |      custom-vector-bulk |     49.0719 |     ms |
-|                                        99th percentile latency |      custom-vector-bulk |     72.6138 |     ms |
-|                                      99.9th percentile latency |      custom-vector-bulk |     279.826 |     ms |
-|                                       100th percentile latency |      custom-vector-bulk |       15688 |     ms |
-|                                   50th percentile service time |      custom-vector-bulk |     29.5857 |     ms |
-|                                   90th percentile service time |      custom-vector-bulk |     49.0719 |     ms |
-|                                   99th percentile service time |      custom-vector-bulk |     72.6138 |     ms |
-|                                 99.9th percentile service time |      custom-vector-bulk |     279.826 |     ms |
-|                                  100th percentile service time |      custom-vector-bulk |       15688 |     ms |
-|                                                     error rate |      custom-vector-bulk |           0 |      % |
-|                                                 Min Throughput |    refresh-target-index |        0.01 |  ops/s |
-|                                                Mean Throughput |    refresh-target-index |        0.01 |  ops/s |
-|                                              Median Throughput |    refresh-target-index |        0.01 |  ops/s |
-|                                                 Max Throughput |    refresh-target-index |        0.01 |  ops/s |
-|                                       100th percentile latency |    refresh-target-index |      176610 |     ms |
-|                                  100th percentile service time |    refresh-target-index |      176610 |     ms |
-|                                                     error rate |    refresh-target-index |           0 |      % |
-|                                                 Min Throughput | knn-query-from-data-set |      444.17 |  ops/s |
-|                                                Mean Throughput | knn-query-from-data-set |      601.68 |  ops/s |
-|                                              Median Throughput | knn-query-from-data-set |      621.19 |  ops/s |
-|                                                 Max Throughput | knn-query-from-data-set |      631.23 |  ops/s |
-|                                        50th percentile latency | knn-query-from-data-set |     14.7612 |     ms |
-|                                        90th percentile latency | knn-query-from-data-set |     20.6954 |     ms |
-|                                        99th percentile latency | knn-query-from-data-set |     27.7499 |     ms |
-|                                      99.9th percentile latency | knn-query-from-data-set |     41.3506 |     ms |
-|                                     99.99th percentile latency | knn-query-from-data-set |     162.391 |     ms |
-|                                       100th percentile latency | knn-query-from-data-set |     162.756 |     ms |
-|                                   50th percentile service time | knn-query-from-data-set |     14.7612 |     ms |
-|                                   90th percentile service time | knn-query-from-data-set |     20.6954 |     ms |
-|                                   99th percentile service time | knn-query-from-data-set |     27.7499 |     ms |
-|                                 99.9th percentile service time | knn-query-from-data-set |     41.3506 |     ms |
-|                                99.99th percentile service time | knn-query-from-data-set |     162.391 |     ms |
-|                                  100th percentile service time | knn-query-from-data-set |     162.756 |     ms |
-|                                                     error rate | knn-query-from-data-set |           0 |      % |
-
-
----------------------------------
-[INFO] SUCCESS (took 618 seconds)
----------------------------------
-```
-
-### Train Test
-
-The Train Test procedure is used to test `knn_vector` indices that do use an 
-algorithm that requires training.
-
-#### Workflow
-
-1. Delete old resources in the cluster if they are present
-2. Create an OpenSearch index with `knn_vector` configured to load with training data
-3. Wait for cluster to be green
-4. Ingest data set into the training index
-5. Refresh the index
-6. Train a model based on user provided input parameters
-7. Create an OpenSearch index with `knn_vector` configured to use the model
-8. Ingest vectors into the target index
-9. Refresh the target index
-10. Run queries from data set against the cluster
-
-#### Parameters
-
-| Name                                    | Description                                                              |
-|-----------------------------------------|--------------------------------------------------------------------------|
-| target_index_name                       | Name of index to add vectors to                                          |
-| target_field_name                       | Name of field to add vectors to                                          |
-| target_index_body                       | Path to target index definition                                          |
-| target_index_primary_shards             | Target index primary shards                                              |
-| target_index_replica_shards             | Target index replica shards                                              |
-| target_index_dimension                  | Dimension of target index                                                |
-| target_index_space_type                 | Target index space type                                                  |
-| target_index_bulk_size                  | Target index bulk size                                                   |
-| target_index_bulk_index_data_set_format | Format of vector data set for ingestion                                  |
-| target_index_bulk_index_data_set_path   | Path to vector data set for ingestion                                    |
-| target_index_bulk_index_clients         | Clients to be used for bulk ingestion (must be divisor of data set size) |
-| target_index_max_num_segments           | Number of segments to merge target index down to before beginning search |
-| target_index_force_merge_timeout        | Timeout for of force merge requests in seconds                           |
-| ivf_nlists                              | IVF nlist parameter                                                      |
-| ivf_nprobes                             | IVF nprobe parameter                                                     |
-| pq_code_size                            | PQ code_size parameter                                                   |
-| pq_m                                    | PQ m parameter                                                           |
-| train_model_method                      | Method to be used for model (ivf or ivfpq)                               |
-| train_model_id                          | Model ID                                                                 |
-| train_index_name                        | Name of index to put training data into                                  |
-| train_field_name                        | Name of field to put training data into                                  |
-| train_index_body                        | Path to train index definition                                           |
-| train_search_size                       | Search size to use when pulling training data                            |
-| train_timeout                           | Timeout to wait for training to finish                                   |
-| train_index_primary_shards              | Train index primary shards                                               |
-| train_index_replica_shards              | Train index replica shards                                               |
-| train_index_bulk_size                   | Train index bulk size                                                    |
-| train_index_data_set_format             | Format of vector data set for training                                   |
-| train_index_data_set_path               | Path to vector data set for training                                     |
-| train_index_num_vectors                 | Number of vectors to use from vector data set for training               |
-| train_index_bulk_index_clients          | Clients to be used for bulk ingestion (must be divisor of data set size) |
-| query_k                                 | The number of neighbors to return for the search                         |
-| query_clients                           | Number of clients to use for running queries                             |
-| query_data_set_format                   | Format of vector data set for queries                                    |
-| query_data_set_path                     | Path to vector data set for queries                                      |
-
-#### Metrics
-
-The result metrics of this procedure will look like: 
-```
-------------------------------------------------------
-    _______             __   _____
-   / ____(_)___  ____ _/ /  / ___/_________  ________
-  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
- / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
-/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
-------------------------------------------------------
-
-|                                                         Metric |                    Task |      Value |             Unit |
-|---------------------------------------------------------------:|------------------------:|-----------:|-----------------:|
-|                     Cumulative indexing time of primary shards |                         |    2.92382 |              min |
-|             Min cumulative indexing time across primary shards |                         |    0.42245 |              min |
-|          Median cumulative indexing time across primary shards |                         |    0.43395 |              min |
-|             Max cumulative indexing time across primary shards |                         |    1.63347 |              min |
-|            Cumulative indexing throttle time of primary shards |                         |          0 |              min |
-|    Min cumulative indexing throttle time across primary shards |                         |          0 |              min |
-| Median cumulative indexing throttle time across primary shards |                         |          0 |              min |
-|    Max cumulative indexing throttle time across primary shards |                         |          0 |              min |
-|                        Cumulative merge time of primary shards |                         |    1.36293 |              min |
-|                       Cumulative merge count of primary shards |                         |         20 |                  |
-|                Min cumulative merge time across primary shards |                         |   0.263283 |              min |
-|             Median cumulative merge time across primary shards |                         |   0.291733 |              min |
-|                Max cumulative merge time across primary shards |                         |   0.516183 |              min |
-|               Cumulative merge throttle time of primary shards |                         |   0.701683 |              min |
-|       Min cumulative merge throttle time across primary shards |                         |   0.163883 |              min |
-|    Median cumulative merge throttle time across primary shards |                         |   0.175717 |              min |
-|       Max cumulative merge throttle time across primary shards |                         |   0.186367 |              min |
-|                      Cumulative refresh time of primary shards |                         |   0.222217 |              min |
-|                     Cumulative refresh count of primary shards |                         |         67 |                  |
-|              Min cumulative refresh time across primary shards |                         |    0.03915 |              min |
-|           Median cumulative refresh time across primary shards |                         |   0.039825 |              min |
-|              Max cumulative refresh time across primary shards |                         |   0.103417 |              min |
-|                        Cumulative flush time of primary shards |                         |  0.0276833 |              min |
-|                       Cumulative flush count of primary shards |                         |          1 |                  |
-|                Min cumulative flush time across primary shards |                         |          0 |              min |
-|             Median cumulative flush time across primary shards |                         |          0 |              min |
-|                Max cumulative flush time across primary shards |                         |  0.0276833 |              min |
-|                                        Total Young Gen GC time |                         |      0.074 |                s |
-|                                       Total Young Gen GC count |                         |          8 |                  |
-|                                          Total Old Gen GC time |                         |          0 |                s |
-|                                         Total Old Gen GC count |                         |          0 |                  |
-|                                                     Store size |                         |    1.67839 |               GB |
-|                                                  Translog size |                         |   0.115145 |               GB |
-|                                         Heap used for segments |                         |  0.0350914 |               MB |
-|                                       Heap used for doc values |                         | 0.00771713 |               MB |
-|                                            Heap used for terms |                         |  0.0101089 |               MB |
-|                                            Heap used for norms |                         |          0 |               MB |
-|                                           Heap used for points |                         |          0 |               MB |
-|                                    Heap used for stored fields |                         |  0.0172653 |               MB |
-|                                                  Segment count |                         |         25 |                  |
-|                                                 Min Throughput |            delete-model |      25.45 |            ops/s |
-|                                                Mean Throughput |            delete-model |      25.45 |            ops/s |
-|                                              Median Throughput |            delete-model |      25.45 |            ops/s |
-|                                                 Max Throughput |            delete-model |      25.45 |            ops/s |
-|                                       100th percentile latency |            delete-model |    39.0409 |               ms |
-|                                  100th percentile service time |            delete-model |    39.0409 |               ms |
-|                                                     error rate |            delete-model |          0 |                % |
-|                                                 Min Throughput |       train-vector-bulk |    49518.9 |           docs/s |
-|                                                Mean Throughput |       train-vector-bulk |    54418.8 |           docs/s |
-|                                              Median Throughput |       train-vector-bulk |    52984.2 |           docs/s |
-|                                                 Max Throughput |       train-vector-bulk |    62118.3 |           docs/s |
-|                                        50th percentile latency |       train-vector-bulk |    26.5293 |               ms |
-|                                        90th percentile latency |       train-vector-bulk |    41.8212 |               ms |
-|                                        99th percentile latency |       train-vector-bulk |    239.351 |               ms |
-|                                      99.9th percentile latency |       train-vector-bulk |    348.507 |               ms |
-|                                       100th percentile latency |       train-vector-bulk |    436.292 |               ms |
-|                                   50th percentile service time |       train-vector-bulk |    26.5293 |               ms |
-|                                   90th percentile service time |       train-vector-bulk |    41.8212 |               ms |
-|                                   99th percentile service time |       train-vector-bulk |    239.351 |               ms |
-|                                 99.9th percentile service time |       train-vector-bulk |    348.507 |               ms |
-|                                  100th percentile service time |       train-vector-bulk |    436.292 |               ms |
-|                                                     error rate |       train-vector-bulk |          0 |                % |
-|                                                 Min Throughput |     refresh-train-index |       0.47 |            ops/s |
-|                                                Mean Throughput |     refresh-train-index |       0.47 |            ops/s |
-|                                              Median Throughput |     refresh-train-index |       0.47 |            ops/s |
-|                                                 Max Throughput |     refresh-train-index |       0.47 |            ops/s |
-|                                       100th percentile latency |     refresh-train-index |    2142.96 |               ms |
-|                                  100th percentile service time |     refresh-train-index |    2142.96 |               ms |
-|                                                     error rate |     refresh-train-index |          0 |                % |
-|                                                 Min Throughput |       ivfpq-train-model |       0.01 | models_trained/s |
-|                                                Mean Throughput |       ivfpq-train-model |       0.01 | models_trained/s |
-|                                              Median Throughput |       ivfpq-train-model |       0.01 | models_trained/s |
-|                                                 Max Throughput |       ivfpq-train-model |       0.01 | models_trained/s |
-|                                       100th percentile latency |       ivfpq-train-model |     136563 |               ms |
-|                                  100th percentile service time |       ivfpq-train-model |     136563 |               ms |
-|                                                     error rate |       ivfpq-train-model |          0 |                % |
-|                                                 Min Throughput |      custom-vector-bulk |    62384.8 |           docs/s |
-|                                                Mean Throughput |      custom-vector-bulk |    69035.2 |           docs/s |
-|                                              Median Throughput |      custom-vector-bulk |    68675.4 |           docs/s |
-|                                                 Max Throughput |      custom-vector-bulk |    80713.4 |           docs/s |
-|                                        50th percentile latency |      custom-vector-bulk |    18.7726 |               ms |
-|                                        90th percentile latency |      custom-vector-bulk |    34.8881 |               ms |
-|                                        99th percentile latency |      custom-vector-bulk |    150.435 |               ms |
-|                                      99.9th percentile latency |      custom-vector-bulk |    296.862 |               ms |
-|                                       100th percentile latency |      custom-vector-bulk |    344.394 |               ms |
-|                                   50th percentile service time |      custom-vector-bulk |    18.7726 |               ms |
-|                                   90th percentile service time |      custom-vector-bulk |    34.8881 |               ms |
-|                                   99th percentile service time |      custom-vector-bulk |    150.435 |               ms |
-|                                 99.9th percentile service time |      custom-vector-bulk |    296.862 |               ms |
-|                                  100th percentile service time |      custom-vector-bulk |    344.394 |               ms |
-|                                                     error rate |      custom-vector-bulk |          0 |                % |
-|                                                 Min Throughput |    refresh-target-index |      28.32 |            ops/s |
-|                                                Mean Throughput |    refresh-target-index |      28.32 |            ops/s |
-|                                              Median Throughput |    refresh-target-index |      28.32 |            ops/s |
-|                                                 Max Throughput |    refresh-target-index |      28.32 |            ops/s |
-|                                       100th percentile latency |    refresh-target-index |    34.9811 |               ms |
-|                                  100th percentile service time |    refresh-target-index |    34.9811 |               ms |
-|                                                     error rate |    refresh-target-index |          0 |                % |
-|                                                 Min Throughput | knn-query-from-data-set |        0.9 |            ops/s |
-|                                                Mean Throughput | knn-query-from-data-set |     453.84 |            ops/s |
-|                                              Median Throughput | knn-query-from-data-set |     554.15 |            ops/s |
-|                                                 Max Throughput | knn-query-from-data-set |        681 |            ops/s |
-|                                        50th percentile latency | knn-query-from-data-set |    11.7174 |               ms |
-|                                        90th percentile latency | knn-query-from-data-set |    15.4445 |               ms |
-|                                        99th percentile latency | knn-query-from-data-set |    21.0682 |               ms |
-|                                      99.9th percentile latency | knn-query-from-data-set |    39.5414 |               ms |
-|                                     99.99th percentile latency | knn-query-from-data-set |    1116.33 |               ms |
-|                                       100th percentile latency | knn-query-from-data-set |    1116.66 |               ms |
-|                                   50th percentile service time | knn-query-from-data-set |    11.7174 |               ms |
-|                                   90th percentile service time | knn-query-from-data-set |    15.4445 |               ms |
-|                                   99th percentile service time | knn-query-from-data-set |    21.0682 |               ms |
-|                                 99.9th percentile service time | knn-query-from-data-set |    39.5414 |               ms |
-|                                99.99th percentile service time | knn-query-from-data-set |    1116.33 |               ms |
-|                                  100th percentile service time | knn-query-from-data-set |    1116.66 |               ms |
-|                                                     error rate | knn-query-from-data-set |          0 |                % |
-
-
----------------------------------
-[INFO] SUCCESS (took 281 seconds)
----------------------------------
-```
-
-## Adding a procedure
-
-Adding additional benchmarks is very simple. First, place any custom parameter 
-sources or runners in the [extensions](extensions) directory so that other tests 
-can use them and also update the [documentation](#custom-extensions) 
-accordingly.
-
-Next, create a new test procedure file and add the operations you want your test 
-to run. Lastly, be sure to update documentation.
-
-## Custom Extensions
-
-OpenSearch Benchmarks is very extendable. To fit the plugins needs, we add 
-customer parameter sources and custom runners. Parameter sources allow users to 
-supply custom parameters to an operation. Runners are what actually performs 
-the operations against OpenSearch.
-
-### Custom Parameter Sources
-
-Custom parameter sources are defined in [extensions/param_sources.py](extensions/param_sources.py).
-
-| Name                    | Description                                                            | Parameters                                                                                                                                                                                                                                                                                                                                                |
-|-------------------------|------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| bulk-from-data-set      | Provides bulk payloads containing vectors from a data set for indexing | 1. data_set_format - (hdf5, bigann)<br/>2. data_set_path - path to data set<br/>3. index - name of index for bulk ingestion<br/> 4. field - field to place vector in <br/> 5. bulk_size - vectors per bulk request<br/> 6. num_vectors - number of vectors to use from the data set. Defaults to the whole data set.                                      |
-| knn-query-from-data-set | Provides a query generated from a data set                             | 1. data_set_format - (hdf5, bigann)<br/>2. data_set_path - path to data set<br/>3. index - name of index to query against<br/>4. field - field to to query against<br/>5. k - number of results to return<br/>6. dimension - size of vectors to produce<br/> 7. num_vectors - number of vectors to use from the data set. Defaults to the whole data set. |
-
-
-### Custom Runners
-
-Custom runners are defined in [extensions/runners.py](extensions/runners.py).
-
-| Syntax             | Description                                         | Parameters                                                                                                   |
-|--------------------|-----------------------------------------------------|:-------------------------------------------------------------------------------------------------------------|
-| custom-vector-bulk | Bulk index a set of vectors in an OpenSearch index. | 1. bulk-from-data-set                                                                                        |
-| custom-refresh     | Run refresh with retry capabilities.                | 1. index - name of index to refresh<br/> 2. retries - number of times to retry the operation                 |
-| train-model        | Trains a model.                                     | 1. body - model definition<br/> 2. timeout - time to wait for model to finish<br/> 3. model_id - ID of model |
-| delete-model       | Deletes a model if it exists.                       | 1. model_id - ID of model                                                                                    |
-
-### Testing
-
-We have a set of unit tests for our extensions in 
-[tests](tests). To run all the tests, run the following 
-command:
-
-```commandline
-python -m unittest discover ./tests
-```
-
-To run an individual test:
-```commandline
-python -m unittest tests.test_param_sources.VectorsFromDataSetParamSourceTestCase.test_partition_hdf5
-```
diff --git a/benchmarks/osb/__init__.py b/benchmarks/osb/__init__.py
deleted file mode 100644
index e69de29bb..000000000
diff --git a/benchmarks/osb/extensions/__init__.py b/benchmarks/osb/extensions/__init__.py
deleted file mode 100644
index e69de29bb..000000000
diff --git a/benchmarks/osb/extensions/data_set.py b/benchmarks/osb/extensions/data_set.py
deleted file mode 100644
index 7e8058844..000000000
--- a/benchmarks/osb/extensions/data_set.py
+++ /dev/null
@@ -1,202 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-import os
-import numpy as np
-from abc import ABC, ABCMeta, abstractmethod
-from enum import Enum
-from typing import cast
-import h5py
-import struct
-
-
-class Context(Enum):
-    """DataSet context enum. Can be used to add additional context for how a
-    data-set should be interpreted.
-    """
-    INDEX = 1
-    QUERY = 2
-    NEIGHBORS = 3
-
-
-class DataSet(ABC):
-    """DataSet interface. Used for reading data-sets from files.
-
-    Methods:
-        read: Read a chunk of data from the data-set
-        seek: Get to position in the data-set
-        size: Gets the number of items in the data-set
-        reset: Resets internal state of data-set to beginning
-    """
-    __metaclass__ = ABCMeta
-
-    BEGINNING = 0
-
-    @abstractmethod
-    def read(self, chunk_size: int):
-        pass
-
-    @abstractmethod
-    def seek(self, offset: int):
-        pass
-
-    @abstractmethod
-    def size(self):
-        pass
-
-    @abstractmethod
-    def reset(self):
-        pass
-
-
-class HDF5DataSet(DataSet):
-    """ Data-set format corresponding to `ANN Benchmarks
-    <https://github.com/erikbern/ann-benchmarks#data-sets>`_
-    """
-
-    FORMAT_NAME = "hdf5"
-
-    def __init__(self, dataset_path: str, context: Context):
-        file = h5py.File(dataset_path)
-        self.data = cast(h5py.Dataset, file[self.parse_context(context)])
-        self.current = self.BEGINNING
-
-    def read(self, chunk_size: int):
-        if self.current >= self.size():
-            return None
-
-        end_offset = self.current + chunk_size
-        if end_offset > self.size():
-            end_offset = self.size()
-
-        v = cast(np.ndarray, self.data[self.current:end_offset])
-        self.current = end_offset
-        return v
-
-    def seek(self, offset: int):
-
-        if offset < self.BEGINNING:
-            raise Exception("Offset must be greater than or equal to 0")
-
-        if offset >= self.size():
-            raise Exception("Offset must be less than the data set size")
-
-        self.current = offset
-
-    def size(self):
-        return self.data.len()
-
-    def reset(self):
-        self.current = self.BEGINNING
-
-    @staticmethod
-    def parse_context(context: Context) -> str:
-        if context == Context.NEIGHBORS:
-            return "neighbors"
-
-        if context == Context.INDEX:
-            return "train"
-
-        if context == Context.QUERY:
-            return "test"
-
-        raise Exception("Unsupported context")
-
-
-class BigANNVectorDataSet(DataSet):
-    """ Data-set format for vector data-sets for `Big ANN Benchmarks
-    <https://big-ann-benchmarks.com/index.html#bench-datasets>`_
-    """
-
-    DATA_SET_HEADER_LENGTH = 8
-    U8BIN_EXTENSION = "u8bin"
-    FBIN_EXTENSION = "fbin"
-    FORMAT_NAME = "bigann"
-
-    BYTES_PER_U8INT = 1
-    BYTES_PER_FLOAT = 4
-
-    def __init__(self, dataset_path: str):
-        self.file = open(dataset_path, 'rb')
-        self.file.seek(BigANNVectorDataSet.BEGINNING, os.SEEK_END)
-        num_bytes = self.file.tell()
-        self.file.seek(BigANNVectorDataSet.BEGINNING)
-
-        if num_bytes < BigANNVectorDataSet.DATA_SET_HEADER_LENGTH:
-            raise Exception("File is invalid")
-
-        self.num_points = int.from_bytes(self.file.read(4), "little")
-        self.dimension = int.from_bytes(self.file.read(4), "little")
-        self.bytes_per_num = self._get_data_size(dataset_path)
-
-        if (num_bytes - BigANNVectorDataSet.DATA_SET_HEADER_LENGTH) != self.num_points * \
-                self.dimension * self.bytes_per_num:
-            raise Exception("File is invalid")
-
-        self.reader = self._value_reader(dataset_path)
-        self.current = BigANNVectorDataSet.BEGINNING
-
-    def read(self, chunk_size: int):
-        if self.current >= self.size():
-            return None
-
-        end_offset = self.current + chunk_size
-        if end_offset > self.size():
-            end_offset = self.size()
-
-        v = np.asarray([self._read_vector() for _ in
-                        range(end_offset - self.current)])
-        self.current = end_offset
-        return v
-
-    def seek(self, offset: int):
-
-        if offset < self.BEGINNING:
-            raise Exception("Offset must be greater than or equal to 0")
-
-        if offset >= self.size():
-            raise Exception("Offset must be less than the data set size")
-
-        bytes_offset = BigANNVectorDataSet.DATA_SET_HEADER_LENGTH + \
-                       self.dimension * self.bytes_per_num * offset
-        self.file.seek(bytes_offset)
-        self.current = offset
-
-    def _read_vector(self):
-        return np.asarray([self.reader(self.file) for _ in
-                           range(self.dimension)])
-
-    def size(self):
-        return self.num_points
-
-    def reset(self):
-        self.file.seek(BigANNVectorDataSet.DATA_SET_HEADER_LENGTH)
-        self.current = BigANNVectorDataSet.BEGINNING
-
-    def __del__(self):
-        self.file.close()
-
-    @staticmethod
-    def _get_data_size(file_name):
-        ext = file_name.split('.')[-1]
-        if ext == BigANNVectorDataSet.U8BIN_EXTENSION:
-            return BigANNVectorDataSet.BYTES_PER_U8INT
-
-        if ext == BigANNVectorDataSet.FBIN_EXTENSION:
-            return BigANNVectorDataSet.BYTES_PER_FLOAT
-
-        raise Exception("Unknown extension")
-
-    @staticmethod
-    def _value_reader(file_name):
-        ext = file_name.split('.')[-1]
-        if ext == BigANNVectorDataSet.U8BIN_EXTENSION:
-            return lambda file: float(int.from_bytes(file.read(BigANNVectorDataSet.BYTES_PER_U8INT), "little"))
-
-        if ext == BigANNVectorDataSet.FBIN_EXTENSION:
-            return lambda file: struct.unpack('<f', file.read(BigANNVectorDataSet.BYTES_PER_FLOAT))
-
-        raise Exception("Unknown extension")
diff --git a/benchmarks/osb/extensions/param_sources.py b/benchmarks/osb/extensions/param_sources.py
deleted file mode 100644
index 040490317..000000000
--- a/benchmarks/osb/extensions/param_sources.py
+++ /dev/null
@@ -1,217 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-import copy
-from abc import ABC, abstractmethod
-
-from .data_set import Context, HDF5DataSet, DataSet, BigANNVectorDataSet
-from .util import bulk_transform, parse_string_parameter, parse_int_parameter, \
-    ConfigurationError
-
-
-def register(registry):
-    registry.register_param_source(
-        "bulk-from-data-set", BulkVectorsFromDataSetParamSource
-    )
-
-    registry.register_param_source(
-        "knn-query-from-data-set", QueryVectorsFromDataSetParamSource
-    )
-
-
-class VectorsFromDataSetParamSource(ABC):
-    """ Abstract class that can read vectors from a data set and partition the
-    vectors across multiple clients.
-
-    Attributes:
-        index_name: Name of the index to generate the query for
-        field_name: Name of the field to generate the query for
-        data_set_format: Format data set is serialized with. bigann or hdf5
-        data_set_path: Path to data set
-        context: Context the data set will be used in.
-        data_set: Structure containing meta data about data and ability to read
-        num_vectors: Number of vectors to use from the data set
-        total: Number of vectors for the partition
-        current: Current vector offset in data set
-        infinite: Property of param source signalling that it can be exhausted
-        percent_completed: Progress indicator for how exhausted data set is
-        offset: Offset into the data set to start at. Relevant when there are
-                multiple partitions
-    """
-
-    def __init__(self, params, context: Context):
-        self.index_name: str = parse_string_parameter("index", params)
-        self.field_name: str = parse_string_parameter("field", params)
-
-        self.context = context
-        self.data_set_format = parse_string_parameter("data_set_format", params)
-        self.data_set_path = parse_string_parameter("data_set_path", params)
-        self.data_set: DataSet = self._read_data_set(self.data_set_format,
-                                                     self.data_set_path,
-                                                     self.context)
-
-        self.num_vectors: int = parse_int_parameter(
-            "num_vectors", params, self.data_set.size()
-        )
-        self.total = self.num_vectors
-        self.current = 0
-        self.infinite = False
-        self.percent_completed = 0
-        self.offset = 0
-
-    def _read_data_set(self, data_set_format: str, data_set_path: str,
-                       data_set_context: Context):
-        if data_set_format == HDF5DataSet.FORMAT_NAME:
-            return HDF5DataSet(data_set_path, data_set_context)
-        if data_set_format == BigANNVectorDataSet.FORMAT_NAME:
-            return BigANNVectorDataSet(data_set_path)
-        raise ConfigurationError("Invalid data set format")
-
-    def partition(self, partition_index, total_partitions):
-        """
-        Splits up the parameters source so that multiple clients can read data
-        from it.
-        Args:
-            partition_index: index of one particular partition
-            total_partitions: total number of partitions data set is split into
-
-        Returns:
-            The parameter source for this particular partion
-        """
-        if self.num_vectors % total_partitions != 0:
-            raise ValueError("Num vectors must be divisible by number of "
-                             "partitions")
-
-        partition_x = copy.copy(self)
-        partition_x.num_vectors = int(self.num_vectors / total_partitions)
-        partition_x.offset = int(partition_index * partition_x.num_vectors)
-
-        # We need to create a new instance of the data set for each client
-        partition_x.data_set = partition_x._read_data_set(
-            self.data_set_format,
-            self.data_set_path,
-            self.context
-        )
-        partition_x.data_set.seek(partition_x.offset)
-        partition_x.current = partition_x.offset
-        return partition_x
-
-    @abstractmethod
-    def params(self):
-        """
-        Returns: A single parameter from this sourc
-        """
-        pass
-
-
-class QueryVectorsFromDataSetParamSource(VectorsFromDataSetParamSource):
-    """ Query parameter source for k-NN. Queries are created from data set
-    provided.
-
-    Attributes:
-        k: The number of results to return for the search
-        vector_batch: List of vectors to be read from data set. Read are batched
-                        so that we do not need to read from disk for each query
-    """
-
-    VECTOR_READ_BATCH_SIZE = 100  # batch size to read vectors from data-set
-
-    def __init__(self, workload, params, **kwargs):
-        super().__init__(params, Context.QUERY)
-        self.k = parse_int_parameter("k", params)
-        self.vector_batch = None
-
-    def params(self):
-        """
-        Returns: A query parameter with a vector from a data set
-        """
-        if self.current >= self.num_vectors + self.offset:
-            raise StopIteration
-
-        if self.vector_batch is None or len(self.vector_batch) == 0:
-            self.vector_batch = self._batch_read(self.data_set)
-            if self.vector_batch is None:
-                raise StopIteration
-        vector = self.vector_batch.pop(0)
-        self.current += 1
-        self.percent_completed = self.current / self.total
-
-        return self._build_query_body(self.index_name, self.field_name, self.k,
-                                      vector)
-
-    def _batch_read(self, data_set: DataSet):
-        return list(data_set.read(self.VECTOR_READ_BATCH_SIZE))
-
-    def _build_query_body(self, index_name: str, field_name: str, k: int,
-                          vector) -> dict:
-        """Builds a k-NN query that can be used to execute an approximate nearest
-        neighbor search against a k-NN plugin index
-        Args:
-            index_name: name of index to search
-            field_name: name of field to search
-            k: number of results to return
-            vector: vector used for query
-        Returns:
-            A dictionary containing the body used for search, a set of request
-            parameters to attach to the search and the name of the index.
-        """
-        return {
-            "index": index_name,
-            "request-params": {
-                "_source": {
-                    "exclude": [field_name]
-                }
-            },
-            "body": {
-                "size": k,
-                "query": {
-                    "knn": {
-                        field_name: {
-                            "vector": vector,
-                            "k": k
-                        }
-                    }
-                }
-            }
-        }
-
-
-class BulkVectorsFromDataSetParamSource(VectorsFromDataSetParamSource):
-    """ Create bulk index requests from a data set of vectors.
-
-    Attributes:
-        bulk_size: number of vectors per request
-        retries: number of times to retry the request when it fails
-    """
-
-    DEFAULT_RETRIES = 10
-
-    def __init__(self, workload, params, **kwargs):
-        super().__init__(params, Context.INDEX)
-        self.bulk_size: int = parse_int_parameter("bulk_size", params)
-        self.retries: int = parse_int_parameter("retries", params,
-                                                self.DEFAULT_RETRIES)
-
-    def params(self):
-        """
-        Returns: A bulk index parameter with vectors from a data set.
-        """
-        if self.current >= self.num_vectors + self.offset:
-            raise StopIteration
-
-        def action(doc_id):
-            return {'index': {'_index': self.index_name, '_id': doc_id}}
-
-        partition = self.data_set.read(self.bulk_size)
-        body = bulk_transform(partition, self.field_name, action, self.current)
-        size = len(body) // 2
-        self.current += size
-        self.percent_completed = self.current / self.total
-
-        return {
-            "body": body,
-            "retries": self.retries,
-            "size": size
-        }
diff --git a/benchmarks/osb/extensions/registry.py b/benchmarks/osb/extensions/registry.py
deleted file mode 100644
index 5ce17ab6f..000000000
--- a/benchmarks/osb/extensions/registry.py
+++ /dev/null
@@ -1,13 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-from .param_sources import register as param_sources_register
-from .runners import register as runners_register
-
-
-def register(registry):
-    param_sources_register(registry)
-    runners_register(registry)
diff --git a/benchmarks/osb/extensions/runners.py b/benchmarks/osb/extensions/runners.py
deleted file mode 100644
index d048f80b0..000000000
--- a/benchmarks/osb/extensions/runners.py
+++ /dev/null
@@ -1,121 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-from opensearchpy.exceptions import ConnectionTimeout
-from .util import parse_int_parameter, parse_string_parameter
-import logging
-import time
-
-
-def register(registry):
-    registry.register_runner(
-        "custom-vector-bulk", BulkVectorsFromDataSetRunner(), async_runner=True
-    )
-    registry.register_runner(
-        "custom-refresh", CustomRefreshRunner(), async_runner=True
-    )
-    registry.register_runner(
-        "train-model", TrainModelRunner(), async_runner=True
-    )
-    registry.register_runner(
-        "delete-model", DeleteModelRunner(), async_runner=True
-    )
-
-
-class BulkVectorsFromDataSetRunner:
-
-    async def __call__(self, opensearch, params):
-        size = parse_int_parameter("size", params)
-        retries = parse_int_parameter("retries", params, 0) + 1
-
-        for _ in range(retries):
-            try:
-                await opensearch.bulk(
-                    body=params["body"],
-                    timeout='5m'
-                )
-
-                return size, "docs"
-            except ConnectionTimeout:
-                logging.getLogger(__name__)\
-                    .warning("Bulk vector ingestion timed out. Retrying")
-
-        raise TimeoutError("Failed to submit bulk request in specified number "
-                           "of retries: {}".format(retries))
-
-    def __repr__(self, *args, **kwargs):
-        return "custom-vector-bulk"
-
-
-class CustomRefreshRunner:
-
-    async def __call__(self, opensearch, params):
-        retries = parse_int_parameter("retries", params, 0) + 1
-
-        for _ in range(retries):
-            try:
-                await opensearch.indices.refresh(
-                    index=parse_string_parameter("index", params)
-                )
-
-                return
-            except ConnectionTimeout:
-                logging.getLogger(__name__)\
-                    .warning("Custom refresh timed out. Retrying")
-
-        raise TimeoutError("Failed to refresh the index in specified number "
-                           "of retries: {}".format(retries))
-
-    def __repr__(self, *args, **kwargs):
-        return "custom-refresh"
-
-
-class TrainModelRunner:
-
-    async def __call__(self, opensearch, params):
-        # Train a model and wait for it training to complete
-        body = params["body"]
-        timeout = parse_int_parameter("timeout", params)
-        model_id = parse_string_parameter("model_id", params)
-
-        method = "POST"
-        model_uri = "/_plugins/_knn/models/{}".format(model_id)
-        await opensearch.transport.perform_request(method, "{}/_train".format(model_uri), body=body)
-
-        start_time = time.time()
-        while time.time() < start_time + timeout:
-            time.sleep(1)
-            model_response = await opensearch.transport.perform_request("GET", model_uri)
-
-            if 'state' not in model_response.keys():
-                continue
-
-            if model_response['state'] == 'created':
-                #TODO: Return model size as well
-                return 1, "models_trained"
-
-            if model_response['state'] == 'failed':
-                raise Exception("Failed to create model: {}".format(model_response))
-
-        raise Exception('Failed to create model: {} within timeout {} seconds'
-                        .format(model_id, timeout))
-
-    def __repr__(self, *args, **kwargs):
-        return "train-model"
-
-
-class DeleteModelRunner:
-
-    async def __call__(self, opensearch, params):
-        # Delete model provided by model id
-        method = "DELETE"
-        model_id = parse_string_parameter("model_id", params)
-        uri = "/_plugins/_knn/models/{}".format(model_id)
-
-        # Ignore if model doesnt exist
-        await opensearch.transport.perform_request(method, uri, params={"ignore": [400, 404]})
-
-    def __repr__(self, *args, **kwargs):
-        return "delete-model"
diff --git a/benchmarks/osb/extensions/util.py b/benchmarks/osb/extensions/util.py
deleted file mode 100644
index f7f6aab62..000000000
--- a/benchmarks/osb/extensions/util.py
+++ /dev/null
@@ -1,71 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-import numpy as np
-from typing import List
-from typing import Dict
-from typing import Any
-
-
-def bulk_transform(partition: np.ndarray, field_name: str, action,
-                   offset: int) -> List[Dict[str, Any]]:
-    """Partitions and transforms a list of vectors into OpenSearch's bulk
-    injection format.
-    Args:
-        offset: to start counting from
-        partition: An array of vectors to transform.
-        field_name: field name for action
-        action: Bulk API action.
-    Returns:
-        An array of transformed vectors in bulk format.
-    """
-    actions = []
-    _ = [
-        actions.extend([action(i + offset), None])
-        for i in range(len(partition))
-    ]
-    actions[1::2] = [{field_name: vec} for vec in partition.tolist()]
-    return actions
-
-
-def parse_string_parameter(key: str, params: dict, default: str = None) -> str:
-    if key not in params:
-        if default is not None:
-            return default
-        raise ConfigurationError(
-            "Value cannot be None for param {}".format(key)
-        )
-
-    if type(params[key]) is str:
-        return params[key]
-
-    raise ConfigurationError("Value must be a string for param {}".format(key))
-
-
-def parse_int_parameter(key: str, params: dict, default: int = None) -> int:
-    if key not in params:
-        if default:
-            return default
-        raise ConfigurationError(
-            "Value cannot be None for param {}".format(key)
-        )
-
-    if type(params[key]) is int:
-        return params[key]
-
-    raise ConfigurationError("Value must be a int for param {}".format(key))
-
-
-class ConfigurationError(Exception):
-    """Exception raised for errors configuration.
-
-    Attributes:
-        message -- explanation of the error
-    """
-
-    def __init__(self, message: str):
-        self.message = f'{message}'
-        super().__init__(self.message)
diff --git a/benchmarks/osb/indices/faiss-index.json b/benchmarks/osb/indices/faiss-index.json
deleted file mode 100644
index 2db4d34d4..000000000
--- a/benchmarks/osb/indices/faiss-index.json
+++ /dev/null
@@ -1,27 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": {{ target_index_primary_shards }},
-      "number_of_replicas": {{ target_index_replica_shards }}
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-          "type": "knn_vector",
-          "dimension": {{ target_index_dimension }},
-          "method": {
-          "name": "hnsw",
-          "space_type": "{{ target_index_space_type }}",
-          "engine": "faiss",
-          "parameters": {
-            "ef_search":  {{ hnsw_ef_search }},
-            "ef_construction": {{ hnsw_ef_construction }},
-            "m": {{ hnsw_m }}
-          }
-        }
-      }
-    }
-  }
-}
diff --git a/benchmarks/osb/indices/lucene-index.json b/benchmarks/osb/indices/lucene-index.json
deleted file mode 100644
index 0a4ed868a..000000000
--- a/benchmarks/osb/indices/lucene-index.json
+++ /dev/null
@@ -1,26 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": {{ target_index_primary_shards }},
-      "number_of_replicas": {{ target_index_replica_shards }}
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-        "type": "knn_vector",
-        "dimension": {{ target_index_dimension }},
-        "method": {
-          "name": "hnsw",
-          "space_type": "{{ target_index_space_type }}",
-          "engine": "lucene",
-          "parameters": {
-            "ef_construction": {{ hnsw_ef_construction }},
-            "m": {{ hnsw_m }}
-          }
-        }
-      }
-    }
-  }
-}
diff --git a/benchmarks/osb/indices/model-index.json b/benchmarks/osb/indices/model-index.json
deleted file mode 100644
index 0e92c8903..000000000
--- a/benchmarks/osb/indices/model-index.json
+++ /dev/null
@@ -1,17 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": {{ target_index_primary_shards | default(1) }},
-      "number_of_replicas": {{ target_index_replica_shards | default(0) }}
-    }
-  },
-  "mappings": {
-    "properties": {
-      "{{ target_field_name }}": {
-        "type": "knn_vector",
-        "model_id": "{{ train_model_id }}"
-      }
-    }
-  }
-}
diff --git a/benchmarks/osb/indices/nmslib-index.json b/benchmarks/osb/indices/nmslib-index.json
deleted file mode 100644
index 4ceb57977..000000000
--- a/benchmarks/osb/indices/nmslib-index.json
+++ /dev/null
@@ -1,27 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "knn.algo_param.ef_search":  {{ hnsw_ef_search }},
-      "number_of_shards": {{ target_index_primary_shards }},
-      "number_of_replicas": {{ target_index_replica_shards }}
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-        "type": "knn_vector",
-        "dimension": {{ target_index_dimension }},
-        "method": {
-          "name": "hnsw",
-          "space_type": "{{ target_index_space_type }}",
-          "engine": "nmslib",
-          "parameters": {
-            "ef_construction": {{ hnsw_ef_construction }},
-            "m": {{ hnsw_m }}
-          }
-        }
-      }
-    }
-  }
-}
diff --git a/benchmarks/osb/indices/train-index.json b/benchmarks/osb/indices/train-index.json
deleted file mode 100644
index 82af8215e..000000000
--- a/benchmarks/osb/indices/train-index.json
+++ /dev/null
@@ -1,16 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "number_of_shards": {{ train_index_primary_shards }},
-      "number_of_replicas": {{ train_index_replica_shards }}
-    }
-  },
-  "mappings": {
-    "properties": {
-      "{{ train_field_name }}": {
-        "type": "knn_vector",
-        "dimension": {{ target_index_dimension }}
-      }
-    }
-  }
-}
diff --git a/benchmarks/osb/operations/default.json b/benchmarks/osb/operations/default.json
deleted file mode 100644
index ee33166f0..000000000
--- a/benchmarks/osb/operations/default.json
+++ /dev/null
@@ -1,53 +0,0 @@
-[
-    {
-        "name": "ivfpq-train-model",
-        "operation-type": "train-model",
-        "model_id": "{{ train_model_id }}",
-        "timeout": {{ train_timeout }},
-        "body": {
-            "training_index": "{{ train_index_name }}",
-            "training_field": "{{ train_field_name }}",
-            "dimension": {{ target_index_dimension }},
-            "search_size": {{ train_search_size }},
-            "max_training_vector_count": {{ train_index_num_vectors }},
-            "method": {
-                "name":"ivf",
-                "engine":"faiss",
-                "space_type": "{{ target_index_space_type }}",
-                "parameters":{
-                    "nlist": {{ ivf_nlists }},
-                    "nprobes": {{ ivf_nprobes }},
-                    "encoder":{
-                        "name":"pq",
-                        "parameters":{
-                            "code_size": {{ pq_code_size }},
-                            "m": {{ pq_m }}
-                        }
-                    }
-                }
-            }
-        }
-    },
-    {
-        "name": "ivf-train-model",
-        "operation-type": "train-model",
-        "model_id": "{{ train_model_id }}",
-        "timeout": {{ train_timeout | default(1000) }},
-        "body": {
-            "training_index": "{{ train_index_name }}",
-            "training_field": "{{ train_field_name }}",
-            "search_size": {{ train_search_size }},
-            "dimension": {{ target_index_dimension }},
-            "max_training_vector_count": {{ train_index_num_vectors }},
-            "method": {
-                "name":"ivf",
-                "engine":"faiss",
-                "space_type": "{{ target_index_space_type }}",
-                "parameters":{
-                    "nlist": {{ ivf_nlists }},
-                    "nprobes": {{ ivf_nprobes }}
-                }
-            }
-        }
-    }
-]
diff --git a/benchmarks/osb/params/no-train-params.json b/benchmarks/osb/params/no-train-params.json
deleted file mode 100644
index 58e4197fd..000000000
--- a/benchmarks/osb/params/no-train-params.json
+++ /dev/null
@@ -1,40 +0,0 @@
-{
-  "target_index_name": "target_index",
-  "target_field_name": "target_field",
-  "target_index_body": "indices/nmslib-index.json",
-  "target_index_primary_shards": 3,
-  "target_index_replica_shards": 1,
-  "target_index_dimension": 128,
-  "target_index_space_type": "l2",
-  "target_index_bulk_size": 200,
-  "target_index_bulk_index_data_set_format": "hdf5",
-  "target_index_bulk_index_data_set_path": "<path to data>",
-  "target_index_bulk_index_clients": 10,
-  "target_index_max_num_segments": 10,
-  "target_index_force_merge_timeout": 45.0,
-  "hnsw_ef_search": 512,
-  "hnsw_ef_construction": 512,
-  "hnsw_m": 16,
-
-  "query_k": 10,
-  "query_clients": 10,
-  "query_data_set_format": "hdf5",
-  "query_data_set_path": "<path to data>",
-
-  "ivf_nlists": 1,
-  "ivf_nprobes": 1,
-  "pq_code_size": 1,
-  "pq_m": 1,
-  "train_model_method": "",
-  "train_model_id": "",
-  "train_index_name": "",
-  "train_field_name": "",
-  "train_index_body": "",
-  "train_search_size": 1,
-  "train_timeout": 1,
-  "train_index_bulk_size": 1,
-  "train_index_data_set_format": "",
-  "train_index_data_set_path": "",
-  "train_index_num_vectors": 1,
-  "train_index_bulk_index_clients": 1
-}
diff --git a/benchmarks/osb/params/train-params.json b/benchmarks/osb/params/train-params.json
deleted file mode 100644
index f55ed4333..000000000
--- a/benchmarks/osb/params/train-params.json
+++ /dev/null
@@ -1,38 +0,0 @@
-{
-  "target_index_name": "target_index",
-  "target_field_name": "target_field",
-  "target_index_body": "indices/model-index.json",
-  "target_index_primary_shards": 3,
-  "target_index_replica_shards": 1,
-  "target_index_dimension": 128,
-  "target_index_space_type": "l2",
-  "target_index_bulk_size": 200,
-  "target_index_bulk_index_data_set_format": "hdf5",
-  "target_index_bulk_index_data_set_path": "<path to data>",
-  "target_index_bulk_index_clients": 10,
-  "target_index_max_num_segments": 10,
-  "target_index_force_merge_timeout": 45.0,
-  "ivf_nlists": 10,
-  "ivf_nprobes": 1,
-  "pq_code_size": 8,
-  "pq_m": 8,
-  "train_model_method": "ivfpq",
-  "train_model_id": "test-model",
-  "train_index_name": "train_index",
-  "train_field_name": "train_field",
-  "train_index_body": "indices/train-index.json",
-  "train_search_size": 500,
-  "train_timeout": 5000,
-  "train_index_primary_shards": 1,
-  "train_index_replica_shards": 0,
-  "train_index_bulk_size": 200,
-  "train_index_data_set_format": "hdf5",
-  "train_index_data_set_path": "<path to data>",
-  "train_index_num_vectors": 1000000,
-  "train_index_bulk_index_clients": 10,
-
-  "query_k": 10,
-  "query_clients": 10,
-  "query_data_set_format": "hdf5",
-  "query_data_set_path": "<path to data>"
-}
diff --git a/benchmarks/osb/procedures/no-train-test.json b/benchmarks/osb/procedures/no-train-test.json
deleted file mode 100644
index 01985b914..000000000
--- a/benchmarks/osb/procedures/no-train-test.json
+++ /dev/null
@@ -1,73 +0,0 @@
-{% import "benchmark.helpers" as benchmark with context %}
-{
-    "name": "no-train-test",
-    "default": true,
-    "schedule": [
-        {
-            "operation": {
-                "name": "delete-target-index",
-                "operation-type": "delete-index",
-                "only-if-exists": true,
-                "index": "{{ target_index_name }}"
-            }
-        },
-        {
-            "operation": {
-                "name": "create-target-index",
-                "operation-type": "create-index",
-                "index": "{{ target_index_name }}"
-            }
-        },
-        {
-            "name": "wait-for-cluster-to-be-green",
-            "operation": "cluster-health",
-            "request-params": {
-                "wait_for_status": "green"
-            }
-        },
-        {
-            "operation": {
-                "name": "custom-vector-bulk",
-                "operation-type": "custom-vector-bulk",
-                "param-source": "bulk-from-data-set",
-                "index": "{{ target_index_name }}",
-                "field": "{{ target_field_name }}",
-                "bulk_size": {{ target_index_bulk_size }},
-                "data_set_format": "{{ target_index_bulk_index_data_set_format }}",
-                "data_set_path": "{{ target_index_bulk_index_data_set_path }}"
-            },
-            "clients": {{ target_index_bulk_index_clients }}
-        },
-        {
-            "operation": {
-                "name": "refresh-target-index",
-                "operation-type": "custom-refresh",
-                "index": "{{ target_index_name }}",
-                "retries": 100
-            }
-        },
-        {
-            "operation": {
-                "name": "force-merge",
-                "operation-type": "force-merge",
-                "request-timeout": {{ target_index_force_merge_timeout }},
-                "index": "{{ target_index_name }}",
-                "mode": "polling",
-                "max-num-segments": {{ target_index_max_num_segments }}
-            }
-        },
-        {
-            "operation": {
-                "name": "knn-query-from-data-set",
-                "operation-type": "search",
-                "index": "{{ target_index_name }}",
-                "param-source": "knn-query-from-data-set",
-                "k": {{ query_k }},
-                "field": "{{ target_field_name }}",
-                "data_set_format": "{{ query_data_set_format }}",
-                "data_set_path": "{{ query_data_set_path }}"
-          },
-            "clients": {{ query_clients }}
-        }
-    ]
-}
diff --git a/benchmarks/osb/procedures/train-test.json b/benchmarks/osb/procedures/train-test.json
deleted file mode 100644
index ca26db0b0..000000000
--- a/benchmarks/osb/procedures/train-test.json
+++ /dev/null
@@ -1,127 +0,0 @@
-{% import "benchmark.helpers" as benchmark with context %}
-{
-    "name": "train-test",
-    "default": false,
-    "schedule": [
-        {
-            "operation": {
-                "name": "delete-target-index",
-                "operation-type": "delete-index",
-                "only-if-exists": true,
-                "index": "{{ target_index_name }}"
-            }
-        },
-        {
-            "operation": {
-                "name": "delete-train-index",
-                "operation-type": "delete-index",
-                "only-if-exists": true,
-                "index": "{{ train_index_name }}"
-            }
-        },
-        {
-            "operation": {
-                "operation-type": "delete-model",
-                "name": "delete-model",
-                "model_id": "{{ train_model_id }}"
-            }
-        },
-        {
-            "operation": {
-                "name": "create-train-index",
-                "operation-type": "create-index",
-                "index": "{{ train_index_name }}"
-            }
-        },
-        {
-            "name": "wait-for-train-index-to-be-green",
-            "operation": "cluster-health",
-            "request-params": {
-                "wait_for_status": "green"
-            }
-        },
-        {
-            "operation": {
-                "name": "train-vector-bulk",
-                "operation-type": "custom-vector-bulk",
-                "param-source": "bulk-from-data-set",
-                "index": "{{ train_index_name }}",
-                "field": "{{ train_field_name }}",
-                "bulk_size": {{ train_index_bulk_size }},
-                "data_set_format": "{{ train_index_data_set_format }}",
-                "data_set_path": "{{ train_index_data_set_path }}",
-                "num_vectors": {{ train_index_num_vectors }}
-            },
-            "clients": {{ train_index_bulk_index_clients }}
-        },
-        {
-            "operation": {
-                "name": "refresh-train-index",
-                "operation-type": "custom-refresh",
-                "index": "{{ train_index_name }}",
-                "retries": 100
-            }
-        },
-        {
-            "operation": "{{ train_model_method }}-train-model"
-        },
-        {
-            "operation": {
-                "name": "create-target-index",
-                "operation-type": "create-index",
-                "index": "{{ target_index_name }}"
-            }
-        },
-        {
-            "name": "wait-for-target-index-to-be-green",
-            "operation": "cluster-health",
-            "request-params": {
-                "wait_for_status": "green"
-            }
-        },
-        {
-            "operation": {
-                "name": "custom-vector-bulk",
-                "operation-type": "custom-vector-bulk",
-                "param-source": "bulk-from-data-set",
-                "index": "{{ target_index_name }}",
-                "field": "{{ target_field_name }}",
-                "bulk_size": {{ target_index_bulk_size }},
-                "data_set_format": "{{ target_index_bulk_index_data_set_format }}",
-                "data_set_path": "{{ target_index_bulk_index_data_set_path }}"
-            },
-            "clients": {{ target_index_bulk_index_clients }}
-        },
-        {
-            "operation": {
-                "name": "refresh-target-index",
-                "operation-type": "custom-refresh",
-                "index": "{{ target_index_name }}",
-                "retries": 100
-            }
-        },
-        {
-            "operation": {
-                "name": "force-merge",
-                "operation-type": "force-merge",
-                "request-timeout": {{ target_index_force_merge_timeout }},
-                "index": "{{ target_index_name }}",
-                "mode": "polling",
-                "max-num-segments": {{ target_index_max_num_segments }}
-            }
-        },
-        {
-            "operation": {
-                "name": "knn-query-from-data-set",
-                "operation-type": "search",
-                "index": "{{ target_index_name }}",
-                "param-source": "knn-query-from-data-set",
-                "k": {{ query_k }},
-                "field": "{{ target_field_name }}",
-                "data_set_format": "{{ query_data_set_format }}",
-                "data_set_path": "{{ query_data_set_path }}"
-          },
-            "clients": {{ query_clients }}
-        }
-    ]
-}
diff --git a/benchmarks/osb/requirements.in b/benchmarks/osb/requirements.in
deleted file mode 100644
index a9e12b5d3..000000000
--- a/benchmarks/osb/requirements.in
+++ /dev/null
@@ -1,4 +0,0 @@
-opensearch-py
-numpy
-h5py
-opensearch-benchmark
diff --git a/benchmarks/osb/requirements.txt b/benchmarks/osb/requirements.txt
deleted file mode 100644
index a220ee44f..000000000
--- a/benchmarks/osb/requirements.txt
+++ /dev/null
@@ -1,96 +0,0 @@
-#
-# This file is autogenerated by pip-compile with python 3.8
-# To update, run:
-#
-#    pip-compile
-#
-aiohttp==3.9.4
-    # via opensearch-py
-aiosignal==1.2.0
-    # via aiohttp
-async-timeout==4.0.2
-    # via aiohttp
-attrs==21.4.0
-    # via
-    #   aiohttp
-    #   jsonschema
-cachetools==4.2.4
-    # via google-auth
-certifi==2023.7.22
-    # via
-    #   opensearch-benchmark
-    #   opensearch-py
-frozenlist==1.3.0
-    # via
-    #   aiohttp
-    #   aiosignal
-google-auth==1.22.1
-    # via opensearch-benchmark
-google-crc32c==1.3.0
-    # via google-resumable-media
-google-resumable-media==1.1.0
-    # via opensearch-benchmark
-h5py==3.6.0
-    # via -r requirements.in
-idna==3.7
-    # via yarl
-ijson==2.6.1
-    # via opensearch-benchmark
-importlib-metadata==4.11.3
-    # via jsonschema
-jinja2==3.1.3
-    # via opensearch-benchmark
-jsonschema==3.1.1
-    # via opensearch-benchmark
-markupsafe==2.0.1
-    # via
-    #   jinja2
-    #   opensearch-benchmark
-multidict==6.0.2
-    # via
-    #   aiohttp
-    #   yarl
-numpy==1.24.2
-    # via
-    #   -r requirements.in
-    #   h5py
-opensearch-benchmark==0.0.2
-    # via -r requirements.in
-opensearch-py[async]==1.0.0
-    # via
-    #   -r requirements.in
-    #   opensearch-benchmark
-psutil==5.8.0
-    # via opensearch-benchmark
-py-cpuinfo==7.0.0
-    # via opensearch-benchmark
-pyasn1==0.4.8
-    # via
-    #   pyasn1-modules
-    #   rsa
-pyasn1-modules==0.2.8
-    # via google-auth
-pyrsistent==0.18.1
-    # via jsonschema
-rsa==4.8
-    # via google-auth
-six==1.16.0
-    # via
-    #   google-auth
-    #   google-resumable-media
-    #   jsonschema
-tabulate==0.8.7
-    # via opensearch-benchmark
-thespian==3.10.1
-    # via opensearch-benchmark
-urllib3==1.26.18
-    # via opensearch-py
-yappi==1.2.3
-    # via opensearch-benchmark
-yarl==1.7.2
-    # via aiohttp
-zipp==3.7.0
-    # via importlib-metadata
-
-# The following packages are considered to be unsafe in a requirements file:
-# setuptools
diff --git a/benchmarks/osb/tests/__init__.py b/benchmarks/osb/tests/__init__.py
deleted file mode 100644
index e69de29bb..000000000
diff --git a/benchmarks/osb/tests/data_set_helper.py b/benchmarks/osb/tests/data_set_helper.py
deleted file mode 100644
index 2b144da49..000000000
--- a/benchmarks/osb/tests/data_set_helper.py
+++ /dev/null
@@ -1,197 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-from abc import ABC, abstractmethod
-
-import h5py
-import numpy as np
-
-from osb.extensions.data_set import Context, HDF5DataSet, BigANNVectorDataSet
-
-""" Module containing utility classes and functions for working with data sets.
-
-Included are utilities that can be used to build data sets and write them to 
-paths.
-"""
-
-
-class DataSetBuildContext:
-    """ Data class capturing information needed to build a particular data set
-
-    Attributes:
-        data_set_context: Indicator of what the data set is used for,
-        vectors: A 2D array containing vectors that are used to build data set.
-        path: string representing path where data set should be serialized to.
-    """
-    def __init__(self, data_set_context: Context, vectors: np.ndarray, path: str):
-        self.data_set_context: Context = data_set_context
-        self.vectors: np.ndarray = vectors  #TODO: Validate shape
-        self.path: str = path
-
-    def get_num_vectors(self) -> int:
-        return self.vectors.shape[0]
-
-    def get_dimension(self) -> int:
-        return self.vectors.shape[1]
-
-    def get_type(self) -> np.dtype:
-        return self.vectors.dtype
-
-
-class DataSetBuilder(ABC):
-    """ Abstract builder used to create a build a collection of data sets
-
-    Attributes:
-        data_set_build_contexts: list of data set build contexts that builder
-                                 will build.
-    """
-    def __init__(self):
-        self.data_set_build_contexts = list()
-
-    def add_data_set_build_context(self, data_set_build_context: DataSetBuildContext):
-        """ Adds a data set build context to list of contexts to be built.
-
-        Args:
-            data_set_build_context: DataSetBuildContext to be added to list
-
-        Returns: Updated DataSetBuilder
-
-        """
-        self._validate_data_set_context(data_set_build_context)
-        self.data_set_build_contexts.append(data_set_build_context)
-        return self
-
-    def build(self):
-        """ Builds and serializes all data sets build contexts
-
-        Returns:
-
-        """
-        [self._build_data_set(data_set_build_context) for data_set_build_context
-         in self.data_set_build_contexts]
-
-    @abstractmethod
-    def _build_data_set(self, context: DataSetBuildContext):
-        """ Builds an individual data set
-
-        Args:
-            context: DataSetBuildContext of data set to be built
-
-        Returns:
-
-        """
-        pass
-
-    @abstractmethod
-    def _validate_data_set_context(self, context: DataSetBuildContext):
-        """ Validates that data set context can be added to this builder
-
-        Args:
-            context: DataSetBuildContext to be validated
-
-        Returns:
-
-        """
-        pass
-
-
-class HDF5Builder(DataSetBuilder):
-
-    def __init__(self):
-        super(HDF5Builder, self).__init__()
-        self.data_set_meta_data = dict()
-
-    def _validate_data_set_context(self, context: DataSetBuildContext):
-        if context.path not in self.data_set_meta_data.keys():
-            self.data_set_meta_data[context.path] = {
-                context.data_set_context: context
-            }
-            return
-
-        if context.data_set_context in \
-                self.data_set_meta_data[context.path].keys():
-            raise IllegalDataSetBuildContext("Path and context for data set "
-                                             "are already present in builder.")
-
-        self.data_set_meta_data[context.path][context.data_set_context] = \
-            context
-
-    @staticmethod
-    def _validate_extension(context: DataSetBuildContext):
-        ext = context.path.split('.')[-1]
-
-        if ext != HDF5DataSet.FORMAT_NAME:
-            raise IllegalDataSetBuildContext("Invalid file extension")
-
-    def _build_data_set(self, context: DataSetBuildContext):
-        # For HDF5, because multiple data sets can be grouped in the same file,
-        # we will build data sets in memory and not write to disk until
-        # _flush_data_sets_to_disk is called
-        with h5py.File(context.path, 'a') as hf:
-            hf.create_dataset(
-                HDF5DataSet.parse_context(context.data_set_context),
-                data=context.vectors
-            )
-
-
-class BigANNBuilder(DataSetBuilder):
-
-    def _validate_data_set_context(self, context: DataSetBuildContext):
-        self._validate_extension(context)
-
-        # prevent the duplication of paths for data sets
-        data_set_paths = [c.path for c in self.data_set_build_contexts]
-        if any(data_set_paths.count(x) > 1 for x in data_set_paths):
-            raise IllegalDataSetBuildContext("Build context paths have to be "
-                                              "unique.")
-
-    @staticmethod
-    def _validate_extension(context: DataSetBuildContext):
-        ext = context.path.split('.')[-1]
-
-        if ext != BigANNVectorDataSet.U8BIN_EXTENSION and ext != \
-                BigANNVectorDataSet.FBIN_EXTENSION:
-            raise IllegalDataSetBuildContext("Invalid file extension")
-
-        if ext == BigANNVectorDataSet.U8BIN_EXTENSION and context.get_type() != \
-                np.u8int:
-            raise IllegalDataSetBuildContext("Invalid data type for {} ext."
-                                             .format(BigANNVectorDataSet
-                                                     .U8BIN_EXTENSION))
-
-        if ext == BigANNVectorDataSet.FBIN_EXTENSION and context.get_type() != \
-                np.float32:
-            print(context.get_type())
-            raise IllegalDataSetBuildContext("Invalid data type for {} ext."
-                                             .format(BigANNVectorDataSet
-                                                     .FBIN_EXTENSION))
-
-    def _build_data_set(self, context: DataSetBuildContext):
-        num_vectors = context.get_num_vectors()
-        dimension = context.get_dimension()
-
-        with open(context.path, 'wb') as f:
-            f.write(int.to_bytes(num_vectors, 4, "little"))
-            f.write(int.to_bytes(dimension, 4, "little"))
-            context.vectors.tofile(f)
-
-
-def create_random_2d_array(num_vectors: int, dimension: int) -> np.ndarray:
-    rng = np.random.default_rng()
-    return rng.random(size=(num_vectors, dimension), dtype=np.float32)
-
-
-class IllegalDataSetBuildContext(Exception):
-    """Exception raised when passed in DataSetBuildContext is illegal
-
-    Attributes:
-        message -- explanation of the error
-    """
-
-    def __init__(self, message: str):
-        self.message = f'{message}'
-        super().__init__(self.message)
-
diff --git a/benchmarks/osb/tests/test_param_sources.py b/benchmarks/osb/tests/test_param_sources.py
deleted file mode 100644
index cda730cee..000000000
--- a/benchmarks/osb/tests/test_param_sources.py
+++ /dev/null
@@ -1,353 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-import os
-import random
-import shutil
-import string
-import sys
-import tempfile
-import unittest
-
-# Add parent directory to path
-import numpy as np
-
-sys.path.append(os.path.abspath(os.path.join(os.getcwd(), os.pardir)))
-
-from osb.tests.data_set_helper import HDF5Builder, create_random_2d_array, \
-    DataSetBuildContext, BigANNBuilder
-from osb.extensions.data_set import Context, HDF5DataSet
-from osb.extensions.param_sources import VectorsFromDataSetParamSource, \
-    QueryVectorsFromDataSetParamSource, BulkVectorsFromDataSetParamSource
-from osb.extensions.util import ConfigurationError
-
-DEFAULT_INDEX_NAME = "test-index"
-DEFAULT_FIELD_NAME = "test-field"
-DEFAULT_CONTEXT = Context.INDEX
-DEFAULT_TYPE = HDF5DataSet.FORMAT_NAME
-DEFAULT_NUM_VECTORS = 10
-DEFAULT_DIMENSION = 10
-DEFAULT_RANDOM_STRING_LENGTH = 8
-
-
-class VectorsFromDataSetParamSourceTestCase(unittest.TestCase):
-
-    def setUp(self) -> None:
-        self.data_set_dir = tempfile.mkdtemp()
-
-        # Create a data set we know to be valid for convenience
-        self.valid_data_set_path = _create_data_set(
-            DEFAULT_NUM_VECTORS,
-            DEFAULT_DIMENSION,
-            DEFAULT_TYPE,
-            DEFAULT_CONTEXT,
-            self.data_set_dir
-        )
-
-    def tearDown(self):
-        shutil.rmtree(self.data_set_dir)
-
-    def test_missing_params(self):
-        empty_params = dict()
-        self.assertRaises(
-            ConfigurationError,
-            lambda: VectorsFromDataSetParamSourceTestCase.
-                TestVectorsFromDataSetParamSource(empty_params, DEFAULT_CONTEXT)
-        )
-
-    def test_invalid_data_set_format(self):
-        invalid_data_set_format = "invalid-data-set-format"
-
-        test_param_source_params = {
-            "index": DEFAULT_INDEX_NAME,
-            "field": DEFAULT_FIELD_NAME,
-            "data_set_format": invalid_data_set_format,
-            "data_set_path": self.valid_data_set_path,
-        }
-        self.assertRaises(
-            ConfigurationError,
-            lambda: self.TestVectorsFromDataSetParamSource(
-                test_param_source_params,
-                DEFAULT_CONTEXT
-            )
-        )
-
-    def test_invalid_data_set_path(self):
-        invalid_data_set_path = "invalid-data-set-path"
-        test_param_source_params = {
-            "index": DEFAULT_INDEX_NAME,
-            "field": DEFAULT_FIELD_NAME,
-            "data_set_format": HDF5DataSet.FORMAT_NAME,
-            "data_set_path": invalid_data_set_path,
-        }
-        self.assertRaises(
-            FileNotFoundError,
-            lambda: self.TestVectorsFromDataSetParamSource(
-                test_param_source_params,
-                DEFAULT_CONTEXT
-            )
-        )
-
-    def test_partition_hdf5(self):
-        num_vectors = 100
-
-        hdf5_data_set_path = _create_data_set(
-            num_vectors,
-            DEFAULT_DIMENSION,
-            HDF5DataSet.FORMAT_NAME,
-            DEFAULT_CONTEXT,
-            self.data_set_dir
-        )
-
-        test_param_source_params = {
-            "index": DEFAULT_INDEX_NAME,
-            "field": DEFAULT_FIELD_NAME,
-            "data_set_format": HDF5DataSet.FORMAT_NAME,
-            "data_set_path": hdf5_data_set_path,
-        }
-        test_param_source = self.TestVectorsFromDataSetParamSource(
-            test_param_source_params,
-            DEFAULT_CONTEXT
-        )
-
-        num_partitions = 10
-        vecs_per_partition = test_param_source.num_vectors // num_partitions
-
-        self._test_partition(
-            test_param_source,
-            num_partitions,
-            vecs_per_partition
-        )
-
-    def test_partition_bigann(self):
-        num_vectors = 100
-        float_extension = "fbin"
-
-        bigann_data_set_path = _create_data_set(
-            num_vectors,
-            DEFAULT_DIMENSION,
-            float_extension,
-            DEFAULT_CONTEXT,
-            self.data_set_dir
-        )
-
-        test_param_source_params = {
-            "index": DEFAULT_INDEX_NAME,
-            "field": DEFAULT_FIELD_NAME,
-            "data_set_format": "bigann",
-            "data_set_path": bigann_data_set_path,
-        }
-        test_param_source = self.TestVectorsFromDataSetParamSource(
-            test_param_source_params,
-            DEFAULT_CONTEXT
-        )
-
-        num_partitions = 10
-        vecs_per_partition = test_param_source.num_vectors // num_partitions
-
-        self._test_partition(
-            test_param_source,
-            num_partitions,
-            vecs_per_partition
-        )
-
-    def _test_partition(
-            self,
-            test_param_source: VectorsFromDataSetParamSource,
-            num_partitions: int,
-            vec_per_partition: int
-    ):
-        for i in range(num_partitions):
-            test_param_source_i = test_param_source.partition(i, num_partitions)
-            self.assertEqual(test_param_source_i.num_vectors, vec_per_partition)
-            self.assertEqual(test_param_source_i.offset, i * vec_per_partition)
-
-    class TestVectorsFromDataSetParamSource(VectorsFromDataSetParamSource):
-        """
-        Empty implementation of ABC VectorsFromDataSetParamSource so that we can
-        test the concrete methods.
-        """
-
-        def params(self):
-            pass
-
-
-class QueryVectorsFromDataSetParamSourceTestCase(unittest.TestCase):
-
-    def setUp(self) -> None:
-        self.data_set_dir = tempfile.mkdtemp()
-
-    def tearDown(self):
-        shutil.rmtree(self.data_set_dir)
-
-    def test_params(self):
-        # Create a data set
-        k = 12
-        data_set_path = _create_data_set(
-            DEFAULT_NUM_VECTORS,
-            DEFAULT_DIMENSION,
-            DEFAULT_TYPE,
-            Context.QUERY,
-            self.data_set_dir
-        )
-
-        # Create a QueryVectorsFromDataSetParamSource with relevant params
-        test_param_source_params = {
-            "index": DEFAULT_INDEX_NAME,
-            "field": DEFAULT_FIELD_NAME,
-            "data_set_format": DEFAULT_TYPE,
-            "data_set_path": data_set_path,
-            "k": k,
-        }
-        query_param_source = QueryVectorsFromDataSetParamSource(
-            None, test_param_source_params
-        )
-
-        # Check each
-        for i in range(DEFAULT_NUM_VECTORS):
-            self._check_params(
-                query_param_source.params(),
-                DEFAULT_INDEX_NAME,
-                DEFAULT_FIELD_NAME,
-                DEFAULT_DIMENSION,
-                k
-            )
-
-        # Assert last call creates stop iteration
-        self.assertRaises(
-            StopIteration,
-            lambda: query_param_source.params()
-        )
-
-    def _check_params(
-            self,
-            params: dict,
-            expected_index: str,
-            expected_field: str,
-            expected_dimension: int,
-            expected_k: int
-    ):
-        index_name = params.get("index")
-        self.assertEqual(expected_index, index_name)
-        body = params.get("body")
-        self.assertIsInstance(body, dict)
-        query = body.get("query")
-        self.assertIsInstance(query, dict)
-        query_knn = query.get("knn")
-        self.assertIsInstance(query_knn, dict)
-        field = query_knn.get(expected_field)
-        self.assertIsInstance(field, dict)
-        vector = field.get("vector")
-        self.assertIsInstance(vector, np.ndarray)
-        self.assertEqual(len(list(vector)), expected_dimension)
-        k = field.get("k")
-        self.assertEqual(k, expected_k)
-
-
-class BulkVectorsFromDataSetParamSourceTestCase(unittest.TestCase):
-
-    def setUp(self) -> None:
-        self.data_set_dir = tempfile.mkdtemp()
-
-    def tearDown(self):
-        shutil.rmtree(self.data_set_dir)
-
-    def test_params(self):
-        num_vectors = 49
-        bulk_size = 10
-        data_set_path = _create_data_set(
-            num_vectors,
-            DEFAULT_DIMENSION,
-            DEFAULT_TYPE,
-            Context.INDEX,
-            self.data_set_dir
-        )
-
-        test_param_source_params = {
-            "index": DEFAULT_INDEX_NAME,
-            "field": DEFAULT_FIELD_NAME,
-            "data_set_format": DEFAULT_TYPE,
-            "data_set_path": data_set_path,
-            "bulk_size": bulk_size
-        }
-        bulk_param_source = BulkVectorsFromDataSetParamSource(
-            None, test_param_source_params
-        )
-
-        # Check each payload returned
-        vectors_consumed = 0
-        while vectors_consumed < num_vectors:
-            expected_num_vectors = min(num_vectors - vectors_consumed, bulk_size)
-            self._check_params(
-                bulk_param_source.params(),
-                DEFAULT_INDEX_NAME,
-                DEFAULT_FIELD_NAME,
-                DEFAULT_DIMENSION,
-                expected_num_vectors
-            )
-            vectors_consumed += expected_num_vectors
-
-        # Assert last call creates stop iteration
-        self.assertRaises(
-            StopIteration,
-            lambda: bulk_param_source.params()
-        )
-
-    def _check_params(
-            self,
-            params: dict,
-            expected_index: str,
-            expected_field: str,
-            expected_dimension: int,
-            expected_num_vectors_in_payload: int
-    ):
-        size = params.get("size")
-        self.assertEqual(size, expected_num_vectors_in_payload)
-        body = params.get("body")
-        self.assertIsInstance(body, list)
-        self.assertEqual(len(body) // 2, expected_num_vectors_in_payload)
-
-        # Bulk payload has 2 parts: first one is the header and the second one
-        # is the body. The header will have the index name and the body will
-        # have the vector
-        for header, req_body in zip(*[iter(body)] * 2):
-            index = header.get("index")
-            self.assertIsInstance(index, dict)
-            index_name = index.get("_index")
-            self.assertEqual(index_name, expected_index)
-
-            vector = req_body.get(expected_field)
-            self.assertIsInstance(vector, list)
-            self.assertEqual(len(vector), expected_dimension)
-
-
-def _create_data_set(
-        num_vectors: int,
-        dimension: int,
-        extension: str,
-        data_set_context: Context,
-        data_set_dir
-) -> str:
-
-    file_name_base = ''.join(random.choice(string.ascii_letters) for _ in
-                             range(DEFAULT_RANDOM_STRING_LENGTH))
-    data_set_file_name = "{}.{}".format(file_name_base, extension)
-    data_set_path = os.path.join(data_set_dir, data_set_file_name)
-    context = DataSetBuildContext(
-        data_set_context,
-        create_random_2d_array(num_vectors, dimension),
-        data_set_path)
-
-    if extension == HDF5DataSet.FORMAT_NAME:
-        HDF5Builder().add_data_set_build_context(context).build()
-    else:
-        BigANNBuilder().add_data_set_build_context(context).build()
-
-    return data_set_path
-
-
-if __name__ == '__main__':
-    unittest.main()
diff --git a/benchmarks/osb/workload.json b/benchmarks/osb/workload.json
deleted file mode 100644
index bd0d84195..000000000
--- a/benchmarks/osb/workload.json
+++ /dev/null
@@ -1,17 +0,0 @@
-{% import "benchmark.helpers" as benchmark with context %}
-{
-    "version": 2,
-    "description": "k-NN Plugin train workload",
-    "indices": [
-        {
-            "name": "{{ target_index_name }}",
-            "body": "{{ target_index_body }}"
-        },
-        {
-            "name": "{{ train_index_name }}",
-            "body": "{{ train_index_body }}"
-        }
-    ],
-    "operations": {{ benchmark.collect(parts="operations/*.json") }},
-    "test_procedures": [{{ benchmark.collect(parts="procedures/*.json") }}]
-}
diff --git a/benchmarks/osb/workload.py b/benchmarks/osb/workload.py
deleted file mode 100644
index 32e6ad02c..000000000
--- a/benchmarks/osb/workload.py
+++ /dev/null
@@ -1,18 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-# This code needs to be included at the top of every workload.py file.
-# OpenSearch Benchmarks is not able to find other helper files unless the path
-# is updated.
-import os
-import sys
-sys.path.append(os.path.abspath(os.getcwd()))
-
-from extensions.registry import register as custom_register
-
-
-def register(registry):
-    custom_register(registry)
diff --git a/benchmarks/perf-tool/.pylintrc b/benchmarks/perf-tool/.pylintrc
deleted file mode 100644
index 15bf4ccc3..000000000
--- a/benchmarks/perf-tool/.pylintrc
+++ /dev/null
@@ -1,443 +0,0 @@
-# This Pylint rcfile contains a best-effort configuration to uphold the
-# best-practices and style described in the Google Python style guide:
-#   https://google.github.io/styleguide/pyguide.html
-#
-# Its canonical open-source location is:
-#   https://google.github.io/styleguide/pylintrc
-
-[MASTER]
-
-fail-under=9.0
-
-# Files or directories to be skipped. They should be base names, not paths.
-ignore=third_party
-
-# Files or directories matching the regex patterns are skipped. The regex
-# matches against base names, not paths.
-ignore-patterns=
-
-# Pickle collected data for later comparisons.
-persistent=no
-
-# List of plugins (as comma separated values of python modules names) to load,
-# usually to register additional checkers.
-load-plugins=
-
-# Use multiple processes to speed up Pylint.
-jobs=4
-
-# Allow loading of arbitrary C extensions. Extensions are imported into the
-# active Python interpreter and may run arbitrary code.
-unsafe-load-any-extension=no
-
-
-[MESSAGES CONTROL]
-
-# Only show warnings with the listed confidence levels. Leave empty to show
-# all. Valid levels: HIGH, INFERENCE, INFERENCE_FAILURE, UNDEFINED
-confidence=
-
-# Enable the message, report, category or checker with the given id(s). You can
-# either give multiple identifier separated by comma (,) or put this option
-# multiple time (only on the command line, not in the configuration file where
-# it should appear only once). See also the "--disable" option for examples.
-#enable=
-
-# Disable the message, report, category or checker with the given id(s). You
-# can either give multiple identifiers separated by comma (,) or put this
-# option multiple times (only on the command line, not in the configuration
-# file where it should appear only once).You can also use "--disable=all" to
-# disable everything first and then reenable specific checks. For example, if
-# you want to run only the similarities checker, you can use "--disable=all
-# --enable=similarities". If you want to run only the classes checker, but have
-# no Warning level messages displayed, use"--disable=all --enable=classes
-# --disable=W"
-disable=abstract-method,
-        apply-builtin,
-        arguments-differ,
-        attribute-defined-outside-init,
-        backtick,
-        bad-option-value,
-        basestring-builtin,
-        buffer-builtin,
-        c-extension-no-member,
-        consider-using-enumerate,
-        cmp-builtin,
-        cmp-method,
-        coerce-builtin,
-        coerce-method,
-        delslice-method,
-        div-method,
-        duplicate-code,
-        eq-without-hash,
-        execfile-builtin,
-        file-builtin,
-        filter-builtin-not-iterating,
-        fixme,
-        getslice-method,
-        global-statement,
-        hex-method,
-        idiv-method,
-        implicit-str-concat-in-sequence,
-        import-error,
-        import-self,
-        import-star-module-level,
-        inconsistent-return-statements,
-        input-builtin,
-        intern-builtin,
-        invalid-str-codec,
-        locally-disabled,
-        long-builtin,
-        long-suffix,
-        map-builtin-not-iterating,
-        misplaced-comparison-constant,
-        missing-function-docstring,
-        metaclass-assignment,
-        next-method-called,
-        next-method-defined,
-        no-absolute-import,
-        no-else-break,
-        no-else-continue,
-        no-else-raise,
-        no-else-return,
-        no-init,  # added
-        no-member,
-        no-name-in-module,
-        no-self-use,
-        nonzero-method,
-        oct-method,
-        old-division,
-        old-ne-operator,
-        old-octal-literal,
-        old-raise-syntax,
-        parameter-unpacking,
-        print-statement,
-        raising-string,
-        range-builtin-not-iterating,
-        raw_input-builtin,
-        rdiv-method,
-        reduce-builtin,
-        relative-import,
-        reload-builtin,
-        round-builtin,
-        setslice-method,
-        signature-differs,
-        standarderror-builtin,
-        suppressed-message,
-        sys-max-int,
-        too-few-public-methods,
-        too-many-ancestors,
-        too-many-arguments,
-        too-many-boolean-expressions,
-        too-many-branches,
-        too-many-instance-attributes,
-        too-many-locals,
-        too-many-nested-blocks,
-        too-many-public-methods,
-        too-many-return-statements,
-        too-many-statements,
-        trailing-newlines,
-        unichr-builtin,
-        unicode-builtin,
-        unnecessary-pass,
-        unpacking-in-except,
-        useless-else-on-loop,
-        useless-object-inheritance,
-        useless-suppression,
-        using-cmp-argument,
-        wrong-import-order,
-        xrange-builtin,
-        zip-builtin-not-iterating,
-
-
-[REPORTS]
-
-# Set the output format. Available formats are text, parseable, colorized, msvs
-# (visual studio) and html. You can also give a reporter class, eg
-# mypackage.mymodule.MyReporterClass.
-output-format=text
-
-# Put messages in a separate file for each module / package specified on the
-# command line instead of printing them on stdout. Reports (if any) will be
-# written in a file name "pylint_global.[txt|html]". This option is deprecated
-# and it will be removed in Pylint 2.0.
-files-output=no
-
-# Tells whether to display a full report or only the messages
-reports=no
-
-# Python expression which should return a note less than 10 (10 is the highest
-# note). You have access to the variables errors warning, statement which
-# respectively contain the number of errors / warnings messages and the total
-# number of statements analyzed. This is used by the global evaluation report
-# (RP0004).
-evaluation=10.0 - ((float(5 * error + warning + refactor + convention) / statement) * 10)
-
-# Template used to display messages. This is a python new-style format string
-# used to format the message information. See doc for all details
-#msg-template=
-
-
-[BASIC]
-
-# Good variable names which should always be accepted, separated by a comma
-good-names=main,_
-
-# Bad variable names which should always be refused, separated by a comma
-bad-names=
-
-# Colon-delimited sets of names that determine each other's naming style when
-# the name regexes allow several styles.
-name-group=
-
-# Include a hint for the correct naming format with invalid-name
-include-naming-hint=no
-
-# List of decorators that produce properties, such as abc.abstractproperty. Add
-# to this list to register other decorators that produce valid properties.
-property-classes=abc.abstractproperty,cached_property.cached_property,cached_property.threaded_cached_property,cached_property.cached_property_with_ttl,cached_property.threaded_cached_property_with_ttl
-
-# Regular expression matching correct function names
-function-rgx=^(?:(?P<exempt>setUp|tearDown|setUpModule|tearDownModule)|(?P<camel_case>_?[A-Z][a-zA-Z0-9]*)|(?P<snake_case>_?[a-z][a-z0-9_]*))$
-
-# Regular expression matching correct variable names
-variable-rgx=^[a-z][a-z0-9_]*$
-
-# Regular expression matching correct constant names
-const-rgx=^(_?[A-Z][A-Z0-9_]*|__[a-z0-9_]+__|_?[a-z][a-z0-9_]*)$
-
-# Regular expression matching correct attribute names
-attr-rgx=^_{0,2}[a-z][a-z0-9_]*$
-
-# Regular expression matching correct argument names
-argument-rgx=^[a-z][a-z0-9_]*$
-
-# Regular expression matching correct class attribute names
-class-attribute-rgx=^(_?[A-Z][A-Z0-9_]*|__[a-z0-9_]+__|_?[a-z][a-z0-9_]*)$
-
-# Regular expression matching correct inline iteration names
-inlinevar-rgx=^[a-z][a-z0-9_]*$
-
-# Regular expression matching correct class names
-class-rgx=^_?[A-Z][a-zA-Z0-9]*$
-
-# Regular expression matching correct module names
-module-rgx=^(_?[a-z][a-z0-9_]*|__init__)$
-
-# Regular expression matching correct method names
-method-rgx=(?x)^(?:(?P<exempt>_[a-z0-9_]+__|runTest|setUp|tearDown|setUpTestCase|tearDownTestCase|setupSelf|tearDownClass|setUpClass|(test|assert)_*[A-Z0-9][a-zA-Z0-9_]*|next)|(?P<camel_case>_{0,2}[A-Z][a-zA-Z0-9_]*)|(?P<snake_case>_{0,2}[a-z][a-z0-9_]*))$
-
-# Regular expression which should only match function or class names that do
-# not require a docstring.
-no-docstring-rgx=(__.*__|main|test.*|.*test|.*Test)$
-
-# Minimum line length for functions/classes that require docstrings, shorter
-# ones are exempt.
-docstring-min-length=10
-
-
-[TYPECHECK]
-
-# List of decorators that produce context managers, such as
-# contextlib.contextmanager. Add to this list to register other decorators that
-# produce valid context managers.
-contextmanager-decorators=contextlib.contextmanager,contextlib2.contextmanager
-
-# Tells whether missing members accessed in mixin class should be ignored. A
-# mixin class is detected if its name ends with "mixin" (case insensitive).
-ignore-mixin-members=yes
-
-# List of module names for which member attributes should not be checked
-# (useful for modules/projects where namespaces are manipulated during runtime
-# and thus existing member attributes cannot be deduced by static analysis. It
-# supports qualified module names, as well as Unix pattern matching.
-ignored-modules=
-
-# List of class names for which member attributes should not be checked (useful
-# for classes with dynamically set attributes). This supports the use of
-# qualified names.
-ignored-classes=optparse.Values,thread._local,_thread._local
-
-# List of members which are set dynamically and missed by pylint inference
-# system, and so shouldn't trigger E1101 when accessed. Python regular
-# expressions are accepted.
-generated-members=
-
-
-[FORMAT]
-
-# Maximum number of characters on a single line.
-max-line-length=80
-
-# TODO(https://github.com/PyCQA/pylint/issues/3352): Direct pylint to exempt
-# lines made too long by directives to pytype.
-
-# Regexp for a line that is allowed to be longer than the limit.
-ignore-long-lines=(?x)(
-  ^\s*(\#\ )?<?https?://\S+>?$|
-  ^\s*(from\s+\S+\s+)?import\s+.+$)
-
-# Allow the body of an if to be on the same line as the test if there is no
-# else.
-single-line-if-stmt=yes
-
-# List of optional constructs for which whitespace checking is disabled. `dict-
-# separator` is used to allow tabulation in dicts, etc.: {1  : 1,\n222: 2}.
-# `trailing-comma` allows a space between comma and closing bracket: (a, ).
-# `empty-line` allows space-only lines.
-no-space-check=
-
-# Maximum number of lines in a module
-max-module-lines=99999
-
-# String used as indentation unit.  The internal Google style guide mandates 2
-# spaces.  Google's externaly-published style guide says 4, consistent with
-# PEP 8.  Here, we use 2 spaces, for conformity with many open-sourced Google
-# projects (like TensorFlow).
-indent-string='    '
-
-# Number of spaces of indent required inside a hanging  or continued line.
-indent-after-paren=4
-
-# Expected format of line ending, e.g. empty (any line ending), LF or CRLF.
-expected-line-ending-format=
-
-
-[MISCELLANEOUS]
-
-# List of note tags to take in consideration, separated by a comma.
-notes=TODO
-
-
-[STRING]
-
-# This flag controls whether inconsistent-quotes generates a warning when the
-# character used as a quote delimiter is used inconsistently within a module.
-check-quote-consistency=yes
-
-
-[VARIABLES]
-
-# Tells whether we should check for unused import in __init__ files.
-init-import=no
-
-# A regular expression matching the name of dummy variables (i.e. expectedly
-# not used).
-dummy-variables-rgx=^\*{0,2}(_$|unused_|dummy_)
-
-# List of additional names supposed to be defined in builtins. Remember that
-# you should avoid to define new builtins when possible.
-additional-builtins=
-
-# List of strings which can identify a callback function by name. A callback
-# name must start or end with one of those strings.
-callbacks=cb_,_cb
-
-# List of qualified module names which can have objects that can redefine
-# builtins.
-redefining-builtins-modules=six,six.moves,past.builtins,future.builtins,functools
-
-
-[LOGGING]
-
-# Logging modules to check that the string format arguments are in logging
-# function parameter format
-logging-modules=logging,absl.logging,tensorflow.io.logging
-
-
-[SIMILARITIES]
-
-# Minimum lines number of a similarity.
-min-similarity-lines=4
-
-# Ignore comments when computing similarities.
-ignore-comments=yes
-
-# Ignore docstrings when computing similarities.
-ignore-docstrings=yes
-
-# Ignore imports when computing similarities.
-ignore-imports=no
-
-
-[SPELLING]
-
-# Spelling dictionary name. Available dictionaries: none. To make it working
-# install python-enchant package.
-spelling-dict=
-
-# List of comma separated words that should not be checked.
-spelling-ignore-words=
-
-# A path to a file that contains private dictionary; one word per line.
-spelling-private-dict-file=
-
-# Tells whether to store unknown words to indicated private dictionary in
-# --spelling-private-dict-file option instead of raising a message.
-spelling-store-unknown-words=no
-
-
-[IMPORTS]
-
-# Deprecated modules which should not be used, separated by a comma
-deprecated-modules=regsub,
-                   TERMIOS,
-                   Bastion,
-                   rexec,
-                   sets
-
-# Create a graph of every (i.e. internal and external) dependencies in the
-# given file (report RP0402 must not be disabled)
-import-graph=
-
-# Create a graph of external dependencies in the given file (report RP0402 must
-# not be disabled)
-ext-import-graph=
-
-# Create a graph of internal dependencies in the given file (report RP0402 must
-# not be disabled)
-int-import-graph=
-
-# Force import order to recognize a module as part of the standard
-# compatibility libraries.
-known-standard-library=
-
-# Force import order to recognize a module as part of a third party library.
-known-third-party=enchant, absl
-
-# Analyse import fallback blocks. This can be used to support both Python 2 and
-# 3 compatible code, which means that the block might have code that exists
-# only in one or another interpreter, leading to false positives when analysed.
-analyse-fallback-blocks=no
-
-
-[CLASSES]
-
-# List of method names used to declare (i.e. assign) instance attributes.
-defining-attr-methods=__init__,
-                      __new__,
-                      setUp
-
-# List of member names, which should be excluded from the protected access
-# warning.
-exclude-protected=_asdict,
-                  _fields,
-                  _replace,
-                  _source,
-                  _make
-
-# List of valid names for the first argument in a class method.
-valid-classmethod-first-arg=cls,
-                            class_
-
-# List of valid names for the first argument in a metaclass class method.
-valid-metaclass-classmethod-first-arg=mcs
-
-
-[EXCEPTIONS]
-
-# Exceptions that will emit a warning when being caught. Defaults to
-# "Exception"
-overgeneral-exceptions=StandardError,
-                       Exception,
-                       BaseException
diff --git a/benchmarks/perf-tool/.style.yapf b/benchmarks/perf-tool/.style.yapf
deleted file mode 100644
index 39b663a7a..000000000
--- a/benchmarks/perf-tool/.style.yapf
+++ /dev/null
@@ -1,10 +0,0 @@
-[style]
-COLUMN_LIMIT: 80
-DEDENT_CLOSING_BRACKETS: True
-INDENT_DICTIONARY_VALUE: True
-SPLIT_ALL_COMMA_SEPARATED_VALUES: True
-SPLIT_ARGUMENTS_WHEN_COMMA_TERMINATED: True
-SPLIT_BEFORE_CLOSING_BRACKET: True
-SPLIT_BEFORE_EXPRESSION_AFTER_OPENING_PAREN: True
-SPLIT_BEFORE_FIRST_ARGUMENT: True
-SPLIT_BEFORE_NAMED_ASSIGNS: True
diff --git a/benchmarks/perf-tool/README.md b/benchmarks/perf-tool/README.md
deleted file mode 100644
index 36f76bcdb..000000000
--- a/benchmarks/perf-tool/README.md
+++ /dev/null
@@ -1,449 +0,0 @@
-# IMPORTANT NOTE: No new features will be added to this tool . This tool is currently in maintanence mode. All new features will be added to [vector search workload]( https://github.com/opensearch-project/opensearch-benchmark-workloads/tree/main/vectorsearch)
-
-# OpenSearch k-NN Benchmarking
-- [Welcome!](#welcome)
-- [Install Prerequisites](#install-prerequisites)
-- [Usage](#usage)
-- [Contributing](#contributing)
-
-## Welcome!
-
-This directory contains the code related to benchmarking the k-NN plugin. 
-Benchmarks can be run against any OpenSearch cluster with the k-NN plugin 
-installed. Benchmarks are highly configurable using the test configuration 
-file.
-
-## Install Prerequisites
-
-### Setup
-
-K-NN perf requires Python 3.8 or greater to be installed. One of 
-the easier ways to do this is through Conda, a package and environment 
-management system for Python.
-
-First, follow the 
-[installation instructions](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html) 
-to install Conda on your system.
-
-Next, create a Python 3.8 environment:
-```
-conda create -n knn-perf python=3.8
-```
-
-After the environment is created, activate it:
-```
-source activate knn-perf
-```
-
-Lastly, clone the k-NN repo and install all required python packages:
-```
-git clone https://github.com/opensearch-project/k-NN.git
-cd k-NN/benchmarks/perf-tool
-pip install -r requirements.txt
-```
-
-After all of this completes, you should be ready to run your first performance benchmarks!
-
-
-## Usage
-
-### Quick Start
-
-In order to run a benchmark, you must first create a test configuration yml 
-file. Checkout [this example](https://github.com/opensearch-project/k-NN/blob/main/benchmarks/perf-tool/sample-configs) file 
-for benchmarking *faiss*'s IVF method. This file contains the definition for 
-the benchmark that you want to run. At the top are 
-[test parameters](#test-parameters). These define high level settings of the 
-test, such as the endpoint of the OpenSearch cluster. 
-
-Next, you define the actions that the test will perform. These actions are 
-referred to as steps. First, you can define "setup" steps. These are steps that 
-are run once at the beginning of the execution to configure the cluster how you 
-want it. These steps do not contribute to the final metrics.
-
-After that, you define the "steps". These are the steps that the test will be 
-collecting metrics on. Each step emits certain metrics. These are run 
-multiple times, depending on the test parameter "num_runs". At the end of the 
-execution of all of the runs, the metrics from each run are collected and 
-averaged.
-
-Lastly, you define the "cleanup" steps. The "cleanup" steps are executed after 
-each test run. For instance, if you are measuring index performance, you may 
-want to delete the index after each run.
-
-To run the test, execute the following command:
-```
-python knn-perf-tool.py [--log LOGLEVEL] test config-path.yml output.json
-
---log       log level of tool, options are: info, debug, warning, error, critical
-```
-
-The output will be a json document containing the results.
-
-Additionally, you can get the difference between two test runs using the diff 
-command:
-```
-python knn-perf-tool.py [--log LOGLEVEL] diff result1.json result2.json
-
---log       log level of tool, options are: info, debug, warning, error, critical
-```
-
-The output will be the delta between the two metrics.
-
-### Test Parameters
-
-| Parameter Name | Description                                                                        | Default    |  
-|----------------|------------------------------------------------------------------------------------|------------|
-| endpoint       | Endpoint OpenSearch cluster is running on                                          | localhost  |
-| port           | Port on which OpenSearch Cluster is running on                                     | 9200       |
-| test_name      | Name of test                                                                       | No default |
-| test_id        | String ID of test                                                                  | No default |
-| num_runs       | Number of runs to execute steps                                                    | 1          |
-| show_runs      | Whether to output each run in addition to the total summary                        | false      |
-| setup          | List of steps to run once before metric collection starts                          | []         |
-| steps          | List of steps that make up one test run. Metrics will be collected on these steps. | No default |
-| cleanup        | List of steps to run after each test run                                           | []         |
-
-### Steps
-
-Included are the list of steps that are currently supported. Each step contains 
-a set of parameters that are passed in the test configuration file and a set 
-of metrics that the test produces. 
-
-#### create_index
-
-Creates an OpenSearch index.
-
-##### Parameters
-| Parameter Name | Description | Default |  
-| ----------- | ----------- | ----------- |
-| index_name | Name of index to create | No default |
-| index_spec | Path to index specification | No default |
-
-##### Metrics
-
-| Metric Name | Description | Unit |  
-| ----------- | ----------- | ----------- |
-| took | Time to execute step end to end. | ms |
-
-#### disable_refresh
-
-Disables refresh for all indices in the cluster.
-
-##### Parameters
-
-| Parameter Name | Description | Default |  
-| ----------- | ----------- | ----------- |
-
-##### Metrics
-
-| Metric Name | Description | Unit |  
-| ----------- | ----------- | ----------- |
-| took | Time to execute step end to end. | ms |
-
-#### refresh_index
-
-Refreshes an OpenSearch index.
-
-##### Parameters
-
-| Parameter Name | Description | Default |  
-| ----------- | ----------- | ----------- |
-| index_name | Name of index to refresh | No default |
-
-##### Metrics
-
-| Metric Name | Description | Unit |  
-| ----------- | ----------- | ----------- |
-| took | Time to execute step end to end. | ms |
-| store_kb | Size of index after refresh completes | KB |
-
-#### force_merge
-
-Force merges an index to a specified number of segments.
-
-##### Parameters
-
-| Parameter Name | Description | Default |  
-| ----------- | ----------- | ----------- |
-| index_name | Name of index to force merge | No default |
-| max_num_segments | Number of segments to force merge to | No default |
-
-##### Metrics
-
-| Metric Name | Description | Unit |  
-| ----------- | ----------- | ----------- |
-| took | Time to execute step end to end. | ms |
-
-#### train_model
-
-Trains a model.
-
-##### Parameters
-
-| Parameter Name | Description | Default |  
-| ----------- | ----------- | ----------- |
-| model_id | Model id to set | Test |
-| train_index | Index to pull training data from | No default |
-| train_field | Field to pull training data from | No default |
-| dimension | Dimension of model | No default |
-| description | Description of model | No default |
-| max_training_vector_count | Number of training vectors to used | No default |
-| method_spec | Path to method specification | No default |
-
-##### Metrics
-
-| Metric Name | Description | Unit |  
-| ----------- | ----------- | ----------- |
-| took | Time to execute step end to end | ms |
-
-#### delete_model
-
-Deletes a model from the cluster.
-
-##### Parameters
-
-| Parameter Name | Description | Default |  
-| ----------- | ----------- | ----------- |
-| model_id | Model id to delete | Test |
-
-##### Metrics
-
-| Metric Name | Description | Unit |  
-| ----------- | ----------- | ----------- |
-| took | Time to execute step end to end | ms |
-
-#### delete_index
-
-Deletes an index from the cluster.
-
-##### Parameters
-
-| Parameter Name | Description | Default |  
-| ----------- | ----------- | ----------- |
-| index_name | Name of index to delete | No default |
-
-##### Metrics
-
-| Metric Name | Description | Unit |  
-| ----------- | ----------- | ----------- |
-| took | Time to execute step end to end | ms |
-
-#### ingest
-
-Ingests a dataset of vectors into the cluster.
-
-##### Parameters
-
-| Parameter Name | Description | Default |  
-| ----------- | ----------- | ----------- |
-| index_name | Name of index to ingest into | No default |
-| field_name | Name of field to ingest into | No default |
-| bulk_size | Documents per bulk request | 300 |
-| dataset_format | Format the data-set is in. Currently hdf5 and bigann is supported. The hdf5 file must be organized in the same way that the ann-benchmarks organizes theirs. | 'hdf5' |
-| dataset_path | Path to data-set | No default |
-| doc_count | Number of documents to create from data-set | Size of the data-set |
-
-##### Metrics
-
-| Metric Name | Description | Unit |  
-| ----------- | ----------- | ----------- |
-| took | Total time to ingest the dataset into the index.| ms |
-
-#### ingest_multi_field
-
-Ingests a dataset of multiple context types into the cluster.
-
-##### Parameters
-
-| Parameter Name | Description                                                                                                                                               | Default |  
-| ----------- |-----------------------------------------------------------------------------------------------------------------------------------------------------------| ----------- |
-| index_name | Name of index to ingest into                                                                                                                              | No default |
-| field_name | Name of field to ingest into                                                                                                                              | No default |
-| bulk_size | Documents per bulk request                                                                                                                                | 300 |
-| dataset_path | Path to data-set                                                                                                                                          | No default |
-| doc_count | Number of documents to create from data-set                                                                                                               | Size of the data-set |
-| attributes_dataset_name | Name of dataset with additional attributes inside the main dataset                                                                                        | No default |
-| attribute_spec | Definition of attributes, format is: [{ name: [name_val], type: [type_val]}] Order is important and must match order of attributes column in dataset file | No default |
-
-##### Metrics
-
-| Metric Name | Description | Unit |  
-| ----------- | ----------- | ----------- |
-| took | Total time to ingest the dataset into the index.| ms |
-
-#### ingest_nested_field
-
-Ingests a dataset with nested field into the cluster.
-
-##### Parameters
-
-| Parameter Name | Description                                                                                                                                                                                                      | Default |  
-| ----------- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ----------- |
-| index_name | Name of index to ingest into                                                                                                                                                                                     | No default |
-| field_name | Name of field to ingest into                                                                                                                                                                                     | No default |
-| dataset_path | Path to data-set                                                                                                                                                                                                 | No default |
-| attributes_dataset_name | Name of dataset with additional attributes inside the main dataset                                                                                                                                               | No default |
-| attribute_spec | Definition of attributes, format is: [{ name: [name_val], type: [type_val]}] Order is important and must match order of attributes column in dataset file. It should contains { name: 'parent_id', type: 'int'}  | No default |
-
-##### Metrics
-
-| Metric Name | Description | Unit |  
-| ----------- | ----------- | ----------- |
-| took | Total time to ingest the dataset into the index.| ms |
-
-#### query
-
-Runs a set of queries against an index.
-
-##### Parameters
-
-| Parameter Name | Description | Default |  
-| ----------- | ----------- | ----------- |
-| k | Number of neighbors to return on search | 100 |
-| r | r value in Recall@R  | 1 |
-| index_name | Name of index to search | No default |
-| field_name | Name field to search | No default |
-| calculate_recall | Whether to calculate recall values | False |
-| dataset_format | Format the dataset is in. Currently hdf5 and bigann is supported. The hdf5 file must be organized in the same way that the ann-benchmarks organizes theirs. | 'hdf5' |
-| dataset_path | Path to dataset | No default |
-| neighbors_format | Format the neighbors dataset is in. Currently hdf5 and bigann is supported. The hdf5 file must be organized in the same way that the ann-benchmarks organizes theirs. | 'hdf5' |
-| neighbors_path | Path to neighbors dataset | No default |
-| query_count | Number of queries to create from data-set | Size of the data-set |
-
-##### Metrics
-
-| Metric Name | Description                                                                                             | Unit |  
-| ----------- |---------------------------------------------------------------------------------------------------------| ----------- |
-| took | Took times returned per query aggregated as total, p50, p90, p99, p99.9 and p100 (when applicable)      | ms |
-| memory_kb | Native memory k-NN is using at the end of the query workload                                            | KB |
-| recall@R | ratio of top R results from the ground truth neighbors that are in the K results returned by the plugin | float 0.0-1.0 |
-| recall@K | ratio of results returned that were ground truth nearest neighbors                                      | float 0.0-1.0 |
-
-#### query_with_filter
-
-Runs a set of queries with filter against an index.
-
-##### Parameters
-
-| Parameter Name | Description                                                                                                                                                                                                                               | Default              |  
-| ----------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|
-| k | Number of neighbors to return on search                                                                                                                                                                                                   | 100                  |
-| r | r value in Recall@R                                                                                                                                                                                                                       | 1                    |
-| index_name | Name of index to search                                                                                                                                                                                                                   | No default           |
-| field_name | Name field to search                                                                                                                                                                                                                      | No default           |
-| calculate_recall | Whether to calculate recall values                                                                                                                                                                                                        | False                |
-| dataset_format | Format the dataset is in. Currently hdf5 and bigann is supported. The hdf5 file must be organized in the same way that the ann-benchmarks organizes theirs.                                                                               | 'hdf5'               |
-| dataset_path | Path to dataset                                                                                                                                                                                                                           | No default           |
-| neighbors_format | Format the neighbors dataset is in. Currently hdf5 and bigann is supported. The hdf5 file must be organized in the same way that the ann-benchmarks organizes theirs.                                                                     | 'hdf5'               |
-| neighbors_path | Path to neighbors dataset                                                                                                                                                                                                                 | No default           |
-| neighbors_dataset | Name of filter dataset inside the neighbors dataset                                                                                                                                                                                       | No default           |
-| filter_spec | Path to filter specification                                                                                                                                                                                                              | No default           |
-| filter_type | Type of filter format, we do support following types: <br/>FILTER inner filter format for approximate k-NN search<br/>SCRIPT score scripting with exact k-NN search and pre-filtering<br/>BOOL_POST_FILTER Bool query with post-filtering | SCRIPT               |
-| score_script_similarity | Similarity function that has been used to index dataset. Used for SCRIPT filter type and ignored for others                                                                                                                               | l2                   |
-| query_count | Number of queries to create from data-set                                                                                                                                                                                                 | Size of the data-set |
-
-##### Metrics
-
-| Metric Name | Description | Unit |  
-| ----------- | ----------- | ----------- |
-| took | Took times returned per query aggregated as total, p50, p90 and p99 (when applicable) | ms |
-| memory_kb | Native memory k-NN is using at the end of the query workload | KB |
-| recall@R | ratio of top R results from the ground truth neighbors that are in the K results returned by the plugin | float 0.0-1.0 |
-| recall@K | ratio of results returned that were ground truth nearest neighbors  | float 0.0-1.0 |
-
-
-#### query_nested_field
-
-Runs a set of queries with nested field against an index.
-
-##### Parameters
-
-| Parameter Name | Description                                                                                                                                                                                                                               | Default              |  
-| ----------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|
-| k | Number of neighbors to return on search                                                                                                                                                                                                   | 100                  |
-| r | r value in Recall@R                                                                                                                                                                                                                       | 1                    |
-| index_name | Name of index to search                                                                                                                                                                                                                   | No default           |
-| field_name | Name field to search                                                                                                                                                                                                                      | No default           |
-| calculate_recall | Whether to calculate recall values                                                                                                                                                                                                        | False                |
-| dataset_format | Format the dataset is in. Currently hdf5 and bigann is supported. The hdf5 file must be organized in the same way that the ann-benchmarks organizes theirs.                                                                               | 'hdf5'               |
-| dataset_path | Path to dataset                                                                                                                                                                                                                           | No default           |
-| neighbors_format | Format the neighbors dataset is in. Currently hdf5 and bigann is supported. The hdf5 file must be organized in the same way that the ann-benchmarks organizes theirs.                                                                     | 'hdf5'               |
-| neighbors_path | Path to neighbors dataset                                                                                                                                                                                                                 | No default           |
-| neighbors_dataset | Name of filter dataset inside the neighbors dataset                                                                                                                                                                                       | No default           |
-| query_count | Number of queries to create from data-set                                                                                                                                                                                                 | Size of the data-set |
-
-##### Metrics
-
-| Metric Name | Description | Unit |  
-| ----------- | ----------- | ----------- |
-| took | Took times returned per query aggregated as total, p50, p90 and p99 (when applicable) | ms |
-| memory_kb | Native memory k-NN is using at the end of the query workload | KB |
-| recall@R | ratio of top R results from the ground truth neighbors that are in the K results returned by the plugin | float 0.0-1.0 |
-| recall@K | ratio of results returned that were ground truth nearest neighbors  | float 0.0-1.0 |
-
-#### get_stats
-
-Gets the index stats.
-
-##### Parameters
-
-| Parameter Name | Description                                                                                                                                                                                                                               | Default              |  
-| ----------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------| 
-| index_name | Name of index to search                                                                                                                                                                                                                   | No default           |
-
-##### Metrics
-
-| Metric Name | Description                                     | Unit       |  
-| ----------- |-------------------------------------------------|------------|
-| num_of_committed_segments | Total number of commited segments in the index  | integer >= 0 |
-| num_of_search_segments | Total number of search segments in the index    | integer >= 0 |
-
-### Data sets
-
-This benchmark tool uses pre-generated data sets to run indexing and query workload. For some benchmark types existing dataset need to be 
-extended. Filtering is an example of use case where such dataset extension is needed.
-
-It's possible to use script provided with this repo to generate dataset and run benchmark for filtering queries.
-You need to have existing dataset with vector data. This dataset will be used  to generate additional attribute data and set of ground truth neighbours document ids.
-
-To generate dataset with attributes based on vectors only dataset use following command pattern:
-
-```commandline
-python add-filters-to-dataset.py <path_to_dataset_with_vectors> <path_of_new_dataset_with_attributes> True False
-```
-
-To generate neighbours dataset for different filters based on dataset with attributes use following command pattern:
-
-```commandline
-python add-filters-to-dataset.py <path_to_dataset_with_vectors> <path_of_new_dataset_with_attributes> False True
-```
-
-After that new dataset(s) can be referred from testcase definition in `ingest_extended` and `query_with_filter` steps.
-
-To generate dataset with parent doc id based on vectors only dataset, use following command pattern:
-```commandline
-python add-parent-doc-id-to-dataset.py <path_to_dataset_with_vectors> <path_of_new_dataset_with_parent_id>
-```
-This will generate neighbours dataset as well. This new dataset(s) can be referred from testcase definition in `ingest_nested_field` and `query_nested_field` steps.
-
-## Contributing 
-
-### Linting
-
-Use pylint to lint the code:
-```
-pylint knn-perf-tool.py okpt/**/*.py okpt/**/**/*.py
-```
-
-### Formatting
-
-We use yapf and the google style to format our code. After installing yapf, you can format your code by running:
-
-```
-yapf --style google knn-perf-tool.py okpt/**/*.py okpt/**/**/*.py
-```
-
-### Updating requirements
-
-Add new requirements to "requirements.in" and run `pip-compile`
diff --git a/benchmarks/perf-tool/add-filters-to-dataset.py b/benchmarks/perf-tool/add-filters-to-dataset.py
deleted file mode 100644
index 0624f7323..000000000
--- a/benchmarks/perf-tool/add-filters-to-dataset.py
+++ /dev/null
@@ -1,200 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-"""
-Script builds complex dataset with additional attributes from exiting dataset that has only vectors. 
-Additional attributes are predefined in the script: color, taste, age. Only HDF5 format of vector dataset is supported.
-
-Output dataset file will have additional dataset 'attributes' with multiple columns, each column corresponds to one attribute
-from an attribute set, and value is generated at random, e.g.:
-
-0: green	None	71
-1: green	bitter	28
-
-there is no explicit index reference in 'attributes' dataset, index of the row corresponds to a document id. 
-For instance, in example above two rows of fields mapped to documents with ids '0' and '1'.  
-
-If 'generate_filters' flag is set script generates additional dataset of neighbours (ground truth) for each filter type. 
-Output is a new file with several datasets, each dataset corresponds to one filter. Datasets are named 'neighbour_filter_X'
-where X is 1 based index of particular filter. 
-Each dataset has rows with array of integers, where integer corresponds to 
-a document id from original dataset with additional fields. Array ca have -1 values that are treated as null, this is because
-subset of filtered documents is same of smaller than original set. 
-
-For example, dataset file content may look like :
-
-neighbour_filter_1: [[ 2,  5, -1],
-                     [ 3,  1, -1],
-                     [ 2   5,  7]]
-neighbour_filter_2: [[-1, -1, -1],
-                     [ 5,  6, -1],
-                     [ 4,  2,  1]]
-
-In this case we do have datasets for two filters, 3 query results for each. [2, 5, -1] indicates that for first query 
-if filter 1 is used most similar document is with id 2, next similar is 5, and the rest do not pass filter 1 criteria.
-
-Example of script usage:
-    
-    create new hdf5 file with attribute dataset
-    add-filters-to-dataset.py ~/dev/opensearch/k-NN/benchmarks/perf-tool/dataset/data.hdf5 ~/dev/opensearch/datasets/data-with-attr True False
-    
-    create new hdf5 file with filter datasets
-    add-filters-to-dataset.py ~/dev/opensearch/k-NN/benchmarks/perf-tool/dataset/data-with-attr.hdf5 ~/dev/opensearch/datasets/data-with-filters False True
-"""
-
-import getopt
-import os
-import random
-import sys
-
-import h5py
-
-from osb.extensions.data_set import HDF5DataSet
-
-
-class _Dataset:
-    """Type of dataset container for data with additional attributes"""
-    DEFAULT_TYPE = HDF5DataSet.FORMAT_NAME
-
-    def create_dataset(self, source_dataset_path, out_file_path, generate_attrs: bool, generate_filters: bool) -> None:
-        path_elements = os.path.split(os.path.abspath(source_dataset_path))
-        data_set_dir = path_elements[0]
-
-        # For HDF5, because multiple data sets can be grouped in the same file,
-        # we will build data sets in memory and not write to disk until
-        # _flush_data_sets_to_disk is called
-        # read existing dataset
-        data_hdf5 = os.path.join(os.path.dirname(os.path.realpath('/')), source_dataset_path)
-
-        with h5py.File(data_hdf5, "r") as hf:
-
-            if generate_attrs:
-                data_set_w_attr = self.create_dataset_file(out_file_path, self.DEFAULT_TYPE, data_set_dir)
-
-                possible_colors = ['red', 'green', 'yellow', 'blue', None]
-                possible_tastes = ['sweet', 'salty', 'sour', 'bitter', None]
-                max_age = 100
-
-                for key in hf.keys():
-                    if key not in ['neighbors', 'test', 'train']:
-                        continue
-                    data_set_w_attr.create_dataset(key, data=hf[key][()])
-
-                attributes = []
-                for i in range(len(hf['train'])):
-                    attr = [random.choice(possible_colors), random.choice(possible_tastes),
-                            random.randint(0, max_age + 1)]
-                    attributes.append(attr)
-
-                data_set_w_attr.create_dataset('attributes', (len(attributes), 3), 'S10', data=attributes)
-
-                data_set_w_attr.flush()
-                data_set_w_attr.close()
-
-            if generate_filters:
-                attributes = hf['attributes'][()]
-                expected_neighbors = hf['neighbors'][()]
-
-                data_set_filters = self.create_dataset_file(out_file_path, self.DEFAULT_TYPE, data_set_dir)
-
-                def filter1(attributes, vector_idx):
-                    if attributes[vector_idx][0].decode() == 'red' and int(attributes[vector_idx][2].decode()) >= 20:
-                        return True
-                    else:
-                        return False
-
-                self.apply_filter(expected_neighbors, attributes, data_set_filters, 'neighbors_filter_1', filter1)
-
-                # filter 2 - color = blue or None and taste = 'salty'
-                def filter2(attributes, vector_idx):
-                    if (attributes[vector_idx][0].decode() == 'blue' or attributes[vector_idx][
-                        0].decode() == 'None') and attributes[vector_idx][1].decode() == 'salty':
-                        return True
-                    else:
-                        return False
-
-                self.apply_filter(expected_neighbors, attributes, data_set_filters, 'neighbors_filter_2', filter2)
-
-                # filter 3 - color and taste are not None and age is between 20 and 80
-                def filter3(attributes, vector_idx):
-                    if attributes[vector_idx][0].decode() != 'None' and attributes[vector_idx][
-                        1].decode() != 'None' and 20 <= \
-                            int(attributes[vector_idx][2].decode()) <= 80:
-                        return True
-                    else:
-                        return False
-
-                self.apply_filter(expected_neighbors, attributes, data_set_filters, 'neighbors_filter_3', filter3)
-
-                # filter 4 - color green or blue and taste is bitter and age is between (30, 60)
-                def filter4(attributes, vector_idx):
-                    if (attributes[vector_idx][0].decode() == 'green' or attributes[vector_idx][0].decode() == 'blue') \
-                            and (attributes[vector_idx][1].decode() == 'bitter') \
-                            and 30 <= int(attributes[vector_idx][2].decode()) <= 60:
-                        return True
-                    else:
-                        return False
-
-                self.apply_filter(expected_neighbors, attributes, data_set_filters, 'neighbors_filter_4', filter4)
-
-                # filter 5 color is (green or blue or yellow) or taste = sweet or age is between (30, 70)
-                def filter5(attributes, vector_idx):
-                    if attributes[vector_idx][0].decode() == 'green' or attributes[vector_idx][0].decode() == 'blue' \
-                            or attributes[vector_idx][0].decode() == 'yellow' \
-                            or attributes[vector_idx][1].decode() == 'sweet' \
-                            or 30 <= int(attributes[vector_idx][2].decode()) <= 70:
-                        return True
-                    else:
-                        return False
-
-                self.apply_filter(expected_neighbors, attributes, data_set_filters, 'neighbors_filter_5', filter5)
-
-                data_set_filters.flush()
-                data_set_filters.close()
-
-    def apply_filter(self, expected_neighbors, attributes, data_set_w_filtering, filter_name, filter_func):
-        neighbors_filter = []
-        filtered_count = 0
-        for expected_neighbors_row in expected_neighbors:
-            neighbors_filter_row = [-1] * len(expected_neighbors_row)
-            idx = 0
-            for vector_idx in expected_neighbors_row:
-                if filter_func(attributes, vector_idx):
-                    neighbors_filter_row[idx] = vector_idx
-                    idx += 1
-                    filtered_count += 1
-            neighbors_filter.append(neighbors_filter_row)
-        overall_count = len(expected_neighbors) * len(expected_neighbors[0])
-        perc = float(filtered_count / overall_count) * 100
-        print('ground truth size for {} is {}, percentage {}'.format(filter_name, filtered_count, perc))
-        data_set_w_filtering.create_dataset(filter_name, data=neighbors_filter)
-        return expected_neighbors
-
-    def create_dataset_file(self, file_name, extension, data_set_dir) -> h5py.File:
-        data_set_file_name = "{}.{}".format(file_name, extension)
-        data_set_path = os.path.join(data_set_dir, data_set_file_name)
-
-        data_set_w_filtering = h5py.File(data_set_path, 'a')
-
-        return data_set_w_filtering
-
-
-def main(argv):
-    opts, args = getopt.getopt(argv, "")
-    in_file_path = args[0]
-    out_file_path = args[1]
-    generate_attr = str2bool(args[2])
-    generate_filters = str2bool(args[3])
-
-    worker = _Dataset()
-    worker.create_dataset(in_file_path, out_file_path, generate_attr, generate_filters)
-
-
-def str2bool(v):
-    return v.lower() in ("yes", "true", "t", "1")
-
-
-if __name__ == "__main__":
-    main(sys.argv[1:])
diff --git a/benchmarks/perf-tool/add-parent-doc-id-to-dataset.py b/benchmarks/perf-tool/add-parent-doc-id-to-dataset.py
deleted file mode 100644
index a4acafd03..000000000
--- a/benchmarks/perf-tool/add-parent-doc-id-to-dataset.py
+++ /dev/null
@@ -1,291 +0,0 @@
-#  Copyright OpenSearch Contributors
-#  SPDX-License-Identifier: Apache-2.0
-
-"""
-Script builds complex dataset with additional attributes from exiting dataset that has only vectors.
-Additional attributes are predefined in the script: color, taste, age, and parent doc id. Only HDF5 format of vector dataset is supported.
-
-Output dataset file will have additional dataset 'attributes' with multiple columns, each column corresponds to one attribute
-from an attribute set, and value is generated at random, e.g.:
-
-0: green	None	71  1
-1: green	bitter	28  1
-2: green	bitter	28  1
-3: green	bitter	28  2
-...
-
-there is no explicit index reference in 'attributes' dataset, index of the row corresponds to a document id.
-For instance, in example above two rows of fields mapped to documents with ids '0' and '1'.
-
-The parend doc ids are assigned in non-decreasing order.
-
-If 'generate_filters' flag is set script generates additional dataset of neighbours (ground truth).
-Output is a new file with three dataset each of which corresponds to a certain type of query.
-Dataset name neighbour_nested is a ground truth for query without filtering.
-Dataset name neighbour_filtered_relaxed is a ground truth for query with filtering of (30 <= age <= 70) or color in ["green", "blue", "yellow"] or taste in ["sweet"]
-Dataset name neighbour_filtered_restrictive is a ground truth for query with filtering of (30 <= age <= 60) and color in ["green", "blue"] and taste in ["bitter"]
-
-
-Each dataset has rows with array of integers, where integer corresponds to
-a document id from original dataset with additional fields.
-
-Example of script usage:
-
-    create new hdf5 file with attribute dataset
-    add-parent-doc-id-to-dataset.py ~/dev/opensearch/k-NN/benchmarks/perf-tool/dataset/data.hdf5 ~/dev/opensearch/datasets/data-nested.hdf5
-
-"""
-import getopt
-import multiprocessing
-import random
-import sys
-from multiprocessing import Process
-from typing import cast
-import traceback
-
-import h5py
-import numpy as np
-
-
-class MyVector:
-    def __init__(self, vector, id, color=None, taste=None, age=None, parent_id=None):
-        self.vector = vector
-        self.id = id
-        self.age = age
-        self.color = color
-        self.taste = taste
-        self.parent_id = parent_id
-
-    def apply_restricted_filter(self):
-        return (30 <= self.age <= 60) and self.color in ["green", "blue"] and self.taste in ["bitter"]
-
-    def apply_relaxed_filter(self):
-        return (30 <= self.age <= 70) or self.color in ["green", "blue", "yellow"] or self.taste in ["sweet"]
-
-    def __str__(self):
-        return f'Vector : {self.vector}, id : {self.id}, color: {self.color}, taste: {self.taste}, age: {self.age}, parent_id: {self.parent_id}\n'
-
-    def __repr__(self):
-        return f'Vector : {self.vector}, id : {self.id}, color: {self.color}, taste: {self.taste}, age: {self.age}, parent_id: {self.parent_id}\n'
-
-class HDF5DataSet:
-    def __init__(self, file_path, key):
-        self.file_name = file_path
-        self.file = h5py.File(self.file_name)
-        self.key = key
-        self.data = cast(h5py.Dataset, self.file[key])
-        self.metadata = None
-        self.metadata = cast(h5py.Dataset, self.file["attributes"]) if key == "train" else None
-        print(f'Keys in the file are {self.file.keys()}')
-
-    def read(self, start, end=None):
-        if end is None:
-            end = self.data.len()
-        values = cast(np.ndarray, self.data[start:end])
-        metadata = cast(list, self.metadata[start:end]) if self.metadata is not None else None
-        if metadata is not None:
-            print(metadata)
-        vectors = []
-        i = 0
-        for value in values:
-            if self.metadata is None:
-                vector = MyVector(value, i)
-            else:
-                # color, taste, age, and parent id
-                vector = MyVector(value, i, str(metadata[i][0].decode()), str(metadata[i][1].decode()),
-                                  int(metadata[i][2]), int(metadata[i][3]))
-            vectors.append(vector)
-            i = i + 1
-        return vectors
-
-    def read_neighbors(self, start, end):
-        return cast(np.ndarray, self.data[start:end])
-
-    def size(self):
-        return self.data.len()
-
-    def close(self):
-        self.file.close()
-
-class _Dataset:
-    def run(self, source_path, target_path) -> None:
-        # Add attributes
-        print(f'Adding attributes started.')
-        with h5py.File(source_path, "r") as in_file:
-            out_file = h5py.File(target_path, "w")
-            possible_colors = ['red', 'green', 'yellow', 'blue', None]
-            possible_tastes = ['sweet', 'salty', 'sour', 'bitter', None]
-            max_age = 100
-            min_field_size = 10
-            max_field_size = 10
-
-            # Copy train and test data
-            for key in in_file.keys():
-                if key not in ['test', 'train']:
-                    continue
-                out_file.create_dataset(key, data=in_file[key][()])
-
-            # Generate attributes
-            attributes = []
-            field_size = random.randint(min_field_size, max_field_size)
-            parent_id = 1
-            field_count = 0
-            for i in range(len(in_file['train'])):
-                attr = [random.choice(possible_colors), random.choice(possible_tastes),
-                        random.randint(0, max_age + 1), parent_id]
-                attributes.append(attr)
-                field_count += 1
-                if field_count >= field_size:
-                    field_size = random.randint(min_field_size, max_field_size)
-                    field_count = 0
-                    parent_id += 1
-            out_file.create_dataset('attributes', (len(attributes), 4), 'S10', data=attributes)
-
-            out_file.flush()
-            out_file.close()
-
-        print(f'Adding attributes completed.')
-
-
-        # Calculate ground truth
-        print(f'Calculating ground truth started.')
-        cpus = multiprocessing.cpu_count()
-        total_clients = min(8, cpus)  # 1  # 10
-        hdf5Data_train = HDF5DataSet(target_path, "train")
-        train_vectors = hdf5Data_train.read(0, hdf5Data_train.size())
-        hdf5Data_train.close()
-        print(f'Train vector size: {len(train_vectors)}')
-
-        hdf5Data_test = HDF5DataSet(target_path, "test")
-        total_queries = hdf5Data_test.size()  # 10000
-        dis = [] * total_queries
-
-        for i in range(total_queries):
-            dis.insert(i, [])
-
-        queries_per_client = int(total_queries / total_clients + 0.5)
-        if queries_per_client == 0:
-            queries_per_client = total_queries
-
-        processes = []
-        test_vectors = hdf5Data_test.read(0, total_queries)
-        hdf5Data_test.close()
-        tasks_that_are_done = multiprocessing.Queue()
-        for client in range(total_clients):
-            start_index = int(client * queries_per_client)
-            if start_index + queries_per_client <= total_queries:
-                end_index = int(start_index + queries_per_client)
-            else:
-                end_index = total_queries
-
-            print(f'Start Index: {start_index}, end Index: {end_index}')
-            print(f'client is  : {client}')
-            p = Process(target=queryTask, args=(
-                train_vectors, test_vectors, start_index, end_index, client, total_queries, tasks_that_are_done))
-            processes.append(p)
-            p.start()
-            if end_index >= total_queries:
-                print(f'Exiting end Index : {end_index} total_queries: {total_queries}')
-                break
-
-        # wait for tasks to be completed
-        print('Waiting for all tasks to be completed')
-        j = 0
-        # This is required because threads can hang if the data sent from the sub process increases by a certain limit
-        # https://stackoverflow.com/questions/21641887/python-multiprocessing-process-hangs-on-join-for-large-queue
-        while j < total_queries:
-            while not tasks_that_are_done.empty():
-                calculatedDis = tasks_that_are_done.get()
-                i = 0
-                for d in calculatedDis:
-                    if d:
-                        dis[i] = d
-                        j = j + 1
-                    i = i + 1
-
-        for p in processes:
-            if p.is_alive():
-                p.join()
-            else:
-                print("Process was not alive hence shutting down")
-
-        data_set_file = h5py.File(target_path, "a")
-        for type in ['nested', 'relaxed', 'restricted']:
-            results = []
-            for d in dis:
-                r = []
-                for i in range(min(10000, len(d[type]))):
-                    r.append(d[type][i]['id'])
-                results.append(r)
-
-
-            data_set_file.create_dataset("neighbour_" + type, (len(results), len(results[0])), data=results)
-        data_set_file.flush()
-        data_set_file.close()
-
-def calculateL2Distance(point1, point2):
-    return np.linalg.norm(point1 - point2)
-
-
-def queryTask(train_vectors, test_vectors, startIndex, endIndex, process_number, total_queries, tasks_that_are_done):
-    print(f'Starting Process number : {process_number}')
-    all_distances = [] * total_queries
-    for i in range(total_queries):
-        all_distances.insert(i, {})
-    try:
-        test_vectors = test_vectors[startIndex:endIndex]
-        i = startIndex
-        for test in test_vectors:
-            distances = []
-            values = {}
-            for value in train_vectors:
-                values[value.id] = value
-                distances.append({
-                    "dis": calculateL2Distance(test.vector, value.vector),
-                    "id": value.parent_id
-                })
-
-            distances.sort(key=lambda vector: vector['dis'])
-            seen_set_nested = set()
-            seen_set_restricted = set()
-            seen_set_relaxed = set()
-            nested = []
-            restricted = []
-            relaxed = []
-            for sub_i in range(len(distances)):
-                id = distances[sub_i]['id']
-                # Check if the number has been seen before
-                if len(nested) < 1000 and id not in seen_set_nested:
-                    # If not seen before, mark it as seen
-                    seen_set_nested.add(id)
-                    nested.append(distances[sub_i])
-                if len(restricted) < 1000 and id not in seen_set_restricted and values[id].apply_restricted_filter():
-                    seen_set_restricted.add(id)
-                    restricted.append(distances[sub_i])
-                if len(relaxed) < 1000 and id not in seen_set_relaxed and values[id].apply_relaxed_filter():
-                    seen_set_relaxed.add(id)
-                    relaxed.append(distances[sub_i])
-
-            all_distances[i]['nested'] = nested
-            all_distances[i]['restricted'] = restricted
-            all_distances[i]['relaxed'] = relaxed
-            print(f"Process {process_number} queries completed: {i + 1 - startIndex}, queries left: {endIndex - i - 1}")
-            i = i + 1
-    except:
-        print(
-            f"Got exception while running the thread: {process_number} with startIndex: {startIndex} endIndex: {endIndex} ")
-        traceback.print_exc()
-    tasks_that_are_done.put(all_distances)
-    print(f'Exiting Process number : {process_number}')
-
-
-def main(argv):
-    opts, args = getopt.getopt(argv, "")
-    in_file_path = args[0]
-    out_file_path = args[1]
-
-    worker = _Dataset()
-    worker.run(in_file_path, out_file_path)
-
-if __name__ == "__main__":
-    main(sys.argv[1:])
\ No newline at end of file
diff --git a/benchmarks/perf-tool/dataset/data-nested.hdf5 b/benchmarks/perf-tool/dataset/data-nested.hdf5
deleted file mode 100644
index 4223d7281..000000000
Binary files a/benchmarks/perf-tool/dataset/data-nested.hdf5 and /dev/null differ
diff --git a/benchmarks/perf-tool/dataset/data-with-attr-with-filters.hdf5 b/benchmarks/perf-tool/dataset/data-with-attr-with-filters.hdf5
deleted file mode 100644
index 01df75f83..000000000
Binary files a/benchmarks/perf-tool/dataset/data-with-attr-with-filters.hdf5 and /dev/null differ
diff --git a/benchmarks/perf-tool/dataset/data-with-attr.hdf5 b/benchmarks/perf-tool/dataset/data-with-attr.hdf5
deleted file mode 100644
index 22873b06c..000000000
Binary files a/benchmarks/perf-tool/dataset/data-with-attr.hdf5 and /dev/null differ
diff --git a/benchmarks/perf-tool/dataset/data.hdf5 b/benchmarks/perf-tool/dataset/data.hdf5
deleted file mode 100644
index c9268606d..000000000
Binary files a/benchmarks/perf-tool/dataset/data.hdf5 and /dev/null differ
diff --git a/benchmarks/perf-tool/knn-perf-tool.py b/benchmarks/perf-tool/knn-perf-tool.py
deleted file mode 100644
index 48eedc427..000000000
--- a/benchmarks/perf-tool/knn-perf-tool.py
+++ /dev/null
@@ -1,10 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-"""Script for user to run the testing tool."""
-
-import okpt.main
-
-okpt.main.main()
diff --git a/benchmarks/perf-tool/okpt/__init__.py b/benchmarks/perf-tool/okpt/__init__.py
deleted file mode 100644
index c3bffc54c..000000000
--- a/benchmarks/perf-tool/okpt/__init__.py
+++ /dev/null
@@ -1,6 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
diff --git a/benchmarks/perf-tool/okpt/diff/diff.py b/benchmarks/perf-tool/okpt/diff/diff.py
deleted file mode 100644
index 23f424ab9..000000000
--- a/benchmarks/perf-tool/okpt/diff/diff.py
+++ /dev/null
@@ -1,142 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-"""Provides the Diff class."""
-
-from enum import Enum
-from typing import Any, Dict, Tuple
-
-
-class InvalidTestResultsError(Exception):
-    """Exception raised when the test results are invalid.
-
-    The results can be invalid if they have different fields, non-numeric
-    values, or if they don't follow the standard result format.
-    """
-    def __init__(self, msg: str):
-        self.message = msg
-        super().__init__(self.message)
-
-
-def _is_numeric(a) -> bool:
-    return isinstance(a, (int, float))
-
-
-class TestResultFields(str, Enum):
-    METADATA = 'metadata'
-    RESULTS = 'results'
-    TEST_PARAMETERS = 'test_parameters'
-
-
-class TestResultNames(str, Enum):
-    BASE = 'base_result'
-    CHANGED = 'changed_result'
-
-
-class Diff:
-    """Diff class for validating and diffing two test result files.
-
-    Methods:
-        diff: Returns the diff between two test results. (changed - base)
-    """
-    def __init__(
-        self,
-        base_result: Dict[str,
-                          Any],
-        changed_result: Dict[str,
-                             Any],
-        metadata: bool
-    ):
-        """Initializes test results and validate them."""
-        self.base_result = base_result
-        self.changed_result = changed_result
-        self.metadata = metadata
-
-        # make sure results have proper test result fields
-        is_valid, key, result = self._validate_keys()
-        if not is_valid:
-            raise InvalidTestResultsError(
-                f'{result} has a missing or invalid key `{key}`.'
-            )
-
-        self.base_results = self.base_result[TestResultFields.RESULTS]
-        self.changed_results = self.changed_result[TestResultFields.RESULTS]
-
-        # make sure results have the same fields
-        is_valid, key, result = self._validate_structure()
-        if not is_valid:
-            raise InvalidTestResultsError(
-                f'key `{key}` is not present in {result}.'
-            )
-
-        # make sure results have numeric values
-        is_valid, key, result = self._validate_types()
-        if not is_valid:
-            raise InvalidTestResultsError(
-                f'key `{key}` in {result} points to a non-numeric value.'
-            )
-
-    def _validate_keys(self) -> Tuple[bool, str, str]:
-        """Ensure both test results have `metadata` and `results` keys."""
-        check_keydict = lambda key, res: key in res and isinstance(
-            res[key], dict)
-
-        # check if results have a `metadata` field and if `metadata` is a dict
-        if self.metadata:
-            if not check_keydict(TestResultFields.METADATA, self.base_result):
-                return (False, TestResultFields.METADATA, TestResultNames.BASE)
-            if not check_keydict(TestResultFields.METADATA,
-                                 self.changed_result):
-                return (
-                    False,
-                    TestResultFields.METADATA,
-                    TestResultNames.CHANGED
-                )
-        # check if results have a `results` field and `results` is a dict
-        if not check_keydict(TestResultFields.RESULTS, self.base_result):
-            return (False, TestResultFields.RESULTS, TestResultNames.BASE)
-        if not check_keydict(TestResultFields.RESULTS, self.changed_result):
-            return (False, TestResultFields.RESULTS, TestResultNames.CHANGED)
-        return (True, '', '')
-
-    def _validate_structure(self) -> Tuple[bool, str, str]:
-        """Ensure both test results have the same keys."""
-        for k in self.base_results:
-            if not k in self.changed_results:
-                return (False, k, TestResultNames.CHANGED)
-        for k in self.changed_results:
-            if not k in self.base_results:
-                return (False, k, TestResultNames.BASE)
-        return (True, '', '')
-
-    def _validate_types(self) -> Tuple[bool, str, str]:
-        """Ensure both test results have numeric values."""
-        for k, v in self.base_results.items():
-            if not _is_numeric(v):
-                return (False, k, TestResultNames.BASE)
-        for k, v in self.changed_results.items():
-            if not _is_numeric(v):
-                return (False, k, TestResultNames.BASE)
-        return (True, '', '')
-
-    def diff(self) -> Dict[str, Any]:
-        """Return the diff between the two test results. (changed - base)"""
-        results_diff = {
-            key: self.changed_results[key] - self.base_results[key]
-            for key in self.base_results
-        }
-
-        # add metadata if specified
-        if self.metadata:
-            return {
-                f'{TestResultNames.BASE}_{TestResultFields.METADATA}':
-                    self.base_result[TestResultFields.METADATA],
-                f'{TestResultNames.CHANGED}_{TestResultFields.METADATA}':
-                    self.changed_result[TestResultFields.METADATA],
-                'diff':
-                    results_diff
-            }
-        return results_diff
diff --git a/benchmarks/perf-tool/okpt/io/args.py b/benchmarks/perf-tool/okpt/io/args.py
deleted file mode 100644
index f8c5d8809..000000000
--- a/benchmarks/perf-tool/okpt/io/args.py
+++ /dev/null
@@ -1,178 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-"""Parses and defines command line arguments for the program.
-
-Defines the subcommands `test` and `diff` and the corresponding
-files that are required by each command.
-
-Functions:
-    define_args(): Define the command line arguments.
-    get_args(): Returns a dictionary of the command line args.
-"""
-
-import argparse
-import sys
-from dataclasses import dataclass
-from io import TextIOWrapper
-from typing import Union
-
-_read_type = argparse.FileType('r')
-_write_type = argparse.FileType('w')
-
-
-def _add_config(parser, name, **kwargs):
-    """"Add configuration file path argument."""
-    opts = {
-        'type': _read_type,
-        'help': 'Path of configuration file.',
-        'metavar': 'config_path',
-        **kwargs,
-    }
-    parser.add_argument(name, **opts)
-
-
-def _add_result(parser, name, **kwargs):
-    """"Add results files paths argument."""
-    opts = {
-        'type': _read_type,
-        'help': 'Path of one result file.',
-        'metavar': 'result_path',
-        **kwargs,
-    }
-    parser.add_argument(name, **opts)
-
-
-def _add_results(parser, name, **kwargs):
-    """"Add results files paths argument."""
-    opts = {
-        'nargs': '+',
-        'type': _read_type,
-        'help': 'Paths of result files.',
-        'metavar': 'result_paths',
-        **kwargs,
-    }
-    parser.add_argument(name, **opts)
-
-
-def _add_output(parser, name, **kwargs):
-    """"Add output file path argument."""
-    opts = {
-        'type': _write_type,
-        'help': 'Path of output file.',
-        'metavar': 'output_path',
-        **kwargs,
-    }
-    parser.add_argument(name, **opts)
-
-
-def _add_metadata(parser, name, **kwargs):
-    opts = {
-        'action': 'store_true',
-        **kwargs,
-    }
-    parser.add_argument(name, **opts)
-
-
-def _add_test_cmd(subparsers):
-    test_parser = subparsers.add_parser('test')
-    _add_config(test_parser, 'config')
-    _add_output(test_parser, 'output')
-
-
-def _add_diff_cmd(subparsers):
-    diff_parser = subparsers.add_parser('diff')
-    _add_metadata(diff_parser, '--metadata')
-    _add_result(
-        diff_parser,
-        'base_result',
-        help='Base test result.',
-        metavar='base_result'
-    )
-    _add_result(
-        diff_parser,
-        'changed_result',
-        help='Changed test result.',
-        metavar='changed_result'
-    )
-    _add_output(diff_parser, '--output', default=sys.stdout)
-
-
-@dataclass
-class TestArgs:
-    log: str
-    command: str
-    config: TextIOWrapper
-    output: TextIOWrapper
-
-
-@dataclass
-class DiffArgs:
-    log: str
-    command: str
-    metadata: bool
-    base_result: TextIOWrapper
-    changed_result: TextIOWrapper
-    output: TextIOWrapper
-
-
-def get_args() -> Union[TestArgs, DiffArgs]:
-    """Define, parse and return command line args.
-
-    Returns:
-        A dict containing the command line args.
-    """
-    parser = argparse.ArgumentParser(
-        description=
-        'Run performance tests against the OpenSearch plugin and various ANN '
-        'libaries.'
-    )
-
-    def define_args():
-        """Define tool commands."""
-
-        # add log level arg
-        parser.add_argument(
-            '--log',
-            default='info',
-            type=str,
-            choices=['debug',
-                     'info',
-                     'warning',
-                     'error',
-                     'critical'],
-            help='Log level of the tool.'
-        )
-
-        subparsers = parser.add_subparsers(
-            title='commands',
-            dest='command',
-            help='sub-command help'
-        )
-        subparsers.required = True
-
-        # add subcommands
-        _add_test_cmd(subparsers)
-        _add_diff_cmd(subparsers)
-
-    define_args()
-    args = parser.parse_args()
-    if args.command == 'test':
-        return TestArgs(
-            log=args.log,
-            command=args.command,
-            config=args.config,
-            output=args.output
-        )
-    else:
-        return DiffArgs(
-            log=args.log,
-            command=args.command,
-            metadata=args.metadata,
-            base_result=args.base_result,
-            changed_result=args.changed_result,
-            output=args.output
-        )
diff --git a/benchmarks/perf-tool/okpt/io/config/parsers/base.py b/benchmarks/perf-tool/okpt/io/config/parsers/base.py
deleted file mode 100644
index 795aab1b2..000000000
--- a/benchmarks/perf-tool/okpt/io/config/parsers/base.py
+++ /dev/null
@@ -1,67 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-"""Base Parser class.
-
-Classes:
-    BaseParser: Base class for config parsers.
-
-Exceptions:
-    ConfigurationError: An error in the configuration syntax.
-"""
-
-import os
-from io import TextIOWrapper
-
-import cerberus
-
-from okpt.io.utils import reader
-
-
-class ConfigurationError(Exception):
-    """Exception raised for errors in the tool configuration.
-
-    Attributes:
-        message -- explanation of the error
-    """
-
-    def __init__(self, message: str):
-        self.message = f'{message}'
-        super().__init__(self.message)
-
-
-def _get_validator_from_schema_name(schema_name: str):
-    """Get the corresponding Cerberus validator from a schema name."""
-    curr_file_dir = os.path.dirname(os.path.abspath(__file__))
-    schemas_dir = os.path.join(os.path.dirname(curr_file_dir), 'schemas')
-    schema_file_path = os.path.join(schemas_dir, f'{schema_name}.yml')
-    schema_obj = reader.parse_yaml_from_path(schema_file_path)
-    return cerberus.Validator(schema_obj)
-
-
-class BaseParser:
-    """Base class for config parsers.
-
-    Attributes:
-        validator: Cerberus validator for a particular schema
-        errors: Cerberus validation errors (if any are found during validation)
-
-    Methods:
-        parse: Parse config.
-    """
-
-    def __init__(self, schema_name: str):
-        self.validator = _get_validator_from_schema_name(schema_name)
-        self.errors = ''
-
-    def parse(self, file_obj: TextIOWrapper):
-        """Convert file object to dict, while validating against config schema."""
-        config_obj = reader.parse_yaml(file_obj)
-        is_config_valid = self.validator.validate(config_obj)
-        if not is_config_valid:
-            raise ConfigurationError(self.validator.errors)
-
-        return self.validator.document
diff --git a/benchmarks/perf-tool/okpt/io/config/parsers/test.py b/benchmarks/perf-tool/okpt/io/config/parsers/test.py
deleted file mode 100644
index c47e30ecc..000000000
--- a/benchmarks/perf-tool/okpt/io/config/parsers/test.py
+++ /dev/null
@@ -1,81 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-"""Provides ToolParser.
-
-Classes:
-    ToolParser: Tool config parser.
-"""
-from dataclasses import dataclass
-from io import TextIOWrapper
-from typing import List
-
-from okpt.io.config.parsers import base
-from okpt.test.steps.base import Step, StepConfig
-from okpt.test.steps.factory import create_step
-
-
-@dataclass
-class TestConfig:
-    test_name: str
-    test_id: str
-    endpoint: str
-    port: int
-    timeout: int
-    num_runs: int
-    show_runs: bool
-    setup: List[Step]
-    steps: List[Step]
-    cleanup: List[Step]
-
-
-class TestParser(base.BaseParser):
-    """Parser for Test config.
-
-    Methods:
-        parse: Parse and validate the Test config.
-    """
-
-    def __init__(self):
-        super().__init__('test')
-
-    def parse(self, file_obj: TextIOWrapper) -> TestConfig:
-        """See base class."""
-        config_obj = super().parse(file_obj)
-
-        implicit_step_config = dict()
-        if 'endpoint' in config_obj:
-            implicit_step_config['endpoint'] = config_obj['endpoint']
-
-        if 'port' in config_obj:
-            implicit_step_config['port'] = config_obj['port']
-
-        # Each step should have its own parse - take the config object and check if its valid
-        setup = []
-        if 'setup' in config_obj:
-            setup = [create_step(StepConfig(step["name"], step, implicit_step_config)) for step in config_obj['setup']]
-
-        steps = [create_step(StepConfig(step["name"], step, implicit_step_config)) for step in config_obj['steps']]
-
-        cleanup = []
-        if 'cleanup' in config_obj:
-            cleanup = [create_step(StepConfig(step["name"], step, implicit_step_config)) for step
-                       in config_obj['cleanup']]
-
-        test_config = TestConfig(
-            endpoint=config_obj['endpoint'],
-            port=config_obj['port'],
-            timeout=config_obj['timeout'],
-            test_name=config_obj['test_name'],
-            test_id=config_obj['test_id'],
-            num_runs=config_obj['num_runs'],
-            show_runs=config_obj['show_runs'],
-            setup=setup,
-            steps=steps,
-            cleanup=cleanup
-        )
-
-        return test_config
diff --git a/benchmarks/perf-tool/okpt/io/config/parsers/util.py b/benchmarks/perf-tool/okpt/io/config/parsers/util.py
deleted file mode 100644
index 454fec5a0..000000000
--- a/benchmarks/perf-tool/okpt/io/config/parsers/util.py
+++ /dev/null
@@ -1,116 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-"""Utility functions for parsing"""
-
-
-from okpt.io.config.parsers.base import ConfigurationError
-from okpt.io.dataset import HDF5DataSet, BigANNNeighborDataSet, \
-    BigANNVectorDataSet, DataSet, Context
-
-
-def parse_dataset(dataset_format: str, dataset_path: str,
-                  context: Context, custom_context=None) -> DataSet:
-    if dataset_format == 'hdf5':
-        return HDF5DataSet(dataset_path, context, custom_context)
-
-    if dataset_format == 'bigann' and context == Context.NEIGHBORS:
-        return BigANNNeighborDataSet(dataset_path)
-
-    if dataset_format == 'bigann':
-        return BigANNVectorDataSet(dataset_path)
-
-    raise Exception("Unsupported data-set format")
-
-
-def parse_string_param(key: str, first_map, second_map, default) -> str:
-    value = first_map.get(key)
-    if value is not None:
-        if type(value) is str:
-            return value
-        raise ConfigurationError("Invalid type for {}".format(key))
-
-    value = second_map.get(key)
-    if value is not None:
-        if type(value) is str:
-            return value
-        raise ConfigurationError("Invalid type for {}".format(key))
-
-    if default is None:
-        raise ConfigurationError("{} must be set".format(key))
-    return default
-
-
-def parse_int_param(key: str, first_map, second_map, default) -> int:
-    value = first_map.get(key)
-    if value is not None:
-        if type(value) is int:
-            return value
-        raise ConfigurationError("Invalid type for {}".format(key))
-
-    value = second_map.get(key)
-    if value is not None:
-        if type(value) is int:
-            return value
-        raise ConfigurationError("Invalid type for {}".format(key))
-
-    if default is None:
-        raise ConfigurationError("{} must be set".format(key))
-    return default
-
-
-def parse_bool_param(key: str, first_map, second_map, default) -> bool:
-    value = first_map.get(key)
-    if value is not None:
-        if type(value) is bool:
-            return value
-        raise ConfigurationError("Invalid type for {}".format(key))
-
-    value = second_map.get(key)
-    if value is not None:
-        if type(value) is bool:
-            return value
-        raise ConfigurationError("Invalid type for {}".format(key))
-
-    if default is None:
-        raise ConfigurationError("{} must be set".format(key))
-    return default
-
-
-def parse_dict_param(key: str, first_map, second_map, default) -> dict:
-    value = first_map.get(key)
-    if value is not None:
-        if type(value) is dict:
-            return value
-        raise ConfigurationError("Invalid type for {}".format(key))
-
-    value = second_map.get(key)
-    if value is not None:
-        if type(value) is dict:
-            return value
-        raise ConfigurationError("Invalid type for {}".format(key))
-
-    if default is None:
-        raise ConfigurationError("{} must be set".format(key))
-    return default
-
-
-def parse_list_param(key: str, first_map, second_map, default) -> list:
-    value = first_map.get(key)
-    if value is not None:
-        if type(value) is list:
-            return value
-        raise ConfigurationError("Invalid type for {}".format(key))
-
-    value = second_map.get(key)
-    if value is not None:
-        if type(value) is list:
-            return value
-        raise ConfigurationError("Invalid type for {}".format(key))
-
-    if default is None:
-        raise ConfigurationError("{} must be set".format(key))
-    return default
diff --git a/benchmarks/perf-tool/okpt/io/config/schemas/test.yml b/benchmarks/perf-tool/okpt/io/config/schemas/test.yml
deleted file mode 100644
index 4d5c21a15..000000000
--- a/benchmarks/perf-tool/okpt/io/config/schemas/test.yml
+++ /dev/null
@@ -1,35 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-# defined using the cerberus validation API
-# https://docs.python-cerberus.org/en/stable/index.html
-endpoint:
-  type: string
-  default: "localhost"
-port:
-  type: integer
-  default: 9200
-timeout:
-  type: integer
-  default: 60  
-test_name:
-  type: string
-test_id:
-  type: string
-num_runs:
-  type: integer
-  default: 1
-  min: 1
-  max: 10000
-show_runs:
-  type: boolean
-  default: false
-setup:
-  type: list
-steps:
-  type: list
-cleanup:
-  type: list
diff --git a/benchmarks/perf-tool/okpt/io/dataset.py b/benchmarks/perf-tool/okpt/io/dataset.py
deleted file mode 100644
index 001563bab..000000000
--- a/benchmarks/perf-tool/okpt/io/dataset.py
+++ /dev/null
@@ -1,222 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-"""Defines DataSet interface and implements particular formats
-
-A DataSet is the basic functionality that it can be read in chunks, or
-read completely and reset to the start.
-
-Currently, we support HDF5 formats from ann-benchmarks and big-ann-benchmarks
-datasets.
-
-Classes:
-    HDF5DataSet: Format used in ann-benchmarks
-    BigANNNeighborDataSet: Neighbor format for big-ann-benchmarks
-    BigANNVectorDataSet: Vector format for big-ann-benchmarks
-"""
-import os
-from abc import ABC, ABCMeta, abstractmethod
-from enum import Enum
-from typing import cast
-import h5py
-import numpy as np
-
-import struct
-
-
-class Context(Enum):
-    """DataSet context enum. Can be used to add additional context for how a
-    data-set should be interpreted.
-    """
-    INDEX = 1
-    QUERY = 2
-    NEIGHBORS = 3
-    CUSTOM = 4
-
-
-class DataSet(ABC):
-    """DataSet interface. Used for reading data-sets from files.
-
-    Methods:
-        read: Read a chunk of data from the data-set
-        size: Gets the number of items in the data-set
-        reset: Resets internal state of data-set to beginning
-    """
-    __metaclass__ = ABCMeta
-
-    @abstractmethod
-    def read(self, chunk_size: int):
-        pass
-
-    @abstractmethod
-    def size(self):
-        pass
-
-    @abstractmethod
-    def reset(self):
-        pass
-
-
-class HDF5DataSet(DataSet):
-    """ Data-set format corresponding to `ANN Benchmarks
-    <https://github.com/erikbern/ann-benchmarks#data-sets>`_
-    """
-
-    def __init__(self, dataset_path: str, context: Context, custom_context=None):
-        file = h5py.File(dataset_path)
-        self.data = cast(h5py.Dataset, file[self._parse_context(context, custom_context)])
-        self.current = 0
-
-    def read(self, chunk_size: int):
-        if self.current >= self.size():
-            return None
-
-        end_i = self.current + chunk_size
-        if end_i > self.size():
-            end_i = self.size()
-
-        v = cast(np.ndarray, self.data[self.current:end_i])
-        self.current = end_i
-        return v
-
-    def size(self):
-        return self.data.len()
-
-    def reset(self):
-        self.current = 0
-
-    @staticmethod
-    def _parse_context(context: Context, custom_context=None) -> str:
-        if context == Context.NEIGHBORS:
-            return "neighbors"
-
-        if context == Context.INDEX:
-            return "train"
-
-        if context == Context.QUERY:
-            return "test"
-
-        if context == Context.CUSTOM:
-            return custom_context
-
-        raise Exception("Unsupported context")
-
-
-class BigANNNeighborDataSet(DataSet):
-    """ Data-set format for neighbor data-sets for `Big ANN Benchmarks
-    <https://big-ann-benchmarks.com/index.html#bench-datasets>`_"""
-
-    def __init__(self, dataset_path: str):
-        self.file = open(dataset_path, 'rb')
-        self.file.seek(0, os.SEEK_END)
-        num_bytes = self.file.tell()
-        self.file.seek(0)
-
-        if num_bytes < 8:
-            raise Exception("File is invalid")
-
-        self.num_queries = int.from_bytes(self.file.read(4), "little")
-        self.k = int.from_bytes(self.file.read(4), "little")
-
-        # According to the website, the number of bytes that will follow will
-        # be:  num_queries X K x sizeof(uint32_t) bytes + num_queries X K x
-        # sizeof(float)
-        if (num_bytes - 8) != 2 * (self.num_queries * self.k * 4):
-            raise Exception("File is invalid")
-
-        self.current = 0
-
-    def read(self, chunk_size: int):
-        if self.current >= self.size():
-            return None
-
-        end_i = self.current + chunk_size
-        if end_i > self.size():
-            end_i = self.size()
-
-        v = [[int.from_bytes(self.file.read(4), "little") for _ in
-              range(self.k)] for _ in range(end_i - self.current)]
-
-        self.current = end_i
-        return v
-
-    def size(self):
-        return self.num_queries
-
-    def reset(self):
-        self.file.seek(8)
-        self.current = 0
-
-
-class BigANNVectorDataSet(DataSet):
-    """ Data-set format for vector data-sets for `Big ANN Benchmarks
-    <https://big-ann-benchmarks.com/index.html#bench-datasets>`_
-    """
-
-    def __init__(self, dataset_path: str):
-        self.file = open(dataset_path, 'rb')
-        self.file.seek(0, os.SEEK_END)
-        num_bytes = self.file.tell()
-        self.file.seek(0)
-
-        if num_bytes < 8:
-            raise Exception("File is invalid")
-
-        self.num_points = int.from_bytes(self.file.read(4), "little")
-        self.dimension = int.from_bytes(self.file.read(4), "little")
-        bytes_per_num = self._get_data_size(dataset_path)
-
-        if (num_bytes - 8) != self.num_points * self.dimension * bytes_per_num:
-            raise Exception("File is invalid")
-
-        self.reader = self._value_reader(dataset_path)
-        self.current = 0
-
-    def read(self, chunk_size: int):
-        if self.current >= self.size():
-            return None
-
-        end_i = self.current + chunk_size
-        if end_i > self.size():
-            end_i = self.size()
-
-        v = np.asarray([self._read_vector() for _ in
-                        range(end_i - self.current)])
-        self.current = end_i
-        return v
-
-    def _read_vector(self):
-        return np.asarray([self.reader(self.file) for _ in
-                           range(self.dimension)])
-
-    def size(self):
-        return self.num_points
-
-    def reset(self):
-        self.file.seek(8)  # Seek to 8 bytes to skip re-reading metadata
-        self.current = 0
-
-    @staticmethod
-    def _get_data_size(file_name):
-        ext = file_name.split('.')[-1]
-        if ext == "u8bin":
-            return 1
-
-        if ext == "fbin":
-            return 4
-
-        raise Exception("Unknown extension")
-
-    @staticmethod
-    def _value_reader(file_name):
-        ext = file_name.split('.')[-1]
-        if ext == "u8bin":
-            return lambda file: float(int.from_bytes(file.read(1), "little"))
-
-        if ext == "fbin":
-            return lambda file: struct.unpack('<f', file.read(4))
-
-        raise Exception("Unknown extension")
diff --git a/benchmarks/perf-tool/okpt/io/utils/reader.py b/benchmarks/perf-tool/okpt/io/utils/reader.py
deleted file mode 100644
index f17effe56..000000000
--- a/benchmarks/perf-tool/okpt/io/utils/reader.py
+++ /dev/null
@@ -1,84 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-"""Provides function for reading from files.
-
-Functions:
-    get_file_obj(): Get a readable file object.
-    parse_yaml(): Parse YAML file from file object.
-    parse_yaml_from_path(): Parse YAML file from file path.
-    parse_json(): Parse JSON file from file object.
-    parse_json_from_path(): Parse JSON file from file path.
-"""
-
-import json
-from io import TextIOWrapper
-from typing import Any, Dict
-
-import yaml
-
-from okpt.io.utils import reader
-
-
-def get_file_obj(path: str) -> TextIOWrapper:
-    """Given a file path, get a readable file object.
-
-    Args:
-        file path
-
-    Returns:
-        Writeable file object
-    """
-    return open(path, 'r', encoding='UTF-8')
-
-
-def parse_yaml(file: TextIOWrapper) -> Dict[str, Any]:
-    """Parses YAML file from file object.
-
-    Args:
-        file: file object to parse
-
-    Returns:
-        A dict representing the YAML file.
-    """
-    return yaml.load(file, Loader=yaml.SafeLoader)
-
-
-def parse_yaml_from_path(path: str) -> Dict[str, Any]:
-    """Parses YAML file from file path.
-
-    Args:
-        path: file path to parse
-
-    Returns:
-        A dict representing the YAML file.
-    """
-    file = reader.get_file_obj(path)
-    return parse_yaml(file)
-
-
-def parse_json(file: TextIOWrapper) -> Dict[str, Any]:
-    """Parses JSON file from file object.
-
-    Args:
-        file: file object to parse
-
-    Returns:
-        A dict representing the JSON file.
-    """
-    return json.load(file)
-
-
-def parse_json_from_path(path: str) -> Dict[str, Any]:
-    """Parses JSON file from file path.
-
-    Args:
-        path: file path to parse
-
-    Returns:
-        A dict representing the JSON file.
-    """
-    file = reader.get_file_obj(path)
-    return json.load(file)
diff --git a/benchmarks/perf-tool/okpt/io/utils/writer.py b/benchmarks/perf-tool/okpt/io/utils/writer.py
deleted file mode 100644
index 1f14bfd94..000000000
--- a/benchmarks/perf-tool/okpt/io/utils/writer.py
+++ /dev/null
@@ -1,40 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-"""Provides functions for writing to file.
-
-Functions:
-    get_file_obj(): Get a writeable file object.
-    write_json(): Writes a python dictionary to a JSON file
-"""
-
-import json
-from io import TextIOWrapper
-from typing import Any, Dict, TextIO, Union
-
-
-def get_file_obj(path: str) -> TextIOWrapper:
-    """Get a writeable file object from a file path.
-
-    Args:
-        file path
-
-    Returns:
-        Writeable file object
-    """
-    return open(path, 'w', encoding='UTF-8')
-
-
-def write_json(data: Dict[str, Any],
-               file: Union[TextIOWrapper, TextIO],
-               pretty=False):
-    """Writes a dictionary to a JSON file.
-
-    Args:
-        data: A dict to write to JSON.
-        file: Path of output file.
-    """
-    indent = 2 if pretty else 0
-    json.dump(data, file, indent=indent)
diff --git a/benchmarks/perf-tool/okpt/main.py b/benchmarks/perf-tool/okpt/main.py
deleted file mode 100644
index 3e6e022d4..000000000
--- a/benchmarks/perf-tool/okpt/main.py
+++ /dev/null
@@ -1,55 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-""" Runner script that serves as the main controller of the testing tool."""
-
-import logging
-import sys
-from typing import cast
-
-from okpt.diff import diff
-from okpt.io import args
-from okpt.io.config.parsers import test
-from okpt.io.utils import reader, writer
-from okpt.test import runner
-
-
-def main():
-    """Main function of entry module."""
-    cli_args = args.get_args()
-    output = cli_args.output
-    if cli_args.log:
-        log_level = getattr(logging, cli_args.log.upper())
-        logging.basicConfig(level=log_level)
-
-    if cli_args.command == 'test':
-        cli_args = cast(args.TestArgs, cli_args)
-
-        # parse config
-        parser = test.TestParser()
-        test_config = parser.parse(cli_args.config)
-        logging.info('Configs are valid.')
-
-        # run tests
-        test_runner = runner.TestRunner(test_config=test_config)
-        test_result = test_runner.execute()
-
-        # write test results
-        logging.debug(
-            f'Test Result:\n {writer.write_json(test_result, sys.stdout, pretty=True)}'
-        )
-        writer.write_json(test_result, output, pretty=True)
-    elif cli_args.command == 'diff':
-        cli_args = cast(args.DiffArgs, cli_args)
-
-        # parse test results
-        base_result = reader.parse_json(cli_args.base_result)
-        changed_result = reader.parse_json(cli_args.changed_result)
-
-        # get diff
-        diff_result = diff.Diff(base_result, changed_result,
-                                cli_args.metadata).diff()
-        writer.write_json(data=diff_result, file=output, pretty=True)
diff --git a/benchmarks/perf-tool/okpt/test/__init__.py b/benchmarks/perf-tool/okpt/test/__init__.py
deleted file mode 100644
index ff4fd04d1..000000000
--- a/benchmarks/perf-tool/okpt/test/__init__.py
+++ /dev/null
@@ -1,5 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
diff --git a/benchmarks/perf-tool/okpt/test/profile.py b/benchmarks/perf-tool/okpt/test/profile.py
deleted file mode 100644
index d96860f9a..000000000
--- a/benchmarks/perf-tool/okpt/test/profile.py
+++ /dev/null
@@ -1,86 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-"""Provides decorators to profile functions.
-
-The decorators work by adding a `measureable` (time, memory, etc) field to a
-dictionary returned by the wrapped function. So the wrapped functions must
-return a dictionary in order to be profiled.
-"""
-import functools
-import time
-from typing import Callable
-
-
-class TimerStoppedWithoutStartingError(Exception):
-    """Error raised when Timer is stopped without having been started."""
-
-    def __init__(self):
-        super().__init__()
-        self.message = 'Timer must call start() before calling end().'
-
-
-class _Timer():
-    """Timer class for timing.
-
-    Methods:
-        start: Starts the timer.
-        end: Stops the timer and returns the time elapsed since start.
-
-    Raises:
-        TimerStoppedWithoutStartingError: Timer must start before ending.
-    """
-
-    def __init__(self):
-        self.start_time = None
-
-    def start(self):
-        """Starts the timer."""
-        self.start_time = time.perf_counter()
-
-    def end(self) -> float:
-        """Stops the timer.
-
-        Returns:
-            The time elapsed in milliseconds.
-        """
-        # ensure timer has started before ending
-        if self.start_time is None:
-            raise TimerStoppedWithoutStartingError()
-
-        elapsed = (time.perf_counter() - self.start_time) * 1000
-        self.start_time = None
-        return elapsed
-
-
-def took(f: Callable):
-    """Profiles a functions execution time.
-
-    Args:
-        f: Function to profile.
-
-    Returns:
-        A function that wraps the passed in function and adds a time took field
-        to the return value.
-    """
-
-    @functools.wraps(f)
-    def wrapper(*args, **kwargs):
-        """Wrapper function."""
-        timer = _Timer()
-        timer.start()
-        result = f(*args, **kwargs)
-        time_took = timer.end()
-
-        # if result already has a `took` field, don't modify the result
-        if isinstance(result, dict) and 'took' in result:
-            return result
-        # `result` may not be a dictionary, so it may not be unpackable
-        elif isinstance(result, dict):
-            return {**result, 'took': time_took}
-        return {'took': time_took}
-
-    return wrapper
diff --git a/benchmarks/perf-tool/okpt/test/runner.py b/benchmarks/perf-tool/okpt/test/runner.py
deleted file mode 100644
index 150154691..000000000
--- a/benchmarks/perf-tool/okpt/test/runner.py
+++ /dev/null
@@ -1,107 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-"""Provides a test runner class."""
-import logging
-import platform
-import sys
-from datetime import datetime
-from typing import Any, Dict, List
-
-import psutil
-
-from okpt.io.config.parsers import test
-from okpt.test.test import Test, get_avg
-
-
-def _aggregate_runs(runs: List[Dict[str, Any]]):
-    """Aggregates and averages a list of test results.
-
-    Args:
-        results: A list of test results.
-        num_runs: Number of times the tests were ran.
-
-    Returns:
-        A dictionary containing the averages of the test results.
-    """
-    aggregate: Dict[str, Any] = {}
-    for run in runs:
-        for key, value in run.items():
-            if key in aggregate:
-                aggregate[key].append(value)
-            else:
-                aggregate[key] = [value]
-
-    aggregate = {key: get_avg(value) for key, value in aggregate.items()}
-    return aggregate
-
-
-class TestRunner:
-    """Test runner class for running tests and aggregating the results.
-
-    Methods:
-        execute: Run the tests and aggregate the results.
-    """
-
-    def __init__(self, test_config: test.TestConfig):
-        """"Initializes test state."""
-        self.test_config = test_config
-        self.test = Test(test_config)
-
-    def _get_metadata(self):
-        """"Retrieves the test metadata."""
-        svmem = psutil.virtual_memory()
-        return {
-            'test_name':
-                self.test_config.test_name,
-            'test_id':
-                self.test_config.test_id,
-            'date':
-                datetime.now().strftime('%m/%d/%Y %H:%M:%S'),
-            'python_version':
-                sys.version,
-            'os_version':
-                platform.platform(),
-            'processor':
-                platform.processor() + ', ' +
-                str(psutil.cpu_count(logical=True)) + ' cores',
-            'memory':
-                str(svmem.used) + ' (used) / ' + str(svmem.available) +
-                ' (available) / ' + str(svmem.total) + ' (total)',
-        }
-
-    def execute(self) -> Dict[str, Any]:
-        """Runs the tests and aggregates the results.
-
-        Returns:
-            A dictionary containing the aggregate of test results.
-        """
-        logging.info('Setting up tests.')
-        self.test.setup()
-        logging.info('Beginning to run tests.')
-        runs = []
-        for i in range(self.test_config.num_runs):
-            logging.info(
-                f'Running test {i + 1} of {self.test_config.num_runs}'
-            )
-            runs.append(self.test.execute())
-
-        logging.info('Finished running tests.')
-        aggregate = _aggregate_runs(runs)
-
-        # add metadata to test results
-        test_result = {
-            'metadata':
-                self._get_metadata(),
-            'results':
-                aggregate
-        }
-
-        # include info about all test runs if specified in config
-        if self.test_config.show_runs:
-            test_result['runs'] = runs
-
-        return test_result
diff --git a/benchmarks/perf-tool/okpt/test/steps/base.py b/benchmarks/perf-tool/okpt/test/steps/base.py
deleted file mode 100644
index 829980421..000000000
--- a/benchmarks/perf-tool/okpt/test/steps/base.py
+++ /dev/null
@@ -1,60 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-"""Provides base Step interface."""
-
-from dataclasses import dataclass
-from typing import Any, Dict, List
-
-from okpt.test import profile
-
-
-@dataclass
-class StepConfig:
-    step_name: str
-    config: Dict[str, object]
-    implicit_config: Dict[str, object]
-
-
-class Step:
-    """Test step interface.
-
-    Attributes:
-        label: Name of the step.
-
-    Methods:
-        execute: Run the step and return a step response with the label and
-        corresponding measures.
-    """
-
-    label = 'base_step'
-
-    def __init__(self, step_config: StepConfig):
-        self.step_config = step_config
-
-    def _action(self):
-        """Step logic/behavior to be executed and profiled."""
-        pass
-
-    def _get_measures(self) -> List[str]:
-        """Gets the measures for a particular test"""
-        pass
-
-    def execute(self) -> List[Dict[str, Any]]:
-        """Execute step logic while profiling various measures.
-
-        Returns:
-            Dict containing step label and various step measures.
-        """
-        action = self._action
-
-        # profile the action with measure decorators - add if necessary
-        action = getattr(profile, 'took')(action)
-
-        result = action()
-        if isinstance(result, dict):
-            return [{'label': self.label, **result}]
-
-        raise ValueError('Invalid return by a step')
diff --git a/benchmarks/perf-tool/okpt/test/steps/factory.py b/benchmarks/perf-tool/okpt/test/steps/factory.py
deleted file mode 100644
index 2033f2672..000000000
--- a/benchmarks/perf-tool/okpt/test/steps/factory.py
+++ /dev/null
@@ -1,50 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-"""Factory for creating steps."""
-
-from okpt.io.config.parsers.base import ConfigurationError
-from okpt.test.steps.base import Step, StepConfig
-
-from okpt.test.steps.steps import CreateIndexStep, DisableRefreshStep, RefreshIndexStep, DeleteIndexStep, \
-    TrainModelStep, DeleteModelStep, ForceMergeStep, ClearCacheStep, IngestStep, IngestMultiFieldStep, \
-    IngestNestedFieldStep, QueryStep, QueryWithFilterStep, QueryNestedFieldStep, GetStatsStep, WarmupStep
-
-
-def create_step(step_config: StepConfig) -> Step:
-    if step_config.step_name == CreateIndexStep.label:
-        return CreateIndexStep(step_config)
-    elif step_config.step_name == DisableRefreshStep.label:
-        return DisableRefreshStep(step_config)
-    elif step_config.step_name == RefreshIndexStep.label:
-        return RefreshIndexStep(step_config)
-    elif step_config.step_name == TrainModelStep.label:
-        return TrainModelStep(step_config)
-    elif step_config.step_name == DeleteModelStep.label:
-        return DeleteModelStep(step_config)
-    elif step_config.step_name == DeleteIndexStep.label:
-        return DeleteIndexStep(step_config)
-    elif step_config.step_name == IngestStep.label:
-        return IngestStep(step_config)
-    elif step_config.step_name == IngestMultiFieldStep.label:
-        return IngestMultiFieldStep(step_config)
-    elif step_config.step_name == IngestNestedFieldStep.label:
-        return IngestNestedFieldStep(step_config)
-    elif step_config.step_name == QueryStep.label:
-        return QueryStep(step_config)
-    elif step_config.step_name == QueryWithFilterStep.label:
-        return QueryWithFilterStep(step_config)
-    elif step_config.step_name == QueryNestedFieldStep.label:
-        return QueryNestedFieldStep(step_config)
-    elif step_config.step_name == ForceMergeStep.label:
-        return ForceMergeStep(step_config)
-    elif step_config.step_name == ClearCacheStep.label:
-        return ClearCacheStep(step_config)
-    elif step_config.step_name == GetStatsStep.label:
-        return GetStatsStep(step_config)
-    elif step_config.step_name == WarmupStep.label:
-        return WarmupStep(step_config)
-
-    raise ConfigurationError(f'Invalid step {step_config.step_name}')
diff --git a/benchmarks/perf-tool/okpt/test/steps/steps.py b/benchmarks/perf-tool/okpt/test/steps/steps.py
deleted file mode 100644
index 99b2728dc..000000000
--- a/benchmarks/perf-tool/okpt/test/steps/steps.py
+++ /dev/null
@@ -1,987 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-"""Provides steps for OpenSearch tests.
-
-Some OpenSearch operations return a `took` field in the response body,
-so the profiling decorators aren't needed for some functions.
-"""
-import json
-from abc import abstractmethod
-from typing import Any, Dict, List
-
-import numpy as np
-import requests
-import time
-
-from opensearchpy import OpenSearch, RequestsHttpConnection
-
-from okpt.io.config.parsers.base import ConfigurationError
-from okpt.io.config.parsers.util import parse_string_param, parse_int_param, parse_dataset, parse_bool_param, \
-    parse_list_param
-from okpt.io.dataset import Context
-from okpt.io.utils.reader import parse_json_from_path
-from okpt.test.steps import base
-from okpt.test.steps.base import StepConfig
-
-
-class OpenSearchStep(base.Step):
-    """See base class."""
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-        self.endpoint = parse_string_param('endpoint', step_config.config,
-                                           step_config.implicit_config,
-                                           'localhost')
-        default_port = 9200 if self.endpoint == 'localhost' else 80
-        self.port = parse_int_param('port', step_config.config,
-                                    step_config.implicit_config, default_port)
-        self.timeout = parse_int_param('timeout', step_config.config, {}, 60)
-        self.opensearch = get_opensearch_client(str(self.endpoint),
-                                                int(self.port), int(self.timeout))
-
-
-class CreateIndexStep(OpenSearchStep):
-    """See base class."""
-
-    label = 'create_index'
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-        self.index_name = parse_string_param('index_name', step_config.config,
-                                             {}, None)
-        index_spec = parse_string_param('index_spec', step_config.config, {},
-                                        None)
-        self.body = parse_json_from_path(index_spec)
-        if self.body is None:
-            raise ConfigurationError('Index body must be passed in')
-
-    def _action(self):
-        """Creates an OpenSearch index, applying the index settings/mappings.
-
-        Returns:
-            An OpenSearch index creation response body.
-        """
-        self.opensearch.indices.create(index=self.index_name, body=self.body)
-        return {}
-
-    def _get_measures(self) -> List[str]:
-        return ['took']
-
-
-class DisableRefreshStep(OpenSearchStep):
-    """See base class."""
-
-    label = 'disable_refresh'
-
-    def _action(self):
-        """Disables the refresh interval for an OpenSearch index.
-
-        Returns:
-            An OpenSearch index settings update response body.
-        """
-        self.opensearch.indices.put_settings(
-            body={'index': {
-                'refresh_interval': -1
-            }})
-
-        return {}
-
-    def _get_measures(self) -> List[str]:
-        return ['took']
-
-
-class RefreshIndexStep(OpenSearchStep):
-    """See base class."""
-
-    label = 'refresh_index'
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-        self.index_name = parse_string_param('index_name', step_config.config,
-                                             {}, None)
-
-    def _action(self):
-        while True:
-            try:
-                self.opensearch.indices.refresh(index=self.index_name)
-                return {'store_kb': get_index_size_in_kb(self.opensearch,
-                                                         self.index_name)}
-            except:
-                pass
-
-    def _get_measures(self) -> List[str]:
-        return ['took', 'store_kb']
-
-
-class ForceMergeStep(OpenSearchStep):
-    """See base class."""
-
-    label = 'force_merge'
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-        self.index_name = parse_string_param('index_name', step_config.config,
-                                             {}, None)
-        self.max_num_segments = parse_int_param('max_num_segments',
-                                                step_config.config, {}, None)
-
-    def _action(self):
-        while True:
-            try:
-                self.opensearch.indices.forcemerge(
-                    index=self.index_name,
-                    max_num_segments=self.max_num_segments)
-                return {}
-            except:
-                pass
-
-    def _get_measures(self) -> List[str]:
-        return ['took']
-
-class ClearCacheStep(OpenSearchStep):
-    """See base class."""
-
-    label = 'clear_cache'
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-        self.index_name = parse_string_param('index_name', step_config.config,
-                                             {}, None)
-
-    def _action(self):
-        while True:
-            try:
-                self.opensearch.indices.clear_cache(
-                    index=self.index_name)
-                return {}
-            except:
-                pass
-
-    def _get_measures(self) -> List[str]:
-        return ['took']
-
-
-class WarmupStep(OpenSearchStep):
-    """See base class."""
-
-    label = 'warmup_operation'
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-        self.index_name = parse_string_param('index_name', step_config.config, {},
-                                           None)
-
-    def _action(self):
-        """Performs warmup operation on an index."""
-        warmup_operation(self.endpoint, self.port, self.index_name)
-        return {}
-
-    def _get_measures(self) -> List[str]:
-        return ['took']
-
-
-class TrainModelStep(OpenSearchStep):
-    """See base class."""
-
-    label = 'train_model'
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-
-        self.model_id = parse_string_param('model_id', step_config.config, {},
-                                           'Test')
-        self.train_index_name = parse_string_param('train_index',
-                                                   step_config.config, {}, None)
-        self.train_index_field = parse_string_param('train_field',
-                                                    step_config.config, {},
-                                                    None)
-        self.dimension = parse_int_param('dimension', step_config.config, {},
-                                         None)
-        self.description = parse_string_param('description', step_config.config,
-                                              {}, 'Default')
-        self.max_training_vector_count = parse_int_param(
-            'max_training_vector_count', step_config.config, {}, 10000000000000)
-
-        method_spec = parse_string_param('method_spec', step_config.config, {},
-                                         None)
-        self.method = parse_json_from_path(method_spec)
-        if self.method is None:
-            raise ConfigurationError('method must be passed in')
-
-    def _action(self):
-        """Train a model for an index.
-
-        Returns:
-            The trained model
-        """
-
-        # Build body
-        body = {
-            'training_index': self.train_index_name,
-            'training_field': self.train_index_field,
-            'description': self.description,
-            'dimension': self.dimension,
-            'method': self.method,
-            'max_training_vector_count': self.max_training_vector_count
-        }
-
-        # So, we trained the model. Now we need to wait until we have to wait
-        # until the model is created. Poll every
-        # 1/10 second
-        requests.post('http://' + self.endpoint + ':' + str(self.port) +
-                      '/_plugins/_knn/models/' + str(self.model_id) + '/_train',
-                      json.dumps(body),
-                      headers={'content-type': 'application/json'})
-
-        sleep_time = 0.1
-        timeout = 100000
-        i = 0
-        while i < timeout:
-            time.sleep(sleep_time)
-            model_response = get_model(self.endpoint, self.port, self.model_id)
-            if 'state' in model_response.keys() and model_response['state'] == \
-                    'created':
-                return {}
-            i += 1
-
-        raise TimeoutError('Failed to create model')
-
-    def _get_measures(self) -> List[str]:
-        return ['took']
-
-
-class DeleteModelStep(OpenSearchStep):
-    """See base class."""
-
-    label = 'delete_model'
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-
-        self.model_id = parse_string_param('model_id', step_config.config, {},
-                                           'Test')
-
-    def _action(self):
-        """Train a model for an index.
-
-        Returns:
-            The trained model
-        """
-        delete_model(self.endpoint, self.port, self.model_id)
-        return {}
-
-    def _get_measures(self) -> List[str]:
-        return ['took']
-
-
-class DeleteIndexStep(OpenSearchStep):
-    """See base class."""
-
-    label = 'delete_index'
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-
-        self.index_name = parse_string_param('index_name', step_config.config,
-                                             {}, None)
-
-    def _action(self):
-        """Delete the index
-
-        Returns:
-            An empty dict
-        """
-        delete_index(self.opensearch, self.index_name)
-        return {}
-
-    def _get_measures(self) -> List[str]:
-        return ['took']
-
-
-class BaseIngestStep(OpenSearchStep):
-    """See base class."""
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-        self.index_name = parse_string_param('index_name', step_config.config,
-                                             {}, None)
-        self.field_name = parse_string_param('field_name', step_config.config,
-                                             {}, None)
-        self.bulk_size = parse_int_param('bulk_size', step_config.config, {},
-                                         300)
-        self.implicit_config = step_config.implicit_config
-        dataset_format = parse_string_param('dataset_format',
-                                            step_config.config, {}, 'hdf5')
-        dataset_path = parse_string_param('dataset_path', step_config.config,
-                                          {}, None)
-        self.dataset = parse_dataset(dataset_format, dataset_path,
-                                     Context.INDEX)
-
-        self.input_doc_count = parse_int_param('doc_count', step_config.config, {},
-                                          self.dataset.size())
-        self.doc_count = min(self.input_doc_count, self.dataset.size())
-
-    def _action(self):
-
-        def action(doc_id):
-            return {'index': {'_index': self.index_name, '_id': doc_id}}
-
-        # Maintain minimal state outside of this loop. For large data sets, too
-        # much state may cause out of memory failure
-        for i in range(0, self.doc_count, self.bulk_size):
-            partition = self.dataset.read(self.bulk_size)
-            self._handle_data_bulk(partition, action, i)
-        self.dataset.reset()
-
-        return {}
-
-    def _get_measures(self) -> List[str]:
-        return ['took']
-
-    @abstractmethod
-    def _handle_data_bulk(self, partition, action, i):
-        pass
-
-
-class IngestStep(BaseIngestStep):
-    """See base class."""
-
-    label = 'ingest'
-
-    def _handle_data_bulk(self, partition, action, i):
-        if partition is None:
-            return
-        body = bulk_transform(partition, self.field_name, action, i)
-        bulk_index(self.opensearch, self.index_name, body)
-
-
-class IngestMultiFieldStep(BaseIngestStep):
-    """See base class."""
-
-    label = 'ingest_multi_field'
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-
-        dataset_path = parse_string_param('dataset_path', step_config.config,
-                                          {}, None)
-
-        self.attributes_dataset_name = parse_string_param('attributes_dataset_name',
-                                            step_config.config, {}, None)
-
-        self.attributes_dataset = parse_dataset('hdf5', dataset_path,
-                                                Context.CUSTOM, self.attributes_dataset_name)
-
-        self.attribute_spec = parse_list_param('attribute_spec',
-                                               step_config.config, {}, [])
-
-        self.partition_attr = self.attributes_dataset.read(self.doc_count)
-        self.action_buffer = None
-
-    def _handle_data_bulk(self, partition, action, i):
-        if partition is None:
-            return
-        body = self.bulk_transform_with_attributes(partition, self.partition_attr, self.field_name,
-                                              action, i, self.attribute_spec)
-        bulk_index(self.opensearch, self.index_name, body)
-
-    def bulk_transform_with_attributes(self, partition: np.ndarray, partition_attr, field_name: str,
-                                       action, offset: int, attributes_def) -> List[Dict[str, Any]]:
-        """Partitions and transforms a list of vectors into OpenSearch's bulk
-        injection format.
-        Args:
-            partition: An array of vectors to transform.
-            partition_attr: dictionary of additional data to transform
-            field_name: field name for action
-            action: Bulk API action.
-            offset: to start counting from
-            attributes_def: definition of additional doc fields
-        Returns:
-            An array of transformed vectors in bulk format.
-        """
-        actions = []
-        _ = [
-            actions.extend([action(i + offset), None])
-            for i in range(len(partition))
-        ]
-        idx = 1
-        part_list = partition.tolist()
-        for i in range(len(partition)):
-            actions[idx] = {field_name: part_list[i]}
-            attr_idx = i + offset
-            attr_def_idx = 0
-            for attribute in attributes_def:
-                attr_def_name = attribute['name']
-                attr_def_type = attribute['type']
-
-                if attr_def_type == 'str':
-                    val = partition_attr[attr_idx][attr_def_idx].decode()
-                    if val != 'None':
-                        actions[idx][attr_def_name] = val
-                elif attr_def_type == 'int':
-                    val = int(partition_attr[attr_idx][attr_def_idx].decode())
-                    actions[idx][attr_def_name] = val
-                attr_def_idx += 1
-            idx += 2
-
-        return actions
-
-
-class IngestNestedFieldStep(BaseIngestStep):
-    """See base class."""
-
-    label = 'ingest_nested_field'
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-
-        dataset_path = parse_string_param('dataset_path', step_config.config,
-                                          {}, None)
-
-        self.attributes_dataset_name = parse_string_param('attributes_dataset_name',
-                                                          step_config.config, {}, None)
-
-        self.attributes_dataset = parse_dataset('hdf5', dataset_path,
-                                                Context.CUSTOM, self.attributes_dataset_name)
-
-        self.attribute_spec = parse_list_param('attribute_spec',
-                                               step_config.config, {}, [])
-
-        self.partition_attr = self.attributes_dataset.read(self.doc_count)
-
-        if self.dataset.size() != self.doc_count:
-            raise ValueError("custom doc_count is not supported for nested field")
-        self.action_buffer = None
-        self.action_parent_id = None
-        self.count = 0
-
-    def _handle_data_bulk(self, partition, action, i):
-        if partition is None:
-            return
-        body = self.bulk_transform_with_nested(partition, self.partition_attr, self.field_name,
-                                               action, i, self.attribute_spec)
-        if len(body) > 0:
-            bulk_index(self.opensearch, self.index_name, body)
-
-    def bulk_transform_with_nested(self, partition: np.ndarray, partition_attr, field_name: str,
-                                   action, offset: int, attributes_def) -> List[Dict[str, Any]]:
-        """Partitions and transforms a list of vectors into OpenSearch's bulk
-        injection format.
-        Args:
-            partition: An array of vectors to transform.
-            partition_attr: dictionary of additional data to transform
-            field_name: field name for action
-            action: Bulk API action.
-            offset: to start counting from
-            attributes_def: definition of additional doc fields
-        Returns:
-            An array of transformed vectors in bulk format.
-        """
-        # offset is index of start row. We need number of parent doc - 1.
-        # The number of parent document can be calculated by using partition_attr data.
-        # We need to keep the last parent doc aside so that additional data can be added later.
-        parent_id_idx = next((index for (index, d) in enumerate(attributes_def) if d.get('name') == 'parent_id'), None)
-        if parent_id_idx is None:
-            raise ValueError("parent_id should be provided as attribute spec")
-        if attributes_def[parent_id_idx]['type'] != 'int':
-            raise ValueError("parent_id should be int type")
-
-        first_index = offset
-        last_index = offset + len(partition) - 1
-        num_of_actions = int(partition_attr[last_index][parent_id_idx].decode()) - int(partition_attr[first_index][parent_id_idx].decode())
-        if self.action_buffer is None:
-            self.action_buffer = {"nested_field": []}
-            self.action_parent_id = int(partition_attr[first_index][parent_id_idx].decode())
-
-        actions = []
-        _ = [
-            actions.extend([action(i + self.action_parent_id), None])
-            for i in range(num_of_actions)
-        ]
-
-        idx = 1
-        part_list = partition.tolist()
-        for i in range(len(partition)):
-            self.count += 1
-            nested = {field_name: part_list[i]}
-            attr_idx = i + offset
-            attr_def_idx = 0
-            current_parent_id = None
-            for attribute in attributes_def:
-                attr_def_name = attribute['name']
-                attr_def_type = attribute['type']
-                if attr_def_name == "parent_id":
-                    current_parent_id = int(partition_attr[attr_idx][attr_def_idx].decode())
-                    attr_def_idx += 1
-                    continue
-
-                if attr_def_type == 'str':
-                    val = partition_attr[attr_idx][attr_def_idx].decode()
-                    if val != 'None':
-                        nested[attr_def_name] = val
-                elif attr_def_type == 'int':
-                    val = int(partition_attr[attr_idx][attr_def_idx].decode())
-                    nested[attr_def_name] = val
-                attr_def_idx += 1
-
-            if self.action_parent_id == current_parent_id:
-                self.action_buffer["nested_field"].append(nested)
-            else:
-                actions.extend([action(self.action_parent_id), self.action_buffer])
-                self.action_buffer = {"nested_field": []}
-                self.action_buffer["nested_field"].append(nested)
-                self.action_parent_id = current_parent_id
-                idx += 2
-
-        if self.count == self.doc_count:
-            actions.extend([action(self.action_parent_id), self.action_buffer])
-
-        return actions
-
-
-class BaseQueryStep(OpenSearchStep):
-    """See base class."""
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-        self.k = parse_int_param('k', step_config.config, {}, 100)
-        self.r = parse_int_param('r', step_config.config, {}, 1)
-        self.index_name = parse_string_param('index_name', step_config.config,
-                                             {}, None)
-        self.field_name = parse_string_param('field_name', step_config.config,
-                                             {}, None)
-        self.calculate_recall = parse_bool_param('calculate_recall',
-                                                 step_config.config, {}, False)
-        dataset_format = parse_string_param('dataset_format',
-                                            step_config.config, {}, 'hdf5')
-        dataset_path = parse_string_param('dataset_path',
-                                          step_config.config, {}, None)
-        self.dataset = parse_dataset(dataset_format, dataset_path,
-                                     Context.QUERY)
-
-        input_query_count = parse_int_param('query_count',
-                                            step_config.config, {},
-                                            self.dataset.size())
-        self.query_count = min(input_query_count, self.dataset.size())
-
-        self.neighbors_format = parse_string_param('neighbors_format',
-                                                   step_config.config, {}, 'hdf5')
-        self.neighbors_path = parse_string_param('neighbors_path',
-                                                 step_config.config, {}, None)
-
-    def _action(self):
-
-        results = {}
-        query_responses = []
-        for _ in range(self.query_count):
-            query = self.dataset.read(1)
-            if query is None:
-                break
-            query_responses.append(
-                query_index(self.opensearch, self.index_name,
-                            self.get_body(query[0]) , self.get_exclude_fields()))
-
-        results['took'] = [
-            float(query_response['took']) for query_response in query_responses
-        ]
-        results['client_time'] = [
-            float(query_response['client_time']) for query_response in query_responses
-        ]
-        results['memory_kb'] = get_cache_size_in_kb(self.endpoint, self.port)
-
-        if self.calculate_recall:
-            ids = [[int(hit['_id'])
-                    for hit in query_response['hits']['hits']]
-                   for query_response in query_responses]
-            results['recall@K'] = recall_at_r(ids, self.neighbors,
-                                              self.k, self.k, self.query_count)
-            self.neighbors.reset()
-            results[f'recall@{str(self.r)}'] = recall_at_r(
-                ids, self.neighbors, self.r, self.k, self.query_count)
-            self.neighbors.reset()
-
-        self.dataset.reset()
-
-        return results
-
-    def _get_measures(self) -> List[str]:
-        measures = ['took', 'memory_kb', 'client_time']
-
-        if self.calculate_recall:
-            measures.extend(['recall@K', f'recall@{str(self.r)}'])
-
-        return measures
-
-    @abstractmethod
-    def get_body(self, vec):
-        pass
-
-    def get_exclude_fields(self):
-        return [self.field_name]
-
-class QueryStep(BaseQueryStep):
-    """See base class."""
-
-    label = 'query'
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-        self.neighbors = parse_dataset(self.neighbors_format, self.neighbors_path,
-                                       Context.NEIGHBORS)
-        self.implicit_config = step_config.implicit_config
-
-    def get_body(self, vec):
-        return {
-            'size': self.k,
-            'query': {
-                'knn': {
-                    self.field_name: {
-                        'vector': vec,
-                        'k': self.k
-                    }
-                }
-            }
-        }
-
-
-class QueryWithFilterStep(BaseQueryStep):
-    """See base class."""
-
-    label = 'query_with_filter'
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-
-        neighbors_dataset = parse_string_param('neighbors_dataset',
-                                               step_config.config, {}, None)
-
-        self.neighbors = parse_dataset(self.neighbors_format, self.neighbors_path,
-                                       Context.CUSTOM, neighbors_dataset)
-
-        self.filter_type = parse_string_param('filter_type', step_config.config, {}, 'SCRIPT')
-        self.filter_spec = parse_string_param('filter_spec', step_config.config, {}, None)
-        self.score_script_similarity = parse_string_param('score_script_similarity', step_config.config, {}, 'l2')
-
-        self.implicit_config = step_config.implicit_config
-
-    def get_body(self, vec):
-        filter_json = json.load(open(self.filter_spec))
-        if self.filter_type == 'FILTER':
-            return {
-                'size': self.k,
-                'query': {
-                    'knn': {
-                        self.field_name: {
-                            'vector': vec,
-                            'k': self.k,
-                            'filter': filter_json
-                        }
-                    }
-                }
-            }
-        elif self.filter_type == 'SCRIPT':
-            return {
-                'size': self.k,
-                'query': {
-                    'script_score': {
-                        'query': {
-                            'bool': {
-                                'filter': filter_json
-                            }
-                        },
-                        'script': {
-                            'source': 'knn_score',
-                            'lang': 'knn',
-                            'params': {
-                                'field': self.field_name,
-                                'query_value': vec,
-                                'space_type': self.score_script_similarity
-                            }
-                        }
-                    }
-                }
-            }
-        elif self.filter_type == 'BOOL_POST_FILTER':
-            return {
-                'size': self.k,
-                'query': {
-                    'bool': {
-                        'filter': filter_json,
-                        'must': [
-                            {
-                                'knn': {
-                                    self.field_name: {
-                                        'vector': vec,
-                                        'k': self.k
-                                    }
-                                }
-                            }
-                        ]
-                    }
-                }
-            }
-        else:
-            raise ConfigurationError('Not supported filter type {}'.format(self.filter_type))
-
-class QueryNestedFieldStep(BaseQueryStep):
-    """See base class."""
-
-    label = 'query_nested_field'
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-
-        neighbors_dataset = parse_string_param('neighbors_dataset',
-                                               step_config.config, {}, None)
-
-        self.neighbors = parse_dataset(self.neighbors_format, self.neighbors_path,
-                                       Context.CUSTOM, neighbors_dataset)
-
-        self.implicit_config = step_config.implicit_config
-
-    def get_body(self, vec):
-        return {
-            'size': self.k,
-            'query': {
-                'nested': {
-                    'path': 'nested_field',
-                    'query': {
-                        'knn': {
-                            'nested_field.' + self.field_name: {
-                                'vector': vec,
-                                'k': self.k
-                            }
-                        }
-                    }
-                }
-            }
-        }
-
-class GetStatsStep(OpenSearchStep):
-    """See base class."""
-
-    label = 'get_stats'
-
-    def __init__(self, step_config: StepConfig):
-        super().__init__(step_config)
-
-        self.index_name = parse_string_param('index_name', step_config.config,
-                                             {}, None)
-
-    def _action(self):
-        """Get stats for cluster/index etc.
-
-        Returns:
-            Stats with following info:
-            - number of committed and search segments in the index
-        """
-        results = {}
-        segment_stats = get_segment_stats(self.opensearch, self.index_name)
-        shards = segment_stats["indices"][self.index_name]["shards"]
-        num_of_committed_segments = 0
-        num_of_search_segments = 0;
-        for shard_key in shards.keys():
-            for segment in shards[shard_key]:
-                num_of_committed_segments += segment["num_committed_segments"]
-                num_of_search_segments += segment["num_search_segments"]
-
-        results['committed_segments'] = num_of_committed_segments
-        results['search_segments'] = num_of_search_segments
-        return results
-
-    def _get_measures(self) -> List[str]:
-        return ['committed_segments', 'search_segments']
-
-# Helper functions - (AKA not steps)
-def bulk_transform(partition: np.ndarray, field_name: str, action,
-                   offset: int) -> List[Dict[str, Any]]:
-    """Partitions and transforms a list of vectors into OpenSearch's bulk
-    injection format.
-    Args:
-        offset: to start counting from
-        partition: An array of vectors to transform.
-        field_name: field name for action
-        action: Bulk API action.
-    Returns:
-        An array of transformed vectors in bulk format.
-    """
-    actions = []
-    _ = [
-        actions.extend([action(i + offset), None])
-        for i in range(len(partition))
-    ]
-    actions[1::2] = [{field_name: vec} for vec in partition.tolist()]
-    return actions
-
-
-def delete_index(opensearch: OpenSearch, index_name: str):
-    """Deletes an OpenSearch index.
-
-    Args:
-        opensearch: An OpenSearch client.
-        index_name: Name of the OpenSearch index to be deleted.
-    """
-    opensearch.indices.delete(index=index_name, ignore=[400, 404])
-
-
-def get_model(endpoint, port, model_id):
-    """
-    Retrieve a model from an OpenSearch cluster
-    Args:
-        endpoint: Endpoint OpenSearch is running on
-        port: Port OpenSearch is running on
-        model_id: ID of model to be deleted
-    Returns:
-        Get model response
-    """
-    response = requests.get('http://' + endpoint + ':' + str(port) +
-                            '/_plugins/_knn/models/' + model_id,
-                            headers={'content-type': 'application/json'})
-    return response.json()
-
-
-def delete_model(endpoint, port, model_id):
-    """
-    Deletes a model from OpenSearch cluster
-    Args:
-        endpoint: Endpoint OpenSearch is running on
-        port: Port OpenSearch is running on
-        model_id: ID of model to be deleted
-    Returns:
-        Deleted model response
-    """
-    response = requests.delete('http://' + endpoint + ':' + str(port) +
-                               '/_plugins/_knn/models/' + model_id,
-                               headers={'content-type': 'application/json'})
-    return response.json()
-
-
-def warmup_operation(endpoint, port, index):
-    """
-    Performs warmup operation on index to load native library files
-    of that index to reduce query latencies.
-    Args:
-        endpoint: Endpoint OpenSearch is running on
-        port: Port OpenSearch is running on
-        index: index name
-    Returns:
-        number of shards the plugin succeeded and failed to warm up.
-    """
-    response = requests.get('http://' + endpoint + ':' + str(port) +
-                               '/_plugins/_knn/warmup/' + index,
-                               headers={'content-type': 'application/json'})
-    return response.json()
-
-
-def get_opensearch_client(endpoint: str, port: int, timeout=60):
-    """
-    Get an opensearch client from an endpoint and port
-    Args:
-        endpoint: Endpoint OpenSearch is running on
-        port: Port OpenSearch is running on
-        timeout: timeout for OpenSearch client, default value 60
-    Returns:
-        OpenSearch client
-
-    """
-    # TODO: fix for security in the future
-    return OpenSearch(
-        hosts=[{
-            'host': endpoint,
-            'port': port
-        }],
-        use_ssl=False,
-        verify_certs=False,
-        connection_class=RequestsHttpConnection,
-        timeout=timeout,
-    )
-
-
-def recall_at_r(results, neighbor_dataset, r, k, query_count):
-    """
-    Calculates the recall@R for a set of queries against a ground truth nearest
-    neighbor set
-    Args:
-        results: 2D list containing ids of results returned by OpenSearch.
-        results[i][j] i refers to query, j refers to
-            result in the query
-        neighbor_dataset: 2D dataset containing ids of the true nearest
-        neighbors for a set of queries
-        r: number of top results to check if they are in the ground truth k-NN
-        set.
-        k: k value for the query
-        query_count: number of queries
-    Returns:
-        Recall at R
-    """
-    correct = 0.0
-    total_num_of_results = 0
-    for query in range(query_count):
-        true_neighbors = neighbor_dataset.read(1)
-        if true_neighbors is None:
-            break
-        true_neighbors_set = set(true_neighbors[0][:k])
-        true_neighbors_set.discard(-1)
-        min_r = min(r, len(true_neighbors_set))
-        total_num_of_results += min_r
-        for j in range(min_r):
-            if results[query][j] in true_neighbors_set:
-                correct += 1.0
-
-    return correct / total_num_of_results
-
-
-def get_index_size_in_kb(opensearch, index_name):
-    """
-    Gets the size of an index in kilobytes
-    Args:
-        opensearch: opensearch client
-        index_name: name of index to look up
-    Returns:
-        size of index in kilobytes
-    """
-    return int(
-        opensearch.indices.stats(index_name, metric='store')['indices']
-        [index_name]['total']['store']['size_in_bytes']) / 1024
-
-
-def get_cache_size_in_kb(endpoint, port):
-    """
-    Gets the size of the k-NN cache in kilobytes
-    Args:
-        endpoint: endpoint of OpenSearch cluster
-        port: port of endpoint OpenSearch is running on
-    Returns:
-        size of cache in kilobytes
-    """
-    response = requests.get('http://' + endpoint + ':' + str(port) +
-                            '/_plugins/_knn/stats',
-                            headers={'content-type': 'application/json'})
-    stats = response.json()
-
-    keys = stats['nodes'].keys()
-
-    total_used = 0
-    for key in keys:
-        total_used += int(stats['nodes'][key]['graph_memory_usage'])
-    return total_used
-
-
-def query_index(opensearch: OpenSearch, index_name: str, body: dict,
-                excluded_fields: list):
-    start_time = round(time.time()*1000)
-    queryResponse = opensearch.search(index=index_name,
-                             body=body,
-                             _source_excludes=excluded_fields)
-    end_time = round(time.time() * 1000)
-    queryResponse['client_time'] = end_time - start_time
-    return queryResponse
-
-
-def bulk_index(opensearch: OpenSearch, index_name: str, body: List):
-    return opensearch.bulk(index=index_name, body=body)
-
-def get_segment_stats(opensearch: OpenSearch, index_name: str):
-    return opensearch.indices.segments(index=index_name)
diff --git a/benchmarks/perf-tool/okpt/test/test.py b/benchmarks/perf-tool/okpt/test/test.py
deleted file mode 100644
index c947545ad..000000000
--- a/benchmarks/perf-tool/okpt/test/test.py
+++ /dev/null
@@ -1,188 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-#
-# The OpenSearch Contributors require contributions made to
-# this file be licensed under the Apache-2.0 license or a
-# compatible open source license.
-
-"""Provides a base Test class."""
-from math import floor
-from typing import Any, Dict, List
-
-from okpt.io.config.parsers.test import TestConfig
-from okpt.test.steps.base import Step
-
-
-def get_avg(values: List[Any]):
-    """Get average value of a list.
-
-    Args:
-        values: A list of values.
-
-    Returns:
-        The average value in the list.
-    """
-    valid_total = len(values)
-    running_sum = 0.0
-
-    for value in values:
-        if value == -1:
-            valid_total -= 1
-            continue
-        running_sum += value
-
-    if valid_total == 0:
-        return -1
-    return running_sum / valid_total
-
-
-def _pxx(values: List[Any], p: float):
-    """Calculates the pXX statistics for a given list.
-
-    Args:
-        values: List of values.
-        p: Percentile (between 0 and 1).
-
-    Returns:
-        The corresponding pXX metric.
-    """
-    lowest_percentile = 1 / len(values)
-    highest_percentile = (len(values) - 1) / len(values)
-
-    # return -1 if p is out of range or if the list doesn't have enough elements
-    # to support the specified percentile
-    if p < 0 or p > 1:
-        return -1.0
-    elif p < lowest_percentile or p > highest_percentile:
-        if p == 1.0 and len(values) > 1:
-            return float(values[len(values) - 1])
-        return -1.0
-    else:
-        return float(values[floor(len(values) * p)])
-
-
-def _aggregate_steps(step_results: List[Dict[str, Any]],
-                     measure_labels=None):
-    """Aggregates the steps for a given Test.
-
-    The aggregation process extracts the measures from each step and calculates
-    the total time spent performing each step measure, including the
-    percentile metrics, if possible.
-
-    The aggregation process also extracts the test measures by simply summing
-    up the respective step measures.
-
-    A step measure is formatted as `{step_name}_{measure_name}`, for example,
-    {bulk_index}_{took} or {query_index}_{memory}. The braces are not included
-    in the actual key string.
-
-    Percentile/Total step measures are give as
-    `{step_name}_{measure_name}_{percentile|total}`.
-
-    Test measures are just step measure sums so they just given as
-    `test_{measure_name}`.
-
-    Args:
-        steps: List of test steps to be aggregated.
-        measures: List of step metrics to account for.
-
-    Returns:
-        A complete test result.
-    """
-    if measure_labels is None:
-        measure_labels = ['took']
-    test_measures = {
-        f'test_{measure_label}': 0
-        for measure_label in measure_labels
-    }
-    step_measures: Dict[str, Any] = {}
-
-    # iterate over all test steps
-    for step in step_results:
-        step_label = step['label']
-
-        step_measure_labels = list(step.keys())
-        step_measure_labels.remove('label')
-
-        # iterate over all measures in each test step
-        for measure_label in step_measure_labels:
-
-            step_measure = step[measure_label]
-            step_measure_label = f'{measure_label}' if step_label == 'get_stats' else f'{step_label}_{measure_label}'
-
-            # Add cumulative test measures from steps to test measures
-            if measure_label in measure_labels:
-                test_measures[f'test_{measure_label}'] += sum(step_measure) if \
-                    isinstance(step_measure, list) else step_measure
-
-            if step_measure_label in step_measures:
-                _ = step_measures[step_measure_label].extend(step_measure) \
-                    if isinstance(step_measure, list) else \
-                    step_measures[step_measure_label].append(step_measure)
-            else:
-                step_measures[step_measure_label] = step_measure if \
-                    isinstance(step_measure, list) else [step_measure]
-
-    aggregate = {**test_measures}
-    # calculate the totals and percentile statistics for each step measure
-    # where relevant
-    for step_measure_label, step_measure in step_measures.items():
-        step_measure.sort()
-
-        aggregate[step_measure_label + '_total'] = float(sum(step_measure))
-
-        p50 = _pxx(step_measure, 0.50)
-        if p50 != -1:
-            aggregate[step_measure_label + '_p50'] = p50
-        p90 = _pxx(step_measure, 0.90)
-        if p90 != -1:
-            aggregate[step_measure_label + '_p90'] = p90
-        p99 = _pxx(step_measure, 0.99)
-        if p99 != -1:
-            aggregate[step_measure_label + '_p99'] = p99
-        p99_9 = _pxx(step_measure, 0.999)
-        if p99_9 != -1:
-            aggregate[step_measure_label + '_p99.9'] = p99_9
-        p100 = _pxx(step_measure, 1.00)
-        if p100 != -1:
-            aggregate[step_measure_label + '_p100'] = p100
-
-    return aggregate
-
-
-class Test:
-    """A base Test class, representing a collection of steps to profiled and
-    aggregated.
-
-    Methods:
-        setup: Performs test setup. Usually for steps not intended to be
-        profiled.
-        run_steps: Runs the test steps, aggregating the results into the
-        `step_results` instance field.
-        cleanup: Perform test cleanup. Useful for clearing the state of a
-        persistent process like OpenSearch. Cleanup steps are executed after
-        each run.
-        execute: Runs steps, cleans up, and aggregates the test result.
-    """
-    def __init__(self, test_config: TestConfig):
-        """Initializes the test state.
-        """
-        self.test_config = test_config
-        self.setup_steps: List[Step] = test_config.setup
-        self.test_steps: List[Step] = test_config.steps
-        self.cleanup_steps: List[Step] = test_config.cleanup
-
-    def setup(self):
-        _ = [step.execute() for step in self.setup_steps]
-
-    def _run_steps(self):
-        step_results = []
-        _ = [step_results.extend(step.execute()) for step in self.test_steps]
-        return step_results
-
-    def _cleanup(self):
-        _ = [step.execute() for step in self.cleanup_steps]
-
-    def execute(self):
-        results = self._run_steps()
-        self._cleanup()
-        return _aggregate_steps(results)
diff --git a/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/relaxed-filter/index.json b/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/relaxed-filter/index.json
deleted file mode 100644
index 7e8ddda8e..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/relaxed-filter/index.json
+++ /dev/null
@@ -1,27 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 24,
-      "number_of_replicas": 1,
-      "knn.algo_param.ef_search": 100
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-        "type": "knn_vector",
-        "dimension": 128,
-        "method": {
-          "name": "hnsw",
-          "space_type": "l2",
-          "engine": "faiss",
-          "parameters": {
-            "ef_construction": 256,
-            "m": 16
-          }
-        }
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/relaxed-filter/relaxed-filter-spec.json b/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/relaxed-filter/relaxed-filter-spec.json
deleted file mode 100644
index 3e04d12c4..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/relaxed-filter/relaxed-filter-spec.json
+++ /dev/null
@@ -1,42 +0,0 @@
-{
-    "bool":
-    {
-        "should":
-        [
-            {
-                "range":
-                {
-                    "age":
-                    {
-                        "gte": 30,
-                        "lte": 70
-                    }
-                }
-            },
-            {
-                "term":
-                {
-                    "color": "green"
-                }
-            },
-            {
-                "term":
-                {
-                    "color": "blue"
-                }
-            },
-            {
-                "term":
-                {
-                    "color": "yellow"
-                }
-            },
-            {
-                "term":
-                {
-                    "taste": "sweet"
-                }
-            }
-        ]
-    }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/relaxed-filter/relaxed-filter-test.yml b/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/relaxed-filter/relaxed-filter-test.yml
deleted file mode 100644
index ba8850e1d..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/relaxed-filter/relaxed-filter-test.yml
+++ /dev/null
@@ -1,40 +0,0 @@
-endpoint: [ENDPOINT]
-port: [PORT]
-test_name: "Faiss HNSW Relaxed Filter Test"
-test_id: "Faiss HNSW Relaxed Filter Test"
-num_runs: 3
-show_runs: false
-steps:
-  - name: delete_index
-    index_name: target_index
-  - name: create_index
-    index_name: target_index
-    index_spec: release-configs/faiss-hnsw/filtering/relaxed-filter/index.json
-  - name: ingest_multi_field
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-with-attr.hdf5
-    attributes_dataset_name: attributes
-    attribute_spec: [ { name: 'color', type: 'str' }, { name: 'taste', type: 'str' }, { name: 'age', type: 'int' } ]
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 1
-  - name: warmup_operation
-    index_name: target_index
-  - name: query_with_filter
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-with-attr.hdf5
-    neighbors_format: hdf5
-    neighbors_path: dataset/sift-128-euclidean-with-relaxed-filters.hdf5
-    neighbors_dataset: neighbors_filter_5
-    filter_spec: release-configs/faiss-hnsw/filtering/relaxed-filter/relaxed-filter-spec.json
-    filter_type: FILTER
diff --git a/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/restrictive-filter/index.json b/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/restrictive-filter/index.json
deleted file mode 100644
index 7e8ddda8e..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/restrictive-filter/index.json
+++ /dev/null
@@ -1,27 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 24,
-      "number_of_replicas": 1,
-      "knn.algo_param.ef_search": 100
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-        "type": "knn_vector",
-        "dimension": 128,
-        "method": {
-          "name": "hnsw",
-          "space_type": "l2",
-          "engine": "faiss",
-          "parameters": {
-            "ef_construction": 256,
-            "m": 16
-          }
-        }
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/restrictive-filter/restrictive-filter-spec.json b/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/restrictive-filter/restrictive-filter-spec.json
deleted file mode 100644
index 9e6356f1c..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/restrictive-filter/restrictive-filter-spec.json
+++ /dev/null
@@ -1,44 +0,0 @@
-{
-    "bool":
-    {
-        "must":
-        [
-            {
-                "range":
-                {
-                    "age":
-                    {
-                        "gte": 30,
-                        "lte": 60
-                    }
-                }
-            },
-            {
-                "term":
-                {
-                    "taste": "bitter"
-                }
-            },
-            {
-                "bool":
-                {
-                    "should":
-                    [
-                        {
-                            "term":
-                            {
-                                "color": "blue"
-                            }
-                        },
-                        {
-                            "term":
-                            {
-                                "color": "green"
-                            }
-                        }
-                    ]
-                }
-            }
-        ]
-    }
-}
\ No newline at end of file
diff --git a/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/restrictive-filter/restrictive-filter-test.yml b/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/restrictive-filter/restrictive-filter-test.yml
deleted file mode 100644
index 94f4073c7..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-hnsw/filtering/restrictive-filter/restrictive-filter-test.yml
+++ /dev/null
@@ -1,40 +0,0 @@
-endpoint: [ENDPOINT]
-port: [PORT]
-test_name: "Faiss HNSW Restrictive Filter Test"
-test_id: "Faiss HNSW Restrictive Filter Test"
-num_runs: 3
-show_runs: false
-steps:
-  - name: delete_index
-    index_name: target_index
-  - name: create_index
-    index_name: target_index
-    index_spec: release-configs/faiss-hnsw/filtering/restrictive-filter/index.json
-  - name: ingest_multi_field
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-with-attr.hdf5
-    attributes_dataset_name: attributes
-    attribute_spec: [ { name: 'color', type: 'str' }, { name: 'taste', type: 'str' }, { name: 'age', type: 'int' } ]
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 1
-  - name: warmup_operation
-    index_name: target_index
-  - name: query_with_filter
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-with-attr.hdf5
-    neighbors_format: hdf5
-    neighbors_path: dataset/sift-128-euclidean-with-restrictive-filters.hdf5
-    neighbors_dataset: neighbors_filter_4
-    filter_spec: release-configs/faiss-hnsw/filtering/restrictive-filter/restrictive-filter-spec.json
-    filter_type: FILTER
diff --git a/benchmarks/perf-tool/release-configs/faiss-hnsw/index.json b/benchmarks/perf-tool/release-configs/faiss-hnsw/index.json
deleted file mode 100644
index 7e8ddda8e..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-hnsw/index.json
+++ /dev/null
@@ -1,27 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 24,
-      "number_of_replicas": 1,
-      "knn.algo_param.ef_search": 100
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-        "type": "knn_vector",
-        "dimension": 128,
-        "method": {
-          "name": "hnsw",
-          "space_type": "l2",
-          "engine": "faiss",
-          "parameters": {
-            "ef_construction": 256,
-            "m": 16
-          }
-        }
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-hnsw/nested/simple/index.json b/benchmarks/perf-tool/release-configs/faiss-hnsw/nested/simple/index.json
deleted file mode 100644
index 338ceb1f4..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-hnsw/nested/simple/index.json
+++ /dev/null
@@ -1,35 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 24,
-      "number_of_replicas": 1,
-      "knn.algo_param.ef_search": 100
-    }
-  },
-  "mappings": {
-    "_source": {
-      "excludes": ["nested_field"]
-    },
-    "properties": {
-      "nested_field": {
-        "type": "nested",
-        "properties": {
-          "target_field": {
-            "type": "knn_vector",
-            "dimension": 128,
-            "method": {
-              "name": "hnsw",
-              "space_type": "l2",
-              "engine": "faiss",
-              "parameters": {
-                "ef_construction": 256,
-                "m": 16
-              }
-            }
-          }
-        }
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-hnsw/nested/simple/simple-nested-test.yml b/benchmarks/perf-tool/release-configs/faiss-hnsw/nested/simple/simple-nested-test.yml
deleted file mode 100644
index 151b2014d..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-hnsw/nested/simple/simple-nested-test.yml
+++ /dev/null
@@ -1,37 +0,0 @@
-endpoint: [ENDPOINT]
-port: [PORT]
-test_name: "Faiss HNSW Nested Field Test"
-test_id: "Faiss HNSW Nested Field Test"
-num_runs: 3
-show_runs: false
-steps:
-  - name: delete_index
-    index_name: target_index
-  - name: create_index
-    index_name: target_index
-    index_spec: release-configs/faiss-hnsw/nested/simple/index.json
-  - name: ingest_nested_field
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-nested.hdf5
-    attributes_dataset_name: attributes
-    attribute_spec: [ { name: 'color', type: 'str' }, { name: 'taste', type: 'str' }, { name: 'age', type: 'int' }, { name: 'parent_id', type: 'int'} ]
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 1
-  - name: warmup_operation
-    index_name: target_index
-  - name: query_nested_field
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-nested.hdf5
-    neighbors_format: hdf5
-    neighbors_path: dataset/sift-128-euclidean-nested.hdf5
-    neighbors_dataset: neighbour_nested
\ No newline at end of file
diff --git a/benchmarks/perf-tool/release-configs/faiss-hnsw/test.yml b/benchmarks/perf-tool/release-configs/faiss-hnsw/test.yml
deleted file mode 100644
index c4740acf5..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-hnsw/test.yml
+++ /dev/null
@@ -1,35 +0,0 @@
-endpoint: [ENDPOINT]
-port: [PORT]
-test_name: "Faiss HNSW Test"
-test_id: "Faiss HNSW Test"
-num_runs: 3
-show_runs: false
-steps:
-  - name: delete_index
-    index_name: target_index
-  - name: create_index
-    index_name: target_index
-    index_spec: release-configs/faiss-hnsw/index.json
-  - name: ingest
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 1
-  - name: warmup_operation
-    index_name: target_index
-  - name: query
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-    neighbors_format: hdf5
-    neighbors_path: dataset/sift-128-euclidean.hdf5
diff --git a/benchmarks/perf-tool/release-configs/faiss-hnswpq/index.json b/benchmarks/perf-tool/release-configs/faiss-hnswpq/index.json
deleted file mode 100644
index 479703412..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-hnswpq/index.json
+++ /dev/null
@@ -1,17 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 24,
-      "number_of_replicas": 1
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-          "type": "knn_vector",
-          "model_id": "test-model"
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-hnswpq/method-spec.json b/benchmarks/perf-tool/release-configs/faiss-hnswpq/method-spec.json
deleted file mode 100644
index 2d67bf2df..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-hnswpq/method-spec.json
+++ /dev/null
@@ -1,15 +0,0 @@
-{
-  "name":"hnsw",
-  "engine":"faiss",
-  "space_type": "l2",
-    "parameters":{
-      "ef_construction": 256,
-      "m": 16,
-      "encoder": {
-        "name": "pq",
-        "parameters": {
-          "m": 16
-        }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-hnswpq/test.yml b/benchmarks/perf-tool/release-configs/faiss-hnswpq/test.yml
deleted file mode 100644
index f573ede9c..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-hnswpq/test.yml
+++ /dev/null
@@ -1,59 +0,0 @@
-endpoint: [ENDPOINT]
-port: [PORT]
-test_name: "Faiss HNSW PQ Test"
-test_id: "Faiss HNSW PQ Test"
-num_runs: 3
-show_runs: false
-setup:
-  - name: delete_index
-    index_name: train_index
-  - name: create_index
-    index_name: train_index
-    index_spec: release-configs/faiss-hnswpq/train-index-spec.json
-  - name: ingest
-    index_name: train_index
-    field_name: train_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-    doc_count: 50000
-  - name: refresh_index
-    index_name: train_index
-steps:
-  - name: delete_model
-    model_id: test-model
-  - name: delete_index
-    index_name: target_index
-  - name: train_model
-    model_id: test-model
-    train_index: train_index
-    train_field: train_field
-    dimension: 128
-    method_spec: release-configs/faiss-hnswpq/method-spec.json
-    max_training_vector_count: 50000
-  - name: create_index
-    index_name: target_index
-    index_spec: release-configs/faiss-hnswpq/index.json
-  - name: ingest
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 1
-  - name: warmup_operation
-    index_name: target_index  
-  - name: query
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-    neighbors_format: hdf5
-    neighbors_path: dataset/sift-128-euclidean.hdf5
diff --git a/benchmarks/perf-tool/release-configs/faiss-hnswpq/train-index-spec.json b/benchmarks/perf-tool/release-configs/faiss-hnswpq/train-index-spec.json
deleted file mode 100644
index 804a5707e..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-hnswpq/train-index-spec.json
+++ /dev/null
@@ -1,16 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "number_of_shards": 24,
-      "number_of_replicas": 0
-    }
-  },
-  "mappings": {
-    "properties": {
-      "train_field": {
-        "type": "knn_vector",
-        "dimension": 128
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/index.json b/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/index.json
deleted file mode 100644
index ade7fa377..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/index.json
+++ /dev/null
@@ -1,17 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 24,
-      "number_of_replicas": 1
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-        "type": "knn_vector",
-        "model_id": "test-model"
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/method-spec.json b/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/method-spec.json
deleted file mode 100644
index 51ae89877..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/method-spec.json
+++ /dev/null
@@ -1,9 +0,0 @@
-{
-  "name":"ivf",
-  "engine":"faiss",
-  "space_type": "l2",
-    "parameters":{
-      "nlist": 128,
-      "nprobes": 8
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/relaxed-filter-spec.json b/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/relaxed-filter-spec.json
deleted file mode 100644
index 3e04d12c4..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/relaxed-filter-spec.json
+++ /dev/null
@@ -1,42 +0,0 @@
-{
-    "bool":
-    {
-        "should":
-        [
-            {
-                "range":
-                {
-                    "age":
-                    {
-                        "gte": 30,
-                        "lte": 70
-                    }
-                }
-            },
-            {
-                "term":
-                {
-                    "color": "green"
-                }
-            },
-            {
-                "term":
-                {
-                    "color": "blue"
-                }
-            },
-            {
-                "term":
-                {
-                    "color": "yellow"
-                }
-            },
-            {
-                "term":
-                {
-                    "taste": "sweet"
-                }
-            }
-        ]
-    }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/relaxed-filter-test.yml b/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/relaxed-filter-test.yml
deleted file mode 100644
index adb25a04d..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/relaxed-filter-test.yml
+++ /dev/null
@@ -1,64 +0,0 @@
-endpoint: [ENDPOINT]
-port: [PORT]
-test_name: "Faiss IVF Relaxed Filter Test"
-test_id: "Faiss IVF Relaxed Filter Test"
-num_runs: 3
-show_runs: false
-setup:
-  - name: delete_index
-    index_name: train_index
-  - name: create_index
-    index_name: train_index
-    index_spec: release-configs/faiss-ivf/filtering/relaxed-filter/train-index-spec.json
-  - name: ingest
-    index_name: train_index
-    field_name: train_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-    doc_count: 50000
-  - name: refresh_index
-    index_name: train_index
-steps:
-  - name: delete_model
-    model_id: test-model
-  - name: delete_index
-    index_name: target_index
-  - name: train_model
-    model_id: test-model
-    train_index: train_index
-    train_field: train_field
-    dimension: 128
-    method_spec: release-configs/faiss-ivf/filtering/relaxed-filter/method-spec.json
-    max_training_vector_count: 50000
-  - name: create_index
-    index_name: target_index
-    index_spec: release-configs/faiss-ivf/filtering/relaxed-filter/index.json
-  - name: ingest_multi_field
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-with-attr.hdf5
-    attributes_dataset_name: attributes
-    attribute_spec: [ { name: 'color', type: 'str' }, { name: 'taste', type: 'str' }, { name: 'age', type: 'int' } ]
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 1
-  - name: warmup_operation
-    index_name: target_index
-  - name: query_with_filter
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-with-attr.hdf5
-    neighbors_format: hdf5
-    neighbors_path: dataset/sift-128-euclidean-with-relaxed-filters.hdf5
-    neighbors_dataset: neighbors_filter_5
-    filter_spec: release-configs/faiss-ivf/filtering/relaxed-filter/relaxed-filter-spec.json
-    filter_type: FILTER
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/train-index-spec.json b/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/train-index-spec.json
deleted file mode 100644
index 137fac9d8..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/relaxed-filter/train-index-spec.json
+++ /dev/null
@@ -1,16 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "number_of_shards": 24,
-      "number_of_replicas": 1
-    }
-  },
-  "mappings": {
-    "properties": {
-      "train_field": {
-        "type": "knn_vector",
-        "dimension": 128
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/index.json b/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/index.json
deleted file mode 100644
index ade7fa377..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/index.json
+++ /dev/null
@@ -1,17 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 24,
-      "number_of_replicas": 1
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-        "type": "knn_vector",
-        "model_id": "test-model"
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/method-spec.json b/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/method-spec.json
deleted file mode 100644
index 51ae89877..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/method-spec.json
+++ /dev/null
@@ -1,9 +0,0 @@
-{
-  "name":"ivf",
-  "engine":"faiss",
-  "space_type": "l2",
-    "parameters":{
-      "nlist": 128,
-      "nprobes": 8
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/restrictive-filter-spec.json b/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/restrictive-filter-spec.json
deleted file mode 100644
index 9e6356f1c..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/restrictive-filter-spec.json
+++ /dev/null
@@ -1,44 +0,0 @@
-{
-    "bool":
-    {
-        "must":
-        [
-            {
-                "range":
-                {
-                    "age":
-                    {
-                        "gte": 30,
-                        "lte": 60
-                    }
-                }
-            },
-            {
-                "term":
-                {
-                    "taste": "bitter"
-                }
-            },
-            {
-                "bool":
-                {
-                    "should":
-                    [
-                        {
-                            "term":
-                            {
-                                "color": "blue"
-                            }
-                        },
-                        {
-                            "term":
-                            {
-                                "color": "green"
-                            }
-                        }
-                    ]
-                }
-            }
-        ]
-    }
-}
\ No newline at end of file
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/restrictive-filter-test.yml b/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/restrictive-filter-test.yml
deleted file mode 100644
index bad047eab..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/restrictive-filter-test.yml
+++ /dev/null
@@ -1,64 +0,0 @@
-endpoint: [ENDPOINT]
-port: [PORT]
-test_name: "Faiss IVF restrictive Filter Test"
-test_id: "Faiss IVF restrictive Filter Test"
-num_runs: 3
-show_runs: false
-setup:
-  - name: delete_index
-    index_name: train_index
-  - name: create_index
-    index_name: train_index
-    index_spec: release-configs/faiss-ivf/filtering/restrictive-filter/train-index-spec.json
-  - name: ingest
-    index_name: train_index
-    field_name: train_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-    doc_count: 50000
-  - name: refresh_index
-    index_name: train_index
-steps:
-  - name: delete_model
-    model_id: test-model
-  - name: delete_index
-    index_name: target_index
-  - name: train_model
-    model_id: test-model
-    train_index: train_index
-    train_field: train_field
-    dimension: 128
-    method_spec: release-configs/faiss-ivf/filtering/restrictive-filter/method-spec.json
-    max_training_vector_count: 50000
-  - name: create_index
-    index_name: target_index
-    index_spec: release-configs/faiss-ivf/filtering/restrictive-filter/index.json
-  - name: ingest_multi_field
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-with-attr.hdf5
-    attributes_dataset_name: attributes
-    attribute_spec: [ { name: 'color', type: 'str' }, { name: 'taste', type: 'str' }, { name: 'age', type: 'int' } ]
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 1
-  - name: warmup_operation
-    index_name: target_index
-  - name: query_with_filter
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-with-attr.hdf5
-    neighbors_format: hdf5
-    neighbors_path: dataset/sift-128-euclidean-with-restrictive-filters.hdf5
-    neighbors_dataset: neighbors_filter_4
-    filter_spec: release-configs/faiss-ivf/filtering/restrictive-filter/restrictive-filter-spec.json
-    filter_type: FILTER
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/train-index-spec.json b/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/train-index-spec.json
deleted file mode 100644
index 804a5707e..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivf/filtering/restrictive-filter/train-index-spec.json
+++ /dev/null
@@ -1,16 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "number_of_shards": 24,
-      "number_of_replicas": 0
-    }
-  },
-  "mappings": {
-    "properties": {
-      "train_field": {
-        "type": "knn_vector",
-        "dimension": 128
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivf/index.json b/benchmarks/perf-tool/release-configs/faiss-ivf/index.json
deleted file mode 100644
index 479703412..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivf/index.json
+++ /dev/null
@@ -1,17 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 24,
-      "number_of_replicas": 1
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-          "type": "knn_vector",
-          "model_id": "test-model"
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivf/method-spec.json b/benchmarks/perf-tool/release-configs/faiss-ivf/method-spec.json
deleted file mode 100644
index 51ae89877..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivf/method-spec.json
+++ /dev/null
@@ -1,9 +0,0 @@
-{
-  "name":"ivf",
-  "engine":"faiss",
-  "space_type": "l2",
-    "parameters":{
-      "nlist": 128,
-      "nprobes": 8
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivf/test.yml b/benchmarks/perf-tool/release-configs/faiss-ivf/test.yml
deleted file mode 100644
index 367c42594..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivf/test.yml
+++ /dev/null
@@ -1,59 +0,0 @@
-endpoint: [ENDPOINT]
-port: [PORT]
-test_name: "Faiss IVF"
-test_id: "Faiss IVF"
-num_runs: 3
-show_runs: false
-setup:
-  - name: delete_index
-    index_name: train_index
-  - name: create_index
-    index_name: train_index
-    index_spec: release-configs/faiss-ivf/train-index-spec.json
-  - name: ingest
-    index_name: train_index
-    field_name: train_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-    doc_count: 50000
-  - name: refresh_index
-    index_name: train_index
-steps:
-  - name: delete_model
-    model_id: test-model
-  - name: delete_index
-    index_name: target_index
-  - name: train_model
-    model_id: test-model
-    train_index: train_index
-    train_field: train_field
-    dimension: 128
-    method_spec: release-configs/faiss-ivf/method-spec.json
-    max_training_vector_count: 50000
-  - name: create_index
-    index_name: target_index
-    index_spec: release-configs/faiss-ivf/index.json
-  - name: ingest
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 1
-  - name: warmup_operation
-    index_name: target_index  
-  - name: query
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-    neighbors_format: hdf5
-    neighbors_path: dataset/sift-128-euclidean.hdf5
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivf/train-index-spec.json b/benchmarks/perf-tool/release-configs/faiss-ivf/train-index-spec.json
deleted file mode 100644
index 804a5707e..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivf/train-index-spec.json
+++ /dev/null
@@ -1,16 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "number_of_shards": 24,
-      "number_of_replicas": 0
-    }
-  },
-  "mappings": {
-    "properties": {
-      "train_field": {
-        "type": "knn_vector",
-        "dimension": 128
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivfpq/index.json b/benchmarks/perf-tool/release-configs/faiss-ivfpq/index.json
deleted file mode 100644
index 479703412..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivfpq/index.json
+++ /dev/null
@@ -1,17 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 24,
-      "number_of_replicas": 1
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-          "type": "knn_vector",
-          "model_id": "test-model"
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivfpq/method-spec.json b/benchmarks/perf-tool/release-configs/faiss-ivfpq/method-spec.json
deleted file mode 100644
index 204b0a653..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivfpq/method-spec.json
+++ /dev/null
@@ -1,16 +0,0 @@
-{
-  "name":"ivf",
-  "engine":"faiss",
-  "space_type": "l2",
-    "parameters":{
-      "nlist": 128,
-      "nprobes": 8,
-      "encoder": {
-        "name": "pq",
-        "parameters": {
-          "m": 16,
-          "code_size": 8
-        }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivfpq/test.yml b/benchmarks/perf-tool/release-configs/faiss-ivfpq/test.yml
deleted file mode 100644
index c3f63348b..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivfpq/test.yml
+++ /dev/null
@@ -1,59 +0,0 @@
-endpoint: [ENDPOINT]
-port: [PORT]
-test_name: "Faiss IVF PQ Test"
-test_id: "Faiss IVF PQ Test"
-num_runs: 3
-show_runs: false
-setup:
-  - name: delete_index
-    index_name: train_index
-  - name: create_index
-    index_name: train_index
-    index_spec: release-configs/faiss-ivfpq/train-index-spec.json
-  - name: ingest
-    index_name: train_index
-    field_name: train_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-    doc_count: 50000
-  - name: refresh_index
-    index_name: train_index
-steps:
-  - name: delete_model
-    model_id: test-model
-  - name: delete_index
-    index_name: target_index
-  - name: train_model
-    model_id: test-model
-    train_index: train_index
-    train_field: train_field
-    dimension: 128
-    method_spec: release-configs/faiss-ivfpq/method-spec.json
-    max_training_vector_count: 50000
-  - name: create_index
-    index_name: target_index
-    index_spec: release-configs/faiss-ivfpq/index.json
-  - name: ingest
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 1
-  - name: warmup_operation
-    index_name: target_index  
-  - name: query
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-    neighbors_format: hdf5
-    neighbors_path: dataset/sift-128-euclidean.hdf5
diff --git a/benchmarks/perf-tool/release-configs/faiss-ivfpq/train-index-spec.json b/benchmarks/perf-tool/release-configs/faiss-ivfpq/train-index-spec.json
deleted file mode 100644
index 804a5707e..000000000
--- a/benchmarks/perf-tool/release-configs/faiss-ivfpq/train-index-spec.json
+++ /dev/null
@@ -1,16 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "number_of_shards": 24,
-      "number_of_replicas": 0
-    }
-  },
-  "mappings": {
-    "properties": {
-      "train_field": {
-        "type": "knn_vector",
-        "dimension": 128
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/relaxed-filter/index.json b/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/relaxed-filter/index.json
deleted file mode 100644
index 7a9ff2890..000000000
--- a/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/relaxed-filter/index.json
+++ /dev/null
@@ -1,26 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 24,
-      "number_of_replicas": 1
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-          "type": "knn_vector",
-          "dimension": 128,
-      	  "method": {
-        		"name": "hnsw",
-        		"space_type": "l2",
-        		"engine": "lucene",
-        		"parameters": {
-          			"ef_construction": 256,
-          			"m": 16
-        		}
-        	}
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/relaxed-filter/relaxed-filter-spec.json b/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/relaxed-filter/relaxed-filter-spec.json
deleted file mode 100644
index 3e04d12c4..000000000
--- a/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/relaxed-filter/relaxed-filter-spec.json
+++ /dev/null
@@ -1,42 +0,0 @@
-{
-    "bool":
-    {
-        "should":
-        [
-            {
-                "range":
-                {
-                    "age":
-                    {
-                        "gte": 30,
-                        "lte": 70
-                    }
-                }
-            },
-            {
-                "term":
-                {
-                    "color": "green"
-                }
-            },
-            {
-                "term":
-                {
-                    "color": "blue"
-                }
-            },
-            {
-                "term":
-                {
-                    "color": "yellow"
-                }
-            },
-            {
-                "term":
-                {
-                    "taste": "sweet"
-                }
-            }
-        ]
-    }
-}
diff --git a/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/relaxed-filter/relaxed-filter-test.yml b/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/relaxed-filter/relaxed-filter-test.yml
deleted file mode 100644
index 3bbb99a0f..000000000
--- a/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/relaxed-filter/relaxed-filter-test.yml
+++ /dev/null
@@ -1,38 +0,0 @@
-endpoint: [ENDPOINT]
-port: [PORT]
-test_name: "Lucene HNSW Relaxed Filter Test"
-test_id: "Lucene HNSW Relaxed Filter Test"
-num_runs: 3
-show_runs: false
-steps:
-  - name: delete_index
-    index_name: target_index
-  - name: create_index
-    index_name: target_index
-    index_spec: release-configs/lucene-hnsw/filtering/relaxed-filter/index.json
-  - name: ingest_multi_field
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-with-attr.hdf5
-    attributes_dataset_name: attributes
-    attribute_spec: [ { name: 'color', type: 'str' }, { name: 'taste', type: 'str' }, { name: 'age', type: 'int' } ]
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 1
-  - name: query_with_filter
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-with-attr.hdf5
-    neighbors_format: hdf5
-    neighbors_path: dataset/sift-128-euclidean-with-relaxed-filters.hdf5
-    neighbors_dataset: neighbors_filter_5
-    filter_spec: release-configs/lucene-hnsw/filtering/relaxed-filter/relaxed-filter-spec.json
-    filter_type: FILTER
diff --git a/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/restrictive-filter/index.json b/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/restrictive-filter/index.json
deleted file mode 100644
index 7a9ff2890..000000000
--- a/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/restrictive-filter/index.json
+++ /dev/null
@@ -1,26 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 24,
-      "number_of_replicas": 1
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-          "type": "knn_vector",
-          "dimension": 128,
-      	  "method": {
-        		"name": "hnsw",
-        		"space_type": "l2",
-        		"engine": "lucene",
-        		"parameters": {
-          			"ef_construction": 256,
-          			"m": 16
-        		}
-        	}
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/restrictive-filter/restrictive-filter-spec.json b/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/restrictive-filter/restrictive-filter-spec.json
deleted file mode 100644
index 9e6356f1c..000000000
--- a/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/restrictive-filter/restrictive-filter-spec.json
+++ /dev/null
@@ -1,44 +0,0 @@
-{
-    "bool":
-    {
-        "must":
-        [
-            {
-                "range":
-                {
-                    "age":
-                    {
-                        "gte": 30,
-                        "lte": 60
-                    }
-                }
-            },
-            {
-                "term":
-                {
-                    "taste": "bitter"
-                }
-            },
-            {
-                "bool":
-                {
-                    "should":
-                    [
-                        {
-                            "term":
-                            {
-                                "color": "blue"
-                            }
-                        },
-                        {
-                            "term":
-                            {
-                                "color": "green"
-                            }
-                        }
-                    ]
-                }
-            }
-        ]
-    }
-}
\ No newline at end of file
diff --git a/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/restrictive-filter/restrictive-filter-test.yml b/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/restrictive-filter/restrictive-filter-test.yml
deleted file mode 100644
index aa4c5193f..000000000
--- a/benchmarks/perf-tool/release-configs/lucene-hnsw/filtering/restrictive-filter/restrictive-filter-test.yml
+++ /dev/null
@@ -1,38 +0,0 @@
-endpoint: [ENDPOINT]
-port: [PORT]
-test_name: "Lucene HNSW Restrictive Filter Test"
-test_id: "Lucene HNSW Restrictive Filter Test"
-num_runs: 3
-show_runs: false
-steps:
-  - name: delete_index
-    index_name: target_index
-  - name: create_index
-    index_name: target_index
-    index_spec: release-configs/lucene-hnsw/filtering/restrictive-filter/index.json
-  - name: ingest_multi_field
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-with-attr.hdf5
-    attributes_dataset_name: attributes
-    attribute_spec: [ { name: 'color', type: 'str' }, { name: 'taste', type: 'str' }, { name: 'age', type: 'int' } ]
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 1
-  - name: query_with_filter
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-with-attr.hdf5
-    neighbors_format: hdf5
-    neighbors_path: dataset/sift-128-euclidean-with-restrictive-filters.hdf5
-    neighbors_dataset: neighbors_filter_4
-    filter_spec: release-configs/lucene-hnsw/filtering/restrictive-filter/restrictive-filter-spec.json
-    filter_type: FILTER
diff --git a/benchmarks/perf-tool/release-configs/lucene-hnsw/index.json b/benchmarks/perf-tool/release-configs/lucene-hnsw/index.json
deleted file mode 100644
index 7a9ff2890..000000000
--- a/benchmarks/perf-tool/release-configs/lucene-hnsw/index.json
+++ /dev/null
@@ -1,26 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 24,
-      "number_of_replicas": 1
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-          "type": "knn_vector",
-          "dimension": 128,
-      	  "method": {
-        		"name": "hnsw",
-        		"space_type": "l2",
-        		"engine": "lucene",
-        		"parameters": {
-          			"ef_construction": 256,
-          			"m": 16
-        		}
-        	}
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/lucene-hnsw/nested/simple/index.json b/benchmarks/perf-tool/release-configs/lucene-hnsw/nested/simple/index.json
deleted file mode 100644
index b41b51c77..000000000
--- a/benchmarks/perf-tool/release-configs/lucene-hnsw/nested/simple/index.json
+++ /dev/null
@@ -1,34 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 24,
-      "number_of_replicas": 1
-    }
-  },
-  "mappings": {
-    "_source": {
-      "excludes": ["nested_field"]
-    },
-    "properties": {
-      "nested_field": {
-        "type": "nested",
-        "properties": {
-          "target_field": {
-            "type": "knn_vector",
-            "dimension": 128,
-            "method": {
-              "name": "hnsw",
-              "space_type": "l2",
-              "engine": "lucene",
-              "parameters": {
-                "ef_construction": 256,
-                "m": 16
-              }
-            }
-          }
-        }
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/lucene-hnsw/nested/simple/simple-nested-test.yml b/benchmarks/perf-tool/release-configs/lucene-hnsw/nested/simple/simple-nested-test.yml
deleted file mode 100644
index be825487a..000000000
--- a/benchmarks/perf-tool/release-configs/lucene-hnsw/nested/simple/simple-nested-test.yml
+++ /dev/null
@@ -1,37 +0,0 @@
-endpoint: [ENDPOINT]
-port: [PORT]
-test_name: "Lucene HNSW Nested Field Test"
-test_id: "Lucene HNSW Nested Field Test"
-num_runs: 3
-show_runs: false
-steps:
-  - name: delete_index
-    index_name: target_index
-  - name: create_index
-    index_name: target_index
-    index_spec: release-configs/lucene-hnsw/nested/simple/index.json
-  - name: ingest_nested_field
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-nested.hdf5
-    attributes_dataset_name: attributes
-    attribute_spec: [ { name: 'color', type: 'str' }, { name: 'taste', type: 'str' }, { name: 'age', type: 'int' }, { name: 'parent_id', type: 'int'} ]
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 1
-  - name: warmup_operation
-    index_name: target_index
-  - name: query_nested_field
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean-nested.hdf5
-    neighbors_format: hdf5
-    neighbors_path: dataset/sift-128-euclidean-nested.hdf5
-    neighbors_dataset: neighbour_nested
\ No newline at end of file
diff --git a/benchmarks/perf-tool/release-configs/lucene-hnsw/test.yml b/benchmarks/perf-tool/release-configs/lucene-hnsw/test.yml
deleted file mode 100644
index b253ee08e..000000000
--- a/benchmarks/perf-tool/release-configs/lucene-hnsw/test.yml
+++ /dev/null
@@ -1,33 +0,0 @@
-endpoint: [ENDPOINT]
-port: [PORT]
-test_name: "Lucene HNSW"
-test_id: "Lucene HNSW"
-num_runs: 3
-show_runs: false
-steps:
-  - name: delete_index
-    index_name: target_index
-  - name: create_index
-    index_name: target_index
-    index_spec: release-configs/lucene-hnsw/index.json
-  - name: ingest
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 1
-  - name: query
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-    neighbors_format: hdf5
-    neighbors_path: dataset/sift-128-euclidean.hdf5
diff --git a/benchmarks/perf-tool/release-configs/nmslib-hnsw/index.json b/benchmarks/perf-tool/release-configs/nmslib-hnsw/index.json
deleted file mode 100644
index eb714c5c8..000000000
--- a/benchmarks/perf-tool/release-configs/nmslib-hnsw/index.json
+++ /dev/null
@@ -1,27 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 24,
-      "number_of_replicas": 1,
-      "knn.algo_param.ef_search": 100
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-          "type": "knn_vector",
-          "dimension": 128,
-      	  "method": {
-        		"name": "hnsw",
-        		"space_type": "l2",
-        		"engine": "nmslib",
-        		"parameters": {
-          			"ef_construction": 256,
-          			"m": 16
-        		}
-        	}
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/release-configs/nmslib-hnsw/test.yml b/benchmarks/perf-tool/release-configs/nmslib-hnsw/test.yml
deleted file mode 100644
index 94ad9b131..000000000
--- a/benchmarks/perf-tool/release-configs/nmslib-hnsw/test.yml
+++ /dev/null
@@ -1,35 +0,0 @@
-endpoint: [ENDPOINT]
-port: [PORT]
-test_name: "Nmslib HNSW Test"
-test_id: "Nmslib HNSW Test"
-num_runs: 3
-show_runs: false
-steps:
-  - name: delete_index
-    index_name: target_index
-  - name: create_index
-    index_name: target_index
-    index_spec: release-configs/nmslib-hnsw/index.json
-  - name: ingest
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 1
-  - name: warmup_operation
-    index_name: target_index  
-  - name: query
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: dataset/sift-128-euclidean.hdf5
-    neighbors_format: hdf5
-    neighbors_path: dataset/sift-128-euclidean.hdf5
diff --git a/benchmarks/perf-tool/release-configs/run_all_tests.sh b/benchmarks/perf-tool/release-configs/run_all_tests.sh
deleted file mode 100755
index e65d5b5c4..000000000
--- a/benchmarks/perf-tool/release-configs/run_all_tests.sh
+++ /dev/null
@@ -1,102 +0,0 @@
-#!/bin/bash
-set -e
-
-# Description:
-# Run a performance test for release
-# Dataset should be available in perf-tool/dataset before running this script
-#
-# Example:
-# ./run-test.sh --endpoint localhost
-#
-# Usage:
-# ./run-test.sh \
-#   --endpoint <your endpoint>
-#   --port 80 \
-#   --num-runs 3 \
-#   --outputs ~/outputs
-
-while [ "$1" != "" ]; do
-  case $1 in
-    -url | --endpoint )    shift
-                        ENDPOINT=$1
-                        ;;
-    -p | --port )    shift
-                        PORT=$1
-                        ;;
-    -n | --num-runs )    shift
-                        NUM_RUNS=$1
-                        ;;
-    -o | --outputs )    shift
-                        OUTPUTS=$1
-                        ;;
-    * )                 echo "Unknown parameter"
-                        echo $1
-                        exit 1
-                        ;;
-  esac
-  shift
-done
-
-if [ ! -n "$ENDPOINT" ]; then
-    echo "--endpoint should be specified"
-    exit
-fi
-
-if [ ! -n "$PORT" ]; then
-        PORT=80
-        echo "--port is not specified. Using default values $PORT"
-fi
-
-if [ ! -n "$NUM_RUNS" ]; then
-        NUM_RUNS=3
-        echo "--num-runs is not specified. Using default values $NUM_RUNS"
-fi
-
-if [ ! -n "$OUTPUTS" ]; then
-        OUTPUTS="$HOME/outputs"
-        echo "--outputs is not specified. Using default values $OUTPUTS"
-fi
-
-
-curl -X PUT "http://$ENDPOINT:$PORT/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
-{
- "persistent" : {
-   "knn.algo_param.index_thread_qty" : 4
- }
-}
-'
-
-TESTS="./release-configs/faiss-hnsw/filtering/relaxed-filter/relaxed-filter-test.yml
-./release-configs/faiss-hnsw/filtering/restrictive-filter/restrictive-filter-test.yml
-./release-configs/faiss-hnsw/nested/simple/simple-nested-test.yml
-./release-configs/faiss-hnsw/test.yml
-./release-configs/faiss-hnswpq/test.yml
-./release-configs/faiss-ivf/filtering/relaxed-filter/relaxed-filter-test.yml
-./release-configs/faiss-ivf/filtering/restrictive-filter/restrictive-filter-test.yml
-./release-configs/faiss-ivf/test.yml
-./release-configs/faiss-ivfpq/test.yml
-./release-configs/lucene-hnsw/filtering/relaxed-filter/relaxed-filter-test.yml
-./release-configs/lucene-hnsw/filtering/restrictive-filter/restrictive-filter-test.yml
-./release-configs/lucene-hnsw/nested/simple/simple-nested-test.yml
-./release-configs/lucene-hnsw/test.yml
-./release-configs/nmslib-hnsw/test.yml"
-
-if [ ! -d $OUTPUTS ]
-then
-        mkdir $OUTPUTS
-fi
-
-for TEST in $TESTS
-do
-        ORG_FILE=$TEST
-        NEW_FILE="$ORG_FILE.tmp"
-        OUT_FILE=$(grep test_id $ORG_FILE | cut -d':' -f2 | sed -r 's/^ "|"$//g' | sed 's/ /_/g')
-        echo "cp $ORG_FILE $NEW_FILE"
-        cp $ORG_FILE $NEW_FILE
-        sed -i "/^endpoint:/c\endpoint: $ENDPOINT" $NEW_FILE
-        sed -i "/^port:/c\port: $PORT" $NEW_FILE
-        sed -i "/^num_runs:/c\num_runs: $NUM_RUNS" $NEW_FILE
-        python3 knn-perf-tool.py test $NEW_FILE $OUTPUTS/$OUT_FILE
-        #Sleep for 1 min to cool down cpu from the previous run
-        sleep 60
-done
diff --git a/benchmarks/perf-tool/requirements.in b/benchmarks/perf-tool/requirements.in
deleted file mode 100644
index fd3555aab..000000000
--- a/benchmarks/perf-tool/requirements.in
+++ /dev/null
@@ -1,7 +0,0 @@
-Cerberus
-opensearch-py
-PyYAML
-numpy
-h5py
-requests
-psutil
diff --git a/benchmarks/perf-tool/requirements.txt b/benchmarks/perf-tool/requirements.txt
deleted file mode 100644
index fdfe205f8..000000000
--- a/benchmarks/perf-tool/requirements.txt
+++ /dev/null
@@ -1,37 +0,0 @@
-#
-# This file is autogenerated by pip-compile with python 3.9
-# To update, run:
-#
-#    pip-compile
-#
-cerberus==1.3.4
-    # via -r requirements.in
-certifi==2024.7.4
-    # via
-    #   opensearch-py
-    #   requests
-charset-normalizer==2.0.4
-    # via requests
-h5py==3.3.0
-    # via -r requirements.in
-idna==3.7
-    # via requests
-numpy==1.24.2
-    # via
-    #   -r requirements.in
-    #   h5py
-opensearch-py==1.0.0
-    # via -r requirements.in
-psutil==5.8.0
-    # via -r requirements.in
-pyyaml==5.4.1
-    # via -r requirements.in
-requests==2.32.0
-    # via -r requirements.in
-urllib3==1.26.18
-    # via
-    #   opensearch-py
-    #   requests
-
-# The following packages are considered to be unsafe in a requirements file:
-# setuptools
diff --git a/benchmarks/perf-tool/sample-configs/faiss-sift-ivf/index-spec.json b/benchmarks/perf-tool/sample-configs/faiss-sift-ivf/index-spec.json
deleted file mode 100644
index 5542ef387..000000000
--- a/benchmarks/perf-tool/sample-configs/faiss-sift-ivf/index-spec.json
+++ /dev/null
@@ -1,17 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "number_of_shards": 3,
-      "number_of_replicas": 0
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-          "type": "knn_vector",
-          "model_id": "test-model"
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/sample-configs/faiss-sift-ivf/method-spec.json b/benchmarks/perf-tool/sample-configs/faiss-sift-ivf/method-spec.json
deleted file mode 100644
index 1aa7f809f..000000000
--- a/benchmarks/perf-tool/sample-configs/faiss-sift-ivf/method-spec.json
+++ /dev/null
@@ -1,8 +0,0 @@
-{
-    "name":"ivf",
-    "engine":"faiss",
-    "parameters":{
-      "nlist":16,
-      "nprobes": 4
-    }
-}
diff --git a/benchmarks/perf-tool/sample-configs/faiss-sift-ivf/test.yml b/benchmarks/perf-tool/sample-configs/faiss-sift-ivf/test.yml
deleted file mode 100644
index 027ba8683..000000000
--- a/benchmarks/perf-tool/sample-configs/faiss-sift-ivf/test.yml
+++ /dev/null
@@ -1,62 +0,0 @@
-endpoint: localhost
-test_name: faiss_sift_ivf
-test_id: "Test workflow for faiss ivf"
-num_runs: 3
-show_runs: true
-setup:
-  - name: delete_model
-    model_id: test-model
-  - name: delete_index
-    index_name: target_index
-  - name: delete_index
-    index_name: train_index
-  - name: create_index
-    index_name: train_index
-    index_spec: sample-configs/faiss-sift-ivf/train-index-spec.json
-  - name: ingest
-    index_name: train_index
-    field_name: train_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: ../dataset/sift-128-euclidean.hdf5
-  - name: refresh_index
-    index_name: train_index
-steps:
-  - name: train_model
-    model_id: test-model
-    train_index: train_index
-    train_field: train_field
-    dimension: 128
-    method_spec: sample-configs/faiss-sift-ivf/method-spec.json
-    max_training_vector_count: 1000000000
-  - name: create_index
-    index_name: target_index
-    index_spec: sample-configs/faiss-sift-ivf/index-spec.json
-  - name: ingest
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: ../dataset/sift-128-euclidean.hdf5
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 10
-  - name: warmup_operation
-    index_name: target_index  
-  - name: query
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: ../dataset/sift-128-euclidean.hdf5
-    neighbors_format: hdf5
-    neighbors_path: ../dataset/sift-128-euclidean.hdf5
-cleanup:
-  - name: delete_model
-    model_id: test-model
-  - name: delete_index
-    index_name: target_index
diff --git a/benchmarks/perf-tool/sample-configs/faiss-sift-ivf/train-index-spec.json b/benchmarks/perf-tool/sample-configs/faiss-sift-ivf/train-index-spec.json
deleted file mode 100644
index 00a418e4f..000000000
--- a/benchmarks/perf-tool/sample-configs/faiss-sift-ivf/train-index-spec.json
+++ /dev/null
@@ -1,16 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "number_of_shards": 3,
-      "number_of_replicas": 0
-    }
-  },
-  "mappings": {
-    "properties": {
-      "train_field": {
-        "type": "knn_vector",
-        "dimension": 128
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/sample-configs/filter-spec/filter-1-spec.json b/benchmarks/perf-tool/sample-configs/filter-spec/filter-1-spec.json
deleted file mode 100644
index f529de4fe..000000000
--- a/benchmarks/perf-tool/sample-configs/filter-spec/filter-1-spec.json
+++ /dev/null
@@ -1,24 +0,0 @@
-{
-    "bool":
-    {
-        "must":
-        [
-            {
-                "range":
-                {
-                    "age":
-                    {
-                        "gte": 20,
-                        "lte": 100
-                    }
-                }
-            },
-            {
-                "term":
-                {
-                    "color": "red"
-                }
-            }
-        ]
-    }
-}
\ No newline at end of file
diff --git a/benchmarks/perf-tool/sample-configs/filter-spec/filter-2-spec.json b/benchmarks/perf-tool/sample-configs/filter-spec/filter-2-spec.json
deleted file mode 100644
index 9d4514e62..000000000
--- a/benchmarks/perf-tool/sample-configs/filter-spec/filter-2-spec.json
+++ /dev/null
@@ -1,40 +0,0 @@
-{
-    "bool":
-    {
-        "must":
-        [
-            {
-                "term":
-                {
-                    "taste": "salty"
-                }
-            },
-            {
-                "bool":
-                {
-                    "should":
-                    [
-                        {
-                            "bool":
-                            {
-                                "must_not":
-                                {
-                                    "exists":
-                                    {
-                                        "field": "color"
-                                    }
-                                }
-                            }
-                        },
-                        {
-                            "term":
-                            {
-                                "color": "blue"
-                            }
-                        }
-                    ]
-                }
-            }
-        ]
-    }
-}
\ No newline at end of file
diff --git a/benchmarks/perf-tool/sample-configs/filter-spec/filter-3-spec.json b/benchmarks/perf-tool/sample-configs/filter-spec/filter-3-spec.json
deleted file mode 100644
index d69f8768e..000000000
--- a/benchmarks/perf-tool/sample-configs/filter-spec/filter-3-spec.json
+++ /dev/null
@@ -1,30 +0,0 @@
-{
-    "bool":
-    {
-        "must":
-        [
-            {
-                "range":
-                {
-                    "age":
-                    {
-                        "gte": 20,
-                        "lte": 80
-                    }
-                }
-            },
-            {
-                "exists":
-                {
-                    "field": "color"
-                }
-            },
-            {
-                "exists":
-                {
-                    "field": "taste"
-                }
-            }
-        ]
-    }
-}
\ No newline at end of file
diff --git a/benchmarks/perf-tool/sample-configs/filter-spec/filter-4-spec.json b/benchmarks/perf-tool/sample-configs/filter-spec/filter-4-spec.json
deleted file mode 100644
index 822d63b37..000000000
--- a/benchmarks/perf-tool/sample-configs/filter-spec/filter-4-spec.json
+++ /dev/null
@@ -1,44 +0,0 @@
-{
-    "bool":
-    {
-        "must":
-        [
-            {
-                "range":
-                {
-                    "age":
-                    {
-                        "gte": 30,
-                        "lte": 60
-                    }
-                }
-            },
-            {
-                "term":
-                {
-                    "taste": "bitter"
-                }
-            },
-            {
-                "bool":
-                {
-                    "should":
-                    [
-                        {
-                            "term":
-                            {
-                                "color": "blue"
-                            }
-                        },
-                        {
-                            "term":
-                            {
-                                "color": "green"
-                            }
-                        }
-                    ]
-                }
-            }
-        ]
-    }
-}
diff --git a/benchmarks/perf-tool/sample-configs/filter-spec/filter-5-spec.json b/benchmarks/perf-tool/sample-configs/filter-spec/filter-5-spec.json
deleted file mode 100644
index 3e04d12c4..000000000
--- a/benchmarks/perf-tool/sample-configs/filter-spec/filter-5-spec.json
+++ /dev/null
@@ -1,42 +0,0 @@
-{
-    "bool":
-    {
-        "should":
-        [
-            {
-                "range":
-                {
-                    "age":
-                    {
-                        "gte": 30,
-                        "lte": 70
-                    }
-                }
-            },
-            {
-                "term":
-                {
-                    "color": "green"
-                }
-            },
-            {
-                "term":
-                {
-                    "color": "blue"
-                }
-            },
-            {
-                "term":
-                {
-                    "color": "yellow"
-                }
-            },
-            {
-                "term":
-                {
-                    "taste": "sweet"
-                }
-            }
-        ]
-    }
-}
diff --git a/benchmarks/perf-tool/sample-configs/lucene-sift-hnsw-filter/index-spec.json b/benchmarks/perf-tool/sample-configs/lucene-sift-hnsw-filter/index-spec.json
deleted file mode 100644
index 83ea79b15..000000000
--- a/benchmarks/perf-tool/sample-configs/lucene-sift-hnsw-filter/index-spec.json
+++ /dev/null
@@ -1,27 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "refresh_interval": "10s",
-      "number_of_shards": 30,
-      "number_of_replicas": 0
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-        "type": "knn_vector",
-        "dimension": 128,
-        "method": {
-          "name": "hnsw",
-          "space_type": "l2",
-          "engine": "lucene",
-          "parameters": {
-            "ef_construction": 100,
-            "m": 16
-          }
-        }
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/sample-configs/lucene-sift-hnsw-filter/test.yml b/benchmarks/perf-tool/sample-configs/lucene-sift-hnsw-filter/test.yml
deleted file mode 100644
index aa2ee6389..000000000
--- a/benchmarks/perf-tool/sample-configs/lucene-sift-hnsw-filter/test.yml
+++ /dev/null
@@ -1,41 +0,0 @@
-endpoint: localhost
-test_name: lucene_sift_hnsw
-test_id: "Test workflow for lucene hnsw"
-num_runs: 1
-show_runs: false
-setup:
-  - name: delete_index
-    index_name: target_index
-steps:
-  - name: create_index
-    index_name: target_index
-    index_spec: sample-configs/lucene-sift-hnsw-filter/index-spec.json
-  - name: ingest_multi_field
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: ../dataset/sift-128-euclidean-with-attr.hdf5
-    attributes_dataset_name: attributes
-    attribute_spec: [ { name: 'color', type: 'str' }, { name: 'taste', type: 'str' }, { name: 'age', type: 'int' } ]
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 10
-  - name: query_with_filter
-    k: 10
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: ../dataset/sift-128-euclidean-with-attr.hdf5
-    neighbors_format: hdf5
-    neighbors_path: ../dataset/sift-128-euclidean-with-attr-with-filters.hdf5
-    neighbors_dataset: neighbors_filter_1
-    filter_spec: sample-configs/filter-spec/filter-1-spec.json
-    query_count: 100
-cleanup:
-  - name: delete_index
-    index_name: target_index
\ No newline at end of file
diff --git a/benchmarks/perf-tool/sample-configs/nmslib-sift-hnsw/index-spec.json b/benchmarks/perf-tool/sample-configs/nmslib-sift-hnsw/index-spec.json
deleted file mode 100644
index 75abe7baa..000000000
--- a/benchmarks/perf-tool/sample-configs/nmslib-sift-hnsw/index-spec.json
+++ /dev/null
@@ -1,28 +0,0 @@
-{
-  "settings": {
-    "index": {
-      "knn": true,
-      "knn.algo_param.ef_search": 512,
-      "refresh_interval": "10s",
-      "number_of_shards": 1,
-      "number_of_replicas": 0
-    }
-  },
-  "mappings": {
-    "properties": {
-      "target_field": {
-        "type": "knn_vector",
-        "dimension": 128,
-        "method": {
-          "name": "hnsw",
-          "space_type": "l2",
-          "engine": "nmslib",
-          "parameters": {
-            "ef_construction": 512,
-            "m": 16
-          }
-        }
-      }
-    }
-  }
-}
diff --git a/benchmarks/perf-tool/sample-configs/nmslib-sift-hnsw/test.yml b/benchmarks/perf-tool/sample-configs/nmslib-sift-hnsw/test.yml
deleted file mode 100644
index 6d96bf80c..000000000
--- a/benchmarks/perf-tool/sample-configs/nmslib-sift-hnsw/test.yml
+++ /dev/null
@@ -1,38 +0,0 @@
-endpoint: localhost
-test_name: nmslib_sift_hnsw
-test_id: "Test workflow for nmslib hnsw"
-num_runs: 2
-show_runs: false
-setup:
-  - name: delete_index
-    index_name: target_index
-steps:
-  - name: create_index
-    index_name: target_index
-    index_spec: sample-configs/nmslib-sift-hnsw/index-spec.json
-  - name: ingest
-    index_name: target_index
-    field_name: target_field
-    bulk_size: 500
-    dataset_format: hdf5
-    dataset_path: ../dataset/sift-128-euclidean.hdf5
-  - name: refresh_index
-    index_name: target_index
-  - name: force_merge
-    index_name: target_index
-    max_num_segments: 10
-  - name: warmup_operation
-    index_name: target_index
-  - name: query
-    k: 100
-    r: 1
-    calculate_recall: true
-    index_name: target_index
-    field_name: target_field
-    dataset_format: hdf5
-    dataset_path: ../dataset/sift-128-euclidean.hdf5
-    neighbors_format: hdf5
-    neighbors_path: ../dataset/sift-128-euclidean.hdf5
-cleanup:
-  - name: delete_index
-    index_name: target_index