Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[saved objects] Adds bulkDelete API #139680

Merged

Conversation

TinaHeiligers
Copy link
Contributor

@TinaHeiligers TinaHeiligers commented Aug 29, 2022

Resolves #30503

Summary

The saved objects service has API's for updating, getting, and resolving many documents in one request but is missing similar functionality for deleting many documents.

There are a few use cases for a new API that would delete documents in bulk:

  • bulk delete alert saved objects
  • bulk delete telemetry-related documents
  • replacement for clearing the kibana index with esArchiver

This PR adds a bulkDelete API with the following characteristics:
Multi-namespace saved objects shared to more than one space need to be deleted by force since the requestor in the active space may not be aware of potential repercussions.

How force in bulk delete behaves:

We implement a force-deletion at a request level:

  • when force is false (or not added as a query), the document isn't deleted and the API returns an error for that document.
  • with force as true: will force delete the document.
    The requestor must be authorized to delete saved objects in all the spaces from which the object would be removed when using force. This applies to all objects requested in the bulk delete operation.

Once a document has been deleted, the API deletes any legacy URL aliases that may have existed for the object.
The behavior differs from that of deleting a single document with the saved objects delete API in that force is applied to all the objects sent in the request, as opposed to applying to one particular document.

Other notable differences between delete and bulkDelete:

Errors:

  • bulkDelete returns errors in the response for each document that couldn't be deleted, whereas delete returns an empty object.
  • bulkDelete returns a bad request for deleting hidden objects, whereas delete throws a not-found error.

Response:

  • bulkDelete responds with an array of statuses, containing a summary of the outcome for deleting an object.The response per-object consistd of a success status as either true or false and will always be present. If an object couldn't be deleted (success: false), the corresponding status entry also includes the error for that object.

How to test this:

Testing bulk delete saved objects that don't exist in the current callee's space:

  1. start es and start kibana
  2. create a new space "Non default"
  3. change spaces into "Non default"
  4. load all three sample data fixtures
  5. open the browser's network tab
  6. navigate to Saved Objects application and open the network request for the _find?perPage=50&page=1&fields... request
  7. copy the contents of the saved_objects array from the response
  8. Using a tool of choice (I used postman), issue a _bulk_delete request to saved objects from the default space:
Issue a `_bulk_delete` request to saved objects from the default space
POST localhost:5601/{basepath}/api/saved_objects/_bulk_delete
body (as raw JSON):
[
  {"type": "visualization", "id": "314c6f60-2224-11e8-b802-5bcf64c2cfb4"},
  {"type": "lens", "id": "16b1d7d0-ea71-11eb-8b4b-f7b600de0f7d"},
  ...rest  
]
Observe that the request fails with:
{
    "statuses": [
        {
            "success": false,
            "id": "314c6f60-2224-11e8-b802-5bcf64c2cfb4",
            "type": "visualization",
            "error": {
                "message": "Unable to delete saved object id: 314c6f60-2224-11e8-b802-5bcf64c2cfb4, type: visualization that exists in multiple namespaces, use the \"force\" option to delete all saved objects: Bad Request",
                "statusCode": 400,
                "error": "Bad Request"
            }
        },
        {
            "success": false,
            "id": "16b1d7d0-ea71-11eb-8b4b-f7b600de0f7d",
            "type": "lens",
            "error": {
                "message": "Unable to delete saved object id: 16b1d7d0-ea71-11eb-8b4b-f7b600de0f7d, type: lens that exists in multiple namespaces, use the \"force\" option to delete all saved objects: Bad Request",
                "statusCode": 400,
                "error": "Bad Request"
            }
        }
    ]
}
  1. Send the request again but from the "Non default" space:
    POST localhost:5601/{basepath}/s/non-default/api/saved_objects/_bulk_delete
Observer that the request is successful:
{
    "statuses": [
        {
            "success": true,
            "id": "d3d7af60-4c81-11e8-b3d7-01146121b73d",
            "type": "index-pattern"
        },
        {
            "success": true,
            "id": "b05ee7f1-0341-43f1-b345-f6cd3a944254",
            "type": "visualization"
        }
    ]
}

Testing legacy URL alias deletion

There's no easy way to test that bulkDelete also deletes legacy URL aliases that may be present for multi-namespace saved objects.

You can use staging data following the instructions for testing bulkResolve. I found I had to create a new user a system_indices_superuser role add the staging data to kibana.

Checklist

Delete any items that are not applicable to this PR.

Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release.

When forming the risk matrix, consider some of the following examples and how they may potentially impact the change:

Risk Probability Severity Mitigation/Notes
Code should gracefully handle failures for users with different privileges High High Integration tests will verify that saved objects are not deleted when not permitted to delete them in any situation where the request would violate the privileges for the callee.
Consumers need to know the implications of using force to delete multi-namespace saved objects Medium High Consumers need to opt-in to using force explicitly
Performance regression when deleting multi-namespace saved objects with many legacy URL aliases Medium High Consumers need to be aware that deleting multi-namespace objects also deletes legacy URL aliases, which is done one at a time. THere will be a follow up issue to optimise the legacyUrlAlias deletion

@TinaHeiligers TinaHeiligers added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc release_note:plugin_api_changes Contains a Plugin API changes section for the breaking plugin API changes section. Feature:Saved Objects backport:skip This commit does not require backporting release_note:feature Makes this part of the condensed release notes v8.5.0 labels Aug 29, 2022
@TinaHeiligers
Copy link
Contributor Author

@elasticmachine merge upstream

Copy link
Contributor

@pgayvallet pgayvallet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work so far! Overall the implementation looks good, I only have one question about it (this one), the rest are mostly NITs.

As a side note, I found the review of the actual implementation of the SOR.bulkDelete quite... challenging. Maybe we could find a way to extract parts of it to utility methods or something? Note that as long as the implem is working, it's fully optional

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

Copy link
Contributor Author

@TinaHeiligers TinaHeiligers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self review.

* @returns array BulkDeleteExpectedBulkGetResult[]
* @internal
*/
private presortObjectsByNamespaceType(objects: SavedObjectsBulkDeleteObject[]) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extracted this from the main body of the bulkDelete implementation to make it easier to follow. The intent is to eventually extract all the pre-flight handling to another module, similar to what's been done for bulkResolve.

}

/**
* Fetch multi-namespace saved objects
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted as a pre-flight check. For now, these helpers are in the main class and can be moved at a later stage to another module.

* @returns array of objects sorted by expected delete success or failure result
* @internal
*/
private getExpectedBulkDeleteMultiNamespaceDocsResults(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can also be moved to a module at a later stage.

objects: SavedObjectsBulkDeleteObject[],
options?: SavedObjectsBulkDeleteOptions
) {
return await this.options.baseClient.bulkDelete(objects, options);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to delete, the bulk operation doesn't return any potentially sensitive data and can be passed directly to the base client.

throw error;
}
const response = await this.baseClient.bulkDelete(objects, options);
response?.statuses.forEach(({ id, type, success, error }) => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to other bulk operations, we need to track the operation for each object being deleted.

import { expectResponses, getUrlPrefix, getTestTitle } from '../lib/saved_object_test_utils';
import { ExpectResponseBody, TestCase, TestDefinition, TestSuite, TestUser } from '../lib/types';

export interface BulkDeleteTestDefinition extends TestDefinition {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests for the bulk operation were based on those for single delete and modified accordingly.

{ ...CASES.ALIAS_DELETE_EXCLUSIVE, force: true },
{ ...CASES.DOES_NOT_EXIST, ...fail404() },
];
const hiddenType = [{ ...CASES.HIDDEN, ...fail400() }]; // this behavior diverges from `delete`, which throws 404
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the comment, attempting to delete a hidden type, this behavior differs slightly from that of delete, to make it clear the hidden types aren't necessarily "missing" but rather that they cannot be deleted using the api.

@TinaHeiligers
Copy link
Contributor Author

@elasticmachine merge upstream

Copy link
Contributor

@jeramysoucy jeramysoucy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I ran all of the tests as I reviewed the code. Fond some very minor nits which I commented on. Manually tested not-found, different space, same space, and force required paths.

);
});

it(`returns an for an object when the object's type is hidden`, async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: missing word..."returns an error for..."

@@ -237,6 +262,15 @@ describe('404s from proxies', () => {
);
});

it('handles `bulkDelete` requests that are successful when the proxy passes through the product header', async () => {
const docsToGet = myBulkDeleteTypeDocs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Total nit, would expect const docsToDelete

@@ -273,7 +286,7 @@ export class SpacesSavedObjectsClient implements SavedObjectsClientContract {

async bulkUpdate<T = unknown>(
objects: Array<SavedObjectsBulkUpdateObject<T>> = [],
options: SavedObjectsBaseOptions = {}
options: SavedObjectsBulkDeleteOptions = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not have changed - I think you had already caught this one during our walkthrough.

@@ -127,6 +141,7 @@ export interface SavedObjectsRepositoryOptions {
export const DEFAULT_REFRESH_SETTING = 'wait_for';
export const DEFAULT_RETRY_COUNT = 3;

const MAX_CONCURRENT_ALIAS_DELETIONS = 10;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same limit on the number of concurrent alias deletions as used for updating objects' spaces. We could export the constant from here and reuse that as a default everywhere we implement a pMap.

}).catch((err) => {
this._logger.error(`Unable to delete aliases when deleting an object: ${err.message}`);
});
await pMap(objectsToDeleteAliasesFor, mapper, { concurrency: MAX_CONCURRENT_ALIAS_DELETIONS });
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in #117056 (comment), there isn't a bulk API for deleting legacy URL aliases for many objects since the es query/script would need to handle different namespaces and delete behavior for each saved object that no longer exists. To minimize the number of changes adding a bulkDelete API introduces, I kept it simple, I'm using a pMap, as is done in updateObjectSpaces.

Copy link
Contributor

@pgayvallet pgayvallet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work, LGTM!

The only thing is, as discussed on slack, we should open a follow-up issue to try to improve the performances by trying to achieve 'bulk' removal of the legacy alias during a bulk delete.

@@ -2044,6 +2046,517 @@ describe('SavedObjectsRepository', () => {
});
});

describe('#bulkDelete', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I die a little every time I open this file 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're working on it! Long term plan is to modularize it. I've already started pulling out common code as I've been working on the new extension unit tests which will reside in separate files. Once we merge my changes with the packages refactor we can continue to improve the disaster.

namespaces = actualResult!._source.namespaces ?? [
SavedObjectsUtils.namespaceIdToString(namespace),
];
const useForce = force && force === true ? true : false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: const useForce = force === true?

const useForce = force && force === true ? true : false;
// the document is shared to more than one space and can only be deleted by force.
if (
useForce === false &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: !useForce && ...

}
// the following check should be redundant since we're retrieving the docs from elasticsearch but we check just to make sure
// @ts-expect-error MultiGetHit is incorrectly missing _id, _source
if (docFound && !this.rawDocExistsInNamespace(actualResult, namespace)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: we exited previously on the !docFound condition, so docFound can be removed from this condition


if (rawResponse.result === 'deleted') {
// `namespaces` should only exist in the expectedResult.value if the type is multi-namespace.
if (namespaces) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: this condition should always be true here, right? I'm fine keeping, but then we should throw in a else block as we did for if (bulkDeleteResponse === undefined) {

Copy link
Contributor Author

@TinaHeiligers TinaHeiligers Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this condition should always be true here, right?

Not necessarily. Only multi-namespace documents have a namespaces array because we're only using mget for those namespace types. Single namespace and agnostic types aren't fetched from es in the mget call.

@@ -109,6 +109,7 @@ describe('features', () => {
actions.savedObject.get('all-savedObject-all-1', 'update'),
actions.savedObject.get('all-savedObject-all-1', 'bulk_update'),
actions.savedObject.get('all-savedObject-all-1', 'delete'),
actions.savedObject.get('all-savedObject-all-1', 'bulk_delete'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yea, I forgot about this file too.

@TinaHeiligers
Copy link
Contributor Author

@elasticmachine merge upstream

@TinaHeiligers TinaHeiligers enabled auto-merge (squash) September 20, 2022 14:26
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/core-saved-objects-api-browser 66 75 +9
@kbn/core-saved-objects-api-server 126 137 +11
@kbn/core-saved-objects-api-server-internal 49 51 +2
@kbn/core-saved-objects-server-internal 66 69 +3
@kbn/core-usage-data-server 135 142 +7
total +32

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
core 362.3KB 363.1KB +803.0B
Unknown metric groups

API count

id before after diff
@kbn/core-saved-objects-api-browser 94 106 +12
@kbn/core-saved-objects-api-server 289 308 +19
@kbn/core-saved-objects-api-server-internal 68 71 +3
@kbn/core-saved-objects-server-internal 66 69 +3
@kbn/core-usage-data-server 146 153 +7
core 2657 2684 +27
total +71

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:Saved Objects release_note:plugin_api_changes Contains a Plugin API changes section for the breaking plugin API changes section. Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Delete saved objects in bulk
7 participants