Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Response Ops][Alerting] Only load maintenance windows when there are alerts during rule execution and caching loaded maintenance windows #192573

Merged
merged 22 commits into from
Sep 26, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
a544490
Only loading maintenance windows if there are new alerts to persist
ymao1 Sep 12, 2024
6f71723
Fixing types
ymao1 Sep 12, 2024
71b7538
Need to load when there are any alerts not just new
ymao1 Sep 13, 2024
e097e99
Merging in main
ymao1 Sep 13, 2024
8d82191
Need to load when there are any alerts not just new
ymao1 Sep 13, 2024
3ef5f22
Merging in main
ymao1 Sep 16, 2024
9a5f463
wip
ymao1 Sep 16, 2024
9293e37
Caching maintenance windows
ymao1 Sep 16, 2024
c5af98f
Fixing tests
ymao1 Sep 16, 2024
cbb7c82
Fixing types
ymao1 Sep 16, 2024
18f8e10
Merge branch 'main' of github.com:elastic/kibana into alerting/load-m…
ymao1 Sep 16, 2024
1b98138
Taking cache interval into account when querying for maintenance windows
ymao1 Sep 18, 2024
2044596
Merge branch 'main' of github.com:elastic/kibana into alerting/load-m…
ymao1 Sep 18, 2024
57fd492
Merge branch 'main' into alerting/load-maintenance-windows-later
elasticmachine Sep 19, 2024
3dd0eac
[CI] Auto-commit changed files from 'yarn openapi:bundle'
kibanamachine Sep 19, 2024
fc94013
Merge branch 'main' into alerting/load-maintenance-windows-later
elasticmachine Sep 19, 2024
46e48cf
Fixing test
ymao1 Sep 20, 2024
99556c2
Merge branch 'main' of github.com:elastic/kibana into alerting/load-m…
ymao1 Sep 20, 2024
5e3d400
Merge branch 'main' into alerting/load-maintenance-windows-later
elasticmachine Sep 23, 2024
120df74
Merge branch 'main' of github.com:elastic/kibana into alerting/load-m…
ymao1 Sep 25, 2024
883d909
PR feedback
ymao1 Sep 25, 2024
3e43f91
Merge branch 'main' into alerting/load-maintenance-windows-later
elasticmachine Sep 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -156,8 +156,6 @@ export function createAlertFactory<
autoRecoverAlerts,
// flappingSettings.enabled is false, as we only want to use this function to get the recovered alerts
flappingSettings: DISABLE_FLAPPING_SETTINGS,
// no maintenance window IDs are passed as we only want to use this function to get recovered alerts
maintenanceWindowIds: [],
});
return Object.keys(currentRecoveredAlerts ?? {}).map(
(alertId: string) => currentRecoveredAlerts[alertId]
Expand Down
217 changes: 183 additions & 34 deletions x-pack/plugins/alerting/server/alerts_client/alerts_client.test.ts

Large diffs are not rendered by default.

80 changes: 44 additions & 36 deletions x-pack/plugins/alerting/server/alerts_client/alerts_client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ import type { AlertRule, LogAlertsOpts, ProcessAlertsOpts, SearchResult } from '
import {
IAlertsClient,
InitializeExecutionOpts,
ProcessAndLogAlertsOpts,
TrackedAlerts,
ReportedAlert,
ReportedAlertData,
Expand All @@ -62,11 +61,10 @@ import {
} from './lib';
import { isValidAlertIndexName } from '../alerts_service';
import { resolveAlertConflicts } from './lib/alert_conflict_resolver';
import { MaintenanceWindow } from '../application/maintenance_window/types';
import {
filterMaintenanceWindows,
filterMaintenanceWindowsIds,
} from '../task_runner/get_maintenance_windows';
} from '../task_runner/maintenance_windows';

// Term queries can take up to 10,000 terms
const CHUNK_SIZE = 10000;
Expand All @@ -77,6 +75,10 @@ export interface AlertsClientParams extends CreateAlertsClientParams {
dataStreamAdapter: DataStreamAdapter;
}

interface AlertsAffectedByMaintenanceWindows {
alertIds: string[];
maintenanceWindowIds: string[];
}
export class AlertsClient<
AlertData extends RuleAlertData,
LegacyState extends AlertInstanceState,
Expand Down Expand Up @@ -121,7 +123,14 @@ export class AlertsClient<
LegacyContext,
ActionGroupIds,
RecoveryActionGroupId
>({ logger: this.options.logger, ruleType: this.options.ruleType });
>({
alertingEventLogger: this.options.alertingEventLogger,
logger: this.options.logger,
maintenanceWindowsService: this.options.maintenanceWindowsService,
request: this.options.request,
ruleType: this.options.ruleType,
spaceId: this.options.spaceId,
});
this.indexTemplateAndPattern = getIndexTemplateAndPattern({
context: this.options.ruleType.alerts?.context!,
namespace: this.options.ruleType.alerts?.isSpaceAware
Expand Down Expand Up @@ -301,45 +310,25 @@ export class AlertsClient<
return this.legacyAlertsClient.checkLimitUsage();
}

public processAlerts(opts: ProcessAlertsOpts) {
this.legacyAlertsClient.processAlerts(opts);
public async processAlerts(opts: ProcessAlertsOpts) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

complete aside, but if the occaisonal event loop delays we seen in rules are caused by processAlerts(), making this async may help - chunks things up a bit, give other node events get a chance to run

await this.legacyAlertsClient.processAlerts(opts);
}

public logAlerts(opts: LogAlertsOpts) {
this.legacyAlertsClient.logAlerts(opts);
}

public processAndLogAlerts(opts: ProcessAndLogAlertsOpts) {
this.legacyAlertsClient.processAndLogAlerts(opts);
}

public getProcessedAlerts(
type: 'new' | 'active' | 'activeCurrent' | 'recovered' | 'recoveredCurrent'
) {
return this.legacyAlertsClient.getProcessedAlerts(type);
}

public async persistAlerts(maintenanceWindows?: MaintenanceWindow[]): Promise<{
alertIds: string[];
maintenanceWindowIds: string[];
} | null> {
public async persistAlerts(): Promise<AlertsAffectedByMaintenanceWindows> {
// Persist alerts first
await this.persistAlertsHelper();

// Try to update the persisted alerts with maintenance windows with a scoped query
let updateAlertsMaintenanceWindowResult = null;
try {
updateAlertsMaintenanceWindowResult = await this.updateAlertsMaintenanceWindowIdByScopedQuery(
maintenanceWindows ?? []
);
} catch (e) {
this.options.logger.debug(
`Failed to update alert matched by maintenance window scoped query ${this.ruleInfoMessage}`,
this.logTags
);
}

return updateAlertsMaintenanceWindowResult;
return await this.updatePersistedAlertsWithMaintenanceWindowIds();
}

public getAlertsToSerialize() {
Expand Down Expand Up @@ -692,18 +681,39 @@ export class AlertsClient<
}
}

private async updateAlertsMaintenanceWindowIdByScopedQuery(
maintenanceWindows: MaintenanceWindow[]
) {
private async updatePersistedAlertsWithMaintenanceWindowIds(): Promise<AlertsAffectedByMaintenanceWindows> {
// check if there are any alerts
const newAlerts = Object.values(this.legacyAlertsClient.getProcessedAlerts('new'));
const activeAlerts = Object.values(this.legacyAlertsClient.getProcessedAlerts('active'));
const recoveredAlerts = Object.values(this.legacyAlertsClient.getProcessedAlerts('recovered'));

// return if there are no alerts written
if (
(!newAlerts.length && !activeAlerts.length && !recoveredAlerts.length) ||
!this.options.maintenanceWindowsService
) {
return {
alertIds: [],
maintenanceWindowIds: [],
};
}

const { maintenanceWindows } =
await this.options.maintenanceWindowsService.loadMaintenanceWindows({
eventLogger: this.options.alertingEventLogger,
request: this.options.request,
ruleTypeCategory: this.ruleType.category,
spaceId: this.options.spaceId,
});

const maintenanceWindowsWithScopedQuery = filterMaintenanceWindows({
maintenanceWindows,
maintenanceWindows: maintenanceWindows ?? [],
withScopedQuery: true,
});
const maintenanceWindowsWithoutScopedQueryIds = filterMaintenanceWindowsIds({
maintenanceWindows,
maintenanceWindows: maintenanceWindows ?? [],
withScopedQuery: false,
});

if (maintenanceWindowsWithScopedQuery.length === 0) {
return {
alertIds: [],
Expand All @@ -723,8 +733,6 @@ export class AlertsClient<
const alertsAffectedByScopedQuery: string[] = [];
const appliedMaintenanceWindowIds: string[] = [];

const newAlerts = Object.values(this.getProcessedAlerts('new'));

for (const [scopedQueryMaintenanceWindowId, alertIds] of Object.entries(aggsResult)) {
// Go through matched alerts, find the in memory object
alertIds.forEach((alertId) => {
Expand Down
Loading