Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution][Detections] - Auto refresh all rules/monitoring tables #82062

Merged
merged 11 commits into from
Nov 6, 2020

Conversation

yctercero
Copy link
Contributor

@yctercero yctercero commented Oct 29, 2020

Summary

This PR addresses #63865 . Please read the issue for more detail, but essentially, stale data on the tables and use of relative date format leads to confusion as to whether the table was auto refreshing or not.

Went with the third suggestion in the issue ticket. The rules and monitoring tables auto refresh every 1 minute. An updated at relative data line was added in the header of the tables to give users a sense of how fresh the data is. I did end up changing the Last updated column to use standard format date as opposed to relative time because it seemed with the addition of the last table update, there were a lot of shifting relative dates.

Update: per @FrankHassanabad suggestion, updated UI to use EUI's refresh component, giving user option to change interval or stop all together. It also uses local storage to store the interval selection so if a user navigates away and comes back to the page, they don't have to reset the interval every time.

Update 2.0: After chatting with the team this feature includes the following:

  • User can turn auto refresh on/off from the all rules page
  • User can use Advanced Settings to turn auto refresh on/off (persists session), choose a refresh interval (defaults to 1 min and cannot be set to less than 1 min), and set their idle timeout (defaults to 45 min)
  • By default the page reloads every 1 min
  • After 45 min of no activity, idle modal pops up, pausing refresh and asking if user is still there

Initial table load

init_table_load

Refresh

sample_update

Idle modal

idle_modal

Advanced settings

Screen Shot 2020-11-05 at 5 30 54 PM

Advanced settings - user adds interval lower than 1 min

Screen Shot 2020-11-04 at 9 53 50 PM

Things to test

  • test that advanced settings take on page
  • test that you cannot set refresh interval below 1 min
  • test that you can use switch on rules page to pause auto refresh
  • update x-pack/plugins/security_solution/public/detections/pages/detection_engine/rules/all/index.tsx line 380 to a shorter interval like 3000 --> set interval --> leave window idle --> idle modal should pop up and pause interval --> hit continue and interval should continue
    • if you are actively using the window (it's listening for mousemove or keydown), the modal should not pop up

I also tested that filters stick on rules refresh. A bug was found with sort on the page, will be addressed separately.

Note

Checklist

@yctercero yctercero added release_note:enhancement Team:SIEM v8.0.0 Feature:Detection Rules Anything related to Security Solution's Detection Rules Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v7.11 labels Oct 29, 2020
@yctercero yctercero self-assigned this Oct 29, 2020
@yctercero yctercero marked this pull request as ready for review October 29, 2020 18:58
@yctercero yctercero requested review from a team as code owners October 29, 2020 18:58
@elasticmachine
Copy link
Contributor

Pinging @elastic/siem (Team:SIEM)

@FrankHassanabad
Copy link
Contributor

I see 3 things being updated, is that what you want? 3 back end requests each time?
Screen Shot 2020-10-29 at 1 24 31 PM

Also, I would suggest adding something to "advanced settings" to change/configure the amount of time it takes and be able to turn it off with something such as a -1 to be on the safe side for customers.

@yctercero
Copy link
Contributor Author

I see 3 things being updated, is that what you want? 3 back end requests each time?
Screen Shot 2020-10-29 at 1 24 31 PM

Also, I would suggest adding something to "advanced settings" to change/configure the amount of time it takes and be able to turn it off with something such as a -1 to be on the safe side for customers.

It should be the 3 calls as it's updating both all rules and monitoring table. If you see the initial render, it makes those three calls to get the necessary info for both tables.

@marrasherrier and I had chatted about giving the user the option to opt out of the auto-refresh. Is this something we want to reconsider Marra?

@FrankHassanabad
Copy link
Contributor

The "refresh every" if it was available like it is from the date picker would be a nice consistency.
Screen Shot 2020-10-29 at 1 34 29 PM

@timroes timroes added v7.11.0 and removed v7.11 labels Oct 30, 2020
@peluja1012
Copy link
Contributor

I found a couple of potential problems in testing the functionality of this improvement. First, it seems like clicking on "Stop" currently has no effect. See gif below.

rules_autorefresh_1 mov

I also noticed that when the interval is low, for example 1 second, the refresh/update calls never complete and table doesn't actually get updated (see gif below). It might be better to set a minimum or to only allow minutes and hours. Thoughts?

rules_autorefresh_2 mov

@marrasherrier
Copy link
Contributor

marrasherrier commented Nov 2, 2020

My only concern with this design is that it's not immediately obvious what the new dropdown does (until you click on it).

Screen Shot 2020-11-02 at 1 41 09 PM

@peluja1012 @MikePaquette @lindseypoli do you think this is an issue? we can't add text to the dropdown values or icon due to EUI constraints. we could add a label above, but it may look a bit awkward.

@yctercero
Copy link
Contributor Author

I found a couple of potential problems in testing the functionality of this improvement. First, it seems like clicking on "Stop" currently has no effect. See gif below.

I also noticed that when the interval is low, for example 1 second, the refresh/update calls never complete and table doesn't actually get updated (see gif below). It might be better to set a minimum or to only allow minutes and hours. Thoughts?

@peluja1012 thanks for the feedback! For your first point - there's a couple bugs in that version you checked out. I was trying to do a quick POC of using the refresh component, but that should all now be fixed.

For your second point, super valid. I actually checked in with the EUI team and they mentioned that restricting intervals is not yet an option and opened a ticket here - elastic/eui#4211 If I finish the other tickets, I can try to circle back and see if it's something I could try to add to the EUI component. We do have the option to "intercept" the interval change and warn the user about auto refresh intervals of < ~30 seconds.

@marrasherrier I tagged you in the slack convo about customizations with EUI. It seems that one of the reasons there isn't as much customization available is because they want to keep it the same across the app, but I think a discussion ticket was opened to discuss it. elastic/eui#4213

@@ -0,0 +1,47 @@
/*
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The files under components/last_updated were just moved from under timeline. Not new code.

@@ -210,7 +210,7 @@ export const getColumns = ({
getEmptyTagValue()
) : (
<LocalizedDateTooltip fieldName={i18n.COLUMN_LAST_UPDATE} date={new Date(value)}>
<FormattedRelative value={value} />
<FormattedDate value={value} fieldName={'last rule update date'} />
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this here because all the relative times (updated at, last run, last update, etc) felt overwhelming. @marrasherrier what do you think?

@@ -85,27 +93,24 @@ export const allRulesReducer = (
};
}
case 'updateRules': {
if (state.rules != null) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this as at no point is it typed or set to null or undefined, it's defaulted to empty array.

if (!isRefreshPaused) {
idleTimeoutId.current = setTimeout(() => {
setShowIdleModal(true);
}, 2700000);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set idle timeout to 45 minutes. Any thoughts on this?

const { loading: isLoadingRulesStatuses, rulesStatuses } = useRulesStatuses(rules);
const history = useHistory();
const [, dispatchToaster] = useStateToaster();
const mlCapabilities = useMlCapabilities();
const [allRulesTab, setAllRulesTab] = useState(AllRulesTabs.rules);
const { formatUrl } = useFormatUrl(SecurityPageName.detections);

// Auto rules info refresh refs
const idleTimeoutId = useRef<NodeJS.Timeout | null>(null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such a weird world between NodeJS and the browser. The browser puts all the globals on window and then NodeJS puts them all on global. But we run the unit tests on the backend within NodeJS and not really the front end.

What's interesting is actually a 3rd way called globalThis:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/globalThis
https://www.typescriptlang.org/docs/handbook/release-notes/typescript-3-4.html#type-checking-for-globalthis

I don't know if/when that will catch on enough, but in the meantime you can remove that NodeJS.Timeout with a handy trick like this:

const idleTimeoutId = useRef<ReturnType<typeof setTimeout> | null>(null);

It's interesting to change that ReturnType to be window.setTimeout instead but notice that TypeScript didn't really replace it all the way correctly, so it's a bit of workaround to avoid just having the words NodeJS in your front end types.

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

@kbn/optimizer bundle module count

id before after diff
securitySolution 2066 2068 +2

async chunks size

id before after diff
securitySolution 7.8MB 7.8MB +8.5KB

page load bundle size

id before after diff
securitySolution 243.0KB 243.3KB +317.0B

Saved Objects .kibana field count

id before after diff
_data_stream_timestamp - 1 +1
_doc_count - 1 +1
action - 5 +5
action_task_params - 3 +3
alert - 30 +30
apm-indices - 8 +8
application_usage_daily - 2 +2
canvas-element - 8 +8
canvas-workpad - 5 +5
canvas-workpad-template - 8 +8
cases - 38 +38
cases-comments - 18 +18
cases-configure - 19 +19
cases-user-actions - 10 +10
config - 2 +2
dashboard - 17 +17
endpoint:user-artifact - 10 +10
endpoint:user-artifact-manifest - 5 +5
epm-packages - 15 +15
exception-list - 39 +39
exception-list-agnostic - 39 +39
file-upload-telemetry - 2 +2
fleet-agent-actions - 9 +9
fleet-agent-events - 11 +11
fleet-agents - 25 +25
fleet-enrollment-api-keys - 10 +10
graph-workspace - 9 +9
index-pattern - 3 +3
ingest_manager_settings - 6 +6
ingest-agent-policies - 11 +11
ingest-outputs - 11 +11
ingest-package-policies - 35 +35
kql-telemetry - 3 +3
lens - 7 +7
lens-ui-telemetry - 5 +5
map - 7 +7
ml-job - 6 +6
ml-telemetry - 3 +3
monitoring-telemetry - 2 +2
namespace - 1 +1
namespaces - 1 +1
originId - 1 +1
query - 6 +6
references - 4 +4
sample-data-telemetry - 3 +3
search - 9 +9
siem-detection-engine-rule-actions - 8 +8
siem-detection-engine-rule-status - 12 +12
siem-ui-timeline - 90 +90
siem-ui-timeline-note - 8 +8
siem-ui-timeline-pinned-event - 7 +7
space - 9 +9
tag - 4 +4
telemetry - 9 +9
timelion-sheet - 13 +13
tsvb-validation-telemetry - 2 +2
type - 1 +1
ui-metric - 2 +2
updated_at - 1 +1
upgrade-assistant-reindex-operation - 18 +18
upgrade-assistant-telemetry - 13 +13
url - 6 +6
visualization - 9 +9
total +685

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cy.get(ASYNC_LOADING_PROGRESS).should('not.exist');
};

// when using, ensure you've called cy.clock prior in test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment still valid? Didn't see a cy.clock anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted in followup #83023

describe('AllRules', () => {
it('renders AllRulesUtilityBar total rules and selected rules', () => {
const wrapper = mount(
<ThemeProvider theme={() => ({ eui: euiDarkVars, darkMode: true })}>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: In tests I think the pattern of declaring a theme at the top:

const theme = () => ({ eui: euiDarkVars, darkMode: true });

And then using it everywhere is better:

<ThemeProvider theme={theme}>

It's just less syntax noise and repetition. At least that's the pattern I follow at the moment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thanks, that's a nice pattern. Updated in #83023

expect(mockSwitch).toHaveBeenCalledTimes(1);
});
});
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all these tests btw 👍

'xpack.securitySolution.uiSettings.rulesTableRefreshDescription',
{
defaultMessage:
'<p>Enables auto refresh on the all rules and monitoring tables, in milliseconds</p>',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first time I have seen HTML mixed in with a defaultMessage like this? Is this correct or are the <p> going to end up in the translation file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it's fine and they translate around it:

"xpack.securitySolution.uiSettings.defaultAnomalyScoreDescription": "<p>機械学習ジョブの異常がこの値を超えると、セキュリティアプリに表示されます。</p><p>有効な値:0 ~ 100。</p>",

So no changes with this PR needed, but that just feels weird and off to me and maybe in a follow up or something one of us should just fix these weird things going on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I followed the existing pattern in that file but then also saw the i18n documentation have instances of it.

"value": ${DEFAULT_RULE_REFRESH_INTERVAL_VALUE},
"idleTimeout": ${DEFAULT_RULE_REFRESH_IDLE_VALUE}
}`,
category: ['securitySolution'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if this whole file should have used APP_ID from constants at some point instead of the hard coded names of securitySolution. No changes if you don't want to, but I would try to update that and see if it works? I am curious if has to be this hard coded name and cannot use the variable we have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and changed it to use APP_ID in #83023 - it seemed to work just fine on testing it.

Copy link
Contributor

@FrankHassanabad FrankHassanabad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked it out, tested each scenario per the docs about tests and everything looks to be functioning as expected.

Few pointers, questions, nits, but this is a really awesome improvement so I don't want those to drown out all the good that is coming from this PR.

👍

@yctercero
Copy link
Contributor Author

Thanks @FrankHassanabad ! I'm gonna go ahead and merge and address the various things you pointed out in a follow up PR.

@yctercero yctercero merged commit e53da76 into elastic:master Nov 6, 2020
yctercero added a commit to yctercero/kibana that referenced this pull request Nov 6, 2020
…ables (elastic#82062)

## Summary

This PR addresses elastic#63865 . Please read the issue for more detail, but essentially, stale data on the tables and use of relative date format leads to confusion as to whether the table was auto refreshing or not.
yctercero added a commit that referenced this pull request Nov 7, 2020
…ables (#82062) (#82890)

## Summary

This PR addresses #63865 . Please read the issue for more detail, but essentially, stale data on the tables and use of relative date format leads to confusion as to whether the table was auto refreshing or not.
gmmorris added a commit to gmmorris/kibana that referenced this pull request Nov 9, 2020
* master: (68 commits)
  [Fleet] Make stream id unique in agent policy (elastic#82447)
  skip flaky suite (elastic#82915)
  skip flaky suite (elastic#75794)
  Copy `dateAsStringRt` to observability plugin (elastic#82839)
  [Maps] rename connected_components/map folder to mb_map (elastic#82897)
  [Security Solution] Fix EventsViewer DnD cypress tests (elastic#82619)
  [Security Solution] Adds logging and performance fan out API for threat/Indicator matching (elastic#82546)
  Implemented Alerting health status pusher by using task manager and status pooler for Kibana status plugins 'kibanahost/api/status' (elastic#79056)
  [APM] Adds new configuration 'xpack.apm.maxServiceEnvironments' (elastic#82090)
  Move single use function in line (elastic#82885)
  [ML] Add unsigned_long support to data frame analytics and anomaly detection (elastic#82636)
  Add flot_chart dependency from shared_deps to Shareable Runtime (elastic#81649)
  [Security Solution][Detections] - Auto refresh all rules/monitoring tables (elastic#82062)
  [APM] Fix apm e2e runner script commands (elastic#82798)
  [Ingest Manager] Move cache functions to from registry to archive (elastic#82871)
  Update webpack-dev-server and webpack-cli (elastic#82844)
  [Uptime] Migrate to new es client (elastic#82003)
  Move parseAndVerify* functions to validation.ts (elastic#82845)
  Remove yeoman & yo (elastic#82825)
  [Canvas] Fix elements not being updated properly when filter is changed on workpad (elastic#81863)
  ...
@yctercero
Copy link
Contributor Author

Thanks @FrankHassanabad ! I'm gonna go ahead and merge and address the various things you pointed out in a follow up PR.

Followed up in #83023

@yctercero yctercero deleted the auto_refresh_rules branch November 9, 2020 23:45
yctercero added a commit that referenced this pull request Nov 10, 2020
…ules (#83023)

### Summary

This is a follow up cleanup PR to #82062 . There were a few comments I hadn't gotten to and wanted to follow up on.
yctercero added a commit to yctercero/kibana that referenced this pull request Nov 10, 2020
…ules (elastic#83023)

### Summary

This is a follow up cleanup PR to elastic#82062 . There were a few comments I hadn't gotten to and wanted to follow up on.
yctercero added a commit that referenced this pull request Nov 11, 2020
…ules (#83023) (#83089)

### Summary

This is a follow up cleanup PR to #82062 . There were a few comments I hadn't gotten to and wanted to follow up on.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Detection Rules Anything related to Security Solution's Detection Rules release_note:enhancement Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:SIEM v7.11.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SIEM] [Detections] Stale data in Rules/Monitoring table implies Rule isn't running
7 participants