Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution][Detections]Indicator Match Enrichment #89899

Merged
merged 43 commits into from
Feb 12, 2021

Conversation

rylnd
Copy link
Contributor

@rylnd rylnd commented Feb 1, 2021

Summary

This work adds enrichment to alerts generated by indicator match detection rules.

Features

  • All indicator match alerts will be enriched with threat.indicator.matched, containing metadata about the circumstances of that event(s)/indicator(s) being matched.
  • If the indicator index conform to the prospective ECS threat intel fields, those will automatically be added to the resulting detection alert.

Followup to this PR:

Reviewer Notes:

In general: create an indicator index, create a rule, generate some events, and observe the resultant, enriched signals. Specifically:

  1. Generate an indicator index. We can use es_archive to loading the mappings/data used by the integration tests in this PR:

    node scripts/es_archiver.js load threat_intel --dir x-pack/test/functional/es_archives/filebeat --es-url http://elastic:changeme@localhost:9200 --kibana-url http://elastic:changeme@localhost:5601
    
  2. Create some indicator data. The simplest approach here is to create an indicator(s) with some known value (e.g. your computer's hostname) that we can leverage later. In dev tools:

    POST filebeat-8.0.0-2021.01.26-000001/_doc/
    {
      "@timestamp": "2021-02-01T00:41:06.527Z",
      "threat": {
        "indicator": {
          "domain": "my-laptop-hostname.local",
          "provider": "local-testing",
    
    
      "other_fields": "with values you want to see in the signal"
        }
      }
    }
  3. Generate source events with values matching your indicator fields above. As an example, I run auditbeat, knowing that it will generate events with our laptop's hostname in host.name. However, you can also manually insert some documents into an index of your own creation; just ensure that they contain values that will cause the rule to generate a signal.

  4. Create an indicator match rule against your indicator index and event data:

    Detections_-_Kibana

  5. Peep those enriched signals:
    Note: we're temporarily using dev tools here due to some ongoing fields display issues, and the _source filter is for demo cleanliness, please remove

    GET .siem-signals-default/_search
    {
      "query": {
        "term": {
          "signal.rule.name": {
            "value": "Review Test"
          }
        }
      },
      "_source": "threat.indicator"
    }

    Dev_Tools_-_Elastic

Checklist

For maintainers

When this moves to individual rule-specific data transformations this
will be a little more explicit/configurable; for now to keep changes
minimal, we're using dependency injection to pass a function, which will
default to the identity function (e.g. a no-op).
This is what allows us to enrich the threat match signals using only the
signal search response.
This gives us the information we need to enrich our signals after
they've been queried without having to perform a complicated reverse
query.
Adds assertions to the existing test, and fleshes out another test for a
multi-match signal.
* single indicator matching multiple events
* multiple indicators matching a single event
* multiple indicators, multiple events
* placeholder for deduplication logic

This also adds some descriptions to our threat intel documents, to give
a little context around how they're meant to function within the tests,
particularly as relates to the auditbeat/hosts data on which it is meant
to function.
This handles the situation where the indicator match search has returned
the same signal multiple times due to the source event matching
different indicators in different query batches. In this case, we want
to generate a single signal with all matched indicators.
We were previously adding the indicator's field to matched.field,
instead of the corresponding event field that matched the indicator.

In the normal case, the expectation is that the indicator field is
self-evident, and thus we want to know the other side of the match on
the event itself.

Updates tests accordingly.
This could occur if the indicator index is updated while a rule is being
run.
This just verifies that the enrichment function gets invoked with search
results.
@@ -2457,6 +2457,144 @@
"ignore_above": 1024,
"type": "keyword"
},
"indicator": {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These need to be updated with the latest from elastic/ecs#1127 before this gets merged/released.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: as of today, these are up to date with the RFC.

I made both of these before we were clear on the direction we were
taking here.
}
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As talked about over video conference let's see if we can trim this and the mappings down to their essence and only have what we absolutely need in these for testing.

This will make maintainers lives easier and make things easier to understand. If someone needs to test other portions or parts of threat intelligence we should encourage them to create and use different index mappings and data sets specific to those tests and avoid mixing extra baggage between tests.

It seems to be the better way for this to all work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed; I will trim down this file, and also create a smaller auditbeat index containing just the few events that we care about here (plus 1-2 extras to ensure no false positives). In a followup PR 😜

},
"number_of_replicas": "0",
"number_of_shards": "1",
"refresh_interval": "5s"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting though. I think the refresh_interval should stay at 5s for most correctness tests. I might clean up my other tests with a 5s refresh interval to reduce flakiness and increase the amount of time for things to be showing up within the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't a conscious choice, but was instead cribbed from another existing archive. I think we want as small an interval as possible, no? I see values of 5s, 1ms, and -1 in the existing x-pack FTR archives.

Ensure that we throw an error if the indicator field is either a
primitive or an array of primitives.
These values are already defaulted in the parent, and the types are
correct in that these cannot be undefined.
existingSignalHit could not be undefined on line 30 here, but typescript
could not infer this from the !acc.has() call.
We were using a map previously in order to use .has() for a predicate,
but code has since been refactored to make that unnecessary.
These are being typed implicitly and verified against SignalSourceHit[]
on the assignment below, but this makes the types explicit and surfaces
a type error here instead of the subsequent assignment.
@rylnd
Copy link
Contributor Author

rylnd commented Feb 10, 2021

@elasticmachine merge upstream

kibanamachine and others added 3 commits February 10, 2021 17:39
These references were moved into buildThreatEnrichment
I copied the entirety of the `threat` mappings in order to get the
`threat.indicator` ones, but it looks like these were added at some
point too.

I'd rather these not be added incidentally. If we need them, we should
do so explicitly.
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@rylnd rylnd merged commit 7a55267 into elastic:master Feb 12, 2021
rylnd added a commit to rylnd/kibana that referenced this pull request Feb 12, 2021
)

* Adds basic integration test for threat enrichment

* Update signals mappings with indicator fields

* Simplify some ternaries with Math.min

* Remove outdated comments

* Add notes from walkthrough with devin

* Add an enrichment hook to the current signal creation pipeline

When this moves to individual rule-specific data transformations this
will be a little more explicit/configurable; for now to keep changes
minimal, we're using dependency injection to pass a function, which will
default to the identity function (e.g. a no-op).

* Add utility functions for encoding/decoding our threat query

This is what allows us to enrich the threat match signals using only the
signal search response.

* Add a name to each threat match filter clause

This gives us the information we need to enrich our signals after
they've been queried without having to perform a complicated reverse
query.

* Adds functions for signal enrichment of threat indicators

* Wire up threat enrichment to threat match rules

* Fleshes out threat match integration tests

Adds assertions to the existing test, and fleshes out another test for a
multi-match signal.

* Add more test cases to indicator match integration tests

* single indicator matching multiple events
* multiple indicators matching a single event
* multiple indicators, multiple events
* placeholder for deduplication logic

This also adds some descriptions to our threat intel documents, to give
a little context around how they're meant to function within the tests,
particularly as relates to the auditbeat/hosts data on which it is meant
to function.

* Implement signal deduplification

This handles the situation where the indicator match search has returned
the same signal multiple times due to the source event matching
different indicators in different query batches. In this case, we want
to generate a single signal with all matched indicators.

* Move default indicator path to constant

* Testing some edge cases with signal enrichment

* Cover and test edge cases with threat enrichment generation

* Fix logical error in TI enrichment

We were previously adding the indicator's field to matched.field,
instead of the corresponding event field that matched the indicator.

In the normal case, the expectation is that the indicator field is
self-evident, and thus we want to know the other side of the match on
the event itself.

Updates tests accordingly.

* Document behavior when an indicator matched but is absent on enrichment

This could occur if the indicator index is updated while a rule is being
run.

* Add followup note

* Add basic unit test for our enrichment function

This just verifies that the enrichment function gets invoked with search
results.

* Update license headers for new files

* Remove unused threatintel archive

I made both of these before we were clear on the direction we were
taking here.

* Bump signals version to allows some updates in patch releases

* Fix typings of threat list item

We were conflating the type of the underlying document with the type of
the search response for that document. This is now addressed with two
types: ThreatListDoc and ThreatListItem, respectively.

ThreatListDoc isn't the most distinguishing name but it avoids a lot of
unnecessary renaming for the existing concept of ThreatListItem.

* Update test mock to be aware of (but not care about) named queries

* Remove/update outdated comments

This code was modified to perform two searches instead of one; at that
time, a lot of this code was duplicated and modified slightly, and these
misleading comments were a result. I removed the ones that were no
longer relevant, but left a TODO for one that could be a bug.

* Remove outdated comment

Documents will always have _id.

* Update enriched signals' total to account for deduplication

If a given signal matched on multiple indicators in different loops of
our indicator query, it may appear multiple times. Our enrichment
performs the merging of those duplicated results, but did not previously
update the response's total field to account for this.

I don't believe that anything downstream is actually using this field and that we
are instead operating on the length of hits and the response from the
bulk create request, but this keeps things consistent in case that
changes.

* Remove development comments

* Add JSDoc for our special template version constant

* Remove outdated comments

* Add an additional test permutation for error cases

Ensure that we throw an error if the indicator field is either a
primitive or an array of primitives.

* Remove unnecessary coalescing

These values are already defaulted in the parent, and the types are
correct in that these cannot be undefined.

* Move logic to build threat enrichment function into helper

* Refactor code to allow typescript to infer our type narrowing

existingSignalHit could not be undefined on line 30 here, but typescript
could not infer this from the !acc.has() call.

* Use a POJO over a Map

We were using a map previously in order to use .has() for a predicate,
but code has since been refactored to make that unnecessary.

* Explicitly type our enriched signals

These are being typed implicitly and verified against SignalSourceHit[]
on the assignment below, but this makes the types explicit and surfaces
a type error here instead of the subsequent assignment.

* Add an explanatory note about these test results

* Remove unused imports

These references were moved into buildThreatEnrichment

* Remove threat mappings accidentally brought in with indicator work

I copied the entirety of the `threat` mappings in order to get the
`threat.indicator` ones, but it looks like these were added at some
point too.

I'd rather these not be added incidentally. If we need them, we should
do so explicitly.

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
@rylnd rylnd deleted the threat_enrichment_simple branch February 12, 2021 19:01
rylnd added a commit that referenced this pull request Feb 12, 2021
…91264)

* Adds basic integration test for threat enrichment

* Update signals mappings with indicator fields

* Simplify some ternaries with Math.min

* Remove outdated comments

* Add notes from walkthrough with devin

* Add an enrichment hook to the current signal creation pipeline

When this moves to individual rule-specific data transformations this
will be a little more explicit/configurable; for now to keep changes
minimal, we're using dependency injection to pass a function, which will
default to the identity function (e.g. a no-op).

* Add utility functions for encoding/decoding our threat query

This is what allows us to enrich the threat match signals using only the
signal search response.

* Add a name to each threat match filter clause

This gives us the information we need to enrich our signals after
they've been queried without having to perform a complicated reverse
query.

* Adds functions for signal enrichment of threat indicators

* Wire up threat enrichment to threat match rules

* Fleshes out threat match integration tests

Adds assertions to the existing test, and fleshes out another test for a
multi-match signal.

* Add more test cases to indicator match integration tests

* single indicator matching multiple events
* multiple indicators matching a single event
* multiple indicators, multiple events
* placeholder for deduplication logic

This also adds some descriptions to our threat intel documents, to give
a little context around how they're meant to function within the tests,
particularly as relates to the auditbeat/hosts data on which it is meant
to function.

* Implement signal deduplification

This handles the situation where the indicator match search has returned
the same signal multiple times due to the source event matching
different indicators in different query batches. In this case, we want
to generate a single signal with all matched indicators.

* Move default indicator path to constant

* Testing some edge cases with signal enrichment

* Cover and test edge cases with threat enrichment generation

* Fix logical error in TI enrichment

We were previously adding the indicator's field to matched.field,
instead of the corresponding event field that matched the indicator.

In the normal case, the expectation is that the indicator field is
self-evident, and thus we want to know the other side of the match on
the event itself.

Updates tests accordingly.

* Document behavior when an indicator matched but is absent on enrichment

This could occur if the indicator index is updated while a rule is being
run.

* Add followup note

* Add basic unit test for our enrichment function

This just verifies that the enrichment function gets invoked with search
results.

* Update license headers for new files

* Remove unused threatintel archive

I made both of these before we were clear on the direction we were
taking here.

* Bump signals version to allows some updates in patch releases

* Fix typings of threat list item

We were conflating the type of the underlying document with the type of
the search response for that document. This is now addressed with two
types: ThreatListDoc and ThreatListItem, respectively.

ThreatListDoc isn't the most distinguishing name but it avoids a lot of
unnecessary renaming for the existing concept of ThreatListItem.

* Update test mock to be aware of (but not care about) named queries

* Remove/update outdated comments

This code was modified to perform two searches instead of one; at that
time, a lot of this code was duplicated and modified slightly, and these
misleading comments were a result. I removed the ones that were no
longer relevant, but left a TODO for one that could be a bug.

* Remove outdated comment

Documents will always have _id.

* Update enriched signals' total to account for deduplication

If a given signal matched on multiple indicators in different loops of
our indicator query, it may appear multiple times. Our enrichment
performs the merging of those duplicated results, but did not previously
update the response's total field to account for this.

I don't believe that anything downstream is actually using this field and that we
are instead operating on the length of hits and the response from the
bulk create request, but this keeps things consistent in case that
changes.

* Remove development comments

* Add JSDoc for our special template version constant

* Remove outdated comments

* Add an additional test permutation for error cases

Ensure that we throw an error if the indicator field is either a
primitive or an array of primitives.

* Remove unnecessary coalescing

These values are already defaulted in the parent, and the types are
correct in that these cannot be undefined.

* Move logic to build threat enrichment function into helper

* Refactor code to allow typescript to infer our type narrowing

existingSignalHit could not be undefined on line 30 here, but typescript
could not infer this from the !acc.has() call.

* Use a POJO over a Map

We were using a map previously in order to use .has() for a predicate,
but code has since been refactored to make that unnecessary.

* Explicitly type our enriched signals

These are being typed implicitly and verified against SignalSourceHit[]
on the assignment below, but this makes the types explicit and surfaces
a type error here instead of the subsequent assignment.

* Add an explanatory note about these test results

* Remove unused imports

These references were moved into buildThreatEnrichment

* Remove threat mappings accidentally brought in with indicator work

I copied the entirety of the `threat` mappings in order to get the
`threat.indicator` ones, but it looks like these were added at some
point too.

I'd rather these not be added incidentally. If we need them, we should
do so explicitly.

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants