Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic filtering Implementation #830

Closed
wants to merge 10 commits into from

Conversation

monicasarbu
Copy link
Contributor

The proposal for the generic filtering can be found here: #451

In this implementation two actions are implemented:

  • include_fields
  • drop_fields

and both accept fields as argument:

- include_fields:
    fields: ["proc.mem", "proc.cpu", "cmdline"]

Each event MUST contain a few fields that cannot be removed. For now, these "mandatory" fields are: ["@timestamp", "beat", type", "count"]. If you configure one of these fields in the drop_fields, it will be ignored.

A field can be a dictionary or a value. Example: `proc.cpu.total_p", "proc", "proc.cpu".

Special cases to consider:

  • Ignore the fields that are not available in the event. Example:
- include_fields: 
      fields: ["xx", "proc.mem", "proc.cpu"]
  • If none of the include fields is part of the event, drop the event as it contains no information except the "mandatory" fields. In the previous example all the events that have type != process will be dropped as they don't contain any of the fields: proc.mem, proc.cpu.
  • Filter the event by applying multiple filter rules defined under `filter. We start with a copy of the event and apply the filter rules one by to the filtered event.

event -> filter rule 1 -> event 1 -> filter rule 2 -> event 2 ...

- include_fields:
      fields: ["proc"]
- include_fields:
      fields: ["proc.cpu.total_p", "proc.mem.rss_p", "proc.cmdline"]
  • The generic filtering implementation relies that the event contains only primitives and MapStr, so before calling the filter rules the event checker is called that verifies:
    • the event doesn't contain struct
    • the event doesn't contain pointer

Testing:

  • if works with topbeat
  • if works with filebeat
  • if works with packetbeat

@monicasarbu monicasarbu changed the title Generic filtering Generic filtering (Phase 2) Implementation Jan 24, 2016
@monicasarbu monicasarbu changed the title Generic filtering (Phase 2) Implementation Generic filtering Implementation Jan 25, 2016
@ruflin ruflin added the in progress Pull request is currently in progress. label Jan 25, 2016
@monicasarbu monicasarbu force-pushed the generic_filtering branch 2 times, most recently from 672c482 to 4115a5c Compare January 26, 2016 18:43
keyPart := keyParts[i]

if _, ok := mapp[keyPart]; ok {
mapp = mapp[keyPart].(MapStr)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

panic possible here.

@monicasarbu monicasarbu force-pushed the generic_filtering branch 5 times, most recently from 4fa7a64 to 2d3b680 Compare January 27, 2016 23:33
filters, err := filter.New(b.Config.Filter)
if err != nil {
fmt.Printf("Error Initialising filters: %v\n", err)
logp.Critical(err.Error())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed we spell "initializing" two different ways in messages in this file. We should stop choose one way and change them all.

return nil
}

func (m MapStr) Clone() MapStr {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe replace it with copystructure?

Both actions receive fields as argument that can be of type map or a value. For example: "proc", "proc.cpu.total_p".
Each event MUST contain a few fields that cannot be removed. For now, these "mandatory" fields are: ["@timestamp",
"beat", type", "count"]. If you configure one of these fields in the drop_fields, it will be ignored.
In case there is a field to include that is not available in the event, then just ignore the field. If none of the
fields to include are part of the event, then drop the event as it contains no information except the "mandatory" fields.

When multiple filtering rules are defined, we start with a copy of the event and apply the filtering rules one by to the event.
   event -> filter rule 1 -> event 1 -> filter rule 2 -> event 2 ...
where event is the initial event, event1 is the event resulted after applying "filter rule1" and it's considered input for the
"filter rule 2" and so on.
The event checker verifies if no field has the struct or pointer type and it's called
by the publisher just before publishing the event.
Check if the event (of type MapStr) passed to the publisher contains the expected types. If not, then
convert the values by calling json.marshall & json.unmarshall.
@monicasarbu
Copy link
Contributor Author

Closing this PR in favor of #1120.

@monicasarbu monicasarbu closed this Mar 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in progress Pull request is currently in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants