Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for setting fields in the config file #6024

Merged
merged 2 commits into from
Mar 13, 2018

Conversation

ruflin
Copy link
Member

@ruflin ruflin commented Jan 9, 2018

So far if the user wanted to modify the generated template, he either had to create his own template.json file or modify our fields.yml file. The problem with changing the fields.yml file is that with new versions it was hard to keep these up-to-date. This change allows to specify the few fields which should be set as part of the config.

Setting fields is especially useful in the case of Metricbeat for modules like Http or Prometheus where the data is user specific and we don't know the structure in advance.

This change also has affects on the generation of the index pattern in Kibana.

The configuration looks as following:

setup.template.append_fields:
  - name: test.name
    type: keyword
  - name: test.hostname
    type: long

I would have preferred to use setup.template.fields: but that is already taken by the path to the file.

Notes:

  • For this change to happen the template and index pattern must be overwritten
  • Overwriting existing fields is not allowed

@ruflin ruflin added discuss Issue needs further discussion. libbeat labels Jan 9, 2018
@exekias
Copy link
Contributor

exekias commented Jan 9, 2018

I like the idea 🌮, it will enable users to map their own stuff, which is nice. I would call it extra_fields, to make it clear that it's adding fields on top of the default ones.

I'm worried about allowing overwrites, I cannot think of any case that doesn't break our events. I would say you can override fully y replacing fields.yaml if that's what you want. That will ensure someone adding just some fields cannot break anything.

@tsg
Copy link
Contributor

tsg commented Jan 10, 2018

Two thoughts on this:

  • It might be nicer to bring the configuration at the module level. That way, we can in the future support simple customisations to Filebeat modules by modifying only the configuration file. But that can come later.
  • Just adding those fields won't be enough, because the template needs to be overwritten, which means that the overwrite flag needs to be set to true at least once, and a new index needs to be created.

I was working/thinking of a similar development for the Filebeat modules pipelines. In that case, as a solution to the issue above, we could allow the user to provide a custom string to add to the pipeline ID. So something like this:

  pipeline:
    append_processors:
      - rename: ...
    custom_token: mypipeline

Then the pipeline ID created would be of the form: filebeat-<module>-<version>-mypipeline which means it would be applied automatically for the Beats with that configuration, and not for the other.

I'm not sure if we can translate this idea to the templates as well, but just wanted to put it out there.

Naming suggestion: the config setting could be called append_fields to point out that the fields are appending to the existing one, not replacing them. Or would replacing be supported as well?

@ruflin
Copy link
Member Author

ruflin commented Jan 12, 2018

I agree we should only append fields and not even allow to overwrite them. As template inheritance might still be around in the future in ES, we potentially could load separate templates for these additional fields. Like this we would only have to overwrite the additional template.

I agree the support for seems to be most important inside modules. So far I was mainly thinking of prometheus or http module in metricbeat, but Filebeat is also a good use case.

Long term I'm hoping the fields.yml is part of the binary already (#4834), so the user will not even be tempted to touch it. If he wants to add some fields, he can do it in the beats config file directly either on the global level, module or prospector level. If he has many fields, he should also be able to provide a path to one or many additional fields.yml files. And then an option should be available to overwrite the template or not. In the best case, this is only for the parts the user modified.

As soon as we have the fields.yml as part of the Golang code, we can split it up internally into one file for each module which we already have for development. For the modules this would allow then to either use the fields.yml in the binary or if a file exists in the module directory, to use the one in the file. This will be tricky with overwriting etc. but I think there are a few ideas how we can solve that (see below).

It would be nice if we could detect on the beat side if the template in ES is identical to the local one or not. Or if we have multiple templates, if one does not exist or one has changed. So in the default case we could at least report that the template is not identical or one of the templates does not exist or is not identical. Then we could offer the user to only add non existing templates or overwrite the ones which are not identical which would mainly be used for development I hope.

Moving forward I suggest to put the feature on a global level out there to see how people use it and disallow overwriting existing fields. I would start with append_fields as config option, mark is as experimental, wait for user feedback and revise it in combination with our development on modules again.

@@ -134,3 +134,26 @@ func (f Fields) hasKey(keys []string) bool {
}
return false
}

func (f Fields) GetKeys() []string {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported method Fields.GetKeys should have comment or be unexported

@ruflin ruflin force-pushed the add-template-fields branch 2 times, most recently from 6978561 to 3e402ca Compare January 31, 2018 01:57
@ruflin ruflin changed the title [Discuss] Add support for setting fields in the config file Add support for setting fields in the config file Feb 6, 2018
@ruflin ruflin added review and removed discuss Issue needs further discussion. labels Feb 6, 2018
@ruflin
Copy link
Member Author

ruflin commented Feb 6, 2018

This is now ready for a round of reviews. I added the feature as experimental for now so we can start playing around with it.

@@ -99,6 +99,44 @@ func (f Fields) HasKey(key string) bool {
return f.hasKey(keys)
}

// HasNode checks if inside fields the given node exists
// In contrast to HasKey it not only compares the leave nodes but
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/leave/leaf/ since it's singular in this context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

for _, test := range tests {
_, err := appendFields(test.fields, test.appendFields)
if test.error {
fmt.Printf("%v\n\n", test.fields)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.Logf instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just removed it, was only for debugging and a left over.

@ruflin
Copy link
Member Author

ruflin commented Feb 12, 2018

@andrewkroh Ready for an other look, please squash.

@ruflin ruflin force-pushed the add-template-fields branch 2 times, most recently from 7302a11 to 07312fa Compare February 27, 2018 06:55
So far if the user wanted to modify the generated template, he either had to create his own template.json file or modify our `fields.yml` file. The problem with changing the `fields.yml` file is that with new versions it was hard to keep these up-to-date. This change allows to specify the few fields which should be set as part of the config.

Setting fields is especially useful in the case of Metricbeat for modules like Http or Prometheus where the data is user specific and we don't know the structure in advance.

This change also has affects on the generation of the index pattern in Kibana.

The configuration looks as following:

```
setup.template.append_fields:
  - name: test.name
    type: keyword
  - name: test.hostname
    type: long
```

I would have preferred to use `setup.template.fields:` but that is already taken by the path to the file.

Notes:
* For this change to happen the template and index pattern must be overwritten
* Overwriting existing fields is not allowed
@@ -82,3 +82,7 @@ setup.template.overwrite: false
setup.template.settings:
_source.enabled: false
----------------------------------------------------------------------

*`setup.template.append_fields`*:: A list of of fields to be added to the template and Kibana index pattern.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the changelog comment this should probably be marked as experimental.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

experimental flag added.

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@andrewkroh andrewkroh requested a review from tsg March 13, 2018 14:05
Copy link
Contributor

@tsg tsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, happy to have this feature in.

@andrewkroh andrewkroh merged commit b640832 into elastic:master Mar 13, 2018
@ruflin ruflin deleted the add-template-fields branch March 13, 2018 14:28
ruflin added a commit to ruflin/beats that referenced this pull request Apr 10, 2018
The rename processor allows to rename fields before they are indexed to standardise on names or move fields around. This becomes useful when building filebeat modules which read from json files. With the rename processor no ingest pipeline is needed to follow the naming schema. This should make some modules simpler to build. It's also useful in combination with elastic#6024 to rename some fields according to the schema.

```
processors:
- rename:
    fields:
     - from: "a"
       to: "b"
```

Intention of rename
* Adjust fields to mapping
* Prevent conflicts like `a` and `a.b` by renaming `a` to `a.value`

Limitations
* Will not overwrite keys
ph pushed a commit that referenced this pull request Apr 12, 2018
* Add rename fields processor

The rename processor allows to rename fields before they are indexed to standardise on names or move fields around. This becomes useful when building filebeat modules which read from json files. With the rename processor no ingest pipeline is needed to follow the naming schema. This should make some modules simpler to build. It's also useful in combination with #6024 to rename some fields according to the schema.

```
processors:
- rename:
    fields:
     - from: "a"
       to: "b"
```

Intention of rename
* Adjust fields to mapping
* Prevent conflicts like `a` and `a.b` by renaming `a` to `a.value`

Limitations
* Will not overwrite keys
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants