Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make fields.yml part of the binary #4834

Merged
merged 2 commits into from
May 15, 2018

Conversation

ruflin
Copy link
Member

@ruflin ruflin commented Aug 7, 2017

Currently the fields.yml file is shipped as its own file and has to be place in the right directory to be used. As fields have become a central part of beats I suggest to make them part of the binary which will allow us to also split up the fields.yml inside the binary into multiple part (for example one per module) and use it for testing like fields checks.

The behavior from the user perspective is as following:

  • By default internal fields.yml data is used
  • If fields.yml path is set in the config file, it is loaded from disk

The golang files are generated when running make update. The content of fields.yml is stored as a compressed string.

Each asset is currently prefixed by the Beat name. The current fields registry is a basic implementation that likely needs to be extended when we add fields per module or processors.

@ruflin ruflin added discuss Issue needs further discussion. in progress Pull request is currently in progress. labels Aug 7, 2017
@tsg
Copy link
Contributor

tsg commented Aug 8, 2017

I think we need to move into the direction where we no longer concatenate all the module's fields.yml into one large file, but have them per module. This allows the compiling of "partial" mapping templates and index patterns, with only the fields that are in use.

Note sure if this is conflicting or is just orthogonal with this PR, but I wanted us to keep that in mind during the discussion.

@ruflin
Copy link
Member Author

ruflin commented Aug 8, 2017

I like the idea of having the "partial" fields.yml locally. The main thing that I see as a blocker to load partial mappings is that with ES 6.0 inheritance of templates was removed. This means if someone starts with the mysql module and then adds the redis module one large template with a new index would have to be used. One way out of this would be to have one index per module.

@ruflin
Copy link
Member Author

ruflin commented Oct 6, 2017

@andrewkroh go-bindata causes an issue here as it does not generate with which goimports is happy. I remember you had your own implementation of loading file data. Can you point me to it again?

@ruflin
Copy link
Member Author

ruflin commented Oct 9, 2017

@andrewkroh Thanks. I took your implementation and modified it a bit to be used here.

@ruflin ruflin added libbeat review and removed discuss Issue needs further discussion. in progress Pull request is currently in progress. labels Oct 9, 2017
@tsg
Copy link
Contributor

tsg commented Oct 9, 2017

@ruflin There's a probably related test failure here: https://travis-ci.org/elastic/beats/jobs/285439037#L1872

Copy link
Contributor

@tsg tsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM. I think you want to remove the packaging of fields.yml in each platform (for example here), and then test the packages.

@ruflin
Copy link
Member Author

ruflin commented Oct 9, 2017

I was thinking of removing fields.yml from the package only when the export of fields.yml is possible. I added it as a task to this issue here: #5347

@ruflin
Copy link
Member Author

ruflin commented Oct 9, 2017

@tsg Have a look again, tests should be fixed. Travis failure is related to elastic/logstash#8456

)

func init() {
if err := assets.AddFields("fields.yml", Asset); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If two Beats were combined into the same process would this still work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AddFields would work as currently the name is not used at all internally. What would cause problems is that the fields from libbeat would be duplicated. For now I kept the fields.yml exactly as it is and a 1-1 copy is added to each file. For some long terms ideas like splitting up for libbeat or modules, see PR description.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that you would want the fields from each Beat to be stored independently such that each template (from asset.GetFields) contains only the fields for that Beat. But maybe that is moot if all beats in one process share a single output (not sure how that would work).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitively a valid point but I would ignore it for now.

@@ -147,6 +148,7 @@ func NewBeat(name, indexPrefix, v string) (*Beat, error) {
Hostname: hostname,
UUID: uuid.NewV4(),
},
Fields: assets.GetFields(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the beat name could be a param? (thinking of two beats in the same process)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you thinking it cause some issues with the collectbeat from @vjsamuel ? Did not think about that TBH.

@@ -237,7 +237,7 @@ func TestOverwrite(t *testing.T) {
Enabled: true,
Fields: absPath + "/fields.yml",
})
loader, err := NewLoader(config, client, beatInfo)
loader, err := NewLoader(config, client, beatInfo, []byte{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you pass nil instead of allocating an empty slice?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

func load(cfg *ucfg.Config) (common.Fields, error) {
keys := []common.Field{}

cfg.Unpack(&keys)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

}

func load(cfg *ucfg.Config) (common.Fields, error) {
keys := []common.Field{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use var keys common.Field will Unpack allocate this for you?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested and seems to work.

@andrewkroh
Copy link
Member

andrewkroh commented Oct 9, 2017

Just a few notes from discussion:

You could auto-detect if {path.home}/fields.yml exists and use it.

You might want to compress the fields.yml data internally and then extract it on first use.

@ruflin ruflin added in progress Pull request is currently in progress. and removed review labels Oct 9, 2017
}
}

func Asset() ([]byte, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function Asset should have comment or be unexported

}
}

func Asset() ([]byte, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function Asset should have comment or be unexported

return output, nil
}

// Load the given input and generates the input based on it

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment on exported method Template.LoadBytes should be of the form "LoadBytes ..."

}

func NewLoader(cfg *common.Config, client ESClient, beatInfo beat.Info) (*Loader, error) {
func NewLoader(cfg *common.Config, client ESClient, beatInfo beat.Info, fields []byte) (*Loader, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function NewLoader should have comment or be unexported

return nil
}

func GetFields() []byte {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function GetFields should have comment or be unexported


var FieldsRegistry [][]byte

func AddFields(name string, asset func() ([]byte, error)) error {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function AddFields should have comment or be unexported

@@ -0,0 +1,23 @@
package assets

var FieldsRegistry [][]byte

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported var FieldsRegistry should have comment or be unexported

}
}

func Asset() ([]byte, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function Asset should have comment or be unexported

}
}

func Asset() ([]byte, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function Asset should have comment or be unexported

}
}

func Asset() ([]byte, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function Asset should have comment or be unexported

@ruflin
Copy link
Member Author

ruflin commented Apr 25, 2018

This PR is ready for an other round of reviews.

@ruflin ruflin mentioned this pull request Apr 26, 2018
@ruflin ruflin force-pushed the add-fields-to-package branch 4 times, most recently from 4f7db2d to cd27521 Compare April 27, 2018 15:41
@ruflin
Copy link
Member Author

ruflin commented May 1, 2018

jenkins, test this

@ruflin ruflin force-pushed the add-fields-to-package branch 5 times, most recently from f18e304 to 7c80fe0 Compare May 8, 2018 08:52
@ruflin
Copy link
Member Author

ruflin commented May 8, 2018

One temporary issue of this change is that it will increase the PR conflicts every time fields are changed. But as soon as each module loads the fields locally, this should disappear again.

Currently the fields.yml file is shipped as its own file and has to be place in the right directory to be used. As fields have become a central part of beats I suggest to make them part of the binary which will allow us to also split up the fields.yml inside the binary into multiple part (for example one per module) and use it for testing like fields checks.

The behavior from the user perspective is as following:

* By default internal `fields.yml` data is used
* If `fields.yml` path is set in the config file, it is loaded from disk

The golang files are generated when running `make update`. The content of fields.yml is stored as a compressed string.

Each asset is currently prefixed by the Beat name. The current fields registry is a basic implementation that likely needs to be extended when we add fields per module or processors.
Copy link
Contributor

@exekias exekias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, there are some valid comments by hound, apart from that: 👍

@exekias exekias merged commit 726fb00 into elastic:master May 15, 2018

var tpl = template.Must(template.New("normalizations").Parse(`
// ${ES_BEATS}/dev-tools/cmd/asset/asset.go
// MACHINE GENERATED BY THE ABOVE COMMAND; DO NOT EDIT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be changed to Code generated by beats/dev-tools/cmd/asset/asset.go - DO NOT EDIT. Or something that matches this pattern golang/go#13560 (comment). Then tools like IntelliJ and golint will recognize it as being generated.

@@ -4,6 +4,7 @@ import (
"os"

"{beat_path}/cmd"
_ "{beat_path}/include"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that unused imports should be separated into their own group and have a comment explaining it's existence. Similar to the golint warning, but golint exempts main packages.

@ruflin ruflin deleted the add-fields-to-package branch May 15, 2018 13:36
@ruflin
Copy link
Member Author

ruflin commented May 15, 2018

@andrewkroh Will do a follow up PR with your requested changes.

ruflin added a commit to ruflin/beats that referenced this pull request May 18, 2018
Follow up from review comments in elastic#4834
andrewkroh pushed a commit that referenced this pull request May 18, 2018
Follow up from review comments in #4834
stevea78 pushed a commit to stevea78/beats that referenced this pull request May 20, 2018
Make fields.yml part of the binary

Currently the fields.yml file is shipped as its own file and has to be place in the right directory to be used. As fields have become a central part of beats I suggest to make them part of the binary which will allow us to also split up the fields.yml inside the binary into multiple part (for example one per module) and use it for testing like fields checks.
stevea78 pushed a commit to stevea78/beats that referenced this pull request May 20, 2018
stevea78 pushed a commit to stevea78/beats that referenced this pull request May 20, 2018
Make fields.yml part of the binary

Currently the fields.yml file is shipped as its own file and has to be place in the right directory to be used. As fields have become a central part of beats I suggest to make them part of the binary which will allow us to also split up the fields.yml inside the binary into multiple part (for example one per module) and use it for testing like fields checks.
stevea78 pushed a commit to stevea78/beats that referenced this pull request May 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants