Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add implementation of FSWatcher and FSScanner for filestream #21444

Merged
merged 2 commits into from
Oct 2, 2020

Conversation

kvch
Copy link
Contributor

@kvch kvch commented Oct 1, 2020

What does this PR do?

This PR adds the implementation for FSWatcher and FSScanner for the filestream input.

The implementation of FSScanner is called fileScanner. It is responsible for

  • resolves recursive globs on creation
  • normalizes glob patterns on creation
  • finds files which match the configured paths and returns FileInfo for those

This is the refactored version of the log input's scanner, globber functions.

The implementation of FSWatcher is called fileWatcher. It checks the file list returned by fileScanner and creates events based on the result.

Why is it important?

It is required for the filestream input.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation
    - [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
    - [ ] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

Related #20243

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Oct 1, 2020
@kvch kvch changed the title Add implementation of FSWatch for file scanner Add implementation of FSWatcher and FSScanner for filestream Oct 1, 2020
@kvch kvch added Team:Services (Deprecated) Label for the former Integrations-Services team and removed needs_team Indicates that the issue/PR needs a Team:* label labels Oct 1, 2020
@kvch kvch marked this pull request as ready for review October 1, 2020 10:04
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-services (Team:Services)

@kvch
Copy link
Contributor Author

kvch commented Oct 1, 2020

jenkins run tests

@kvch kvch requested review from faec and ruflin October 1, 2020 10:05
@elasticmachine
Copy link
Collaborator

elasticmachine commented Oct 1, 2020

💔 Tests Failed

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #21444 updated]

  • Start Time: 2020-10-02T08:48:35.503+0000

  • Duration: 60 min 51 sec

Test stats 🧪

Test Results
Failed 2
Passed 2017
Skipped 131
Total 2150

Test errors

Expand to view the tests failures

  • Name: Build&Test / filebeat-build / TestFileWatchNewDeleteModified – filestream

    • Age: 1
    • Duration: 0.02
    • Error Details: Failed
  • Name: Build&Test / filebeat-build / TestFileWatchNewDeleteModified/two_modified_files – filestream

    • Age: 1
    • Duration: 0.01
    • Error Details: Failed

Steps errors

Expand to view the steps failures

  • Name: mage build test

    • Description: mage build test

    • Duration: 2 min 56 sec

    • Start Time: 2020-10-02T09:15:16.194+0000

    • log

  • Name: Notifies GitHub of the status of a Pull Request

    • Description: script returned exit code 1

    • Duration: 0 min 1 sec

    • Start Time: 2020-10-02T09:17:20.272+0000

    • log

  • Name: Extract

    • Description: tar -xpf source.tgz

    • Duration: 1 min 39 sec

    • Start Time: 2020-10-02T09:15:25.575+0000

    • log

  • Name: Error signal

    • Description: untar: step failed with error script returned exit code 1

    • Duration: 0 min 0 sec

    • Start Time: 2020-10-02T09:16:04.694+0000

    • log

  • Name: Notifies GitHub of the status of a Pull Request

    • Description: untar: step failed with error script returned exit code 1

    • Duration: 0 min 2 sec

    • Start Time: 2020-10-02T09:16:05.075+0000

    • log

  • Name: Extract

    • Description: tar -xpf source.tgz

    • Duration: 0 min 27 sec

    • Start Time: 2020-10-02T09:16:02.947+0000

    • log

  • Name: Error signal

    • Description: untar: step failed with error script returned exit code 1

    • Duration: 0 min 0 sec

    • Start Time: 2020-10-02T09:16:29.846+0000

    • log

  • Name: Notifies GitHub of the status of a Pull Request

    • Description: untar: step failed with error script returned exit code 1

    • Duration: 0 min 1 sec

    • Start Time: 2020-10-02T09:16:30.170+0000

    • log

Log output

Expand to view the last 100 lines of log output

[2020-10-02T09:46:48.104Z] Unable to find image 'alpine:3.4' locally
[2020-10-02T09:46:48.686Z] 3.4: Pulling from library/alpine
[2020-10-02T09:46:48.947Z] c1e54eec4b57: Pulling fs layer
[2020-10-02T09:46:49.517Z] c1e54eec4b57: Download complete
[2020-10-02T09:46:49.783Z] c1e54eec4b57: Pull complete
[2020-10-02T09:46:49.783Z] Digest: sha256:b733d4a32c4da6a00a84df2ca32791bb03df95400243648d8c539e7b4cce329c
[2020-10-02T09:46:49.783Z] Status: Downloaded newer image for alpine:3.4
[2020-10-02T09:46:51.989Z] + python .ci/scripts/pre_archive_test.py
[2020-10-02T09:46:53.898Z] Copy ./x-pack/filebeat/build into build/x-pack/filebeat/build
[2020-10-02T09:46:53.909Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-21444/src/github.com/elastic/beats/build
[2020-10-02T09:46:53.919Z] WARNING: Unknown parameter(s) found for class type 'hudson.tasks.junit.pipeline.JUnitResultsStep': id,stashedTestReports
[2020-10-02T09:46:53.923Z] Recording test results
[2020-10-02T09:46:55.281Z] Stashed 4 file(s)
[2020-10-02T09:46:55.291Z] Archiving artifacts
[2020-10-02T09:46:55.878Z] + python .ci/scripts/search_system_tests.py
[2020-10-02T09:46:55.894Z] [INFO] system-tests='build/x-pack/filebeat/build/system-tests'. If no empty then let's create a tarball
[2020-10-02T09:46:56.223Z] + tar --version
[2020-10-02T09:46:56.527Z] + tar --exclude=x-pack-filebeat--system-tests-linux.tgz -czf x-pack-filebeat--system-tests-linux.tgz build/x-pack/filebeat/build/system-tests
[2020-10-02T09:47:23.127Z] Archiving artifacts
[2020-10-02T09:47:34.647Z] Client: Docker Engine - Community
[2020-10-02T09:47:34.647Z]  Version:           19.03.13
[2020-10-02T09:47:34.647Z]  API version:       1.40
[2020-10-02T09:47:34.647Z]  Go version:        go1.13.15
[2020-10-02T09:47:34.647Z]  Git commit:        4484c46d9d
[2020-10-02T09:47:34.647Z]  Built:             Wed Sep 16 17:02:36 2020
[2020-10-02T09:47:34.647Z]  OS/Arch:           linux/amd64
[2020-10-02T09:47:34.647Z]  Experimental:      false
[2020-10-02T09:47:34.647Z] 
[2020-10-02T09:47:34.647Z] Server: Docker Engine - Community
[2020-10-02T09:47:34.647Z]  Engine:
[2020-10-02T09:47:34.647Z]   Version:          19.03.13
[2020-10-02T09:47:34.647Z]   API version:      1.40 (minimum version 1.12)
[2020-10-02T09:47:34.647Z]   Go version:       go1.13.15
[2020-10-02T09:47:34.647Z]   Git commit:       4484c46d9d
[2020-10-02T09:47:34.647Z]   Built:            Wed Sep 16 17:01:06 2020
[2020-10-02T09:47:34.647Z]   OS/Arch:          linux/amd64
[2020-10-02T09:47:34.647Z]   Experimental:     false
[2020-10-02T09:47:34.647Z]  containerd:
[2020-10-02T09:47:34.647Z]   Version:          1.3.7
[2020-10-02T09:47:34.647Z]   GitCommit:        8fba4e9a7d01810a393d5d25a3621dc101981175
[2020-10-02T09:47:34.647Z]  runc:
[2020-10-02T09:47:34.647Z]   Version:          1.0.0-rc10
[2020-10-02T09:47:34.647Z]   GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
[2020-10-02T09:47:34.647Z]  docker-init:
[2020-10-02T09:47:34.647Z]   Version:          0.18.0
[2020-10-02T09:47:34.647Z]   GitCommit:        fec3683
[2020-10-02T09:47:45.101Z] [INFO] unstashV2: JOB_GCS_BUCKET is set. bucket param got precedency instead.
[2020-10-02T09:47:45.111Z] [INFO] unstashV2: JOB_GCS_CREDENTIALS is set. credentialsId param got precedency instead.
[2020-10-02T09:47:45.177Z] [Google Cloud Storage Plugin] Found 1 files to download from pattern: gs://beats-ci-temp/Beats/beats/PR-21444-3/source/source.tgz
[2020-10-02T09:47:45.196Z] [Google Cloud Storage Plugin] Downloading: Beats/beats/PR-21444-3/source/source.tgz to local path: /var/lib/jenkins/workspace/Beats_beats_PR-21444/source.tgz
[2020-10-02T09:47:55.237Z] + tar --version
[2020-10-02T09:47:55.546Z] + tar -xpf source.tgz
[2020-10-02T09:48:05.904Z] + rm source.tgz
[2020-10-02T09:48:05.961Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-21444/src/github.com/elastic/beats
[2020-10-02T09:48:05.970Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-21444/src/github.com/elastic/beats/uncategorized-1601629887098
[2020-10-02T09:48:06.022Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-21444/src/github.com/elastic/beats/filebeat-build-1601630236022
[2020-10-02T09:48:06.060Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-21444/src/github.com/elastic/beats/x-pack-filebeat-build-1601632014963
[2020-10-02T09:48:06.401Z] + cat
[2020-10-02T09:48:06.401Z] + /usr/local/bin/runbld ./runbld-test-reports --job-name elastic+beats+pull-request
[2020-10-02T09:48:06.401Z] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
[2020-10-02T09:48:13.007Z] runbld>>> runbld started
[2020-10-02T09:48:13.007Z] runbld>>> 1.6.12/f45d832f2ba0aa2722ab4ec1fda8ad140f027f8b
[2020-10-02T09:48:14.393Z] runbld>>> The following profiles matched the job 'elastic+beats+pull-request' in order of occurrence in the config (last value wins).
[2020-10-02T09:48:14.393Z] runbld>>> Matches in the system config:
[2020-10-02T09:48:14.393Z] runbld>>> - Matched ^elastic\+beats
[2020-10-02T09:48:14.393Z] runbld>>> - Matched ^elastic\+beats\+pull-request
[2020-10-02T09:48:15.799Z] runbld>>> Debug logging enabled.
[2020-10-02T09:48:15.799Z] runbld>>> Storing result
[2020-10-02T09:48:16.060Z] runbld>>> Store result: created {:total 2, :successful 2, :failed 0} 1
[2020-10-02T09:48:16.061Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1597739501209/t/20201002094815-28161F10
[2020-10-02T09:48:16.061Z] runbld>>> Adding system facts.
[2020-10-02T09:48:17.004Z] runbld>>> Adding vcs info for the latest commit:  5818f01de2c3b25b56ed89f174ca7b3b7a15935a
[2020-10-02T09:48:17.005Z] runbld>>> >>>>>>>>>>>> SCRIPT EXECUTION BEGIN >>>>>>>>>>>>
[2020-10-02T09:48:17.005Z] runbld>>> Adding /usr/lib/jvm/java-8-openjdk-amd64/bin to the path.
[2020-10-02T09:48:17.266Z] Processing JUnit reports with runbld...
[2020-10-02T09:48:17.266Z] + echo 'Processing JUnit reports with runbld...'
[2020-10-02T09:48:17.527Z] runbld>>> <<<<<<<<<<<< SCRIPT EXECUTION END <<<<<<<<<<<<
[2020-10-02T09:48:17.527Z] runbld>>> DURATION: 32ms
[2020-10-02T09:48:17.527Z] runbld>>> STDOUT: 40 bytes
[2020-10-02T09:48:17.527Z] runbld>>> STDERR: 49 bytes
[2020-10-02T09:48:17.527Z] runbld>>> WRAPPED PROCESS: SUCCESS (0)
[2020-10-02T09:48:17.527Z] runbld>>> Searching for build metadata in /var/lib/jenkins/workspace/Beats_beats_PR-21444
[2020-10-02T09:48:18.468Z] runbld>>> Storing build metadata: 
[2020-10-02T09:48:18.468Z] runbld>>> Adding test report.
[2020-10-02T09:48:18.468Z] runbld>>> Searching for junit test output files with the pattern: TEST-.*\.xml$ in: /var/lib/jenkins/workspace/Beats_beats_PR-21444/src/github.com/elastic/beats
[2020-10-02T09:48:19.413Z] runbld>>> Found 5 test output files
[2020-10-02T09:48:19.984Z] runbld>>> Test output logs contained: Errors: 0 Failures: 2 Tests: 2150 Skipped: 122
[2020-10-02T09:48:20.245Z] runbld>>> Storing result
[2020-10-02T09:48:20.245Z] runbld>>> FAILURES: 2
[2020-10-02T09:48:20.818Z] runbld>>> Store result: updated {:total 2, :successful 2, :failed 0} 2
[2020-10-02T09:48:20.818Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1597739501209/t/20201002094815-28161F10
[2020-10-02T09:48:20.818Z] runbld>>> Email notification disabled by environment variable.
[2020-10-02T09:48:20.818Z] runbld>>> Slack notification disabled by environment variable.
[2020-10-02T09:48:26.250Z] Running on Jenkins in /var/lib/jenkins/workspace/Beats_beats_PR-21444
[2020-10-02T09:48:26.315Z] [INFO] getVaultSecret: Getting secrets
[2020-10-02T09:48:26.391Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-10-02T09:48:26.990Z] + chmod 755 generate-build-data.sh
[2020-10-02T09:48:26.990Z] + ./generate-build-data.sh https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-21444/ https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-21444/runs/3 FAILURE 3591226
[2020-10-02T09:48:26.990Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-21444/runs/3/steps/?limit=10000 -o steps-info.json
[2020-10-02T09:48:28.334Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-21444/runs/3/tests/?status=FAILED -o tests-errors.json

@kvch
Copy link
Contributor Author

kvch commented Oct 1, 2020

Failing tests are unrelated.

Copy link
Contributor

@faec faec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several suggestions, but nothing complex enough to require another review pass, approved.

filebeat/input/filestream/fswatch.go Outdated Show resolved Hide resolved
filebeat/input/filestream/fswatch.go Outdated Show resolved Hide resolved
filebeat/input/filestream/fswatch.go Outdated Show resolved Hide resolved
filebeat/input/filestream/fswatch.go Outdated Show resolved Hide resolved
filebeat/input/filestream/fswatch.go Outdated Show resolved Hide resolved
filebeat/input/filestream/fswatch.go Outdated Show resolved Hide resolved
filebeat/input/filestream/fswatch.go Outdated Show resolved Hide resolved
@kvch kvch force-pushed the feature-filebeat-fswatch-for-filestream branch from c8ad6f6 to fb3dbab Compare October 2, 2020 08:47
@kvch kvch added the needs_backport PR is waiting to be backported to other branches. label Oct 2, 2020
@kvch kvch merged commit a119083 into elastic:master Oct 2, 2020
kvch added a commit to kvch/beats that referenced this pull request Oct 2, 2020
…#21444)

## What does this PR do?

This PR adds the implementation for `FSWatcher` and `FSScanner` for the `filestream` input.

The implementation of `FSScanner` is called `fileScanner`. It is responsible for
* resolves recursive globs on creation
* normalizes glob patterns on creation
* finds files which match the configured paths and returns `FileInfo` for those

This is the refactored version of the `log` input's scanner, globber functions.

The implementation of `FSWatcher` is called `fileWatcher`. It checks the file list returned by `fileScanner` and creates events based on the result.

## Why is it important?

It is required for the `filestream` input.

## Related issues

Related elastic#20243

(cherry picked from commit a119083)
@kvch kvch added v7.10.0 and removed needs_backport PR is waiting to be backported to other branches. labels Oct 2, 2020
@v1v
Copy link
Member

v1v commented Oct 2, 2020

Test failures might not be relevant, but the step failures are relevant, symlinks are breaking the windows builds

v1v added a commit to v1v/beats that referenced this pull request Oct 2, 2020
…ne-2.0-arm

* upstream/master: (54 commits)
  [CI] Change x-pack/auditbeat build events (comments, labels) (elastic#21463)
  [CI] changeset from elastic#20603 was not added to CI2.0 (elastic#21464)
  Add new log file reader for filestream input (elastic#21450)
  [CI] Send slack message with build status (elastic#21428)
  Remove duplicated sources url in dependencies report (elastic#21462)
  Add implementation of FSWatcher and FSScanner for filestream (elastic#21444)
  [Ingest Manager] Split index restrictions into type,dataset, namespace parts (elastic#21406)
  Update Filebeat module expected logs files (elastic#21454)
  Edit SQL module docs and fix broken doc structure (elastic#21233)
  [Ingest Manager] Send snapshot flag together with metadata (elastic#21285)
  Revert "[JJBB] Set shallow cloning to 10 (elastic#21409)" (elastic#21447)
  [JJBB] Use reference repo for fast checkouts (elastic#21410)
  Add initial skeleton of filestream input (elastic#21427)
  Initial spec file for apm-server (elastic#21225)
  [Ingest Manager] Upgrade Action: make source URI optional (elastic#21372)
  Add field limit check for AWS Cloudtrail flattened fields (elastic#21388)
  [Winlogbeat] Move winlogbeat javascript processor to libbeat (elastic#21402)
  ci: pipeline to generate the changelog (elastic#21426)
  [JJBB] Set shallow cloning to 10 (elastic#21409)
  docs: add link to release notes for 7.9.2 (elastic#21405) (elastic#21419)
  ...
v1v added a commit to v1v/beats that referenced this pull request Oct 2, 2020
…ci-build-label-support

* upstream/master:
  [CI] Change x-pack/auditbeat build events (comments, labels) (elastic#21463)
  [CI] changeset from elastic#20603 was not added to CI2.0 (elastic#21464)
  Add new log file reader for filestream input (elastic#21450)
  [CI] Send slack message with build status (elastic#21428)
  Remove duplicated sources url in dependencies report (elastic#21462)
  Add implementation of FSWatcher and FSScanner for filestream (elastic#21444)
  [Ingest Manager] Split index restrictions into type,dataset, namespace parts (elastic#21406)
  Update Filebeat module expected logs files (elastic#21454)
  Edit SQL module docs and fix broken doc structure (elastic#21233)
  [Ingest Manager] Send snapshot flag together with metadata (elastic#21285)
  Revert "[JJBB] Set shallow cloning to 10 (elastic#21409)" (elastic#21447)
  [JJBB] Use reference repo for fast checkouts (elastic#21410)
  Add initial skeleton of filestream input (elastic#21427)
  Initial spec file for apm-server (elastic#21225)
  [Ingest Manager] Upgrade Action: make source URI optional (elastic#21372)
  Add field limit check for AWS Cloudtrail flattened fields (elastic#21388)
  [Winlogbeat] Move winlogbeat javascript processor to libbeat (elastic#21402)
  ci: pipeline to generate the changelog (elastic#21426)
kvch added a commit that referenced this pull request Oct 5, 2020
…ner for filestream (#21468)

* Add implementation of FSWatcher and FSScanner for filestream (#21444)

## What does this PR do?

This PR adds the implementation for `FSWatcher` and `FSScanner` for the `filestream` input.

The implementation of `FSScanner` is called `fileScanner`. It is responsible for
* resolves recursive globs on creation
* normalizes glob patterns on creation
* finds files which match the configured paths and returns `FileInfo` for those

This is the refactored version of the `log` input's scanner, globber functions.

The implementation of `FSWatcher` is called `fileWatcher`. It checks the file list returned by `fileScanner` and creates events based on the result.

## Why is it important?

It is required for the `filestream` input.

## Related issues

Related #20243

(cherry picked from commit a119083)

* Do not run symlink tests on Windows (#21472)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Services (Deprecated) Label for the former Integrations-Services team v7.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants