Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auditbeat Processor to enrich auditd events with session view information #37640

Merged
merged 55 commits into from
Apr 5, 2024

Conversation

mjwolf
Copy link
Contributor

@mjwolf mjwolf commented Jan 16, 2024

Proposed commit message

This adds a add_session_metadata processor to auditbeat which will enrich auditd process events with information needed to enable the Kibana session viewer on the events.

In this implementation, eBPF is used to collect information on all processes running on the system, which are added to a process database. When a process event is run through the processor, the DB will be read to retrieve information on the processes related to the process in the event (the processes's parent, session leader,process group leader and entry leader will be retrieved). Then the event will be enriched with the metadata on these related processes, which can enable use of session view on the data

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation Documentation is being worked on separately
    - [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

How to test this PR locally

By using the below auditd rules and adding the processor configuration, a local auditbeat instance, on a supported Linux system will send ECS events that will work with Kibana session viewer. So it can be tested end-to-end by using this configuration, working interactively with a console session and viewing the session in Kibana.

Audit Rules:

## Executions.
-a always,exit -F arch=b64 -S execve,execveat -k exec
-a always,exit -F arch=b64 -S exit_group

## Set_sid
-a always,exit -F arch=b64 -S setsid

/etc/auditbeat/auditbeat.yml processor config

auditbeat.modules:

- module: auditd
  processors:
    - add_session_metadata:
       replace_fields: true
       backend: "ebpf"

Related issues

Screenshots

Example of Session View with data provided by Auditbeat and this processor

Screenshot 2024-01-15 at 4 49 02 PM

This processor will enrich process events with additional infomation needed to
enable session view in Kibana.

This processor can be run on Linux systems, and will use eBPF to enrich auditd events
for process exec and exit events. The additional fields that will be added are information
on process parent, session leader and process group leader.
Calculate and append entry leader information to enriched processes.
@mjwolf mjwolf added enhancement backport-skip Skip notification from the automated backport with mergify labels Jan 16, 2024
@mjwolf mjwolf requested review from a team as code owners January 16, 2024 01:37
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jan 16, 2024
@mergify mergify bot assigned mjwolf Jan 16, 2024
@elasticmachine
Copy link
Collaborator

❕ Build Aborted

There is a new build on-going so the previous on-going builds have been aborted.

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Start Time: 2024-01-16T01:37:45.735+0000

  • Duration: 8 min 37 sec

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@elasticmachine
Copy link
Collaborator

💔 Build Failed

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Duration: 130 min 43 sec

Pipeline error 1

This error is likely related to the pipeline itself. Click here
and then you will see the error (either incorrect syntax or an invalid configuration).

❕ Flaky test report

No test was executed to be analysed.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

pid = uint32(nr)
case uint32:
pid = v
case int, int8, int16, int32, int64:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious how likely are we to get these types (and below).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, this should only be int. In the future, this processor might be used with other beats, so I tried to prepare by accepting all types for pid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here generics would be helpful to eliminate the usage of reflect

Copy link
Contributor

mergify bot commented Jan 17, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b session_view_processor_ebpf upstream/session_view_processor_ebpf
git merge upstream/main
git push upstream session_view_processor_ebpf

@elasticmachine
Copy link
Collaborator

❕ Build Aborted

There is a new build on-going so the previous on-going builds have been aborted.

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Start Time: 2024-01-17T21:56:37.481+0000

  • Duration: 67 min 37 sec

Test stats 🧪

Test Results
Failed 0
Passed 27500
Skipped 1906
Total 29406

Steps errors 29

Expand to view the steps failures

Show only the first 10 steps failures

x-pack/auditbeat-rhel-9-rhel-9 - mage build unitTest
  • Took 2 min 44 sec . View more details here
  • Description: mage build unitTest
x-pack/auditbeat-rhel-9-rhel-9 - mage build unitTest
  • Took 0 min 6 sec . View more details here
  • Description: mage build unitTest
x-pack/auditbeat-rhel-9-rhel-9 - mage build unitTest
  • Took 0 min 2 sec . View more details here
  • Description: mage build unitTest
x-pack/auditbeat-windows-2022-windows-2022 - mage build unitTest
  • Took 2 min 53 sec . View more details here
  • Description: mage build unitTest
x-pack/auditbeat-windows-2022-windows-2022 - mage build unitTest
  • Took 0 min 16 sec . View more details here
  • Description: mage build unitTest
x-pack/auditbeat-windows-2022-windows-2022 - mage build unitTest
  • Took 0 min 15 sec . View more details here
  • Description: mage build unitTest
x-pack/auditbeat-windows-2016-windows-2016 - mage build unitTest
  • Took 2 min 49 sec . View more details here
  • Description: mage build unitTest
x-pack/auditbeat-windows-2016-windows-2016 - mage build unitTest
  • Took 0 min 16 sec . View more details here
  • Description: mage build unitTest
x-pack/auditbeat-windows-2016-windows-2016 - mage build unitTest
  • Took 0 min 16 sec . View more details here
  • Description: mage build unitTest
Error signal
  • Took 0 min 0 sec . View more details here
  • Description: Error 'hudson.AbortException: script returned exit code 1'

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@elasticmachine
Copy link
Collaborator

❕ Build Aborted

There is a new build on-going so the previous on-going builds have been aborted.

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Start Time: 2024-01-17T21:36:44.936+0000

  • Duration: 87 min 54 sec

Test stats 🧪

Test Results
Failed 0
Passed 19782
Skipped 942
Total 20724

Steps errors 6

Expand to view the steps failures

auditbeat-crosscompile - make -C auditbeat crosscompile
  • Took 2 min 45 sec . View more details here
  • Description: make -C auditbeat crosscompile
auditbeat-crosscompile - make -C auditbeat crosscompile
  • Took 0 min 2 sec . View more details here
  • Description: make -C auditbeat crosscompile
auditbeat-crosscompile - make -C auditbeat crosscompile
  • Took 0 min 2 sec . View more details here
  • Description: make -C auditbeat crosscompile
filebeat-goIntegTest - mage goIntegTest
  • Took 5 min 44 sec . View more details here
  • Description: mage goIntegTest
filebeat-goIntegTest - mage goIntegTest
  • Took 16 min 42 sec . View more details here
  • Description: mage goIntegTest
Error signal
  • Took 0 min 0 sec . View more details here
  • Description: Error 'hudson.AbortException: script returned exit code 2'

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

Remove the DB interface, as there will only be one implementation for it
@mjwolf mjwolf force-pushed the session_view_processor_ebpf branch from c9cf706 to bf38e89 Compare January 18, 2024 00:11
@elasticmachine
Copy link
Collaborator

❕ Build Aborted

There is a new build on-going so the previous on-going builds have been aborted.

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Start Time: 2024-01-17T22:57:22.423+0000

  • Duration: 81 min 59 sec

Test stats 🧪

Test Results
Failed 97
Passed 28087
Skipped 1958
Total 30142

Test errors 97

Expand to view the tests failures

> Show only the first 10 test failures

Build&Test / filebeat-pythonIntegTest / test_keystore_with_present_key – filebeat.tests.system.test_keystore.TestKeystore
    Expand to view the error details

     Failed: Timeout >90.0s 
    

    Expand to view the stacktrace

     self = <test_keystore.TestKeystore testMethod=test_keystore_with_present_key>
    
        def test_keystore_with_present_key(self):
            """
            Test that we correctly do string replacement with values from the keystore
            """
        
            key = "elasticsearch_host"
            secret = "myeleasticsearchsecrethost"
        
            self.render_config_template(keystore_path=self.keystore_path, elasticsearch={
                'host': "${%s}:9200" % key
            })
        
            exit_code = self.run_beat(extra_args=["keystore", "create"])
            assert exit_code == 0
        
            self.add_secret(key, secret, True)
            proc = self.start_beat()
            self.wait_until(lambda: self.log_contains("myeleasticsearchsecrethost"))
            assert self.log_contains(secret)
    >       proc.kill_and_wait()
    
    tests/system/test_keystore.py:36: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    ../libbeat/tests/system/beat/beat.py:145: in kill_and_wait
        return self.wait()
    ../libbeat/tests/system/beat/beat.py:124: in wait
        return self.proc.wait()
    /usr/lib/python3.10/subprocess.py:1209: in wait
        return self._wait(timeout=timeout)
    /usr/lib/python3.10/subprocess.py:1959: in _wait
        (pid, sts) = self._try_wait(0)
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    
    self = <Popen: returncode: None args: ['/var/lib/jenkins/workspace/PR-37640-5-e523d...>
    wait_flags = 0
    
        def _try_wait(self, wait_flags):
            """All callers to this function MUST hold self._waitpid_lock."""
            try:
    >           (pid, sts) = os.waitpid(self.pid, wait_flags)
    E           Failed: Timeout >90.0s
    
    /usr/lib/python3.10/subprocess.py:1917: Failed 
    

Build&Test / filebeat-pythonIntegTest / test_c_style_log – filebeat.tests.system.test_multiline.Test
    Expand to view the error details

     beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds: 
    

    Expand to view the stacktrace

     self = <test_multiline.Test testMethod=test_c_style_log>
    
        def test_c_style_log(self):
            """
            Test that multi lines for c style log works
            It checks that all lines following a line with \\ are appended to the previous line
            """
            self.render_config_template(
                path=os.path.abspath(self.working_dir) + "/log/*",
                multiline=True,
                multiline_type="pattern",
                pattern="\\\\$",
                match="before"
            )
        
            os.mkdir(self.working_dir + "/log/")
            self.copy_files(["logs/multiline-c-log.log"],
                            target_dir="log")
        
            proc = self.start_beat()
        
            # wait for the "Skipping file" log message
    >       self.wait_until(
                lambda: self.output_has(lines=4),
                max_timeout=10)
    
    tests/system/test_multiline.py:64: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    
    self = <test_multiline.Test testMethod=test_c_style_log>
    cond = <function Test.test_c_style_log.<locals>.<lambda> at 0x7f1017e995a0>
    max_timeout = 10, poll_interval = 0.1, name = 'cond', err_msg = ''
    
        def wait_until(self, cond, max_timeout=20, poll_interval=0.1, name="cond", err_msg=""):
            """
            TODO: this can probably be a "wait_until_output_count", among other things, since that could actually use `self`, and this can become an internal function
            Waits until the cond function returns true,
            or until the max_timeout is reached. Calls the cond
            function every poll_interval seconds.
        
            If the max_timeout is reached before cond() returns
            true, an exception is raised.
            """
            start = datetime.now()
            while not cond():
                if datetime.now() - start > timedelta(seconds=max_timeout):
                    print("Test has failed, here are the Beat logs")
                    for l in self.get_log_lines():
                        print(l)
    >               raise WaitTimeoutError(
                        f"Timeout waiting for condition '{name}'. Waited {max_timeout} seconds: {err_msg}")
    E               beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds:
    
    ../libbeat/tests/system/beat/beat.py:449: WaitTimeoutError 
    

Build&Test / filebeat-pythonIntegTest / test_close_timeout_with_multiline – filebeat.tests.system.test_multiline.Test
    Expand to view the error details

     beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds: 
    

    Expand to view the stacktrace

     self = <test_multiline.Test testMethod=test_close_timeout_with_multiline>
    
        def test_close_timeout_with_multiline(self):
            """
            Test if multiline events are split up with close_timeout
            """
            self.render_config_template(
                path=os.path.abspath(self.working_dir) + "/log/*",
                multiline=True,
                multiline_type="pattern",
                pattern=r"^\[",
                negate="true",
                match="after",
                close_timeout="2s",
            )
        
            os.mkdir(self.working_dir + "/log/")
        
            testfile = self.working_dir + "/log/test.log"
        
            with open(testfile, 'wb', 0) as file:
                file.write(b"[2015] hello world")
                file.write(b"\n")
                file.write(b"  First Line\n")
                file.write(b"  Second Line\n")
        
            proc = self.start_beat()
        
            # Wait until harvester is closed because of timeout
            # This leads to the partial event above to be sent
            self.wait_until(
                lambda: self.log_contains(
                    "Closing harvester because close_timeout was reached"),
                max_timeout=15)
        
            # Because of the timeout the following two lines should be put together
            with open(testfile, 'ab', 0) as file:
                file.write(b"  This should not be third\n")
                file.write(b"  This should not be fourth\n")
                # This starts a new pattern
                file.write(b"[2016] Hello world\n")
                # This line should be appended
                file.write(b"  First line again\n")
        
    >       self.wait_until(
                lambda: self.output_has(lines=3),
                max_timeout=10)
    
    tests/system/test_multiline.py:284: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    
    self = <test_multiline.Test testMethod=test_close_timeout_with_multiline>
    cond = <function Test.test_close_timeout_with_multiline.<locals>.<lambda> at 0x7f1017e9a4d0>
    max_timeout = 10, poll_interval = 0.1, name = 'cond', err_msg = ''
    
        def wait_until(self, cond, max_timeout=20, poll_interval=0.1, name="cond", err_msg=""):
            """
            TODO: this can probably be a "wait_until_output_count", among other things, since that could actually use `self`, and this can become an internal function
            Waits until the cond function returns true,
            or until the max_timeout is reached. Calls the cond
            function every poll_interval seconds.
        
            If the max_timeout is reached before cond() returns
            true, an exception is raised.
            """
            start = datetime.now()
            while not cond():
                if datetime.now() - start > timedelta(seconds=max_timeout):
                    print("Test has failed, here are the Beat logs")
                    for l in self.get_log_lines():
                        print(l)
    >               raise WaitTimeoutError(
                        f"Timeout waiting for condition '{name}'. Waited {max_timeout} seconds: {err_msg}")
    E               beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds:
    
    ../libbeat/tests/system/beat/beat.py:449: WaitTimeoutError 
    

Build&Test / filebeat-pythonIntegTest / test_consecutive_newline – filebeat.tests.system.test_multiline.Test
    Expand to view the error details

     beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds: 
    

    Expand to view the stacktrace

     self = <test_multiline.Test testMethod=test_consecutive_newline>
    
            def test_consecutive_newline(self):
                """
                Test if consecutive multilines have an affect on multiline
                """
                self.render_config_template(
                    path=os.path.abspath(self.working_dir) + "/log/*",
                    multiline=True,
                    multiline_type="pattern",
                    pattern=r"^\[",
                    negate="true",
                    match="after",
                    close_timeout="2s",
                )
        
                logentry1 = """[2016-09-02 19:54:23 +0000] Started 2016-09-02 19:54:23 +0000 "GET" for /gaq?path=%2FCA%2FFallbrook%2F1845-Acacia-Ln&referer=http%3A%2F%2Fwww.xxxxx.com%2FAcacia%2BLn%2BFallbrook%2BCA%2Baddresses&search_bucket=none&page_controller=v9%2Faddresses&page_action=show at 23.235.47.31
        X-Forwarded-For:72.197.227.93, 23.235.47.31
        Processing by GoogleAnalyticsController#index as JSON
        
          Parameters: {"path"=>"/CA/Fallbrook/1845-Acacia-Ln", "referer"=>"http://www.xxxx.com/Acacia+Ln+Fallbrook+CA+addresses", "search_bucket"=>"none", "page_controller"=>"v9/addresses", "page_action"=>"show"}
        Completed 200 OK in 5ms (Views: 1.9ms)""".encode("utf-8")
                logentry2 = """[2016-09-02 19:54:23 +0000] Started 2016-09-02 19:54:23 +0000 "GET" for /health_check at xxx.xx.44.181
        X-Forwarded-For:
        SetAdCodeMiddleware.default_ad_code referer
        SetAdCodeMiddleware.default_ad_code path /health_check
        SetAdCodeMiddleware.default_ad_code route """.encode("utf-8")
        
                os.mkdir(self.working_dir + "/log/")
        
                testfile = self.working_dir + "/log/test.log"
        
                with open(testfile, 'bw', 0) as file:
                    file.write(logentry1 + b"\n")
                    file.write(logentry2 + b"\n")
        
                proc = self.start_beat()
        
    >           self.wait_until(
                    lambda: self.output_has(lines=2),
                    max_timeout=10)
    
    tests/system/test_multiline.py:334: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    
    self = <test_multiline.Test testMethod=test_consecutive_newline>
    cond = <function Test.test_consecutive_newline.<locals>.<lambda> at 0x7f1017813520>
    max_timeout = 10, poll_interval = 0.1, name = 'cond', err_msg = ''
    
        def wait_until(self, cond, max_timeout=20, poll_interval=0.1, name="cond", err_msg=""):
            """
            TODO: this can probably be a "wait_until_output_count", among other things, since that could actually use `self`, and this can become an internal function
            Waits until the cond function returns true,
            or until the max_timeout is reached. Calls the cond
            function every poll_interval seconds.
        
            If the max_timeout is reached before cond() returns
            true, an exception is raised.
            """
            start = datetime.now()
            while not cond():
                if datetime.now() - start > timedelta(seconds=max_timeout):
                    print("Test has failed, here are the Beat logs")
                    for l in self.get_log_lines():
                        print(l)
    >               raise WaitTimeoutError(
                        f"Timeout waiting for condition '{name}'. Waited {max_timeout} seconds: {err_msg}")
    E               beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds:
    
    ../libbeat/tests/system/beat/beat.py:449: WaitTimeoutError 
    

Build&Test / filebeat-pythonIntegTest / test_java_elasticsearch_log – filebeat.tests.system.test_multiline.Test
    Expand to view the error details

     beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds: 
    

    Expand to view the stacktrace

     self = <test_multiline.Test testMethod=test_java_elasticsearch_log>
    
        def test_java_elasticsearch_log(self):
            """
            Test that multi lines for java logs works.
            It checks that all lines which do not start with [ are append to the last line starting with [
            """
            self.render_config_template(
                path=os.path.abspath(self.working_dir) + "/log/*",
                multiline=True,
                multiline_type="pattern",
                pattern=r"^\[",
                negate="true",
                match="after"
            )
        
            os.mkdir(self.working_dir + "/log/")
            self.copy_files(["logs/elasticsearch-multiline-log.log"],
                            target_dir="log")
        
            proc = self.start_beat()
        
            # wait for the "Skipping file" log message
    >       self.wait_until(
                lambda: self.output_has(lines=20),
                max_timeout=10)
    
    tests/system/test_multiline.py:33: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    
    self = <test_multiline.Test testMethod=test_java_elasticsearch_log>
    cond = <function Test.test_java_elasticsearch_log.<locals>.<lambda> at 0x7f1017663250>
    max_timeout = 10, poll_interval = 0.1, name = 'cond', err_msg = ''
    
        def wait_until(self, cond, max_timeout=20, poll_interval=0.1, name="cond", err_msg=""):
            """
            TODO: this can probably be a "wait_until_output_count", among other things, since that could actually use `self`, and this can become an internal function
            Waits until the cond function returns true,
            or until the max_timeout is reached. Calls the cond
            function every poll_interval seconds.
        
            If the max_timeout is reached before cond() returns
            true, an exception is raised.
            """
            start = datetime.now()
            while not cond():
                if datetime.now() - start > timedelta(seconds=max_timeout):
                    print("Test has failed, here are the Beat logs")
                    for l in self.get_log_lines():
                        print(l)
    >               raise WaitTimeoutError(
                        f"Timeout waiting for condition '{name}'. Waited {max_timeout} seconds: {err_msg}")
    E               beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds:
    
    ../libbeat/tests/system/beat/beat.py:449: WaitTimeoutError 
    

Build&Test / filebeat-pythonIntegTest / test_max_bytes – filebeat.tests.system.test_multiline.Test
    Expand to view the error details

     beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds: 
    

    Expand to view the stacktrace

     self = <test_multiline.Test testMethod=test_max_bytes>
    
        def test_max_bytes(self):
            """
            Test the maximum number of bytes that is sent
            """
            self.render_config_template(
                path=os.path.abspath(self.working_dir) + "/log/*",
                multiline=True,
                multiline_type="pattern",
                pattern=r"^\[",
                negate="true",
                match="after",
                max_bytes=60
            )
        
            os.mkdir(self.working_dir + "/log/")
            self.copy_files(["logs/elasticsearch-multiline-log.log"],
                            target_dir="log")
        
            proc = self.start_beat()
        
    >       self.wait_until(
                lambda: self.output_has(lines=20),
                max_timeout=10)
    
    tests/system/test_multiline.py:225: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    
    self = <test_multiline.Test testMethod=test_max_bytes>
    cond = <function Test.test_max_bytes.<locals>.<lambda> at 0x7f1017811630>
    max_timeout = 10, poll_interval = 0.1, name = 'cond', err_msg = ''
    
        def wait_until(self, cond, max_timeout=20, poll_interval=0.1, name="cond", err_msg=""):
            """
            TODO: this can probably be a "wait_until_output_count", among other things, since that could actually use `self`, and this can become an internal function
            Waits until the cond function returns true,
            or until the max_timeout is reached. Calls the cond
            function every poll_interval seconds.
        
            If the max_timeout is reached before cond() returns
            true, an exception is raised.
            """
            start = datetime.now()
            while not cond():
                if datetime.now() - start > timedelta(seconds=max_timeout):
                    print("Test has failed, here are the Beat logs")
                    for l in self.get_log_lines():
                        print(l)
    >               raise WaitTimeoutError(
                        f"Timeout waiting for condition '{name}'. Waited {max_timeout} seconds: {err_msg}")
    E               beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds:
    
    ../libbeat/tests/system/beat/beat.py:449: WaitTimeoutError 
    

Build&Test / filebeat-pythonIntegTest / test_max_lines – filebeat.tests.system.test_multiline.Test
    Expand to view the error details

     beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds: 
    

    Expand to view the stacktrace

     self = <test_multiline.Test testMethod=test_max_lines>
    
        def test_max_lines(self):
            """
            Test the maximum number of lines that is sent by multiline
            All further lines are discarded
            """
            self.render_config_template(
                path=os.path.abspath(self.working_dir) + "/log/*",
                multiline=True,
                multiline_type="pattern",
                pattern=r"^\[",
                negate="true",
                match="after",
                max_lines=3
            )
        
            os.mkdir(self.working_dir + "/log/")
            self.copy_files(["logs/elasticsearch-multiline-log.log"],
                            target_dir="log")
        
            proc = self.start_beat()
        
    >       self.wait_until(
                lambda: self.output_has(lines=20),
                max_timeout=10)
    
    tests/system/test_multiline.py:141: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    
    self = <test_multiline.Test testMethod=test_max_lines>
    cond = <function Test.test_max_lines.<locals>.<lambda> at 0x7f1017fda5f0>
    max_timeout = 10, poll_interval = 0.1, name = 'cond', err_msg = ''
    
        def wait_until(self, cond, max_timeout=20, poll_interval=0.1, name="cond", err_msg=""):
            """
            TODO: this can probably be a "wait_until_output_count", among other things, since that could actually use `self`, and this can become an internal function
            Waits until the cond function returns true,
            or until the max_timeout is reached. Calls the cond
            function every poll_interval seconds.
        
            If the max_timeout is reached before cond() returns
            true, an exception is raised.
            """
            start = datetime.now()
            while not cond():
                if datetime.now() - start > timedelta(seconds=max_timeout):
                    print("Test has failed, here are the Beat logs")
                    for l in self.get_log_lines():
                        print(l)
    >               raise WaitTimeoutError(
                        f"Timeout waiting for condition '{name}'. Waited {max_timeout} seconds: {err_msg}")
    E               beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds:
    
    ../libbeat/tests/system/beat/beat.py:449: WaitTimeoutError 
    

Build&Test / filebeat-pythonIntegTest / test_rabbitmq_multiline_log – filebeat.tests.system.test_multiline.Test
    Expand to view the error details

     beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds: 
    

    Expand to view the stacktrace

     self = <test_multiline.Test testMethod=test_rabbitmq_multiline_log>
    
            def test_rabbitmq_multiline_log(self):
                """
                Test rabbitmq multiline log
                Special about this log file is that it has empty new lines
                """
                self.render_config_template(
                    path=os.path.abspath(self.working_dir) + "/log/*",
                    multiline=True,
                    multiline_type="pattern",
                    pattern="^=[A-Z]+",
                    match="after",
                    negate="true",
                )
        
                logentry = """=ERROR REPORT==== 3-Feb-2016::03:10:32 ===
        connection <0.23893.109>, channel 3 - soft error:
        {amqp_error,not_found,
                    "no queue 'bucket-1' in vhost '/'",
                    'queue.declare'}
        
        
        """
                os.mkdir(self.working_dir + "/log/")
        
                proc = self.start_beat()
        
                testfile = self.working_dir + "/log/rabbitmq.log"
                file = open(testfile, 'w')
                iterations = 3
                for n in range(0, iterations):
                    file.write(logentry)
                file.close()
        
                # wait for the "Skipping file" log message
    >           self.wait_until(
                    lambda: self.output_has(lines=3),
                    max_timeout=10)
    
    tests/system/test_multiline.py:109: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    
    self = <test_multiline.Test testMethod=test_rabbitmq_multiline_log>
    cond = <function Test.test_rabbitmq_multiline_log.<locals>.<lambda> at 0x7f1017fd9120>
    max_timeout = 10, poll_interval = 0.1, name = 'cond', err_msg = ''
    
        def wait_until(self, cond, max_timeout=20, poll_interval=0.1, name="cond", err_msg=""):
            """
            TODO: this can probably be a "wait_until_output_count", among other things, since that could actually use `self`, and this can become an internal function
            Waits until the cond function returns true,
            or until the max_timeout is reached. Calls the cond
            function every poll_interval seconds.
        
            If the max_timeout is reached before cond() returns
            true, an exception is raised.
            """
            start = datetime.now()
            while not cond():
                if datetime.now() - start > timedelta(seconds=max_timeout):
                    print("Test has failed, here are the Beat logs")
                    for l in self.get_log_lines():
                        print(l)
    >               raise WaitTimeoutError(
                        f"Timeout waiting for condition '{name}'. Waited {max_timeout} seconds: {err_msg}")
    E               beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds:
    
    ../libbeat/tests/system/beat/beat.py:449: WaitTimeoutError 
    

Build&Test / filebeat-pythonIntegTest / test_timeout – filebeat.tests.system.test_multiline.Test
    Expand to view the error details

     beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds: 
    

    Expand to view the stacktrace

     self = <test_multiline.Test testMethod=test_timeout>
    
        def test_timeout(self):
            """
            Test that data is sent after timeout
            """
            self.render_config_template(
                path=os.path.abspath(self.working_dir) + "/log/*",
                multiline=True,
                multiline_type="pattern",
                pattern=r"^\[",
                negate="true",
                match="after",
            )
        
            os.mkdir(self.working_dir + "/log/")
        
            testfile = self.working_dir + "/log/test.log"
            file = open(testfile, 'wb', 0)
        
            file.write(b"[2015] hello world")
            file.write(b"\n")
            file.write(b"  First Line\n")
            file.write(b"  Second Line\n")
        
            proc = self.start_beat()
        
    >       self.wait_until(
                lambda: self.output_has(lines=1),
                max_timeout=10)
    
    tests/system/test_multiline.py:185: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    
    self = <test_multiline.Test testMethod=test_timeout>
    cond = <function Test.test_timeout.<locals>.<lambda> at 0x7f101768e3b0>
    max_timeout = 10, poll_interval = 0.1, name = 'cond', err_msg = ''
    
        def wait_until(self, cond, max_timeout=20, poll_interval=0.1, name="cond", err_msg=""):
            """
            TODO: this can probably be a "wait_until_output_count", among other things, since that could actually use `self`, and this can become an internal function
            Waits until the cond function returns true,
            or until the max_timeout is reached. Calls the cond
            function every poll_interval seconds.
        
            If the max_timeout is reached before cond() returns
            true, an exception is raised.
            """
            start = datetime.now()
            while not cond():
                if datetime.now() - start > timedelta(seconds=max_timeout):
                    print("Test has failed, here are the Beat logs")
                    for l in self.get_log_lines():
                        print(l)
    >               raise WaitTimeoutError(
                        f"Timeout waiting for condition '{name}'. Waited {max_timeout} seconds: {err_msg}")
    E               beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 10 seconds:
    
    ../libbeat/tests/system/beat/beat.py:449: WaitTimeoutError 
    

Build&Test / filebeat-pythonIntegTest / test_condition – filebeat.tests.system.test_processors.Test
    Expand to view the error details

     beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 20 seconds: 
    

    Expand to view the stacktrace

     self = <test_processors.Test testMethod=test_condition>
    
        def test_condition(self):
            """
            Check condition in processors
            """
            self.render_config_template(
                path=os.path.abspath(self.working_dir) + "/test*.log",
                processors=[{
                    "drop_event": {
                        "when": "not.contains.message: test",
                    },
                }]
            )
            with open(self.working_dir + "/test1.log", "w") as f:
                f.write("test1 message\n")
        
            with open(self.working_dir + "/test2.log", "w") as f:
                f.write("test2 message\n")
        
            filebeat = self.start_beat()
    >       self.wait_until(lambda: self.output_has(lines=2))
    
    tests/system/test_processors.py:113: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    
    self = <test_processors.Test testMethod=test_condition>
    cond = <function Test.test_condition.<locals>.<lambda> at 0x7f1017f3f130>
    max_timeout = 20, poll_interval = 0.1, name = 'cond', err_msg = ''
    
        def wait_until(self, cond, max_timeout=20, poll_interval=0.1, name="cond", err_msg=""):
            """
            TODO: this can probably be a "wait_until_output_count", among other things, since that could actually use `self`, and this can become an internal function
            Waits until the cond function returns true,
            or until the max_timeout is reached. Calls the cond
            function every poll_interval seconds.
        
            If the max_timeout is reached before cond() returns
            true, an exception is raised.
            """
            start = datetime.now()
            while not cond():
                if datetime.now() - start > timedelta(seconds=max_timeout):
                    print("Test has failed, here are the Beat logs")
                    for l in self.get_log_lines():
                        print(l)
    >               raise WaitTimeoutError(
                        f"Timeout waiting for condition '{name}'. Waited {max_timeout} seconds: {err_msg}")
    E               beat.beat.WaitTimeoutError: Timeout waiting for condition 'cond'. Waited 20 seconds:
    
    ../libbeat/tests/system/beat/beat.py:449: WaitTimeoutError 
    

Steps errors 17

Expand to view the steps failures

Show only the first 10 steps failures

auditbeat-windows-2016-windows-2016 - mage build unitTest
  • Took 0 min 16 sec . View more details here
  • Description: mage build unitTest
auditbeat-windows-2016-windows-2016 - mage build unitTest
  • Took 0 min 16 sec . View more details here
  • Description: mage build unitTest
filebeat-pythonIntegTest - mage pythonIntegTest
  • Took 55 min 46 sec . View more details here
  • Description: mage pythonIntegTest
x-pack/auditbeat-windows-2022-windows-2022 - mage build unitTest
  • Took 2 min 42 sec . View more details here
  • Description: mage build unitTest
x-pack/auditbeat-windows-2022-windows-2022 - mage build unitTest
  • Took 0 min 15 sec . View more details here
  • Description: mage build unitTest
x-pack/auditbeat-windows-2022-windows-2022 - mage build unitTest
  • Took 0 min 15 sec . View more details here
  • Description: mage build unitTest
x-pack/auditbeat-windows-2016-windows-2016 - mage build unitTest
  • Took 2 min 49 sec . View more details here
  • Description: mage build unitTest
x-pack/auditbeat-windows-2016-windows-2016 - mage build unitTest
  • Took 0 min 16 sec . View more details here
  • Description: mage build unitTest
x-pack/auditbeat-windows-2016-windows-2016 - mage build unitTest
  • Took 0 min 16 sec . View more details here
  • Description: mage build unitTest
Error signal
  • Took 0 min 0 sec . View more details here
  • Description: Error 'hudson.AbortException: script returned exit code 2'

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

Comment on lines 60 to 62
func ReduceTimestampPrecision(timeNs uint64) uint64 {
return timeNs - (timeNs % (uint64(time.Second.Nanoseconds()) / ticksPerSecond))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this not be expressed with time.Duration(timeNs).Truncate(time.Second/time.Duration(ticksPerSecond))?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it could. I've made this change and changed all related code to work with time.Duration

@mjwolf
Copy link
Contributor Author

mjwolf commented Feb 22, 2024

Thanks for the good suggestions @andrewkroh. I've made some of the changes, and I'll continue working on the rest

@pierrehilbert
Copy link
Collaborator

@mjwolf what are the news here?

@norrietaylor
Copy link
Member

@pierrehilbert Mike is on PTO; he will be back at his desk next week.

@pierrehilbert
Copy link
Collaborator

Thx @norrietaylor, no urgency here, just want to ensure we don't have forgotten work in progress :-)

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

cc @mjwolf

@elasticmachine
Copy link
Collaborator

elasticmachine commented Mar 14, 2024

💔 Build Failed

Failed CI Steps

History

cc @mjwolf

@elasticmachine
Copy link
Collaborator

elasticmachine commented Mar 14, 2024

💔 Build Failed

Failed CI Steps

History

cc @mjwolf

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @mjwolf

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

cc @mjwolf

@elasticmachine
Copy link
Collaborator

elasticmachine commented Mar 14, 2024

💔 Build Failed

Failed CI Steps

History

cc @mjwolf

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @mjwolf

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @mjwolf

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

cc @mjwolf

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @mjwolf

@mjwolf mjwolf requested a review from andrewkroh March 18, 2024 15:26
Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good. I just have few minor comments.

return nil, fmt.Errorf("pid %v not found in db: %w", pid, err)
}

result := ev.Clone()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very expensive operation to perform for each event. The semantics of Beat processors are not well-defined in the interface. Some of the processor clone in order create a transaction-like behavior in order to be able to undo partial changes in case of any failure.

At a minimum, the Clone operation should be performed immediately before the point at which it is necessary (so right before the first action that mutates the event (MergeFieldsDeep)).

An alternative approach would be to perform some pre-check on the destination map to see if the merge operation can be done safely. Then there would be no need to undo a partial merge operation so we could entirely avoid the clone. We don't need do this now, but we should plan to memory profile this before making it GA.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the Clone to just before MergeFieldsDeep.

But I think it might be possible to remove this completely. It looks like MergeFieldsDeep won't return error when underRoot is true (other than the first precondition check), although there's a comment saying the opposite: https://github.com/elastic/elastic-agent-libs/blob/v0.8.0/mapstr/mapstr.go#L375-L380.

Do you know of any conditions where it could return error and leave the event in an inconsistant state?

But I'll leave the clone in for now, and reinvestigate if it's needed when performance testing is done before GA

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can fail when a field exists in the map that is not an object. Like if the event had a process: 123 field and you wanted to merge in {process: {pid: 123}}.

x-pack/auditbeat/processors/sessionmd/timeutils/time.go Outdated Show resolved Hide resolved
* Move event to after some conditions are checked
* Use sync.OnceValues in time utils
* Add issue reference in comment
@mjwolf mjwolf requested a review from andrewkroh March 28, 2024 23:57
@mjwolf mjwolf enabled auto-merge (squash) April 2, 2024 23:03
@mjwolf mjwolf merged commit 3203e40 into elastic:main Apr 5, 2024
46 of 47 checks passed
zeynepyz pushed a commit to zeynepyz/beats that referenced this pull request Apr 7, 2024
…tion (elastic#37640)

This adds a add_session_metadata processor to auditbeat which will enrich auditd process events with information needed to enable the Kibana session viewer on the events.

In this implementation, eBPF is used to collect information on all processes running on the system, which are added to a process database. When a process event is run through the processor, the DB will be read to retrieve information on the processes related to the process in the event (the processes's parent, session leader,process group leader and entry leader will be retrieved). Then the event will be enriched with the metadata on these related processes, which can enable use of session view on the data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Auditbeat backport-skip Skip notification from the automated backport with mergify enhancement Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants