Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix test cluster performance #4780

Merged
merged 1 commit into from
Feb 8, 2024

Conversation

javiersanchz
Copy link
Member

Related issue
#4298

Description

performance/test_cluster/test_cluster_performance/test_cluster_performance.py::test_cluster_performance FAILED

>           pytest.fail(f"Stats could not be retrieved, '{artifacts_path}' path may not exist, it is empty or it may not"
                        f" follow the proper structure.")
E           Failed: Stats could not be retrieved, '/mnt/efs/tmp/CLUSTER-Workload_benchmarks_metrics/B_263' path may not exist, it is empty or it may not follow the proper structure.

Once the changes were applied:

Test_cluster_performance
(qa) wazuh@javier:~/Git/wazuh-qa/wazuh-qa/tests/performance/test_cluster$ python3 -m pytest test_cluster_performance --artifacts_path='/home/wazuh/Descargas/artifacts' --n_workers=25 --n_agents=50000 --html=report.html --self-contained-html
=========================================================================================== test session starts ============================================================================================
platform linux -- Python 3.9.16, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/wazuh/Git/wazuh-qa/wazuh-qa/tests, configfile: pytest.ini
plugins: metadata-2.0.4, testinfra-5.0.0, html-3.1.1
collected 1 item                                                                                                                                                                                           

test_cluster_performance/test_cluster_performance.py F                                                                                                                                               [100%]

================================================================================================= FAILURES =================================================================================================
_________________________________________________________________________________________ test_cluster_performance _________________________________________________________________________________________

artifacts_path = '/home/wazuh/Descargas/artifacts', n_workers = 25, n_agents = 50000

    def test_cluster_performance(artifacts_path, n_workers, n_agents):
        """Check that a cluster environment did not exceed certain thresholds.
    
        This test obtains various statistics (mean, max, regression coefficient) from CSVs with
        data generated in a cluster environment (resources used and duration of tasks). These
        statistics are compared with thresholds established in the data folder.
    
        Args:
            artifacts_path (str): Path where CSVs with cluster information can be found.
            n_workers (int): Number of workers folders that are expected inside the artifacts path.
            n_agents (int): Number of agents in the cluster environment.
        """
        if None in (artifacts_path, n_workers, n_agents):
            pytest.fail("Parameters '--artifacts_path=<path> --n_workers=<n_workers> --n_agents=<n_agents>' are required.")
    
        # Check if there are threshold data for the specified number of workers and agents.
        selected_conf = f"{n_workers}w_{n_agents}a"
        if selected_conf not in configurations:
            pytest.fail(f"This is not a supported configuration: {selected_conf}. "
                        f"Supported configurations are: {', '.join(configurations.keys())}.")
    
        # Check if path exists and if expected number of workers matches what is found inside artifacts.
        try:
            cluster_info = ClusterEnvInfo(artifacts_path).get_all_info()
        except FileNotFoundError:
            pytest.fail(f"Path '{artifacts_path}' could not be found or it may not follow the proper structure.")
    
        if cluster_info.get('worker_nodes', 0) != int(n_workers):
            pytest.fail(f"Information of {n_workers} workers was expected inside the artifacts folder, but "
                        f"{cluster_info.get('worker_nodes', 0)} were found.")
    
        # Calculate stats from data inside artifacts path.
        data = {'tasks': ClusterCSVTasksParser(artifacts_path).get_stats(),
                'resources': ClusterCSVResourcesParser(artifacts_path).get_stats()}
    
        if not data['tasks'] or not data['resources']:
            pytest.fail(f"Stats could not be retrieved, '{artifacts_path}' path may not exist, it is empty or it may not"
                        f" follow the proper structure.")
    
        # Compare each stat with its threshold.
        for data_name, data_stats in data.items():
            for phase, files in data_stats.items():
                for file, columns in files.items():
                    for column, nodes in columns.items():
                        for node_type, stats in nodes.items():
                            for stat, value in stats.items():
                                th_value = configurations[selected_conf][data_name][phase][file][column][node_type][stat]
                                if value[1] >= th_value:
                                    exceeded_thresholds.append({'value': value[1], 'threshold': th_value, 'stat': stat,
                                                                'column': column, 'node': value[0], 'file': file,
                                                                'phase': phase})
    
        try:
>           assert not exceeded_thresholds, 'Some thresholds were exceeded:\n- ' + '\n- '.join(
                '{stat} {column} {value} >= {threshold} ({node}, {file}, {phase})'.format(**item) for item in
                exceeded_thresholds)
E               AssertionError: Some thresholds were exceeded:
E                 - reg_cof FD 0.023233327721228512 >= 0.02 (worker_8, wazuh-clusterd, setup_phase)
E                 - mean FD 117.15853658536585 >= 103.4 (master, wazuh-clusterd, setup_phase)
E                 - reg_cof FD 0.5827280708755685 >= 0.33 (worker_16, wazuh-clusterd, stable_phase)
E                 - mean FD 70.8 >= 59 (master, wazuh-clusterd, stable_phase)
E                 - max FD 120.0 >= 70.5 (master, wazuh-clusterd, stable_phase)
E               assert not [{'column': 'FD', 'file': 'wazuh-clusterd', 'node': 'worker_8', 'phase': 'setup_phase', ...}, {'column': 'FD', 'file':...ase': 'stable_phase', ...}, {'column': 'FD', 'file': 'wazuh-clusterd', 'node': 'master', 'phase': 'stable_phase', ...}]

test_cluster_performance/test_cluster_performance.py:85: AssertionError
------------------------------------------------------------------------------------------- Captured stdout call -------------------------------------------------------------------------------------------

Setup phase took 0:20:03s (2023/07/06 14:54:59 - 2023/07/06 15:15:02).
Stable phase took 0:04:52s (2023/07/06 15:15:02 - 2023/07/06 15:19:54).
------------------------------------------------- generated html file: file:///home/wazuh/Git/wazuh-qa/wazuh-qa/tests/performance/test_cluster/report.html -------------------------------------------------
========================================================================================= short test summary info ==========================================================================================
FAILED test_cluster_performance/test_cluster_performance.py::test_cluster_performance - AssertionError: Some thresholds were exceeded:
============================================================================================ 1 failed in 0.73s =============================================================================================
  • The failure obtained in the output now is expected because certain thresholds have been exceeded, as mentioned in th test description

@javiersanchz javiersanchz self-assigned this Dec 19, 2023
@GGP1 GGP1 self-requested a review December 19, 2023 14:46
Copy link
Member

@GGP1 GGP1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though the PR description contains a test that runs without issues, the path was passed manually and we cannot assert that the workload benchmark metrics pipeline is doing that correctly.

The error indicated that the artifacts_path was incorrect, something that wasn't changed here. So we should run the pipeline and validate that the script is executed as expected.

Also, please update the changelog and commit messages to comply with the check requirements.

Copy link
Member

@GGP1 GGP1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though there were no changes to the pipeline were the error occurred, the team decided that running the performance test manually was enough.

Please modify the commits so the comply with the convention.

  • Change path syntax artifacts_path should be fix: Change path syntax artifacts_path
  • Add changes to CHANGELOG should be docs: Add changes to CHANGELOG

GGP1
GGP1 previously approved these changes Dec 22, 2023
Copy link
Member

@GGP1 GGP1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@Selutario Selutario left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. However, the release type is patch in this issue so this should be pointing to 4.8.2 branch instead.

CHANGELOG.md Outdated Show resolved Hide resolved
@javiersanchz javiersanchz force-pushed the feature/4298-fix-performance-test branch from 756f33d to a89eb90 Compare January 26, 2024 11:19
@javiersanchz javiersanchz changed the base branch from master to 4.8.2 January 26, 2024 11:20
Selutario
Selutario previously approved these changes Jan 26, 2024
@javiersanchz javiersanchz force-pushed the feature/4298-fix-performance-test branch from a89eb90 to 8595425 Compare January 29, 2024 16:31
@javiersanchz javiersanchz changed the base branch from 4.8.2 to 4.8.0 January 29, 2024 16:31
@javiersanchz javiersanchz force-pushed the feature/4298-fix-performance-test branch from 8595425 to aa4a27a Compare January 29, 2024 17:36
@davidjiglesias davidjiglesias merged commit 578027a into 4.8.0 Feb 8, 2024
4 checks passed
@davidjiglesias davidjiglesias deleted the feature/4298-fix-performance-test branch February 8, 2024 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance test is not working in Workload benchmark test
5 participants