Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration tests for simulate_prod fail when running in parallel #1209

Open
GernotMaier opened this issue Oct 15, 2024 · 2 comments · May be fixed by #1211
Open

Integration tests for simulate_prod fail when running in parallel #1209

GernotMaier opened this issue Oct 15, 2024 · 2 comments · May be fixed by #1211

Comments

@GernotMaier
Copy link
Contributor

Issue noticed by Tobias during work on #1137.

Running the simulate_prod integration tests in parallel makes them fail.

The error messages are typically:

INFO::simulator(l450)::get_file_list::Getting list of hist files
Traceback (most recent call last):
  File "/workdir/env/bin/simtools-simulate-prod", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/workdir/external/simtools/simtools/applications/simulate_prod.py", line 201, in main
    pack_for_register(logger, simulator, args_dict)
  File "/workdir/external/simtools/simtools/applications/simulate_prod.py", line 166, in pack_for_register
    tar.add(file_to_tar, arcname=Path(file_to_tar).name)
  File "/usr/lib64/python3.11/tarfile.py", line 2171, in add
    tarinfo = self.gettarinfo(name, arcname)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/tarfile.py", line 2044, in gettarinfo
    statres = os.lstat(name)
              ^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-root/pytest-17/popen-gw2/test-data2/simtools-simulate-prod-gamma_20_deg_az0deg_south_check_output/simtools-output/simtel/logs/run000001_gamma_za20deg_azm000deg_South_alpha_check_output.hdata.zst'

Comparing log files from sim_telarray, it becomes apparent that for the failing runs sim_telarray was not executed correctly. So above error on missing histogram files is because these files are never generated.

An inspection of configuration files (for simtools, corsika, sim_telarray) or run scripts do not reveal anything notable. Paths are set everywhere consistently to the right temporary directory generated by pytest.

Propose to analyse if sim_telarray array uses some temporary directories for all runs, which might interfer when run in parallel?

@orelgueta
Copy link
Contributor

There are two unrelated issues which pop up when running in parallel. One of them is related to creating the tarball and I think it is solved now in the fix_parallel_running branch. I will take a look at the other issue as well (at some point).

@orelgueta
Copy link
Contributor

Actually, after merging the main into fix_parallel_running after #1137 was merged to main, I cannot reproduce the first problem. Can you @GernotMaier, @tobiaskleiner please check and see if you can reproduce the problem in fix_parallel_running? If not, I will open a PR with this small fix.

I will anyway look into the output files from productions in the future because I want to change the names to be a bit more consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants