You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running the simulate_prod integration tests in parallel makes them fail.
The error messages are typically:
INFO::simulator(l450)::get_file_list::Getting list of hist files
Traceback (most recent call last):
File "/workdir/env/bin/simtools-simulate-prod", line 8, in <module>
sys.exit(main())
^^^^^^
File "/workdir/external/simtools/simtools/applications/simulate_prod.py", line 201, in main
pack_for_register(logger, simulator, args_dict)
File "/workdir/external/simtools/simtools/applications/simulate_prod.py", line 166, in pack_for_register
tar.add(file_to_tar, arcname=Path(file_to_tar).name)
File "/usr/lib64/python3.11/tarfile.py", line 2171, in add
tarinfo = self.gettarinfo(name, arcname)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/tarfile.py", line 2044, in gettarinfo
statres = os.lstat(name)
^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-root/pytest-17/popen-gw2/test-data2/simtools-simulate-prod-gamma_20_deg_az0deg_south_check_output/simtools-output/simtel/logs/run000001_gamma_za20deg_azm000deg_South_alpha_check_output.hdata.zst'
Comparing log files from sim_telarray, it becomes apparent that for the failing runs sim_telarray was not executed correctly. So above error on missing histogram files is because these files are never generated.
An inspection of configuration files (for simtools, corsika, sim_telarray) or run scripts do not reveal anything notable. Paths are set everywhere consistently to the right temporary directory generated by pytest.
Propose to analyse if sim_telarray array uses some temporary directories for all runs, which might interfer when run in parallel?
The text was updated successfully, but these errors were encountered:
There are two unrelated issues which pop up when running in parallel. One of them is related to creating the tarball and I think it is solved now in the fix_parallel_running branch. I will take a look at the other issue as well (at some point).
Actually, after merging the main into fix_parallel_running after #1137 was merged to main, I cannot reproduce the first problem. Can you @GernotMaier, @tobiaskleiner please check and see if you can reproduce the problem in fix_parallel_running? If not, I will open a PR with this small fix.
I will anyway look into the output files from productions in the future because I want to change the names to be a bit more consistent.
Issue noticed by Tobias during work on #1137.
Running the simulate_prod integration tests in parallel makes them fail.
The error messages are typically:
Comparing log files from sim_telarray, it becomes apparent that for the failing runs sim_telarray was not executed correctly. So above error on missing histogram files is because these files are never generated.
An inspection of configuration files (for simtools, corsika, sim_telarray) or run scripts do not reveal anything notable. Paths are set everywhere consistently to the right temporary directory generated by pytest.
Propose to analyse if sim_telarray array uses some temporary directories for all runs, which might interfer when run in parallel?
The text was updated successfully, but these errors were encountered: