-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
no qos driving to invalid qos specification #258
Comments
Hi, |
Hi @guipenedo, here is the content of the generated sbatch script: `#!/bin/bash #SBATCH --account=XXX(hiddenAccount)XXX #SBATCH --cpus-per-task=1 |
It seems that indeed there is no |
If I consider the server documentation, it's written "do not try to specify any qos, it's done automatically". So, just like me, you can't see any other mention to qos than what I already commented in the two scripts I mentioned? I tried to have a look to imported libraries just in case but I couldn't find anything. (many thanks for the help) |
I think your error message can also mean the specific combination of resources you are requesting is not allowed, I suggest you send the cluster admins your sbatch script and ask them if they can spot any issues |
Thank you for the advice. I contacted them yesterday, I'm waiting for an answer! |
Hi everyone,
I want to do deduplication so, for now, I'm running tests using minhash_deduplication.py. I'm using a server where I need to add account and contraint info so I added it in the script (modifying slurm.py also). My problem now, is that I cannot specify any qos for that server. This is set automatically...
I tried commenting everything related to qos is those two scripts, but I still have this error:
2024-07-22 21:46:08.585 | INFO | datatrove.executor.slurm:launch_job:235 - Launching dependency job "mh3" 2024-07-22 21:46:08.585 | INFO | datatrove.executor.slurm:launch_job:235 - Launching dependency job "mh2" 2024-07-22 21:46:08.585 | INFO | datatrove.executor.slurm:launch_job:235 - Launching dependency job "mh1" 2024-07-22 21:46:08.591 | INFO | datatrove.executor.slurm:launch_job:270 - Launching Slurm job mh1 (1 tasks) with launch script "/lus/work/CT10/lig3801/sevain/try_datatrove//signatures/launch_script.slurm" sbatch: error: INFO : As you didn't ask threads_per_core in your request: 2 was taken as default sbatch: error: INFO : As you didn't ask ntasks or ntasks_per-node in your request, 1 task was taken as default sbatch: error: Batch job submission failed: Invalid qos specification Traceback (most recent call last): File "/lus/work/CT10/lig3801/sevain/try_datatrove/./minhash_deduplication.py", line 116, in <module> stage4.run() File "/lus/home/CT10/lig3801/sevain/.conda/envs/datatrove/lib/python3.10/site-packages/datatrove/executor/slurm.py", line 188, in run self.launch_job() File "/lus/home/CT10/lig3801/sevain/.conda/envs/datatrove/lib/python3.10/site-packages/datatrove/executor/slurm.py", line 236, in launch_job self.depends.launch_job() File "/lus/home/CT10/lig3801/sevain/.conda/envs/datatrove/lib/python3.10/site-packages/datatrove/executor/slurm.py", line 236, in launch_job self.depends.launch_job() File "/lus/home/CT10/lig3801/sevain/.conda/envs/datatrove/lib/python3.10/site-packages/datatrove/executor/slurm.py", line 236, in launch_job self.depends.launch_job() File "/lus/home/CT10/lig3801/sevain/.conda/envs/datatrove/lib/python3.10/site-packages/datatrove/executor/slurm.py", line 283, in launch_job self.job_id = launch_slurm_job(launch_file_contents, *args) File "/lus/home/CT10/lig3801/sevain/.conda/envs/datatrove/lib/python3.10/site-packages/datatrove/executor/slurm.py", line 375, in launch_slurm_job return subprocess.check_output(["sbatch", *args, f.name]).decode("utf-8").split()[-1] File "/lus/home/CT10/lig3801/sevain/.conda/envs/datatrove/lib/python3.10/subprocess.py", line 421, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/lus/home/CT10/lig3801/sevain/.conda/envs/datatrove/lib/python3.10/subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['sbatch', '--export=NONE,RUN_OFFSET=0', '/tmp/tmpnif55bvt']' returned non-zero exit status 1.
How can I have qos problem when I need not to specify one?
It's driving me insane. If anyone could provide any help, I would be grateful!
Thanks
The text was updated successfully, but these errors were encountered: