Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tag 1.1.1 fails with qiime time zone issue #114

Closed
marchoeppner opened this issue Dec 11, 2019 · 11 comments · Fixed by nf-core/configs#108
Closed

tag 1.1.1 fails with qiime time zone issue #114

marchoeppner opened this issue Dec 11, 2019 · 11 comments · Fixed by nf-core/configs#108
Assignees
Labels
bug Something isn't working

Comments

@marchoeppner
Copy link

Running tag 1.1.1 using a linux host and singularity 3.3.0 produces the following error:

"Timezone offset does not match system offset: 0 != 3600. Please, check your config files."

This seems related to: https://forum.qiime2.org/t/error-while-running-qiime2-valueerror-timezone-offset-does-not-match-system-offset-0-25200-please-check-your-config-files/11418/11

Running v 1.1.0 produces no such error, so maybe an issue that was introduced when versions were bumped?

@marchoeppner marchoeppner added the bug Something isn't working label Dec 11, 2019
@apeltzer
Copy link
Member

Super weird stuff. Does it help if you do an export TZ=... as written in the QIIME2 forum?

@d4straub
Copy link
Collaborator

Same problem on cfc and binac!
export TZ='Europe/Berlin' didn't do the trick for the pipeline.
Also when I do singularity shell nfcore-ampliseq-1.1.1.img the qiime import fails, but export TZ='Europe/Berlin' helps in this case.

@apeltzer
Copy link
Member

Sorry to say this but this is utter nonsense behavior of QIIME2.

Asking for some ideas on nf-core/core now, maybe someone has an idea.

@apeltzer
Copy link
Member

Ok plan for fix:

1.) We will add the TZ / envWhitelist for Binac to test this , @d4straub will figure out whether that helps in addition to exporting a proper TimeZone first.
2.) We'll open an issue in the QIIME2 forum for them to fix this. It's really not proper behaviour to require users to set this inside their containers for the sake of reproducibility.
3.) Once this is fine, @d4straub will add something to the troubleshooting section in the current dev branch, so that the next ampliseq release will have some info about this.

@d4straub
Copy link
Collaborator

1.) This failed on binac:

export TZ='Europe/Berlin'
nextflow pull nf-core/ampliseq
nextflow run nf-core/ampliseq -r 1.1.1 -profile test,binac,singularity

with: Timezone offset does not match system offset: 0 != 3600. Please, check your config files.

2.) Amended existing issue on github and referenced this.

3.) Unfortunately not solved.

@d4straub
Copy link
Collaborator

Weird, 1.1.1 works local with nextflow run nf-core/ampliseq -r 1.1.1 -profile test,singularity

@thermokarst
Copy link
Contributor

Sorry to say this but this is utter nonsense behavior of QIIME2.

Hi @apeltzer - this isn't behavior from QIIME 2, rather, this exception is coming from the tzlocal package, and has to do with a misconfigured environment. This is apparently a fairly common issue for this package - please see regebro/tzlocal#79 for more detail. Thanks!

@apeltzer
Copy link
Member

Thanks @thermokarst for the info on this matter! The question that arises to me here is why this was introduced in QIIME2 - we never had trouble prior QIIME2 v2019.07 (the 1.1.1 ampliseq release uses 2019.10 as a basis), but just now found out that this breaks the entire release on multiple systems (?).

"Misconfigured environment" meaning, that multiple systems on our hands here and on a different site (as reported by @marchoeppner ) are not configured correctly? I fear that is just an easy excuse by the tzlocal developer here, also as there is zero further information what is misconfigured in the issue you linked above.

Don't get me wrong here: It's just weird that a tool that worked flawlessly for multiple projects now fails entirely on our clusters that are running thousands of jobs per day for multiple pipelines, frameworks, etc pp just because the tzlocal python package considers them "misconfigured"? 🤔

I think the issue you opened here is a good approach: qiime2/qiime2#510, to at least fail gracefully / warn the user within QIIME2. Thank you for that!

In the meanwhile: Can we do anything to get release 1.1.1 running? We have tried multiple approaches, but couldn't find a solution so far.

To be honest with the tool: Is this something crucially necessary?

@thermokarst
Copy link
Contributor

Hey @apeltzer, really sorry to hear you're dealing with this right now - for what its worth, we have dealt with our fair share of secondary and tertiary dependency issues, never any fun!

The question that arises to me here is why this was introduced in QIIME2 - we never had trouble prior QIIME2 v2019.07 (the 1.1.1 ampliseq release uses 2019.10 as a basis), but just now found out that this breaks the entire release on multiple systems (?).

This is an interesting piece of evidence --- nothing changed on our end between those two releases (tzlocal was added to Q2 over 3 years ago, and the version of tzlocal appears to be unchanged, too). Digging in a bit further on the regebro/tzlocal#79 issue, both of the "common reasons" listed there sound feasible to me. So, for the first one:

the OS config has been updated, but you are running an old version of pytz.

Is it possible that something changed with the base image for your singularity container in this new problematic release? Did you change to a new base distro, or change to a different release of the distro?

You are running under some sort of "almost virtualized" OS

This sounds to me like singularity might fit into this category. For what its worth, I have only heard of this timezone mismatch issue from QIIME 2 users who are using singularity. Personally, I have no experience with singularity, so I don't know how that problem might present itself, but I suspect that it is probably pretty easy for file-based config to clash with env-var based config here. Another bit of interesting evidence - I haven't yet seen this issue present itself in a vanilla docker container...

I fear that is just an easy excuse by the tzlocal developer here, also as there is zero further information what is misconfigured in the issue you linked above.

Yeah, that new issue is meant to be a discussion thread issue (apparently). I cruised through the tzlocal issue tracker for a while, sounds like this error message is intentional, and likely won't be removed, since it appears to exclusively reveal issues related to timezone config mismatches (for example, one timezone set in /etc, another set by TZ env var).

In the meanwhile: Can we do anything to get release 1.1.1 running? We have tried multiple approaches, but couldn't find a solution so far.

Have you reached out to the tzlocal author on the thread I linked to above? @regebro appears to have started that thread for cases much like this one - for interactively debugging. It sounds like @regebro is familiar with Docker (based on some issues I read on their tracker), perhaps they are also familiar with singularity? I am happy to keep debugging with you in the meantime, too!

@d4straub
Copy link
Collaborator

PR #116 should fix the pipeline, optionally --qiime_timezone has to be specified!

@apeltzer
Copy link
Member

Hey @apeltzer, really sorry to hear you're dealing with this right now - for what its worth, we have dealt with our fair share of secondary and tertiary dependency issues, never any fun!

I guess everyone in Bioinformatics has these issues from time to time ;-)

The question that arises to me here is why this was introduced in QIIME2 - we never had trouble prior QIIME2 v2019.07 (the 1.1.1 ampliseq release uses 2019.10 as a basis), but just now found out that this breaks the entire release on multiple systems (?).

This is an interesting piece of evidence --- nothing changed on our end between those two releases (tzlocal was added to Q2 over 3 years ago, and the version of tzlocal appears to be unchanged, too). Digging in a bit further on the regebro/tzlocal#79 issue, both of the "common reasons" listed there sound feasible to me. So, for the first one:

the OS config has been updated, but you are running an old version of pytz.

Is it possible that something changed with the base image for your singularity container in this new problematic release? Did you change to a new base distro, or change to a different release of the distro?

We are using a predefined environment miniconda3 in our base image, for all nf-core pipelines. The image we use in nf-core/tools is updated every now and then and yes, there were most likely updates in between. The Docker image we are importing can be found here: https://hub.docker.com/r/continuumio/miniconda/tags

You are running under some sort of "almost virtualized" OS

This sounds to me like singularity might fit into this category. For what its worth, I have only heard of this timezone mismatch issue from QIIME 2 users who are using singularity. Personally, I have no experience with singularity, so I don't know how that problem might present itself, but I suspect that it is probably pretty easy for file-based config to clash with env-var based config here. Another bit of interesting evidence - I haven't yet seen this issue present itself in a vanilla docker container...

Our users are running the pipeline using singularity for security reasons, e.g. no Docker allowed/enabled on most HPC systems, as a user able to run Docker can in principle escalate their user permissions to root - which is of course undesired in a multi-tenant system such as a HPC. What they do is using the Singularity pull feature, that can fetch a Docker Hub Image directly and then create a Singularity image out of it.

I fear that is just an easy excuse by the tzlocal developer here, also as there is zero further information what is misconfigured in the issue you linked above.

Yeah, that new issue is meant to be a discussion thread issue (apparently). I cruised through the tzlocal issue tracker for a while, sounds like this error message is intentional, and likely won't be removed, since it appears to exclusively reveal issues related to timezone config mismatches (for example, one timezone set in /etc, another set by TZ env var).

Have you reached out to the tzlocal author on the thread I linked to above? @regebro appears to have started that thread for cases much like this one - for interactively debugging. It sounds like @regebro is familiar with Docker (based on some issues I read on their tracker), perhaps they are also familiar with singularity? I am happy to keep debugging with you in the meantime, too!

I will do that now I guess - we will release 1.1.2 soon to have a preliminary fix for this matter, but we still don't fully understand whats the problem.

Thanks a lot for your help / ideas / comments!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants