-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestions for improving prediction workflow documentation on https://reat.readthedocs.io/ #35
Comments
Thanks for the feedback the documentation has lagged behind the development and there are a few holes that need filling
As described in reat/docs/modules/prediction/index.rst Line 26 in 3852eee
They need to be referenced in the evm weights files as When providing these via
the order provided will determine which run corresponds to AUGUSTUS_RUN1 AUGUSTUS_RUN2 etc Please let us know of any other issues |
@swarbred reat/annotation/prediction_module/main.wdl Line 560 in be8c679
I guess the reason behind this is that my genome has lots of scaffold and contigs. I am not sure whether I can solve the problem by replacing this line with:
Do you have any suggestions? |
You are hitting the ARG_MAX limit, I was aware that we would likely need to change this (call-JoinAugustus is also a cat). Perhaps the easiest way is to use xargs
That should work but we will make the change ASAP, we are very likely going to get the same error on the wheat annotation we are running. |
@dadrasarmin |
2nd location to update
|
@swarbred Best regards, |
One other thing that I forgot to mention. For transcriptome and homology workflow, "Configurable computational resources available" suggested on the documentation work completely fine. However, the one suggested for the prediction workflow causes an error and the program immediately stops working. Also, there is one computing resource that is necessary to run the prediction workflow, and it was not mentioned in the documentation that it is obligatory.(ei_prediction.augustus_resources). I changed the resource usage by changing the scripts instead of using
As a comparison we can see the difference between homology: reat/annotation/homology_module/main.wdl Lines 25 to 28 in c8c680f
The transcriptome workflow:
And Prediction workflow: reat/annotation/prediction_module/main.wdl Line 568 in c8c680f
I guess the problem could be solved by substituting RuntimeAttr? resources with subprocesses names like RuntimeAttr? ExecuteEVMCommand.resources
|
For info here is an example compute_inputs.json for prediction We will check what is given in the documentation The line reat/annotation/prediction_module/main.wdl Line 568 in c8c680f
looks correct to me as is The doc shows
and this should be
|
Besides the
If I take a look at
Therefore, |
Hi @dadrasarmin I appreciate your detailed checks. We are doing to necessary changes as we speak and carrying out the tests before merging the changes. The |
Hi @gemygk, As I said in my first comment, I really appreciate what you have built and the results I got recently are excellent. That is why I investigated the codes in more detail and found a few suggestions. |
Thanks @dadrasarmin . It is the result of hard work from @ljyanesm and the team. Please do let us know if you have any issues. In the meantime, we will try to get these changes into the repo. |
Updated the documentation solving the following issue following issue: EI-CoreBioinformatics#35
Updated the values as described by @swarbred in the following issue: EI-CoreBioinformatics#35
Add aclaaration realted to de issue: EI-CoreBioinformatics#35
Hi reat developers,
First, I want to thank you for developing this great tool. I participated in your workshop at EI, and I am very motivated to use reat. I had some struggles during the last few days to run the prediction workflow and I have a four suggestions for the documentation file.
I thought I could pass one of the portcullis output to
--introns
, however, my run failed multiple times. After some investigation, I noticed I had to provide the gff3 file instead of the bed file. I think it would be nice to mention this in the document.reat/annotation/prediction_module/main.wdl
Line 1617 in be8c679
On the documentation link, in section the "Evidence Modeler default weights file", we can find an example like:
When I looked at
call-EVM/execute/Scafold_name/evm.out.log
, I noticed thatThere are a few typos here, probably some variable names changed during the development and I am not sure what to do with homology and transcriptome models, therefore, I put them under OTHER_PREDICTIONS tag for the moment:
hq_protein_alignment -> hq_protein
lq_protein_alignment -> lq_protein
lq_asssembly -> lq_assembly
TRANSCRIPT homology_models 10 -> OTHER_PREDICTIONS homology_models 10
TRANSCRIPT transcriptome_models 10 -> OTHER_PREDICTIONS transcriptome_models 10
Strange warning/error for,
$EVM_HOME/EvmUtils/write_EVM_commands.pl
:For this tool, I think the EVM code is somehow broken. Because if I run just
$EVM_HOME/EvmUtils/write_EVM_commands.pl
, I can see that there is an option-S
. However, if I run$EVM_HOME/EvmUtils/write_EVM_commands.pl -S
, I get:Unknown option: S
I just change the -S option from this line of the code and everything worked smoothly and without any warning/error.
reat/annotation/prediction_module/main.wdl
Line 1504 in be8c679
In the "Configuring Augustus runs" section:
I found this line confusing.
reat/docs/modules/prediction/index.rst
Line 48 in 3852eee
In the last section of the same document, it is mentioned that we have to name our AUGUSTUS+hints predictions like AUGUSTUS_RUN1. However, here it is written with lower case letters (augustus_run#). In my case, the job only successfully proceeds if I use the capital letter in the EVM weights, as the file names, and the same way for calling reat with
--augustus_runs AUGUSTUS_RUN1
.Best regards,
Armin
The text was updated successfully, but these errors were encountered: