Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue with tensorflow #76

Closed
corkdagga opened this issue Apr 16, 2024 · 13 comments
Closed

issue with tensorflow #76

corkdagga opened this issue Apr 16, 2024 · 13 comments

Comments

@corkdagga
Copy link

Hi,

I installed deepconsensus[cpu]=1.2.0 using pip within a conda environment (I do not have sudo privalages to be able to install from the source).

I installed using: "conda install deepconsensus[cpu]=1.2.0 python==3.9" to get around the installation error described in issue #69

The installation worked correctly but when I run deepconsenesus, I get the following issue:

2024-04-16 13:26:55.972174: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2024-04-16 13:26:56.053962: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-16 13:26:56.722497: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-16 13:26:56.727045: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 AVX_VNNI AMX_TILE AMX_INT8 AMX_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-16 13:26:59.776981: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning:

TensorFlow Addons (TFA) has ended development and introduction of new features.
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP).

For more information see: tensorflow/addons#2807

warnings.warn(
Traceback (most recent call last):
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/bin/deepconsensus", line 8, in
sys.exit(run())
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/deepconsensus/cli.py", line 118, in run
app.run(main, flags_parser=parse_flags)
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/deepconsensus/cli.py", line 103, in main
app.run(quick_inference.main, argv=passed)
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 977, in main
outcome_counter = run()
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 803, in run
params = model_utils.read_params_from_json(checkpoint_path=FLAGS.checkpoint)
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/deepconsensus/models/model_utils.py", line 444, in read_params_from_json
json.load(tf.io.gfile.GFile(json_path, 'r'))
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/json/init.py", line 293, in load
return loads(fp.read(),
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/tensorflow/python/lib/io/file_io.py", line 116, in read
self._preread_check()
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/tensorflow/python/lib/io/file_io.py", line 77, in _preread_check
self._read_buf = _pywrap_file_io.BufferedInputStream(
tensorflow.python.framework.errors_impl.NotFoundError: model/params.json; No such file or directory

I tried to update tensorflow to version 2.13.0 but that didnt fix the problem. I googled a lot and it seems to be a common problem but I could not find any solution so far.

Any help getting this problem solved would be great.

@pichuan
Copy link
Collaborator

pichuan commented Apr 18, 2024

Hi @corkdagga , from the log above, it seems like you might not have this file:

self._read_buf = _pywrap_file_io.BufferedInputStream( tensorflow.python.framework.errors_impl.NotFoundError: model/params.json; No such file or directory

Can you first check that you have that file or not?

@corkdagga
Copy link
Author

corkdagga commented Apr 25, 2024

Hi Pichuan,

Sorry, I dont have so much bioinformatics experience. Could you please let me know where I might be able to find the file?

@pichuan
Copy link
Collaborator

pichuan commented Apr 28, 2024

Hi @corkdagga ,
Have you followed the steps on https://github.com/google/deepconsensus/blob/r1.2/docs/quick_start.md ?

This section has the path of the model, including that file: https://github.com/google/deepconsensus/blob/r1.2/docs/quick_start.md#download-example-data

Let me know if that works!

@corkdagga
Copy link
Author

corkdagga commented Apr 30, 2024

Hi @pichuan,

In the 'Quick Start for DeepConsensus' document I followed the steps provided for how to run the ccs and actc. I am unable to use Docker so I did not follow these steps. I installed DeepConcensus, ccs and actc independently and ran both tools independently using the settings provided in the 'Quick Start for DeepConsensus' to generate the files needed. I am running on a HPC and Docker is not available as a module, so I do not think its possible for me to install.

I am unsure how I can follow the rest of the steps without Docker...

For the model: I was able to follow the steps and I have the model now downloaded, but I am unsure how to get DeepConcensus to find it. Where should I place the following folders:

n1000.subreads.bam
model/checkpoint.data-00000-of-00001
model/checkpoint.index
model/params.json

@pichuan
Copy link
Collaborator

pichuan commented Apr 30, 2024

Hi @corkdagga ,
In this step: https://github.com/google/deepconsensus/blob/r1.2/docs/quick_start.md#run-deepconsensus

It shows that you can use:

  --checkpoint=model/checkpoint \

Let me know if that works for you.

@corkdagga
Copy link
Author

Hi again @pichuan

I made some progress but it was still unsusccessful. I used the following command and below is the result. The text was way too large to copy all, so I just copied the last page of text:

srun deepconsensus run --subreads_to_ccs=PD049.CCS.actc.bam --ccs_bam=m54089_200615_125054.CCS.bam --checkpoint=/data/horse/ws/pada358b-genome_assembly/DC_model/model/checkpoint.index --output=PD049pacbio.output_DC.fastq

[ 0.00025856, 0.01601613, -0.03229742, ..., 0.01019282,
0.03764324, -0.02552934]], dtype=float32)>: ['encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_3/ffn/pre_post_processing_wrapper_7/feed_forward_network_3/output_layer/kernel']
<tf.Variable 'encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_3/ffn/pre_post_processing_wrapper_7/feed_forward_network_3/output_layer/bias:0' shape=(280,) dtype=float32, numpy=
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)>: ['encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_3/ffn/pre_post_processing_wrapper_7/feed_forward_network_3/output_layer/bias']
<tf.Variable 'encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/self_attention/pre_post_processing_wrapper_8/self_attention_4/query/kernel:0' shape=(280, 2, 140) dtype=float32, numpy=
array([[[ 0.02355821, 0.09528779, 0.03717279, ..., -0.05718403,
-0.01234079, -0.01061931],
[-0.0542623 , -0.06560074, 0.00454033, ..., 0.05440269,
-0.08055404, -0.05507869]],

   [[ 0.01269599, -0.09101377,  0.00030609, ...,  0.0335236 ,
      0.01732583, -0.02346364],
    [-0.08298358,  0.01272721, -0.05807347, ..., -0.07442309,
     -0.02873039, -0.09584662]],

   [[-0.04943868, -0.05078816,  0.0147185 , ...,  0.02219511,
      0.10019105,  0.03981955],
    [ 0.03070989,  0.03485336, -0.00275103, ..., -0.01758868,
     -0.06381247, -0.02471267]],

   ...,

   [[-0.06253722,  0.02299368, -0.04022574, ..., -0.01903863,
      0.08243453, -0.02809814],
    [-0.09463223,  0.04108616, -0.05539139, ..., -0.03844319,
      0.07349332, -0.0749604 ]],

   [[-0.05008304, -0.02539364,  0.05218483, ..., -0.00389064,
     -0.01195891, -0.00670209],
    [-0.09415166, -0.05202752,  0.04484346, ..., -0.09822648,
      0.0377209 ,  0.03204083]],

   [[-0.05568835,  0.05245037,  0.00362601, ...,  0.04655186,
      0.08222169,  0.0323906 ],
    [ 0.01768564, -0.00624343, -0.00904115, ...,  0.10269467,
      0.05640509,  0.09641363]]], dtype=float32)>: ['encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/self_attention/pre_post_processing_wrapper_8/self_attention_4/query/kernel']
<tf.Variable 'encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/self_attention/pre_post_processing_wrapper_8/self_attention_4/key/kernel:0' shape=(280, 2, 140) dtype=float32, numpy=

array([[[ 0.02819713, 0.02876288, 0.02274299, ..., 0.08866426,
-0.05392299, 0.05793411],
[ 0.06952351, -0.04070581, -0.10173748, ..., 0.02893355,
-0.00330812, -0.07877984]],

   [[ 0.05495454, -0.01093845, -0.09050546, ...,  0.0598898 ,
     -0.06155247,  0.02567532],
    [-0.08424073,  0.04196412,  0.09877456, ..., -0.09007102,
      0.05712973, -0.08940084]],

   [[-0.00063784, -0.05345098,  0.09418813, ..., -0.02460335,
     -0.05220254,  0.06893248],
    [ 0.03363927,  0.01110125, -0.01965451, ..., -0.07790814,
      0.05504029,  0.03821679]],

   ...,

   [[ 0.02271084, -0.03269391, -0.06044077, ...,  0.08869565,
     -0.01723188, -0.05409934],
    [ 0.10146245,  0.07813064,  0.08319003, ...,  0.09265465,
      0.06040571,  0.02639508]],

   [[-0.08208962,  0.04218147,  0.09731463, ..., -0.05135154,
      0.09131274, -0.03821055],
    [ 0.05291563, -0.00134318, -0.06305656, ..., -0.03713792,
     -0.06834337,  0.05610258]],

   [[ 0.03926406,  0.01563267, -0.06823291, ..., -0.03778756,
      0.03835034, -0.06163606],
    [-0.01661775,  0.05798808, -0.09441767, ...,  0.02236413,
      0.00829161, -0.05996104]]], dtype=float32)>: ['encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/self_attention/pre_post_processing_wrapper_8/self_attention_4/key/kernel']
<tf.Variable 'encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/self_attention/pre_post_processing_wrapper_8/self_attention_4/value/kernel:0' shape=(280, 2, 140) dtype=float32, numpy=

array([[[ 0.02406118, 0.0343625 , -0.07950142, ..., 0.04772563,
0.00969613, 0.07611965],
[ 0.03666263, -0.08229625, -0.00219458, ..., -0.08711676,
0.04733781, -0.09672994]],

   [[ 0.08699372,  0.04701721,  0.07498317, ...,  0.06015236,
      0.09311568, -0.02617282],
    [-0.05987625, -0.0290884 , -0.08333313, ..., -0.05092367,
      0.00697725, -0.07460243]],

   [[ 0.07449644, -0.10163396, -0.02565712, ...,  0.02633942,
     -0.01938078,  0.04304118],
    [-0.06267202,  0.02551261,  0.0715261 , ..., -0.02070328,
     -0.00463021, -0.01613385]],

   ...,

   [[-0.05981055,  0.08555805, -0.01196311, ..., -0.00164606,
      0.09820005,  0.02571293],
    [-0.07315273, -0.00021137, -0.01340375, ..., -0.09697875,
     -0.03929126, -0.04775184]],

   [[ 0.01185719,  0.0946143 ,  0.06456488, ..., -0.05574758,
     -0.00115824,  0.10159127],
    [-0.02799977, -0.05600816,  0.06503621, ...,  0.07093834,
      0.06910012,  0.05950873]],

   [[-0.10163818,  0.06163784,  0.05308998, ..., -0.01529515,
     -0.03851745,  0.01416492],
    [-0.05432293,  0.0251294 ,  0.03094485, ..., -0.04001784,
     -0.01994764, -0.08600642]]], dtype=float32)>: ['encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/self_attention/pre_post_processing_wrapper_8/self_attention_4/value/kernel']
<tf.Variable 'encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/self_attention/pre_post_processing_wrapper_8/self_attention_4/output_transform/kernel:0' shape=(2, 140, 280) dtype=float32, numpy=

array([[[ 0.01040795, -0.07024652, 0.00894686, ..., -0.03485329,
0.00861081, 0.07646223],
[-0.06738147, -0.07616153, 0.03231331, ..., 0.0511472 ,
0.00560744, -0.08211486],
[-0.02944059, -0.03470305, 0.00333571, ..., -0.04788389,
-0.01682236, -0.06459328],
...,
[-0.00735737, -0.09849185, -0.07305449, ..., -0.05286082,
-0.08518486, 0.03404469],
[ 0.00874639, 0.00044738, -0.05671701, ..., 0.02448469,
0.04249338, -0.05758846],
[-0.02678297, -0.05845858, -0.00940267, ..., 0.05154238,
0.0932558 , -0.04002283]],

   [[-0.07215314,  0.05229052, -0.00614032, ...,  0.00989876,
     -0.08904382, -0.03186455],
    [ 0.01517111,  0.09886045, -0.02317875, ...,  0.02663433,
      0.03900095, -0.04038326],
    [ 0.07352556, -0.08514468, -0.10342583, ..., -0.08604371,
     -0.01308498, -0.02873522],
    ...,
    [-0.07181344,  0.08932251, -0.07083805, ..., -0.02752328,
      0.09097875,  0.09865584],
    [-0.04752675,  0.02988801,  0.08618397, ..., -0.0484809 ,
      0.06751857,  0.00548177],
    [ 0.09906223,  0.04683784, -0.05216091, ...,  0.02303959,
      0.00025039, -0.06622276]]], dtype=float32)>: ['encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/self_attention/pre_post_processing_wrapper_8/self_attention_4/output_transform/kernel']
<tf.Variable 'encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/ffn/pre_post_processing_wrapper_9/feed_forward_network_4/filter_layer/kernel:0' shape=(280, 2048) dtype=float32, numpy=

array([[ 0.01415678, -0.03605887, 0.03162613, ..., -0.03945763,
-0.0073288 , 0.03972591],
[ 0.02551332, 0.00799932, 0.05006762, ..., 0.02487199,
0.04509846, 0.02466176],
[ 0.00373553, -0.04407531, 0.01345216, ..., -0.01362591,
0.02301843, -0.01835387],
...,
[ 0.0322591 , 0.00598079, 0.03762009, ..., -0.03224697,
0.04059494, 0.00033377],
[-0.00664014, 0.00708507, -0.04877863, ..., 0.0017563 ,
0.01244135, 0.02614061],
[-0.03656027, 0.04186665, -0.03868963, ..., 0.03929602,
-0.04559483, -0.04399271]], dtype=float32)>: ['encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/ffn/pre_post_processing_wrapper_9/feed_forward_network_4/filter_layer/kernel']
<tf.Variable 'encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/ffn/pre_post_processing_wrapper_9/feed_forward_network_4/filter_layer/bias:0' shape=(2048,) dtype=float32, numpy=array([0., 0., 0., ..., 0., 0., 0.], dtype=float32)>: ['encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/ffn/pre_post_processing_wrapper_9/feed_forward_network_4/filter_layer/bias']
<tf.Variable 'encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/ffn/pre_post_processing_wrapper_9/feed_forward_network_4/output_layer/kernel:0' shape=(2048, 280) dtype=float32, numpy=
array([[ 2.3738377e-02, 1.6624801e-02, -6.6365153e-03, ...,
-3.4411326e-02, -3.7041619e-02, -5.0726425e-02],
[ 9.1396198e-03, -8.4491409e-03, -3.7688024e-02, ...,
2.4418816e-02, -2.3622457e-02, -2.4965273e-02],
[-3.1535979e-02, 2.9537365e-02, 3.0608639e-02, ...,
1.1284884e-02, 3.5906903e-02, -8.8443719e-03],
...,
[ 4.3789275e-02, 1.9220598e-03, 2.7374551e-02, ...,
2.6777297e-02, 1.8576272e-02, 9.0695880e-03],
[-3.4050934e-02, -1.3096701e-02, 1.8710926e-02, ...,
-3.8043298e-03, 1.2257956e-03, -4.1745387e-02],
[-1.6838312e-06, 5.3003207e-03, -2.6208920e-02, ...,
3.8072430e-02, -1.5364058e-02, 2.3037903e-03]], dtype=float32)>: ['encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/ffn/pre_post_processing_wrapper_9/feed_forward_network_4/output_layer/kernel']
<tf.Variable 'encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/ffn/pre_post_processing_wrapper_9/feed_forward_network_4/output_layer/bias:0' shape=(280,) dtype=float32, numpy=
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)>: ['encoder_only_learned_values_transformer/Transformer/encode/encoder_stack/layer_4/ffn/pre_post_processing_wrapper_9/feed_forward_network_4/output_layer/bias']
srun: error: n1491: task 0: Exited with exit code 1

@pichuan
Copy link
Collaborator

pichuan commented May 3, 2024

Hi @corkdagga ,

In your update, you said you used --checkpoint=/data/horse/ws/pada358b-genome_assembly/DC_model/model/checkpoint.index

Can you try only use the prefix like the https://github.com/google/deepconsensus/blob/r1.2/docs/quick_start.md#run-deepconsensus suggested. So: --checkpoint=/data/horse/ws/pada358b-genome_assembly/DC_model/model/checkpoint

I don't think it would work if you pass in the index.

Thanks!

@corkdagga
Copy link
Author

corkdagga commented May 3, 2024

Hi @pichuan

Deepconsensus was running for a while successfully but unfortunately the following error appeared:

I0503 10:20:37.394586 140737354053440 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.321 seconds
I0503 10:20:37.467836 140737354053440 quick_inference.py:931] Processed 47000 ZMWs in 2796.068 seconds
I0503 10:20:42.877036 140737354053440 quick_inference.py:693] Example summary: ran model=30 (4.73%; 0.168s) skip=604 (95.27%; 0.033s) total=634.
I0503 10:20:42.886625 140737354053440 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.193 seconds
I0503 10:20:42.926704 140737354053440 quick_inference.py:931] Processed 47100 ZMWs in 2801.527 seconds
I0503 10:20:48.733838 140737354053440 quick_inference.py:693] Example summary: ran model=117 (18.34%; 0.406s) skip=521 (81.66%; 0.030s) total=638.
I0503 10:20:48.742551 140737354053440 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.441 seconds
I0503 10:20:48.783353 140737354053440 quick_inference.py:931] Processed 47200 ZMWs in 2807.384 seconds
Traceback (most recent call last):
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/bin/deepconsensus", line 8, in
sys.exit(run())
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/deepconsensus/cli.py", line 118, in run
app.run(main, flags_parser=parse_flags)
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/deepconsensus/cli.py", line 103, in main
app.run(quick_inference.main, argv=passed)
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 977, in main
outcome_counter = run()
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 912, in run
for zmw, subreads, dc_config, window_widths in input_file_generator:
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 480, in stream_bam
for input_data in proc_feeder():
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/deepconsensus/preprocess/pre_lib.py", line 1309, in proc_feeder
for read_set in subread_grouper:
File "/data/horse/ws/pada358b-genome_assembly/conda/envs/DCpy39/lib/python3.9/site-packages/deepconsensus/preprocess/pre_lib.py", line 73, in next
read = next(self.bam_reader)
File "pysam/libcalignmentfile.pyx", line 1876, in pysam.libcalignmentfile.AlignmentFile.next
OSError: error -3 while reading file
srun: error: n1555: task 0: Exited with exit code 1

@corkdagga
Copy link
Author

corkdagga commented May 7, 2024

Hi again @pichuan

A small update regarding the message above.

I reinstalled deepconensus locally on a new computer with GPU. I installed it using pip [gpu=1.2.0] and everything seemed to go ok (I did however have difficulty installing with docker).

Anyway, it installed correctly with pip and I again ran Deepconcensus on the same data as previously. However, I again received an error at exactly the same position as last time - after processing 47200 ZMWs. Therefore I have a feeling it is most likely an error with my input files? Do you agree and do you know how I could check, then fix that? Below is a portion of the output for the new run.

Thanks!

I0507 08:57:24.521475 128752640931648 quick_inference.py:693] Example summary: ran model=53 (8.48%; 0.150s) skip=572 (91.52%; 0.047s) total=625.
I0507 08:57:24.533131 128752640931648 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.262 seconds
I0507 08:57:24.570044 128752640931648 quick_inference.py:931] Processed 46100 ZMWs in 2692.503 seconds
I0507 08:57:30.603274 128752640931648 quick_inference.py:693] Example summary: ran model=80 (12.58%; 0.224s) skip=556 (87.42%; 0.091s) total=636.
I0507 08:57:30.621727 128752640931648 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.502 seconds
I0507 08:57:30.665697 128752640931648 quick_inference.py:931] Processed 46200 ZMWs in 2698.599 seconds
I0507 08:57:36.891413 128752640931648 quick_inference.py:693] Example summary: ran model=78 (11.42%; 0.224s) skip=605 (88.58%; 0.054s) total=683.
I0507 08:57:36.908673 128752640931648 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.544 seconds
I0507 08:57:36.957104 128752640931648 quick_inference.py:931] Processed 46300 ZMWs in 2704.890 seconds
I0507 08:57:43.048226 128752640931648 quick_inference.py:693] Example summary: ran model=144 (21.69%; 0.258s) skip=520 (78.31%; 0.059s) total=664.
I0507 08:57:43.059175 128752640931648 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.392 seconds
I0507 08:57:43.107503 128752640931648 quick_inference.py:931] Processed 46400 ZMWs in 2711.041 seconds
I0507 08:57:48.900568 128752640931648 quick_inference.py:693] Example summary: ran model=75 (11.65%; 0.237s) skip=569 (88.35%; 0.048s) total=644.
I0507 08:57:48.911521 128752640931648 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.338 seconds
I0507 08:57:48.953398 128752640931648 quick_inference.py:931] Processed 46500 ZMWs in 2716.886 seconds
I0507 08:57:55.150791 128752640931648 quick_inference.py:693] Example summary: ran model=63 (10.77%; 0.163s) skip=522 (89.23%; 0.042s) total=585.
I0507 08:57:55.160457 128752640931648 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.324 seconds
I0507 08:57:55.208146 128752640931648 quick_inference.py:931] Processed 46600 ZMWs in 2723.141 seconds
I0507 08:58:02.075521 128752640931648 quick_inference.py:693] Example summary: ran model=80 (12.12%; 0.264s) skip=580 (87.88%; 0.076s) total=660.
I0507 08:58:02.086779 128752640931648 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.637 seconds
I0507 08:58:02.146010 128752640931648 quick_inference.py:931] Processed 46700 ZMWs in 2730.079 seconds
I0507 08:58:07.874739 128752640931648 quick_inference.py:693] Example summary: ran model=70 (11.25%; 0.184s) skip=552 (88.75%; 0.053s) total=622.
I0507 08:58:07.886029 128752640931648 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.266 seconds
I0507 08:58:07.928063 128752640931648 quick_inference.py:931] Processed 46800 ZMWs in 2735.861 seconds
I0507 08:58:14.496442 128752640931648 quick_inference.py:693] Example summary: ran model=120 (17.14%; 0.284s) skip=580 (82.86%; 0.049s) total=700.
I0507 08:58:14.508665 128752640931648 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.474 seconds
I0507 08:58:14.554958 128752640931648 quick_inference.py:931] Processed 46900 ZMWs in 2742.488 seconds
I0507 08:58:20.421319 128752640931648 quick_inference.py:693] Example summary: ran model=48 (7.78%; 0.175s) skip=569 (92.22%; 0.060s) total=617.
I0507 08:58:20.439472 128752640931648 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.281 seconds
I0507 08:58:20.486044 128752640931648 quick_inference.py:931] Processed 47000 ZMWs in 2748.419 seconds
I0507 08:58:25.846841 128752640931648 quick_inference.py:693] Example summary: ran model=30 (4.73%; 0.131s) skip=604 (95.27%; 0.049s) total=634.
I0507 08:58:25.857717 128752640931648 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.219 seconds
I0507 08:58:25.896061 128752640931648 quick_inference.py:931] Processed 47100 ZMWs in 2753.829 seconds
I0507 08:58:31.649722 128752640931648 quick_inference.py:693] Example summary: ran model=117 (18.34%; 0.252s) skip=521 (81.66%; 0.043s) total=638.
I0507 08:58:31.660274 128752640931648 quick_inference.py:770] Processed a batch of 100 ZMWs in 1.349 seconds
I0507 08:58:31.704504 128752640931648 quick_inference.py:931] Processed 47200 ZMWs in 2759.638 seconds
Traceback (most recent call last):
File "/home/gulderlab/miniconda3/envs/DC/bin/deepconsensus", line 8, in
sys.exit(run())
File "/home/gulderlab/miniconda3/envs/DC/lib/python3.9/site-packages/deepconsensus/cli.py", line 118, in run
app.run(main, flags_parser=parse_flags)
File "/home/gulderlab/miniconda3/envs/DC/lib/python3.9/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/home/gulderlab/miniconda3/envs/DC/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/home/gulderlab/miniconda3/envs/DC/lib/python3.9/site-packages/deepconsensus/cli.py", line 103, in main
app.run(quick_inference.main, argv=passed)
File "/home/gulderlab/miniconda3/envs/DC/lib/python3.9/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/home/gulderlab/miniconda3/envs/DC/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/home/gulderlab/miniconda3/envs/DC/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 977, in main
outcome_counter = run()
File "/home/gulderlab/miniconda3/envs/DC/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 912, in run
for zmw, subreads, dc_config, window_widths in input_file_generator:
File "/home/gulderlab/miniconda3/envs/DC/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 480, in stream_bam
for input_data in proc_feeder():
File "/home/gulderlab/miniconda3/envs/DC/lib/python3.9/site-packages/deepconsensus/preprocess/pre_lib.py", line 1309, in proc_feeder
for read_set in subread_grouper:
File "/home/gulderlab/miniconda3/envs/DC/lib/python3.9/site-packages/deepconsensus/preprocess/pre_lib.py", line 73, in next
read = next(self.bam_reader)
File "pysam/libcalignmentfile.pyx", line 1876, in pysam.libcalignmentfile.AlignmentFile.next
OSError: error -3 while reading file

@corkdagga
Copy link
Author

Hi @pichuan,

Just wanted to check in and see if you had any potential ideas for the errors described above?

@pichuan
Copy link
Collaborator

pichuan commented May 15, 2024

Hi @corkdagga , I agree that this looks more like some issue with your input file.
Can you check your input file?

By the way, I can't remember - did you try going through Quick Start (using the inputs provided there) and confirm that your current setup works?

@corkdagga
Copy link
Author

Hi @pichuan

I will have the check the file itself on the weekend (sorry) but I will get back to you about that.

To generate the files, I used the following commands based on the quick start:

srun ccs -j 12 --min-rq=0.88 m54089_200615_125054.subreads.bam PD049.CCS.bam

srun actc -j 12 m54089_200615_125054.subreads.bam PD049.CCS.bam PD049.CCS.actc.bam

and then run DeepConcensensus using the recommended input:

deepconsensus run
--subreads_to_ccs=${shard_id}.subreads_to_ccs.bam
--ccs_bam=${shard_id}.ccs.bam
--checkpoint=model/checkpoint \

I am not performing any sharding, so for the --subreads_to_ccs= and --ccs_bam=${shard_id}.ccs.bam arguments I am using the ccs and actc files generated earlier.

Not sure if this information is helpful to see some kind of error I am making. Otherwise I will try check the file on the weekend and I will try and generate the ccs and actc files again, perhaps that helps.

Thanks!

@corkdagga
Copy link
Author

Hi @pichuan

I have done some more work on the problem. I think my issues are with the starting files and not DeepConcensus. You can close the issue it you like. Thanks for all the help.

@pichuan pichuan closed this as completed May 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants