Error running example notebook #4

00dylan00 · 2024-07-04T14:29:54Z

Hi! I have run the downstream_task_example.ipynb but ran into the following issue:

parameters, forward_fn, tokenizer, config, mlm_config = get_pretrained_downstream_model(
    model_name="tcga_5_cohorts",
    checkpoint_directory="../checkpoints/",
)

Which returned the following errror:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[2], [line 1](vscode-notebook-cell:?execution_count=2&line=1)
----> [1](vscode-notebook-cell:?execution_count=2&line=1) parameters, forward_fn, tokenizer, config, mlm_config = get_pretrained_downstream_model(
      [2](vscode-notebook-cell:?execution_count=2&line=2)     model_name="tcga_5_cohorts",
      [3](vscode-notebook-cell:?execution_count=2&line=3)     checkpoint_directory="../checkpoints/",
      [4](vscode-notebook-cell:?execution_count=2&line=4) )
      [5](vscode-notebook-cell:?execution_count=2&line=5) forward_fn = hk.transform(forward_fn)

File /aloy/home/ddalton/projects/multiomics-open-research/multiomics_open_research/bulk_rna_bert/downstream/pretrained.py:93, in get_pretrained_downstream_model(model_name, compute_dtype, param_dtype, output_dtype, checkpoint_directory)
     [90](https://file+.vscode-resource.vscode-cdn.net/aloy/home/ddalton/projects/multiomics-open-research/multiomics_open_research/bulk_rna_bert/downstream/pretrained.py:90)     embeddings_layer_to_use = mlm_config.num_layers
     [91](https://file+.vscode-resource.vscode-cdn.net/aloy/home/ddalton/projects/multiomics-open-research/multiomics_open_research/bulk_rna_bert/downstream/pretrained.py:91) mlm_config.embeddings_layers_to_save = (embeddings_layer_to_use,)
---> [93](https://file+.vscode-resource.vscode-cdn.net/aloy/home/ddalton/projects/multiomics-open-research/multiomics_open_research/bulk_rna_bert/downstream/pretrained.py:93) tokenizer = BinnedExpressionTokenizer(
     [94](https://file+.vscode-resource.vscode-cdn.net/aloy/home/ddalton/projects/multiomics-open-research/multiomics_open_research/bulk_rna_bert/downstream/pretrained.py:94)     gene_expression_bins=np.array(mlm_config.rnaseq_tokenizer_bins),
     [95](https://file+.vscode-resource.vscode-cdn.net/aloy/home/ddalton/projects/multiomics-open-research/multiomics_open_research/bulk_rna_bert/downstream/pretrained.py:95)     prepend_cls_token=False,
     [96](https://file+.vscode-resource.vscode-cdn.net/aloy/home/ddalton/projects/multiomics-open-research/multiomics_open_research/bulk_rna_bert/downstream/pretrained.py:96) )
     [98](https://file+.vscode-resource.vscode-cdn.net/aloy/home/ddalton/projects/multiomics-open-research/multiomics_open_research/bulk_rna_bert/downstream/pretrained.py:98) if model_name not in MODEL_NAME_TO_HEAD_NAME:
     [99](https://file+.vscode-resource.vscode-cdn.net/aloy/home/ddalton/projects/multiomics-open-research/multiomics_open_research/bulk_rna_bert/downstream/pretrained.py:99)     raise ValueError(f"Model {model_name} not supported.")

TypeError: BinnedExpressionTokenizer.__init__() got an unexpected keyword argument 'gene_expression_bins'

I imagined the parameter gene_expression_bins could be referring to n_expressions_bins but changing this just gave back more errors.

As a side-note installation was smooth through pip install -e . as suggested in the README although I had to add sys.path.append(os.path.dirname(os.path.abspath(os.getcwd()))) when running the example notebook for python to find the package.

Thanks!

The text was updated successfully, but these errors were encountered:

Maxenceglrd · 2024-07-11T13:32:07Z

Hi! Thanks a lot for your interest in this repository and for having raised this issue.

It is indeed a mistake on the tokenizer loading, it will be corrected shortly.

Concerning your installation side-note, can you make sure that the Python kernel you are using when running your notebook is the same as the one in which you have installed the repository as package (using pip install -e .)?

Maxence

00dylan00 · 2024-07-21T10:54:07Z

Fantastic works perfectly!

How would fine-tuning of this model be performed?

Maxenceglrd mentioned this issue Jul 11, 2024

Fix tokenizer loading for downstream task #5

Merged

Maxenceglrd linked a pull request Jul 11, 2024 that will close this issue

Fix tokenizer loading for downstream task #5

Merged

Maxenceglrd closed this as completed in #5 Jul 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error running example notebook #4

Error running example notebook #4

00dylan00 commented Jul 4, 2024

Maxenceglrd commented Jul 11, 2024 •

edited

Loading

00dylan00 commented Jul 21, 2024

Error running example notebook #4

Error running example notebook #4

Comments

00dylan00 commented Jul 4, 2024

Maxenceglrd commented Jul 11, 2024 • edited Loading

00dylan00 commented Jul 21, 2024

Maxenceglrd commented Jul 11, 2024 •

edited

Loading