Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use drug class instead of gene name to query for high level class #278

Merged
merged 6 commits into from
Jul 18, 2023

Conversation

lvreynoso
Copy link
Contributor

@lvreynoso lvreynoso commented Jul 17, 2023

Obtains the names of high-level drug classes for each gene by using the gene's drug class to query the ontology file.

Sample output:
Screenshot 2023-07-17 at 17 07 33

@lvreynoso lvreynoso marked this pull request as ready for review July 18, 2023 00:22
if 'highLevelDrugClasses' not in ontology[gene_name]:
return []
return ontology[gene_name]['highLevelDrugClasses']
def get_high_level_classes(drug_classes_union):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the form of drug_classes_union? A set of strings like "rifamycin antibiotic; peptide antibiotic"? Wondering why it would have length > 1 ever

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's type is list[str] but there is usually only one element in the list, a string containing drug classes separated by semicolons, e.g.:

Screenshot 2023-07-18 at 09 15 36

The variable is created by this line of code:

dc = remove_na(set(sub_df['Drug Class_contig_amr']).union(set(sub_df['Drug Class_kma_amr'])))

This can result in two elements in the list if the two columns differ.

Copy link
Contributor Author

@lvreynoso lvreynoso Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hoping that soon I can move the python code out to a separate file and upgrade the container's python version to >= 3.9 to make type hinting a lot easier to add.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the meantime that's probably a misleading variable name, I'll change it to something better

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh that makes more sense that it would have length > 1 because of two sources. Thanks for explaining!

@lvreynoso lvreynoso merged commit 3ce07d9 into main Jul 18, 2023
11 checks passed
@lvreynoso lvreynoso deleted the lvreynoso/hldc_from_dc_query branch July 18, 2023 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants