-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use drug class instead of gene name to query for high level class #278
Conversation
…high level drug class values
workflows/amr/run.wdl
Outdated
if 'highLevelDrugClasses' not in ontology[gene_name]: | ||
return [] | ||
return ontology[gene_name]['highLevelDrugClasses'] | ||
def get_high_level_classes(drug_classes_union): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the form of drug_classes_union
? A set of strings like "rifamycin antibiotic; peptide antibiotic"? Wondering why it would have length > 1 ever
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's type is list[str]
but there is usually only one element in the list, a string containing drug classes separated by semicolons, e.g.:
The variable is created by this line of code:
dc = remove_na(set(sub_df['Drug Class_contig_amr']).union(set(sub_df['Drug Class_kma_amr'])))
This can result in two elements in the list if the two columns differ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hoping that soon I can move the python code out to a separate file and upgrade the container's python version to >= 3.9 to make type hinting a lot easier to add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the meantime that's probably a misleading variable name, I'll change it to something better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh that makes more sense that it would have length > 1 because of two sources. Thanks for explaining!
Obtains the names of high-level drug classes for each gene by using the gene's drug class to query the ontology file.
Sample output: