Skip to content

LaRA: Large-language-model based semantic similarity for RAre disease

License

Notifications You must be signed in to change notification settings

4pygmalion/LaRa

Repository files navigation

LaRA: Large-language-model based semantic similarity for RAre disease

Semantic similarity between patient's HPOs and rare diseases' HPOs

overview

Install

  • TODO: SimCLR 설치하기

Embedding vector 구하기

  1. HPO definition에 대한 embedding vector 구하기
$ python3 concept_embedding.py -k [OpenAI key] -e hpo_definition -i 0.18 --resume
  1. HPO name 대한 embedding vector 구하기
$ python3 concept_embedding.py -k [OpenAI key] -e hpo_name -i 0.18 --resume

HPO 핸들링

1. 온톨로지 생성

>>> from core.data_model import Ontology
>>> from core.io_ops import load_pickle
>>> from ontology_src import ARTIFACT_PATH

>>> hpos = load_pickle(ARTIFACT_PATH["hpo_name"])
>>> my_ontology = Ontology(hpos)

2. HPO의 집합 생성(HPOs) 및 벡터화

>>> patient_hpos = my_ontology(["HP:0000005", "HP:0000006"])
>>> patient_hpos.vector
array([[-0.00792581,  0.01476753, -0.00463055, ..., -0.00431325,
         0.01031215, -0.02622988],
       [-0.00972291,  0.02108144, -0.00561761, ..., -0.00913226,
         0.00773679, -0.01496731]])

3. Iteration

>>> for hpo in hpos:
>>>     print(hpo)
HPO(id='HP:0000005', name='Mode of inheritance', ...)
HPO(id='HP:0000006', name='Autosomal dominant inheritance', ...,)

4. Membership

>>> "HP:0000005" in patient_hpos
True
>>> HPO(id="HP:0000005",...) in patient_hpos
True

5. Index(getitem)

>>> patient_hpos["HP:0000005"]
HPO(id='HP:0000005', name='Mode of inheritance', ...)

Dependency

$ git clone https://github.com/4pygmalion/SemanticSimilarity.git
$ cd SemanticSimilarity
$ pip install .

About

LaRA: Large-language-model based semantic similarity for RAre disease

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published