Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can CXR-BERT be used / fine-tuned for text generation? #886

Open
PabloMessina opened this issue May 9, 2023 · 3 comments
Open

Can CXR-BERT be used / fine-tuned for text generation? #886

PabloMessina opened this issue May 9, 2023 · 3 comments
Labels
hi-ml-multimodal Issues related to the hi-ml-multimodal package

Comments

@PabloMessina
Copy link

PabloMessina commented May 9, 2023

I have many experiments in mind where I need to condition a Transformer Decoder with some input (e.g. image features, discrete binary labels, a one-hot representing some concept, a question, etc.) in order to generate an output (e.g. a report, an answer). I have already implemented many of these ideas using my own custom Transformer Decoder based on PyTorch's standard implementation. However, now I would like to leverage existing pre-trained language models, instead of my custom implementation that always starts from scratch. Thus, I was wondering if there is an easy way to adapt CXR-BERT (or any other model that you guys would recommend) for text generation, given some input. For example, let's say I have a binary vector encoding certain information, and I want to fine-tune CXR-BERT to generate a paragraph verbalizing the information contained in this binary vector. The paragraph could be, for example, a radiology report, so it makes sense that fine-tuning a model like CXR-BERT for report generation should outperform a custom Transformer Decoder from PyTorch trained from scratch.

Questions:

  • Is this something that can be easily accomplished?
  • Are there examples of adapting CXR-BERT for text generation?
  • What if I need a custom input that conditions the text generation, such as a binary vector?

Thank you very much in advance.

@ant0nsc ant0nsc added the hi-ml-multimodal Issues related to the hi-ml-multimodal package label May 11, 2023
@ant0nsc
Copy link
Collaborator

ant0nsc commented May 11, 2023

@fepegar could you route that question please?

@fepegar
Copy link
Contributor

fepegar commented May 11, 2023

@corcra @Shruthi42 @ozan-oktay @qianchu

Could you please share your thoughts?

@qianchu
Copy link

qianchu commented May 23, 2023

Hello, you can run the following

from transformers import BertLMHeadModel
model = BertLMHeadModel.from_pretrained(<cxr-bert model path>, is_decoder=True)

to initialise a decoder model, and then you can finetune this model for generation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hi-ml-multimodal Issues related to the hi-ml-multimodal package
Projects
None yet
Development

No branches or pull requests

4 participants