Initialization of Type Words #30

anushkasw · 2024-07-09T20:37:19Z

I had a question regarding the initialization of type words. According to the code:
if self.args.init_type_words:
so_word = [a[0] for a in self.tokenizer(["[obj]","[sub]"], add_special_tokens=False)['input_ids']]
meaning_word = [a[0] for a in self.tokenizer(["person","organization", "location", "date", "country"], add_special_tokens=False)['input_ids']]

The meaning words are initialized with certain entity types. While these are the probable entity types for the TACRED dataset, the same is not true for the SemEval dataset.

I wanted to know how this initialization affects the working of the algorithm on other datasets like SemEval. Should we change this initialization based on the dataset?

njcx-ai · 2024-07-10T03:50:45Z

Thanks for your attention. The choice of initialization has some influence on the final model performance, but the impact is not significant. To establish a stronger correspondence between the type words and the specific task at hand, it is necessary to adapt the initialization based on the characteristics of the dataset.

anushkasw · 2024-07-10T03:52:46Z

Got it. So, did you use a different set of initialization while training the semeval dataset or the same ones?

njcx-ai · 2024-07-10T07:37:21Z

The same one.

zxlzr · 2024-07-13T05:58:09Z

hi buddy, do you have any further questions?

zxlzr closed this as completed Jul 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initialization of Type Words #30

Initialization of Type Words #30

anushkasw commented Jul 9, 2024

njcx-ai commented Jul 10, 2024

anushkasw commented Jul 10, 2024

njcx-ai commented Jul 10, 2024

zxlzr commented Jul 13, 2024

Initialization of Type Words #30

Initialization of Type Words #30

Comments

anushkasw commented Jul 9, 2024

njcx-ai commented Jul 10, 2024

anushkasw commented Jul 10, 2024

njcx-ai commented Jul 10, 2024

zxlzr commented Jul 13, 2024