Skip to content

Trallie (“Transfer learning for information extraction”) boosts IE for search among textual asset descriptions by doing away with costly human annotation, instead leveraging LLM capabilities to follow NL guidelines, understand labels, and manipulate NL like it does for code.

License

Notifications You must be signed in to change notification settings

PiSchool/trallie

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Trallie

Trallie (“Transfer learning for information extraction”) boosts IE for search among textual asset descriptions by doing away with costly human annotation, instead leveraging LLM capabilities to follow NL guidelines, understand labels, and manipulate NL like it does for code.

Problem: Natural language descriptions of assets and resources are here to stay, both as legacy or as flexible catch-alls. Clustering and categorizing them to run structured search queries traditionally requires information extraction (IE), with some partial solutions offered by RAG and dense embedding matching. This often is bottlenecked by costly human annotation, if only to provide few-shot examples of categories.

Ambition: Trallie brings transfer learning and world understanding afforded by LLM to make information extraction agile. We deliver multilingual, IE-fine-tuned checkpoints of various open model architectures; and for reproducibility, our full fine-tuning recipe including prompt templates.

Impact: Transfer learning and natural language input imply impact on legacy and low-resource scenarios, improving discoverability of hidden asset collections, plurality of sources through easier access to search tools, improved trust and privacy.

Team: At Pi School, our experience of rapid prototyping in AI, acquired over >100 AI projects, gives us an advantage in exploiting the rapidly moving SOTA.

About

Trallie (“Transfer learning for information extraction”) boosts IE for search among textual asset descriptions by doing away with costly human annotation, instead leveraging LLM capabilities to follow NL guidelines, understand labels, and manipulate NL like it does for code.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published