Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CoNNL-U language support #3810

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Querela
Copy link

@Querela Querela commented Jul 11, 2024

See issue #3790.

This pull requests add support for the CoNLL-U format.

Known issues:

  • not all token classes will be listed on faq.html#how-do-i-know-which-tokens-i-can-style-for due to conditional parsing. CoNLL-U contains 10 tab-separated columns, regular expressions in Prism can't exactly specify the N-th field (only with lookbehind with N-1 tabbed-values?). So token classes will be assign in the after-tokenize hook. Some columns will then also be parsed for additional structures (key-values in feats/deps/misc).
  • it's not completely KISS since I tried to parse almost everything I could instead of just generic "those a column values" but more this is the x-th column with this substructure.

And I was not sure how the current state of the v2-Release is, so I based it on the current default branch master.

Copy link

No JS Changes

Generated by 🚫 dangerJS against 768c8f1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant