[Feature suggestion]: Part-of-speech highlighting #38

SichangHe · 2024-07-27T15:47:09Z

Hey, since this is a text language server, would you be interested in adding syntax highlighting?

I did some experiment (https://github.com/SichangHe/natural_syntax), but it sucks to install because of Rust-BERT and libtorch. I guess something like https://github.com/flairNLP/flair/ would be better, so it makes sense to have it in Python instead.

Please let me know what you think, and then I could share more information.

Thanks!
Steven Hé (Sīchàng)

hangyav · 2024-08-03T05:42:07Z

Hi,

Thanks for the suggestion. Could you tell a bit more about the use case of this feature? Would you use it in any special case or in general? Is it only POS based highlighting or are there other options as well? Other than plain text files, e.g., markdown or latex, have syntax highlighting of their own. How would these play together (technically I guess you either use the editor's or the LSP's highlighting)?

SichangHe · 2024-08-03T09:11:16Z

Could you tell a bit more about the use case of this feature? Would you use it in any special case or in general?

Highlighting code definitely makes them easier to read (after you get used to it); highlighting text may have similar effects. This shows especially when the reader is having fatigue.

Is it only POS based highlighting or are there other options as well?

I have not explored other options. I did see an attention-based highlighting, but it looks very inaccurate.

Other than plain text files, e.g., markdown or latex, have syntax highlighting of their own. How would these play together (technically I guess you either use the editor's or the LSP's highlighting)?

This is a good question. It is a pain in the ass. To actually support text with markup, you would need to parse them, remove the markup (unless your model can deal with markup), feed it into the model, map the result back to the text, then add the markup highlighting on top. LSP semantic highlights usually are set to overwrite regex and Tree-sitter.

With that said, this is non-trivial. I am throwing it out here to see if there are interests in it.

hangyav · 2024-08-11T06:59:39Z

The POS-based full highlighting is a bit too much for me personally but highlighting e.g. just named entities could be interesting.

That said I'm worried about mixing this with markup highlight. One of the main goals of textLSP is to support markup languages. Currently I'd say it quite easy to add now file types, but if we also need to implement markup highlighting that might make things difficult and textLSP quite heavy (although with TreeSitter it might not be that difficult but I'm not sure).

I am not totally against the idea but as you can see from the frequency of my answers, unfortunately I don't have as much time to spend on the project as I'd like to. So if you are still interested despite the markup highlighting issues, could you draft a proposal of how this could be implemented in the current framework?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature suggestion]: Part-of-speech highlighting #38

[Feature suggestion]: Part-of-speech highlighting #38

SichangHe commented Jul 27, 2024

hangyav commented Aug 3, 2024

SichangHe commented Aug 3, 2024 •

edited

Loading

hangyav commented Aug 11, 2024

[Feature suggestion]: Part-of-speech highlighting #38

[Feature suggestion]: Part-of-speech highlighting #38

Comments

SichangHe commented Jul 27, 2024

hangyav commented Aug 3, 2024

SichangHe commented Aug 3, 2024 • edited Loading

hangyav commented Aug 11, 2024

SichangHe commented Aug 3, 2024 •

edited

Loading