Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Typst #6379

Merged
merged 15 commits into from
Sep 11, 2023
Merged

Add Typst #6379

merged 15 commits into from
Sep 11, 2023

Conversation

michidk
Copy link
Contributor

@michidk michidk commented Apr 16, 2023

Add the Typst language, fixes #6358.

Checklist:

Notes

As I executed the add-grammar script I got the following error:

$ script/add-grammar --replace typst-lsp  https://github.com/nvarner/typst-lsp/blob/master/addons/vscode/typst.tmLanguage.json
...
source 'vendor/grammars/typst-lsp' contains no grammar files

@michidk michidk requested a review from a team as a code owner April 16, 2023 11:39
Copy link
Member

@lildude lildude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline comments.

The .typ extension is used by other languages but Linguist doesn't know this. Accordingly, at least one other language needs to get .typ associated with it so this PR doesn't cause all users to be identified as "typst".

As XML seems to be a common user, I recommend adding the extension and a sample to XML in this PR too.

That said, the .typ extension as returned by your search and heuristic is not popular enough for inclusion yet.

lib/linguist/languages.yml Outdated Show resolved Hide resolved
lib/linguist/heuristics.yml Outdated Show resolved Hide resolved
@michidk
Copy link
Contributor Author

michidk commented Apr 17, 2023

I'll remove typst - makes sense.

That said, the .typ extension as returned by your search and heuristic is not popular enough for inclusion yet.

I think you mistyped, you meant that .typst is not popular enough, right? Because .typ returns over 200 results (with the heuristic): https://github.com/search?q=path%3A*.typ+AND+%2F%5E%23%28%3F%3Aimport%7Cshow%7Clet%7Cset%29%2F&type=code

So if I don't add another language, the heuristic is just ignored and it will always match to typst?

@michidk
Copy link
Contributor Author

michidk commented Apr 17, 2023

Where should I add the XML .typ sample to? Typst or XML?

@lildude
Copy link
Member

lildude commented Apr 17, 2023

I think you mistyped, you meant that .typst is not popular enough, right? Because .typ returns over 200 results: https://github.com/search?q=path%3A*.typ+AND+%2F%5E%23%28%3F%3Aimport%7Cshow%7Clet%7Cset%29%2F&type=code

Nope. I meant .typ. The 200 limit applies to unique :user/:repo combinations which would be the case if this file only ever occurred once per repo, but this isn't the case as it's clear two repos are having a disproportionate influence on the usage.

So if I don't add another language, the heuristic is just ignored and it will always match to typst?

Yup because the extension strategy is used before heuristics and the extension will only match one language so things won't go beyond that strategy:

https://github.com/github/linguist/blob/5a0c74277548122267d84283910abd5e0b89380e/lib/linguist.rb#L61-L71

Where should I add the XML .typ sample to? Typst or XML?

XML. You'll need to add the extension to the XML language entry too.

@michidk michidk requested a review from lildude April 17, 2023 15:17
@michidk michidk mentioned this pull request Apr 17, 2023
@lildude lildude changed the title Add typst Add Typst Apr 17, 2023
@lildude
Copy link
Member

lildude commented Apr 17, 2023

As I executed the add-grammar script I got the following error:

$ script/add-grammar --replace typst-lsp  https://github.com/nvarner/typst-lsp/blob/master/addons/vscode/typst.tmLanguage.json
...
source 'vendor/grammars/typst-lsp' contains no grammar files

This is happening because of the layout of the repo. Linguist expects repos to be actual grammar repos as used by the relevant editors and as such will only look in certain locations for the grammar files as the editors do:

https://github.com/github/linguist/blob/5a0c74277548122267d84283910abd5e0b89380e/tools/grammars/compiler/loader.go#L119-L128

As this is a json formatted grammar, it needs to be in a directory called /grammars or /syntaxes or in the root of the directly. This isn't the case here.

You're need to either find another grammar, restructure the layout of the grammar repo, or split the grammar out into its own repo.

@michidk
Copy link
Contributor Author

michidk commented Apr 18, 2023

So the original grammar from https://github.com/nvarner/typst-lsp had the following errors:

  • Invalid regex in grammar: source.typst (in grammars/tmlanguage.json) contains a malformed regex (regex "((#)if|(?<=(}|])\s*)else)\b": lookbehind assertion is not fixed length (at offset 19))
  • Invalid regex in grammar: source.typst (in grammars/tmlanguage.json) contains a malformed regex (regex "(?<=#[[:alpha:]_][[:alnum:]_-]*!...": lookbehind assertion is not fixed length (at offset 33))
  • Invalid regex in grammar: source.typst (in grammars/tmlanguage.json) contains a malformed regex (regex "\+|\*|/|(?<![[:alpha:]_][[:alnum...": POSIX named classes are supported only within a class (at offset 43))
  • Invalid regex in grammar: source.typst (in grammars/tmlanguage.json) contains a malformed regex (regex "(?<=\bshow\s*)\b[[:alpha:]_][[:a...": lookbehind assertion is not fixed length (at offset 13))
  • Invalid regex in grammar: source.typst (in grammars/tmlanguage.json) contains a malformed regex (regex "(?<=\b[[:alpha:]_][[:alnum:]_-]*...": lookbehind assertion is not fixed length (at offset 34))

I now made a fork (https://github.com/michidk/typst-grammar) that has the required structure and fixes the issues.

The script runs and completes now without errors:

OK! added grammar source 'vendor/grammars/typst-grammar'
        new scope: source.typst
Caching grammar license
Caching dependency records for linguist

@michidk michidk requested a review from lildude April 18, 2023 10:49
@michidk
Copy link
Contributor Author

michidk commented Apr 22, 2023

@lildude Is there some concrete way to measure the popularity of this language as defined by linguist? Is there a concrete number we can see somewhere? You mentioned that GitHub search cannot be used, since it does not count unique repositories using a file type. Then how is this measured? Or is this just guestimated from the search results?

@lildude
Copy link
Member

lildude commented Apr 23, 2023

There isn't a concrete number to look at. The popularity is currently approximated from the search results. See #5756

@michidk
Copy link
Contributor Author

michidk commented Apr 26, 2023

@lildude Can you please execute the workflow?

@lildude
Copy link
Member

lildude commented Apr 27, 2023

@lildude Can you please execute the workflow?

Sure, but it's with stating this won't be merged until usage has met the required levels. If you want to see how CI is behaving in the interim, run the tests locally.

@michidk michidk requested a review from a team as a code owner July 3, 2023 17:46
@michidk
Copy link
Contributor Author

michidk commented Aug 11, 2023

@lildude The search now shows 2k files.

@lildude lildude added this pull request to the merge queue Sep 11, 2023
Merged via the queue into github-linguist:master with commit 28aa3fc Sep 11, 2023
5 checks passed
@github-linguist github-linguist locked as resolved and limited conversation to collaborators Jun 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Language: Typst
3 participants