Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modula-2 file extension '.def' is missing #6451

Open
trijezdci opened this issue Jun 16, 2023 · 5 comments
Open

Modula-2 file extension '.def' is missing #6451

trijezdci opened this issue Jun 16, 2023 · 5 comments

Comments

@trijezdci
Copy link

Modula-2 separates interface and implementation in separate files.

Interfaces (historically but incorrectly called definition modules) have file extension '.def'.
Implementation and program modules have file extension '.mod'.

This is similar to C's '.h' and '.c' files but unlike C, the modules are NOT preprocessor includes, they are separate compilation units and it is entirely impossible to write Modula-2 libraries without having '.def' files. This is not a matter of choice or style, this is an absolute requirement.

Not supporting '.def' as a file extension for Modula-2 means that the entire syntax highlighting for Modula-2 is completely broken as it is 50% incomplete.

So, can you please add recognition for the '.def' file extension.

I had reported this many years ago, but apparently nothing has happened since and I can't even find the report now. It would be nice if this wasn't just swept under the carpet again. It should be rather easy to add.

Thank you in advance.

@lildude
Copy link
Member

lildude commented Jun 16, 2023

So, can you please add recognition for the '.def' file extension.

I had reported this many years ago, but apparently nothing has happened since and I can't even find the report now.

I can with a very easy search: #3657 😉 It was closed automatically back when we used that bot. That issue also went way into the weeds all outside of Linguist's scope.

And in much as my response then, the same applies now: Linguist is a community-driven project, so in short: if you want it, submit a PR to add support or implement an override and wait patiently until someone else does.

You can find details for adding support in the CONTRIBUTING.md file.

One thing to keep in mind is .def is a very generic extension likely to be used by many languages. We can add support but it will probably require a very precise heuristic (aka regex) for identifying the files as precisely as possible to reduce the chances of misclassifying other languages.

If you're not prepared to add support yourself, please add and fill in the "Feature request" issue template to the OP of this issue as you should have used it when opening this issue.

@trijezdci
Copy link
Author

trijezdci commented Jun 16, 2023

I have forked the Sublime Text files that Linguist appears to be using for Modula-2, and I have made several corrections so that this reflects the classic Modula-2 Language originally published by Prof. Niklaus Wirth at ETH Zurich in his book "Programming in Modula-2", published by Springer Verlag. This book contains the language report and the language dialect it describes is known as PIM Modula-2, where PIM is a shorthand for the title of the book.

https://github.com/trijezdci/Sublime-Modula-2

However, both .def and .mod are already in those files, so this will not fix the issue of not recognising the file extension.

I am assuming there is at least one other place (within the Linguist repo) where .def needs to be added.

I can do that if you can confirm where this needs to be added.

As for the argument that .def might be used by other languages, I don't think that is justification not to support .def for Modula-2 because like I mentioned, it is not a style choice, it is ESSENTIAL, if it isn't there, then Modula-2 isn't supported, it is that simple. If the language is to be supported, then .def must be supported.

Besides, I have implemented and contributed multi-dialect Modula-2 support for/to vi and VIM where the disambiguation needed to be done for .mod because there are other languages that use .mod, but there was none that used .def.

I am quite happy to add disambiguation though because this would likely make it possible to support multiple dialects as it used to be when Github (and Bitbucket) still used Pygments. I had contributed multi-dialect Modula-2 support to Pygments which used (and still uses) a comment at the beginning of a source file with a dialect tag to tell the renderer for which dialect the file should be rendered. I did the same for Modula-2 support in vi/VIM and the maintainer of GNU Modula-2 did the same for Emacs.

However, I need somebody who understands Linguist to point me to where such disambiguation code is to be added and where the feature is documented so I can read up on how to do this. I will also need to know how to add multiple grammars for the same language.

regards
benjamin

@trijezdci
Copy link
Author

If you can replace the current incorrect Sublime Text definitions for M2 with the correct forked one I made, then I would also like to know a little more about what features in the Sublime Text definitions are actually having an impact on Linguist. For example there are some definitions in there that are quite obviously for text completion, and I doubt that Linguist uses that information at all. There are probably other such things that are only relevant for editors, not for rendering and thus probably ignored by Linguist. If we can identify them, then I would like to remove all of those.

@trijezdci
Copy link
Author

One more thing on file extensions. In classic Modula-2 and also in ISO Modula-2, the interface files were incorrectly called definition modules even though being interfaces, they contain declarations while the implementation files contain the corresponding definitions. Unfortunately, the mistake in nomenclature is already in the syntax as the interface files start with the syntax DEFINITION MODULE. It's counter-intuitive but that's also a reason why you can't just change the file extension to something else which is of course easier to do in a compiler than changing syntax.

The incorrect nomenclature along with its syntax has been corrected in the 2010 revision of Modula-2 where the interfaces are called interface modules and their syntax is INTERFACE MODULE. However, since most Modula-2 users are accustomed to the file naming, the .def file extension remains supported.

@lildude
Copy link
Member

lildude commented Jun 16, 2023

However, both .def and .mod are already in those files, so this will not fix the issue of not recognising the file extension.

I am assuming there is at least one other place (within the Linguist repo) where .def needs to be added.

Yup. The appropriate language within the languages.yml file.

As for the argument that .def might be used by other languages, I don't think that is justification not to support .def for Modula-2 because like I mentioned, it is not a style choice, it is ESSENTIAL, if it isn't there, then Modula-2 isn't supported, it is that simple. If the language is to be supported, then .def must be supported.

I agree and it might be essential, however we need to differentiate between Modula-2's use and everything else; you wouldn't want your .def files identified as Ruby, for example, just because someone else said this extension is used by Ruby. The same applies the other way... if we add .def to Modula-2 without any form of heuristic or adding it to other languages at the same time, to limit it to this language or allow the classifier to make a guess based on the samples, EVERY .def file will be identified as this language. This would be moot if the extension were unique to the language, but I fear .def probably isn't hence the requirement.

It's important to keep in mind that Linguist analyses files in isolation. It makes no consideration, nor should it, for other files in the repo or directory structure as a repo or gist can legitimately contain only a single file and people would want that to be identified as correctly as possible.

I am quite happy to add disambiguation though because this would likely make it possible to support multiple dialects as it used to be when Github (and Bitbucket) still used Pygments. I had contributed multi-dialect Modula-2 support to Pygments which used (and still uses) a comment at the beginning of a source file with a dialect tag to tell the renderer for which dialect the file should be rendered. I did the same for Modula-2 support in vi/VIM and the maintainer of GNU Modula-2 did the same for Emacs.

However, I need somebody who understands Linguist to point me to where such disambiguation code is to be added and where the feature is documented so I can read up on how to do this. I will also need to know how to add multiple grammars for the same language.

See my comments in the discussion you started.

If you can replace the current incorrect Sublime Text definitions for M2 with the correct forked one I made, then I would also like to know a little more about what features in the Sublime Text definitions are actually having an impact on Linguist.

We only use the syntax highlighting parts of a grammar, and as I mentioned in the discussion, these need to be Textmate-compatible grammars, which Sublime 2 so happens to implement. Also mentioned: how to write and maintain Textmate compatible grammars is outside of the scope of Linguist though @Alhadis is quite the expert so may be able to offer tips and help. Textmate has their own documentation (though it is a bit poor the last time I looked) as does VS Code.

dkearns added a commit to dkearns/linguist that referenced this issue Jan 10, 2024
Improve *.mod pattern to match implementation modules.  These were only
being incidentally matched via the /END\./ pattern branch.

See github-linguist#3657 and github-linguist#6451 for related discussions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants