Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit BASIC heuristic to avoid false positives + new sample #6320

Merged
merged 2 commits into from
Mar 14, 2023

Conversation

DecimalTurn
Copy link
Contributor

@DecimalTurn DecimalTurn commented Mar 14, 2023

There is currently at least 2 occurences of false positives with the BASIC heuristic ('^\s*\d+') introduced in #5166:

  1. If a VBA file contains line numbers, it will automatically be identified as BASIC. It is not that common for VBA code to have line numbers, but some people like to use them to improve error reporting.
    eg. https://github.com/Sven-Bo/Integrate-ChatGPT-in-Excel-using-VBA/blob/master/mChatGPT.bas#L34

  2. In VBA (or VB6), an underscore can be used as a line continuation character which means that lines can be split and there is the possibility that a number ends up being the first thing on the second line thus matching with the heuristic.
    eg. https://github.com/dragokas/hijackthis/blob/7560e67b5e0d14f53ea25bbfb88587cbfadca1c8/src/modPermissions.bas#L905

The good thing is that both of these issues can't happen at the start of the file because you can't have line numbering outside of a procedure in VBA and it can't be a line continuation if it's the start of the file (obviously).

Hence, I'm proposing to add \A at the start of the heuristic (which gives ^\A\s*\d+) and I also added a sample so that we have two samples for BASIC with the .bas extension.

Description

Checklist:

  • I am fixing a misclassified language
    • I have included a new sample for the misclassified language:
    • I have included a change to the heuristics to distinguish my language from others using the same extension.

@DecimalTurn DecimalTurn requested a review from a team as a code owner March 14, 2023 02:05
Copy link
Member

@lildude lildude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good explanation and makes sense to me. Thanks.

@lildude lildude merged commit 508cb81 into github-linguist:master Mar 14, 2023
@github-linguist github-linguist locked as resolved and limited conversation to collaborators Jun 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants