Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix regex in language and locale recognition #490

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Schemata/sarif-schema-2.1.0.json
Original file line number Diff line number Diff line change
Expand Up @@ -2354,7 +2354,7 @@
"description": "The language of the messages emitted into the log file during this run (expressed as an ISO 639-1 two-letter lowercase culture code) and an optional region (expressed as an ISO 3166-1 two-letter uppercase subculture code associated with a country or region). The casing is recommended but not required (in order for this data to conform to RFC5646).",
"type": "string",
"default": "en-US",
"pattern": "^[a-zA-Z]{2}|^[a-zA-Z]{2}-[a-zA-Z]{2}]?$"
"pattern": "^[a-zA-Z]{2}(-[a-zA-Z]{2})?$"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zA [](start = 39, length = 2)

Looks good! Also, we could consider the pattern below. It's two characters shorter and, I think, a little more readable. As if readability in regex could be easily quantified. :)

(?i)^[a-z]{2}(-[a-z]{2})?$

@eddynake, @yongyan-gh, we need to get this change into an rtm.6 revision of the SARIF schema as published in the SDK and schemastore.org.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eddynaka, @yongyan-gh, I don't think we pushed these changes into an rtm.6 revision. Let's discuss this offline today, it would be good to close on this issue asap. @aeisenberg, sorry for the delay. We do have all the SARIF errata prepared but the holidays (and a need to rerender all errata in an OASIS-approved template) has introduced delays. We're picking it up again now, however.

},

"versionControlProvenance": {
Expand Down Expand Up @@ -3060,7 +3060,7 @@
"description": "The language of the messages emitted into the log file during this run (expressed as an ISO 639-1 two-letter lowercase language code) and an optional region (expressed as an ISO 3166-1 two-letter uppercase subculture code associated with a country or region). The casing is recommended but not required (in order for this data to conform to RFC5646).",
"type": "string",
"default": "en-US",
"pattern": "^[a-zA-Z]{2}|^[a-zA-Z]{2}-[a-zA-Z]{2}]?$"
"pattern": "^[a-zA-Z]{2}(-[a-zA-Z]{2})?$"
},

"contents": {
Expand Down