Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kebab-case definition produces collision when mapped to snake_case #118

Open
sunfishcode opened this issue Oct 18, 2022 · 5 comments
Open

Comments

@sunfishcode
Copy link
Member

sunfishcode commented Oct 18, 2022

The component-model grammar has this for <name>:

name           ::= <word>
                 | <name>-<word>
word           ::= [a-z][0-9a-z]*
                 | [A-Z][0-9A-Z]*

That accepts both "apple" and "APPLE". This leads to collisions when converting to snake_case for languages which use that convention.

@lukewagner
Copy link
Member

Great point! So I suppose what we need to do here is tighten up the uniqueness condition to be case insensitive, so that you could have one or the other but not both.

alexcrichton pushed a commit to bytecodealliance/wit-bindgen that referenced this issue Oct 19, 2022
* Implement the component-model lexing rules for identifiers.

The [component-model grammer] for kebab-case identifiers now looks
like this:

```
name           ::= <word>
                 | <name>-<word>
word           ::= [a-z][0-9a-z]*
                 | [A-Z][0-9A-Z]*
```

Implement the rules. This continues to use XID rules for the initial
lexing, as that corresponds to what users might accidentally use, so
that we can issue appropriate errors in those cases. The precise
grammer is validated in a separate step.

[component-model grammer]: https://github.com/WebAssembly/component-model/blob/main/design/mvp/Explainer.md#instance-definitions

* Add more lexing tests.

* Update the tests in tests/codegen/conventions.wit.

* Comment out identifiers that collide when mapped to snake_case, for now.

See WebAssembly/component-model#118.
@tschneidereit
Copy link
Member

Treating identifiers as case-insensitive is what we'd talked about in the past precisely to avoid this issue.

I guess the main question is if we allow lower- and upper-case and treat identifiers as case-insensitive, or we enforce all-lower-case. The latter might be better, because it makes for cheaper and purely local validation, instead of multiple identifiers influencing each other.

@lukewagner
Copy link
Member

All-upper-case words are allowed in kebab-names to indicate that the word is an acronym, which the language casing scheme may treat differently than normal words. I think, in either case, the uniqueness condition is equally local, since you have to build a set of all names in the scope (imports or exports) to detect dupes.

@tschneidereit
Copy link
Member

All-upper-case words are allowed in kebab-names to indicate that the word is an acronym, which the language casing scheme may treat differently than normal words.

Ok, that's fair. We had in the past concluded that we'd not cover that case and require e.g. HttpRequest instead of HTTPRequest, but I see how people might feel strongly about enabling the latter. It does lead to strictly better bindings at no further cost, too, I suppose, since you're right about thee locality being the same, yeah.

sunfishcode added a commit to sunfishcode/wit-bindgen that referenced this issue Oct 19, 2022
Implement case-insensitive comparisons for kebab-case names as described
in the component-model thread:

WebAssembly/component-model#118

This also adds previously missing validation for duplicate names in some
contexts.
@sunfishcode
Copy link
Member Author

I opened bytecodealliance/wit-bindgen#385 with a prototype implementation of this.

sunfishcode added a commit to sunfishcode/wit-bindgen that referenced this issue Nov 16, 2022
Implement case-insensitive comparisons for kebab-case names as described
in the component-model thread:

WebAssembly/component-model#118

This also adds previously missing validation for duplicate names in some
contexts.
alexcrichton pushed a commit to alexcrichton/wasm-tools that referenced this issue Nov 16, 2022
…alliance#382)

* Implement the component-model lexing rules for identifiers.

The [component-model grammer] for kebab-case identifiers now looks
like this:

```
name           ::= <word>
                 | <name>-<word>
word           ::= [a-z][0-9a-z]*
                 | [A-Z][0-9A-Z]*
```

Implement the rules. This continues to use XID rules for the initial
lexing, as that corresponds to what users might accidentally use, so
that we can issue appropriate errors in those cases. The precise
grammer is validated in a separate step.

[component-model grammer]: https://github.com/WebAssembly/component-model/blob/main/design/mvp/Explainer.md#instance-definitions

* Add more lexing tests.

* Update the tests in tests/codegen/conventions.wit.

* Comment out identifiers that collide when mapped to snake_case, for now.

See WebAssembly/component-model#118.
rvolosatovs pushed a commit to bytecodealliance/wrpc that referenced this issue May 23, 2024
* Implement the component-model lexing rules for identifiers.

The [component-model grammer] for kebab-case identifiers now looks
like this:

```
name           ::= <word>
                 | <name>-<word>
word           ::= [a-z][0-9a-z]*
                 | [A-Z][0-9A-Z]*
```

Implement the rules. This continues to use XID rules for the initial
lexing, as that corresponds to what users might accidentally use, so
that we can issue appropriate errors in those cases. The precise
grammer is validated in a separate step.

[component-model grammer]: https://github.com/WebAssembly/component-model/blob/main/design/mvp/Explainer.md#instance-definitions

* Add more lexing tests.

* Update the tests in tests/codegen/conventions.wit.

* Comment out identifiers that collide when mapped to snake_case, for now.

See WebAssembly/component-model#118.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants