Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Remove LexerSnapshot #185

Open
JordanBoltonMN opened this issue Aug 21, 2020 · 2 comments
Open

[Enhancement] Remove LexerSnapshot #185

JordanBoltonMN opened this issue Aug 21, 2020 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@JordanBoltonMN
Copy link
Contributor

Is your feature request related to a problem? Please describe.
LexerSnapshot existed due to old reasons which no longer apply. It doesn't have any real reason to exist anymore.

Describe the solution you'd like
Update the parser to take a lexer state and read tokens directly from there.

Describe alternatives you've considered
N/A

Additional context
N/A

@JordanBoltonMN JordanBoltonMN added the enhancement New feature or request label Aug 21, 2020
@JordanBoltonMN JordanBoltonMN self-assigned this Aug 21, 2020
@mattmasson
Copy link
Member

Was lexer snapshot added to support incremental lexing? or can we still do that with state? I'm still hoping we'll eventually be able to support parser based tokenization + semantic highlighting.

@JordanBoltonMN
Copy link
Contributor Author

Originally it was added to help support it, but it no longer is needed.

The lexer creates tokens on a per-line basis, which requires some additional token kinds such as MultilineCommentStart and MultilineCommentEnd, Whenever one line gets updated it conditionally updates the subsequent lines. Eg. you started a multiline comment on one line then it turns the subsequent line into a MultilineCommentContent.

When you want to actually parse something you need try creating a LexerSnapshot. The snapshot attempt iterates over all of the tokens which provides:

  • Validate and combine multiline tokens into a single token
  • Put all comments into a collection
  • Provides a helper function to get a [lineNumber, columnNumber] pair from a Token

All of these could be moved into the parser and LexerSnapshot could be removed. A trade happens by slightly adding to the complexity of the parser, but also removes complexity of having the LexerSnapshot at all. It also removes an O(n) pass on tokens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants