Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Inconsistent and maybe buggy parsing of generalized-identifier compared to Power-BI #355

Open
UliPlabst opened this issue Feb 9, 2023 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@UliPlabst
Copy link

Hi, a user of powerqueryformatter.com filed this issue with me a couple of days.
He outlines that he cannot name the column of a table type with just a digit like that:

[...] type table [1 = text, 2 = text, 3 = text, 4 = text, 5 = text] 

the parser doesn't parse it saying Errors: Expected to find a identifier, but a numeric literal was found instead but in Power-BI
it works. The issue is not pressing, as the digits can be escaped using quoted-identifier but I thought I'd let you know.
I digged in the language specification and the relevant rules are

table-type:
      "table" row-type
row-type:
      "[" field-specification-list? "]"
field-specification-list:
      field-specification
      field-specification "," field-specification-list
field-specification:
      optional? field-name field-type-specification?
field-type-specification:       //this branch is not relevant
      "=" field-type
field-name:
      generalized-identifier
      quoted-identifier
generalized-identifier:
      generalized-identifier-part
      generalized-identifier separated only by blanks (U+0020) generalized-identifier-part
generalized-identifier-part:
      generalized-identifier-segment
      decimal-digit-character generalized-identifier-segment
generalized-identifier-segment:
      keyword-or-identifier
      keyword-or-identifier dot-character keyword-or-identifier
keyword-or-identifier:
      letter-character
      underscore-character
      identifier-start-character identifier-part-characters
letter-character:
      A Unicode character of classes Lu, Ll, Lt, Lm, Lo, or Nl
identifier-start-character:
      letter-character
      underscore-character
decimal-digit-character:                  
      A Unicode character of the class Nd

to me it seems single digit identifiers are not according to spec. So either the spec is wrong or the Power-BI parser is wrong. Also when we look at generalized-identifier-part it seems that according to second branch in

generalized-identifier-part:
      generalized-identifier-segment
      decimal-digit-character generalized-identifier-segment

the identifier 1a should be valid, but it does not parse. If I understand the spec correctly this is a bug.

Expected behavior
Consistency between language specification, microsoft/powerquery-parser and Power-BI.
Parsing of generalized-identifier according to spec

Actual behavior
Parser Power-BI and language specification are inconsistent.
1b does not parse as generalized-identifier in a table type.

To Reproduce
Please include the following:

  • (Required) The Power Query script that triggers the issue.
let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("Pcy5DcAwDEPRXVS7iZ1MY6jIfcf7dxYVgwWLBwI/Z+kk+EbR8CvaJirZZqq3LdRgW12xVTYK2ylUDgqVk0LlcqVWuSn8D4W9FCofhUoR1Qo=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t]),
    GroupedRows = 
        Table.Group ( 
            Source, 
            {"Column1"}, 
            {
                { 
                    "Transform", 
                    each 
                        Table.PromoteHeaders ( 
                            Table.Transpose ( 
                                _[[Column2], [Column3]] 
                            ) 
                        ), 
                        type table [1 = text, 2 = text, 3 = text, 4 = text, 5 = text] 
                }
            }
        ),
    ExpandedCount = 
        Table.ExpandTableColumn ( 
            GroupedRows, 
            "Transform", 
            {"1", "2", "3", "4", "5"}, 
            {"1", "2", "3", "4", "5"}
        )
in
    ExpandedCount
  • (Required) Any non-default settings used in the API call(s) which trigger the issue.
  • (Ideally) A minimal reproducible example. Can you reproduce the problem by calling a function in src/example.ts?
@UliPlabst UliPlabst added the bug Something isn't working label Feb 9, 2023
@bgribaudo
Copy link
Contributor

Unfortunately, I believe the grammar is incorrect in its definition of generalized identifiers (as in, it does not align with how the parser used by Power BI/Excel works). See https://github.com/MicrosoftDocs/powerquery-docs/issues/30.

This project uses modified rules for generalized identifiers, but, apparently, they don't match exactly with how Power BI/Excel work either. :-(

It would be really neat if the official grammar's definition could be corrected. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants