Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect escaped commas #78

Open
AlmightyLks opened this issue Nov 27, 2023 · 6 comments
Open

Respect escaped commas #78

AlmightyLks opened this issue Nov 27, 2023 · 6 comments

Comments

@AlmightyLks
Copy link

Would love to see this plugin respect escaped quotes and doesn't falsely format them

image

image


At the same time, I've been messing with another use-case dealing with commas and whitelisting escaped commas. I've used the following regex, and it might be applicable here 😄

(?<!\\),

image

@BdR76
Copy link
Owner

BdR76 commented Dec 2, 2023

Thanks for posting the suggestion. The plug-in does support escaping of quote characters by using double quote characters, see example data below

LogId,LogDate,LogTime,Type,Description
17584,28-11-2023,00:11:18.170,Error,Internal server error (500)
17585,28-11-2023,00:11:18.056,Warning,LoadDataSource process not ready
17586,28-11-2023,00:39:42.373,Error,"File ""c:\temp\labext_mcl_hb_1.csv"" not found"
17587,28-11-2023,00:51:02.831,Error,"File ""c:\temp\labext_mcl_hb_2.csv"" not found"
17588,28-11-2023,01:19:16.629,Warning,LoadDataSource process not ready

And any value that also contains the separator character (be it comma, semicolon etc.) can be escaped by adding quotes around the value as a whole, see the following example data:

PatientId,BirthDate,Sex,Lastname
1086,30-09-2002,M,Meijer
1248,19-04-1992,M,"Dijk, van"
2459,18-09-2000,M,Bakker
2499,11-05-2005,F,Visser
2907,27-10-1984,F,"Berg, van der"

As for using a slash for escaping certain characters, that is common for code like in C++/Java/Python etc. but afaik I haven't seen this being used in practice in csv data files. This would require quite some effort to include in the plugin, because both the lexer (syntax highlighting) and the parser/validator would have to be changed.

So is this an actual use-case, is there a system or data-supplier that formats the data using a slash to escape commas?

@AlmightyLks
Copy link
Author

Sorry for the late reply 😄
I understand your stance, and I can't find any word about it in the CSV RFC (RFC 4180) either, so I am on the shorter end here obviously

Our decades old desktop software relies on that detail it seems, and it doesnt seem to cause problems programmatically
However when trying to view and/or edit these files with a proper highlighting / formatting using CSVLint, I notice that like 1/20th of the lines fall out of format due to that
And I don't think we can teach our brittle piece of software to use quotes for escaping, even if just for backwards compat. reasons
That's where my use-case comes from

We can close the issue if you, understandably, don't want to pick up on it

@awdhbit
Copy link

awdhbit commented Sep 26, 2024

Thanks for this tool btw. This is very good for csv analysis.

I also have another use case where the exported csv from an application widely used by users does have

  • Single quote which is escaped like "
  • Also newlines are also being escaped for string line this: "2 A\\n\\nATDR Time-delay"

I am just replacing those as workaround for now. That works fine but would help to add this as part of this tool.

Of course this depends on the effort that is required, leave it to you!

@BdR76
Copy link
Owner

BdR76 commented Sep 26, 2024

Thanks for posting your feedback. I don't quite understand the first point though, about the single quote escaped like "? Do you mean like in row 1054 below? If not, can you post an example of the type of data you are working with?

Patid,DoB,Name,Sex,Length,Weight,Remark
1003,31-12-1973,Jansen,Female,182,71.0,"Rescheduled to september"
1054,03-09-1966,"Van 't Hoff",Male,176,98.1,""
1248,15-11-1985,De Vries,Male,169,92.3,"Excluded\\nNoshow twice"
1444,13-10-1994,Mulder,Female,189,72.0,""
1656,22-05-1972,De Graaf,Male,184,87.4,"Excluded due to allergies"

The second point, if new lines are provided in the data as \\n and those values are already properly in double quotes ", then you can do a normal Find-And-Replace in Notepad++. I mean I don't know how that would be added as a feature in the CSV Lint plugin

@awdhbit
Copy link

awdhbit commented Sep 26, 2024

Thanks for the fast response and sorry that I was not so clear. There was a typo in fact.
The first issue is illustrated by the below screenshot. As you can see its not able to ignore the quote(inches) by using the escape chars and therefore creates additional column.
image
this was solved by using double quote instead of single quote :
image

The second problem can be solved with the replacement of newlines as you mentioned so that's solved.
Thanks

@BdR76
Copy link
Owner

BdR76 commented Sep 27, 2024

@AlmightyLks @awdhbit fyi if you only need the syntax highlighting then you could take a look at the Regex Trainer plugin.

When you use the regular expression below, then it kind of works for the syntax highlighting for csv data with escaped commas.

(?:\\.|[^\\,\r\n])+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants