Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support literal newlines in csv #2329

Merged
merged 3 commits into from
Oct 22, 2022

Conversation

strRM
Copy link
Contributor

@strRM strRM commented Oct 20, 2022

According to RFC4180, newlines should be supported provided they are quoted.

In RFC4180 mode, if a LF or CRLF sequence is found in a double-quoted cell, then the sequence is copied into the text for the cell.

Using pop_back is much more efficient to remove the trailing CR from
a string.
Using a function to read the next line simplifies the mainloop a little
bit and sets us up for supporting newlines in CSV cells.
According to rfc4180 newlines are allowed in CSV cells, provided they
are enclosed in double quotes.

Read another line if we encounter a newline character in a CSV cell. If
cell contains CRLF, we reproduce CRLF in the text for the cell.
@codecov
Copy link

codecov bot commented Oct 20, 2022

Codecov Report

Merging #2329 (7c6f381) into master (b698a0b) will decrease coverage by 0.01%.
The diff coverage is 58.33%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2329      +/-   ##
==========================================
- Coverage   77.43%   77.41%   -0.02%     
==========================================
  Files         468      468              
  Lines       29162    29177      +15     
==========================================
+ Hits        22581    22588       +7     
- Misses       6581     6589       +8     
Impacted Files Coverage Δ
src/include/souffle/io/ReadStreamCSV.h 82.05% <58.33%> (-3.51%) ⬇️
...ouffle/datastructure/ConcurrentInsertOnlyHashMap.h 82.11% <0.00%> (+0.81%) ⬆️

@quentin quentin merged commit 841c741 into souffle-lang:master Oct 22, 2022
@strRM
Copy link
Contributor Author

strRM commented Oct 22, 2022

Thanks. This helps a lot. We'll probably have more commits like this in the future.

@strRM strRM deleted the support-literal-newlines-in-csv branch October 22, 2022 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants