Skip to content

Releases: eltorocorp/permissivecsv

3.0.0

16 Nov 01:54
Compare
Choose a tag to compare

This release contains two primary changes over version 2.x.

Compatibility with v2.x should only be broken if the consumer is using any custom header detection logic.

If the consuming system is only using HeaderCheckAssumeHeaderExists or HeaderCheckAssumeNoHeader, then v3.0.0 should be compatible with v2.x (this does not preclude future releases of v3.x to further break from v2.x).

Reader replaces ReadSeeker

Scanner now relies on an io.Reader rather than an io.ReadSeeker.

Since io.ReadSeeker implements io.Reader, this change should not cause any issues with backwards compatibility.

v2.x used a ReadSeeker to simplify some logic that was implemented to assist with header detection. However, upon implementing the package, we discovered that it was sometimes awkward to interface with a ReadSeeker, as it is common for some data sources (particularly those that are stream oriented) to present the data through a Reader. This forced us to have to read large amounts of data into memory to then wrap that data in a seeker before passing into the Scanner. As was mentioned above, the ReaderSeeker was only used to simplify some logic associated with header detection. We decided that it was better to take a step back and simplify the header detection capabilities in exchange for using a Reader, which is easier to use in general.

Simplified HeaderCheck

This change breaks compatibility with v2.x.

HeaderCheck (header detection) logic now only evaluates the first record, and no longer allows a comparison between the first two records. This should be sufficient in many header detection cases.

As noted earlier in these notes, the initial implementation of the header detection logic was clumsy, and relied on a ReadSeeker to enable the comparison between the first two records. The use of a ReadSeeker made using permissivecsv a bit of a pain when many systems just supply an io.Reader. This led to a bunch of gynmastics and boilerplate code in consuming systems. The advantages of a two record header check were significantly outweighed by the disadvantages of using a ReadSeeker

It is possible that a more robust header detection capability will be reintroduced in a later version, but for now, the clearer path is to simplify.