Skip to content

fix(csv-parse): preserve multi-byte record delimiter in raw output#491

Open
Jian-Zhang08 wants to merge 1 commit into
adaltas:masterfrom
Jian-Zhang08:fix/csv-parse-raw-crlf-delimiter
Open

fix(csv-parse): preserve multi-byte record delimiter in raw output#491
Jian-Zhang08 wants to merge 1 commit into
adaltas:masterfrom
Jian-Zhang08:fix/csv-parse-raw-crlf-delimiter

Conversation

@Jian-Zhang08

Copy link
Copy Markdown

With { raw: true }, only the first byte of the record delimiter was appended to the raw buffer (in the per-char loop), and the parser then advanced pos past the remaining delimiter bytes before emitting the record. For multi-byte delimiters such as Windows "\r\n" this dropped the trailing byte, so raw was 'a,b\r' instead of 'a,b\r\n'.

Append the remaining record-delimiter bytes to the raw buffer when a delimiter is detected, so multi-byte delimiters are preserved in full. Single-byte delimiters are unaffected (the loop body does not run).

Adds a regression test and rebuilds dist.

Fixes #332

With { raw: true }, only the first byte of the record delimiter was
appended to the raw buffer (in the per-char loop), and the parser then
advanced pos past the remaining delimiter bytes before emitting the
record. For multi-byte delimiters such as Windows "\r\n" this dropped
the trailing byte, so raw was 'a,b\r' instead of 'a,b\r\n'.

Append the remaining record-delimiter bytes to the raw buffer when a
delimiter is detected, so multi-byte delimiters are preserved in full.
Single-byte delimiters are unaffected (the loop body does not run).

Adds a regression test and rebuilds dist.

Fixes adaltas#332
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant