Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: Multiline support breaks .env files with quotes in comments #12

Closed
BlackDex opened this issue Aug 23, 2022 · 10 comments · Fixed by #16 or #18
Closed

Regression: Multiline support breaks .env files with quotes in comments #12

BlackDex opened this issue Aug 23, 2022 · 10 comments · Fixed by #16 or #18

Comments

@BlackDex
Copy link
Contributor

Hello there,

First thanks for keeping this crate alive.
I just tried to update to the latest version, but it broke my environment.
It looks like the new multiline support also checks within comments, and that breaks everything.

It is very simple to reproduce this by modifying dotenv/tests/common/mod.rs to this:

pub fn make_test_dotenv() -> io::Result<TempDir> {
    tempdir_with_dotenv(
        r#"
        # Start of .env file
        # Comment line with single ' quote
        # Comment line with double " quote
        TESTKEY=test_val
        # End of .env file
        "#
    )
}

That will create a multiline .env file to test with. And it will break.
When you remove the two lines having a double and single quote it will run again.

I didn't had time to look into this my self, but maybe @hoijui or @allan2 knows a quick fix for this :).

For now i will stick with the previous version of this crate.

@allan2
Copy link
Owner

allan2 commented Aug 24, 2022

Thanks for reporting this. I will look into it.

@allan2 allan2 pinned this issue Aug 24, 2022
@allan2
Copy link
Owner

allan2 commented Aug 24, 2022

Hi @BlackDex,

Maybe I am misunderstanding the issue. I was unable to reproduce.

.env

# Start of .env file
# Comment line with single ' quote
# Comment line with double " quote
TESTKEY=test_val
# End of .env file

main.rs

use dotenvy::{dotenv, dotenv_iter, Error};

fn main() -> Result<(), Error> {
    dotenv()?;
    for item in dotenv_iter()? {
        let (key, val) = item?;
        println!("{key}={val}");  // prints `TESTKEY=test_val`
    }
    Ok(())
}

Could you provide more info or an example?

@BlackDex
Copy link
Contributor Author

@allan2 If i use the example i made above and run cargo test it breaks for me

@BlackDex
Copy link
Contributor Author

@allan2, i created a new branch on my fork to illustrate it.
master...BlackDex:dotenvy:quote_issue

With that branch, within the dotenv directory both cargo test and cargo run --example quote_issue will fail

@BlackDex
Copy link
Contributor Author

BlackDex commented Aug 24, 2022

The issue here is that the new Multiline iter checks every line, also if it starts with a #.
Which for multiline variables could be needed of course, but it should exclude lines that start with a # (And any type of whitespace in front of it) when not within a StrongOpen i think.

Example of when it could be needed:

# This is a comment
CODEGEN_TEST_VAR1="hello!"
CODEGEN_TEST_VAR2="'quotes within quotes'"
# This is a comment with a quote '
CODEGEN_TEST_MULTILINE1="First Line
Second Line"
CODEGEN_TEST_MULTILINE1="# First Line Comment
Second Line
#Third Line Comment
Fourth Line
"

domodwyer added a commit to domodwyer/dotenvy that referenced this issue Aug 29, 2022
Fixes the line parser to respect comments when evaluating quote state.

Fixes allan2#12.
allan2 pushed a commit that referenced this issue Aug 30, 2022
Fixes a regression caused by multiline support (#12) 

Comments are now respected when evaluating quote state
@allan2 allan2 unpinned this issue Aug 30, 2022
@BlackDex
Copy link
Contributor Author

BlackDex commented Aug 30, 2022

@allan2 Unfortunately this is still an issue.

It even has an other regression now.
Take this for example:

# Start of .env file
# Comment line with single ' quote
# Comment line with double " quote
 # Comment line, starts with space with double " quote
 CODEGEN_TEST_VAR1="hello!"
CODEGEN_TEST_VAR2="'quotes within quotes'"
CODEGEN_TEST_VAR3="double quoted with # hash in value"
CODEGEN_TEST_VAR4='single quoted with # hash in value'
CODEGEN_TEST_VAR5=not_quoted_with_#_hash_in_value
CODEGEN_TEST_VAR5=not_quoted_with_comment_beheind #_hash_in_value
# End of .env file

This worked in the 0.15.1 version just fine. Now it breaks.

And the following example also breaks with the multiline feature enabled:

# Start of .env file
# Comment line with single ' quote
# Comment line with double " quote
 # Comment line, starts with space with double " quote
 CODEGEN_TEST_VAR1="hello!"
CODEGEN_TEST_VAR2="'quotes within quotes'"
CODEGEN_TEST_VAR3="double quoted with # hash in value"
CODEGEN_TEST_VAR4='single quoted with # hash in value'
CODEGEN_TEST_VAR5=not_quoted_with_#_hash_in_value
CODEGEN_TEST_VAR5=not_quoted_with_comment_beheind #_hash_in_value
CODEGEN_TEST_MULTILINE1="First Line
Second Line"
CODEGEN_TEST_MULTILINE2="# First Line Comment
Second Line
#Third Line Comment
Fourth Line
"
# End of .env file

If i just leave the CODEGEN_TEST_MULTILINE1 there, and remove the rest that will work.
Now, i don't know if you want to support Lines starting with a # within a multiline quoted string, but besides lines starting with a # string, they could be anywhere within the quoted string of course, which still breaks.

@LeoniePhiline
Copy link
Contributor

LeoniePhiline commented Aug 30, 2022

I am also having an issue with KEY='ABC#DEF':

What's happening:

The line iterator, via eval_end_state(), in the iter module returns the line as KEY='ABC, truncating at the #.

Despite the strong quotes ', which should require no escaping, it is necessary to escape the # as \# to get the full line.

However, this creates problems down the line, as the line iterator is wrong, but the line parsing later on is correct.

What I tried:

So I escaped the source line to read KEY='ABC\#DEF'. (Added \.)

However, then parse_value() in the parse module dumps the \ into the final result (extracting \# rather than # into the final value).
This happens because parse_value() correctly does not expect any escaping (and never sets escaped = true) while within strong_quote.

The only two possible outcomes are therefore:

  • If # is (correctly) not escaped within strong quotes: an incorrect value ABC instead of the correct value ABC#DEF.
  • If # is (incorrectly) escaped as \# despite strong quotes: an incorrect value ABC\#DEF instead of the correct value ABC#DEF.

This is a regression:

As mentioned by BlackDex above, this is a 0.15.3 regression.
The issue is introduced in https://github.com/allan2/dotenvy/pull/16/files#diff-2630ab209b8952cf63a6f52f41cef74ba4a5ab035d861895016911ecde5c94a1

@allan2 allan2 reopened this Aug 30, 2022
@BlackDex
Copy link
Contributor Author

@allan2 @LeoniePhiline i might have a fix for this: master...BlackDex:dotenvy:quote_issue

@BlackDex
Copy link
Contributor Author

BlackDex commented Sep 3, 2022

Hmm, doesn't seem like a very good fix though.
I'm encoutering some other issues with other test .env files I'm currently trying out.

BlackDex added a commit to BlackDex/dotenvy that referenced this issue Sep 5, 2022
This PR fixes allan2#12

Added a WhiteSpace ParseState and check for a comment after a
whitespace. Also return the current position of the iter to correctly
strip the comments of from the line to prevent quoted comments or
non-whitespace surrounded hash chars.

Also updated the tests to match these cases.
And added an example code to easily show the output.
BlackDex added a commit to BlackDex/dotenvy that referenced this issue Sep 5, 2022
This PR fixes allan2#12

Added a WhiteSpace ParseState and check for a comment after a
whitespace. Also return the current position of the iter to correctly
strip the comments of from the line to prevent quoted comments or
non-whitespace surrounded hash chars.

Also updated the tests to match these cases.
And added an example code to easily show the output.
@BlackDex
Copy link
Contributor Author

BlackDex commented Sep 5, 2022

@allan2 i think i have a fix for this, please verify the PR #18 if i missed something.

allan2 pushed a commit that referenced this issue Sep 19, 2022
This PR fixes #12

Added a WhiteSpace ParseState and check for a comment after a
whitespace. Also return the current position of the iter to correctly
strip the comments of from the line to prevent quoted comments or
non-whitespace surrounded hash chars.

Also updated the tests to match these cases.
And added an example code to easily show the output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants