-
-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use RequestBuilder to identify invalid urls #1460
base: master
Are you sure you want to change the base?
Conversation
lychee-lib/src/types/input.rs
Outdated
@@ -132,7 +133,8 @@ impl Input { | |||
) -> Result<Self> { | |||
let source = if value == STDIN { | |||
InputSource::Stdin | |||
} else if let Ok(url) = Url::parse(value) { | |||
} else if let (Ok(_), Ok(url)) = (Request::builder().uri(value).body(()), Url::parse(value)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your version from the issue description was a bit more readable to me, but that's probably just personal taste. Both works, so it's up to you to decide if you like to change it back.
Apart from that, can we add a comment here why this is needed? Essentially the main gist from the issue to let people know that it's a workaround for Windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched to this because the other version caused some tests to break on Linux. Apparently there are cases that RequestBuilder catches but Url doesn't, and vice versa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. It would also sounds like an important fact to document here.
Good job! Just a minor ask before we can merge this. |
I made an adjustment to be more explicit. Relying on the behavior of RequestBuilder hides the intent a bit, so instead I explicitly say that I'm rejecting Urls that don't have an http (or https) scheme. I also adjusted the last part of the function to handle Windows behavior better. The function originally assumed that missing a scheme could have been a mistake and adds https, but on Windows this can never happen because the separators in Windows filepaths are not valid for Urls. Finally, I added some Windows-only tests to validate that the behavior is correct. |
After the changes, though, some tests are failing that don't fail on Windows. I will need to investigate to understand the errors. They're related to something called "wayback", which I have never heard of and so I have no idea what the behavior is supposed to be. |
Oh, the I suggest to ignore these particular errors for the moment. Perhaps we can even disable them entirely. |
Yup, looks like you're right. I forced a test rerun and it passed. |
// The extra use of Request::builder is to weed out Windows filepaths with a drive specifier, which Url will erroneously parse successfully | ||
// We still use Url::parse because it catches some other edge cases that Request::builder does not | ||
// This could be improved with further refinement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this comment is no longer needed? At least it looks like we don't use Request::builder
anymore.
})?; | ||
match Url::parse(value) { | ||
// Weed out non-http schemes, including Windows drive specifiers, which will be successfully parsed by the Url crate | ||
Ok(url) if url.scheme().starts_with("http") => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should be a bit more cautious here.
Perhaps something like this?
Ok(url) if url.scheme() == "http" || url.scheme() == "https" => {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This is just to weed out some false positives like httpinvalid
. Maybe we could also add a test for that?)
Attempts to fix this issue.
This has resulted in some failing tests on Linux. Will revise.
Edit:
Fixed Linux tests. Let me know if you have any suggestions for improvement.