Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exact URL Matching. Allow Syntactically Correct Patterns. Better Tests #25

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

da3dsoul
Copy link

As requested, I separated out new features from the bug fixes. I will submit separate PRs for those to be discussed after this is in.

Because of the git mess, I would suggest squashing this on merge.

@da3dsoul
Copy link
Author

@dagwieers here you go. Sorry it took so long. Life and stuff....

lib/tests.py Outdated
f = lambda a, b: None
plugin.route("/foo/<a>/<b>")(f)
assert plugin.url_for(f, a=1, b=2) == plugin.base_url + "/foo/1/2"
f = lambda a, b2: None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the relevance is of renaming b to b2 in these tests ;-)

Copy link
Contributor

@dagwieers dagwieers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for the unneeded changes (b vs b2) this looks fine.

@da3dsoul
Copy link
Author

The b2 stuff is to make sure that nothing fails with numbers in the pattern name, as that didn't work before.

@dagwieers
Copy link
Contributor

@da3dsoul Right, then I would use more descriptive names and make that more explicit. Otherwise it may be lost on people like myself ;-)

@da3dsoul
Copy link
Author

Name it what? var_num2 ?

@da3dsoul
Copy link
Author

Without a comment, I don't see how naming it different will be any better

@dagwieers
Copy link
Contributor

I think var_num2 would be better than b2 indeed.

@dagwieers
Copy link
Contributor

@da3dsoul Thank you.

@da3dsoul
Copy link
Author

Can we get this looked at for merging? I want to have this one (which shouldn't be contentious) so that other ones, such as script support and argument conversion or strict typing can be discussed.

@da3dsoul
Copy link
Author

I didn't notice that last one had an issue. Fixed and wrote a test case for it.

@da3dsoul
Copy link
Author

@danwieers it's ready again. Fingers crossed....

Copy link
Contributor

@dagwieers dagwieers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@da3dsoul Looking through the code I think it would be easier to accept some of it if this was broken down to even smaller PRs. Like the improved regexps or the better tests are no-brainers to me. But by putting everything in one large PR you make it an all-or-nothing proposition, which is a lot harder to decide on (and easier to postpone to review later).

It also makes it a lot easier to review if it consists of a single body of work. And the PR could document in detail why this small change is required (in isolation). But by mixing things with something potentially controversial or added complexity, you make the whole PR a controversial piece. And it doesn't help bisecting either.

That said I also do mix various small pieces together from time to time, typically one-liners or no-brainers, but try to split off anything controversial or complex that touches code all over the place making it hard to review.

lib/tests.py Outdated
plugin.route("/foo/<a>/<var_with_num_underscore2>")(f)
plugin.route("/foo/a/b")(g)

# due to the unpredictable sorting of dict, just do it 100 times to see if it fails
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt this is needed, dict-ordering is not guaranteed (in Python 3.5 or older) but it is deterministic. It is not random.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not in Py3, but it was an issue before this PR, so it's there to prove that it's fixed with the PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think this is a problem on Python 2 either. Again, the ordering is not guaranteed, but it is deterministic. So every single run for the 100 runs will be identical.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would think, right? I expected that, since every other language I've worked with is like that. I can tell you that, in my attempts just to find this issue, I proved the opposite on Python 2.7. When it says that it can't guarantee order there, for the first time I've seen, it really means it.

Just for the sake of it, I'll write up a test case and run it. If it yields consistent results, then I'll concede.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS C:\Users\da3ds> & "C:/Program Files/Python37/python.exe" c:/Users/da3ds/Documents/loop_test.py
apples
PS C:\Users\da3ds> & "C:/Program Files/Python37/python.exe" c:/Users/da3ds/Documents/loop_test.py
apples
PS C:\Users\da3ds> & "C:/Program Files/Python37/python.exe" c:/Users/da3ds/Documents/loop_test.py
apples
PS C:\Users\da3ds> & "C:/Program Files/Python27/python.exe" c:/Users/da3ds/Documents/loop_test.py
beetroot
PS C:\Users\da3ds> & "C:/Program Files/Python27/python.exe" c:/Users/da3ds/Documents/loop_test.py
beetroot
PS C:\Users\da3ds> & "C:/Program Files/Python27/python.exe" c:/Users/da3ds/Documents/loop_test.py
beetroot

You win. I don't know what was different before that was giving unreliable results, but it definitely was. That was a long time ago, a different installation of Windows, etc.

@@ -96,8 +106,8 @@ def url_for(self, func, *args, **kwargs):
path = rule.make_path(*args, **kwargs)
if path is not None:
return self.url_for_path(path)
raise RoutingError("No known paths to '{0}' with args {1} and "
"kwargs {2}".format(func.__name__, args, kwargs))
raise RoutingError("No known paths to '{0}' with args {1} "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a good reason for this change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyLint was complaining about line length and marking my PR with a red X

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I have a PR open to raise this to 160. But in that case I understand.

lib/routing.py Outdated
@@ -127,33 +137,48 @@ def run(self, argv=None):
self.path = self.path.rstrip('/')
if len(argv) > 2:
self.args = parse_qs(argv[2].lstrip('?'))
self.args = dict((k, list(uq(uq(v2)) for v2 in v)) for k, v in self.args.items())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the double quoting and unquoting?
We need to document the reason for added complexity.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it in the test for some reason....idk it was late.

# we wanted double quote for the +, %, and any others that might be in the string

It won't touch ASCII, but double unquoting doesn't hurt if there's a chance of invalid parsing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why would there be any invalid parsing? I would like to see concrete examples where the current implementation failed.

lib/routing.py Outdated
@@ -24,9 +24,9 @@
except ImportError:
from urllib.parse import urlsplit, parse_qs
try:
from urllib import urlencode
from urllib import urlencode, quote as q, unquote as uq
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer the original names as it improves code readability.
Only when there are namespace problems or for backward compatibility renaming is warranted.

For example, q is a well-known debugging tool: https://pypi.org/project/q/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess. It might just be my preference, but I hate long lines with repeated function names. By your standards, would you prefer this?

quoted_string = double_quote(s)

def double_quote(s):
    return quote(quote(s))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't even determined when or why this double quoting is needed.

@da3dsoul
Copy link
Author

da3dsoul commented Jan 11, 2020

That's fair. I was trying to keep it just small fixes, but I kept encountering more little things that were easy to tack on. If this didn't take 6 months, no doubt most of these would have been in smaller PRs. It's a pain to make a new branch for every little fix, then possibly deal with merge conflicts after that.

I didn't think the double quoting would be debatable. In every other router I've seen, they just automatically do it to avoid Research & Development ∝ Coffee × Time² ÷ (Problems + Effort) from causing havoc on the system. Note that it has both unicode, which a single quote will escape, as well as a + and &, which a double quote will escape

@da3dsoul da3dsoul changed the title Exact URL Matching. Allow Syntactically Correct Patterns. Better Tests Exact URL Matching. Allow Syntactically Correct Patterns. Properly Escape Strings. Better Tests Jan 11, 2020
@dagwieers
Copy link
Contributor

@da3dsoul I think most (if not all) of the individual changes are non-conflicting.

@da3dsoul
Copy link
Author

In this case, yes, the changes are non-conflicting. It's a habit, as I've run into the issue far too many times

@da3dsoul
Copy link
Author

Yeah....I don't know what I was thinking with quoting stuff. Probably too much programming and not enough sleep. After running tests, I agree that the thought was convoluted.
NOTE FOR FUTURE ME:

dict -> urlencode is escaped into query string
query string -> parse_qs is unescaped back into dict
Technically a list of tuples, as a query can have multiple keys

@da3dsoul da3dsoul changed the title Exact URL Matching. Allow Syntactically Correct Patterns. Properly Escape Strings. Better Tests Exact URL Matching. Allow Syntactically Correct Patterns. Better Tests Jan 12, 2020
@da3dsoul
Copy link
Author

da3dsoul commented Jan 12, 2020

I remember at least part of why I did it. url_for(f, a, b) does not encode properly. That will be a different discussion. Basically, the best way to handle that without having a breaking change is to quote_plus in make_path(), then on parse, loop unquote_plus until the string can not be unquoted further. If plugins are expecting to handle it themselves, then that is probably the best way, lest we notify the 25ish devs using it.

Okay, yeah. I was quoting more than necessary, but it was needed for something. I'll make an issue and link it. #32

@dagwieers
Copy link
Contributor

@da3dsoul No worries, I have this more often than I would like to, especially if this is code I worked on months before. You forget about the inner workings, and are left with the resulting code. But if this happens to you after a few months, it means it's hard for anyone who didn't know what cases are being fixed with some change. And that is why we need to add notes to those pieces of code so we remember later (or reviewers understand the cases covered). And if this code is up to become refactored, we need to ensure that the refactored code still works correctly for those (special) cases.

Sometimes it is sufficient to have unit tests for all these corner cases, but for reviewers some hints or notes can help digest complexity. And happy reviewers makes code merge faster ;-)

@da3dsoul
Copy link
Author

Considering how I had it done, I probably didn't really know where the problem was, only that there was an escape issue. IDK, but I figured it out and cleaned it up for the future...and wrote it down

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants