Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

url_parse chokes on Unicode characters #36

Closed
crw opened this issue Oct 25, 2016 · 2 comments
Closed

url_parse chokes on Unicode characters #36

crw opened this issue Oct 25, 2016 · 2 comments

Comments

@crw
Copy link

crw commented Oct 25, 2016

Trying to parse "Chögyam Trungpa":https://www.google.com/search?q=Chögyam+Trungpa.

Here is the exception I am seeing:

Traceback (most recent call last):
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/flask/app.py", line 1988, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/flask/app.py", line 1641, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/newrelic-2.72.0.52/newrelic/hooks/framework_flask.py", line 98, in _nr_wrapper_Flask_handle_exception_
    return wrapped(*args, **kwargs)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/flask/app.py", line 1544, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/flask/app.py", line 1639, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/flask/app.py", line 1625, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/newrelic-2.72.0.52/newrelic/hooks/framework_flask.py", line 40, in _nr_wrapper_handler_
    return wrapped(*args, **kwargs)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/hyper/web/lib/auth.py", line 15, in decorated_function
    return f(*args, **kwargs)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/hyper/web/blueprint/chat/__init__.py", line 116, in list_posts
    updated_at=int(time.time())
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/flask/templating.py", line 134, in render_template
    context, ctx.app)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/flask/templating.py", line 116, in _render
    rv = template.render(context)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/newrelic-2.72.0.52/newrelic/api/function_trace.py", line 98, in dynamic_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/jinja2/environment.py", line 989, in render
    return self.environment.handle_exception(exc_info, True)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/jinja2/environment.py", line 754, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/hyper/web/blueprint/chat/templates/list_posts.html", line 1, in top-level template code
    {% extends "chat_page.html" %}
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/hyper/web/blueprint/chat/templates/chat_page.html", line 1, in top-level template code
    {% extends "layout.html" %}
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/hyper/web/templates/layout.html", line 27, in top-level template code
    {% block content %}
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/hyper/web/blueprint/chat/templates/chat_page.html", line 13, in block "content"
    {% block main_column %}{% endblock %}
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/hyper/web/blueprint/chat/templates/list_posts.html", line 11, in block "main_column"
    {% include('_post_list.html') %}
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/hyper/web/blueprint/chat/templates/_post_list.html", line 9, in top-level template code
    {% include('_post_item.html') %}
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/hyper/web/blueprint/chat/templates/_post_item.html", line 56, in top-level template code
    <span class="value">{{ post.get_display_message()|safe }}</span>
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/hyper/model/base.py", line 411, in get_display_message
    html_type='html5'
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/textile/core.py", line 1367, in textile_restricted
    text)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/textile/core.py", line 251, in parse
    text = self.block(text)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/textile/core.py", line 456, in block
    block = Block(self, tag, atts, ext, cite, line)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/textile/objects/block.py", line 29, in __init__
    self.process()
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/textile/objects/block.py", line 121, in process
    self.content = self.textile.graf(self.content)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/textile/core.py", line 582, in graf
    text = self.links(text)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/textile/core.py", line 605, in links
    return self.replaceLinks(text)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/textile/core.py", line 718, in replaceLinks
    text = re.compile(pattern, flags=re.X | re.U).sub(self.fLink, text)
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/textile/core.py", line 874, in fLink
    url = self.shelveURL(self.encode_url(urlunsplit(uri_parts)))
  File "/usr/local/www/hyper/env/local/lib/python2.7/site-packages/textile/core.py", line 929, in encode_url
    query = quote(unquote(parsed.query), b'=&?/')
  File "/usr/lib/python2.7/urllib.py", line 1299, in quote
    return ''.join(map(quoter, s))
KeyError: u'\xf6'

Based on information from https://stackoverflow.com/questions/15115588/urllib-quote-throws-keyerror, I believe the following line needs to include .encode('utf-8')

textile/core.py line 929

        query = quote(unquote(parsed.query), b'=&?/')

to

        query = quote(unquote(parsed.query.encode('utf-8')), b'=&?/')

That is a quick-fix (for python2) but I am not entirely sure what is going on in this bit of code, so there may be a better fix to be performed.

edit: that change makes many unittests fail in python3, so it is a no-go. Unittests pass for python2.

ikirudennis added a commit that referenced this issue Oct 26, 2016
I'm not too happy about it though.
@crw
Copy link
Author

crw commented Oct 27, 2016

Thank you very much! I did not expect such a quick fix. Very appreciated!

@ikirudennis
Copy link
Member

Thank you for providing a clear test case, and even an attempt at a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants