Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Fix: Solved bug where html string failed to be encoded #605

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dhwanils95
Copy link

The error is in the following line where .decode() is called without using the named parameter eventual_encoding.
pptx/compat/python3.py:

def to_unicode(text):
    """Return *text* as a (unicode) str.

    *text* can be str or bytes. A bytes object is assumed to be encoded as UTF-8.
    If *text* is a str object it is returned unchanged.
    """
    if isinstance(text, str):
        return text
    try:
        # Initial line
        # return text.decode("utf-8") 
        # Updated code to mitigate bug
        return text.decode(eventual_encoding="utf-8")
    except AttributeError:
        raise TypeError("expected unicode string, got %s value %s" % (type(text), text))

Its because in the latest version of bs4, indent_level is the first parameter so when you do not mention the parameter name, the wrong parameter is set with 'utf-8'. which causes an error on line indent_space = (' ' * (indent_level - 1)).

Check out the function definition and the line marked with the comment in
bs4/element.py:

     def decode(self, indent_level=None,
                eventual_encoding=DEFAULT_OUTPUT_ENCODING,
                formatter="minimal"):
         ....
         ....
         space = ''
         indent_space = ''
         if indent_level is not None:
             indent_space = (' ' * (indent_level - 1)) # 
         if pretty_print:
             space = indent_space

So, I just modified the call to use the named parameter directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant