Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added __repr__ and a helper function #10

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

KOLANICH
Copy link
Contributor

No description provided.

@GreyCat
Copy link
Member

GreyCat commented Mar 15, 2017

Thanks for the contribution!

Unfortunately, this code breaks many things:

  • First and foremost, it fails with an error:
     33 
     34     def __repr__(self):
---> 35         return "".join((self.__class__.__name__, "(", ", ".join(self._reprGeneratorForAllProps(self)), ")"))
     36 
     37     def __exit__(self, *args, **kwargs):

TypeError: _reprGeneratorForAllProps() takes exactly 1 argument (2 given)
  • If it was working, this code dumps only seq items, but it will ignore instances, as they are stored in _m_foo like members, starting with _, which you skip
  • Adding support for instances might be tricky, as you'll need to avoid problems with infinite recursion to be encountered in cases like these:
meta:
  id: rec
seq:
  - id: some
    type: u1
instances:
  self_ref:
    value: _root
  • Naming:
  • Complexity:
    • cramming 4-5-6 statements into a single line is not a good idea
    • Iadding extra function in KaitaiStruct class that will be inherited everywhere is probably not a good idea as well
  • Formatting:

kaitaistruct.py Outdated
@@ -27,6 +27,13 @@ def __init__(self, stream):
def __enter__(self):
return self

def _reprGeneratorForAllProps(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this name doesn't comply with pep8.

@arekbulski
Copy link
Member

Forgive me for being outright dismissive but... is __repr__ really that much needed? @GreyCat I noticed that you and I run our forums completely differently. In Construct issues, I would have closed this PR after no more than a day of deliberation and half the arguments. You seem to keep issues open forever for everyone to see and participate. I have to admit, and very ahem deliberately, that I appreciate the way you run things. But lets close this one and move on? 🤔

@KOLANICH
Copy link
Contributor Author

KOLANICH commented Jan 19, 2018

__repr__ is definitely needed. It is a pain to deal (especially do debug in ipython shell) with anything without __repr__. Feel free to fix that PR any way you like, I don't have time to work on it for now.

@arekbulski
Copy link
Member

I just fixed the style/pep8, didnt change how it works. Better?

@KOLANICH
Copy link
Contributor Author

I have added a guard against recursive objects.

@KOLANICH
Copy link
Contributor Author

KOLANICH commented Mar 20, 2018

Broadened scope to instances. @arekbulski @GreyCat @funkyfuture could you test and review it, it seems it's ready.

kaitaistruct.py Outdated
@@ -15,6 +15,18 @@
#
__version__ = '0.8'

from RecursionSafe import RecursionSafe

recSafe=RecursionSafe()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does flake8 not complain about that symbol name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recSafe will be fixed to rec_safe. I don't use flake.

kaitaistruct.py Outdated
"""Generator to use in own __repr__ functions."""
return (
"".join(( str(k), "=", repr(getattr(self, k)) ))
for k in dir(self)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't know if it's usable in this context, but vars is often inferior over dir and might make test in the following line unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vars doesn't return properties.

kaitaistruct.py Outdated
return (
"".join(( str(k), "=", repr(getattr(self, k)) ))
for k in dir(self)
if k[0] != "_" and not hasattr(KaitaiStruct, k) and not isinstance(getattr(self, k), type)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k.startswith('_') would be more expressive.

Copy link
Contributor Author

@KOLANICH KOLANICH Mar 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's longer and twice slower.

@@ -26,6 +38,16 @@ def __enter__(self):
def __exit__(self, *args, **kwargs):
self.close()

def __repr__(self):
return "".join(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are too much line-breaks for my gusto.

setup.py Outdated
@@ -44,4 +44,6 @@
],
keywords='kaitai,struct,construct,ksy,declarative,data structure,data format,file format,packet format,binary,parser,parsing,unpack,development',
py_modules=["kaitaistruct"],
requires=["RecursionSafe"],
dependency_links=["git+https://github.com/KOLANICH/RecursionSafe.py.git"],
Copy link

@funkyfuture funkyfuture Mar 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wouldn't trust a python library with camelcase naming and .py suffix. ;-)

also, i wouldn't outsource 45 lines of code to create a dependency, but rather put it in an utils module.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also:

Requirements
Python 3.

Which is, obviously, a deal-breaker for inclusion here anyway :(

Python 2 is dead, stop raping its corpse.

Unfortunately, I see the exact opposite :(

Copy link

@funkyfuture funkyfuture left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my 2 cents.

@KOLANICH KOLANICH force-pushed the __repr__ branch 3 times, most recently from 8c117dd to 9d712b1 Compare March 21, 2018 11:56
@arekbulski
Copy link
Member

I dont even understand how this code works, nor why its so lengthy. Also its not providing different str and repr versions. I would suggest considering using Construct as reference:

@arekbulski
Copy link
Member

I appreciate the initiative, but honestly I dont like the implementation. If you would be willing to wait, then I will offer alternative implementation based on Construct.

Copy link
Member

@arekbulski arekbulski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont like this implementation overall, but I dont have any specific points to make.

@KOLANICH
Copy link
Contributor Author

There were some errors, I have fixed them.

@KOLANICH KOLANICH force-pushed the __repr__ branch 4 times, most recently from 8192392 to d4aee68 Compare March 22, 2018 19:49
@KOLANICH
Copy link
Contributor Author

KOLANICH commented Mar 29, 2018

@GreyCat @arekbulski I was advised with a good replacement for a lock and truncation which are already a part of python stdlib :)

@KOLANICH
Copy link
Contributor Author

KOLANICH commented Mar 29, 2018

reprlib is missing on python 2, but I guess may be backported and put into pip, the name is free https://pypi.python.org/pypi/reprlib

@arekbulski
Copy link
Member

I have issues with this implementation:

  • reprlib is Python 3 only but the runtime is supposed to be Python 2/3
  • your code uses different indentation than rest of the script
  • I dont particularly like the usage of dir, we could ask @GreyCat to change the compiler so that generated structs have something similar to __slots__, so you would just iterate that.
  • it evalutates unevaluated (lazy) instances? this could lead to failure.

@KOLANICH
Copy link
Contributor Author

KOLANICH commented Mar 30, 2018

reprlib is Python 3 only but the runtime is supposed to be Python 2/3

Again: we can try to backport it and put into pip.

your code uses different indentation than rest of the script

fixed. Thank you for bringing it up.

I dont particularly like the usage of dir, we could ask @GreyCat to change the compiler so that generated structs have something similar to slots, so you would just iterate that.

__slots__ cannot be properties

it evalutates unevaluated (lazy) instances? this could lead to failure.

It does, it could. But reprlib provides recursion lock and truncation, so I guess there will be no infinite recursions and rendering too much values of sequences. About reading large files into memory which can cause consuming all the memory in machine - I guess we should map them, not cache them and use references to them instead of copies (I wonder if it is possible in python), this would lead into consuming only a single page. Maybe we need a volatile hint to optimize this.

@arekbulski
Copy link
Member

__slots__ cannot be properties

But thats not what I said or meant. We could have the compiler generate KaitaiStruct-s that have something similar to slots, a list of names, to iterate that instead of filtering dir:

class Enum0(KaitaiStruct):
    __slotsalike__ = ["pet_1","pet_2"]
    def _read(self):
        self.pet_1 = self._root.Animal(self._io.read_u4le())
        self.pet_2 = self._root.Animal(self._io.read_u4le())

then

-        for k in dir(self)
-        if k[0] != "_" and not hasattr(KaitaiStruct, k) and not isinstance(getattr(self, k), type)
+    for k in __slotsalike__: 
+    yield getattr(self, k)

This would not only have better performance, if you care about that in this case that is, but it might also be useful to end users as they might have a use for a list of field names.

@arekbulski
Copy link
Member

But reprlib provides recursion lock and truncation,

That does not address the issue I meant. You could have a pointer instance that refers to outside of stream size, which will cause failure if evaluated. Its not about the size of the output.

@KOLANICH
Copy link
Contributor Author

KOLANICH commented Mar 30, 2018

You could have a pointer instance that refers to outside of stream size, which will cause failure if evaluated. Its not about the size of the output.

Out of scope - its responsibility of a ksy developer to have all the needed ifs.

But thats not what I said or meant. We could have the compiler generate KaitaiStruct-s that have something similar to slots, a list of names, to iterate that instead of filtering dir:

I agree. I guess we need non-volatile fields in _fields, properties in _props and everything possible in __slots__. And tuples should be used, not lists.

And I guess that collection of some of this stuff can be done in runtime without any changes to compiler.

@arekbulski
Copy link
Member

@GreyCat Could you share your position with us? How about we have the compiler attach a list of IDs to each generated Struct, similar to slots?

@GreyCat
Copy link
Member

GreyCat commented Apr 1, 2018

Compiler actually already does something similar when doing --debug, but not for Python. I have no idea what you guys mean by "non-volatile fields in _fields, properties in _props and everything possible in __slots__", i.e. what is "non-volatile fields", "properties", "everything possible".

We generate an array called something like "seq_fields", i.e.:

Could you explain, what's wrong with current solution - i.e. is it's reflection slow, unreliable, or something else is the matter? As far as I have guessed, __slots__ is some special array in Python?

@KOLANICH
Copy link
Contributor Author

KOLANICH commented Apr 1, 2018

I have no idea what you guys mean by "non-volatile fields in _fields, properties in _props and everything possible in __slots__", i.e. what is "non-volatile fields", "properties", "everything possible".

Non-volatile fields are the fields (members in seq) without volatile hint. _props is a tuple of names of everything implemented as @propertyes: instances, volatile fields, etc. __slots__ is a special collection listing the names of dumb (non-@property members), if it is present, only the names listed in it are allowed (unless you have some magic method overloading it), __dict__ is unavailable and it is claimed that memory footprint is reduced.

is it's reflection slow, unreliable, or something else is the matter

we process strings to determine if a name to be included, it's ugly, and may be slow

One more considiration: I guess we'll have inheritance somewhen, so the thing should be compatible to it without much overhead.

@arekbulski
Copy link
Member

SEQ_FIELDS is exactly what I am talking about. The only issue with it is that it should be included even without debug mode and available in Python.

On Kolanich behalf: non-volatile is just a synonym for cached (like instances are cached).

@z3ntu
Copy link

z3ntu commented Nov 4, 2020

Hi, what's the state of this PR? I'm interest in this as well as the default string representation is not really useful at the moment <MyBinaryType object at 0xb18605b0> (and I wanted to spare myself writing it myself for a quick prototype 😉)

@KOLANICH
Copy link
Contributor Author

KOLANICH commented Nov 5, 2020

The status of this PR is that I personally use it, but in a form of a separate branch on top of master. On large files it may cause problems, since text is generated first, then is output, we don't evaluate size of the text that will be generated before generating it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants