-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Save box attributes when downloading the TFDS version of VOC #668
Conversation
`distutils` has been deprecated in Python 3.10. Fortunately, the only name we're using from there is `strtobool`, and conveniently, `attrs` has recently added a function with the same semantics and the same sets of true and false values. So just use that instead. Note that `strtobool` is still used in `setup.py`, and we can't use `attrs` there. I'm hoping that setuptools will eventually provide a replacement.
The main nuisance with this is that Datumaro represents the pose as a string, while TFDS represents it as an index, and to convert the latter into the former, we need access to the list of poses. To implement this, allow adapters access to a persistent state object that can be used to store and retrieve arbitrary data during conversion. We can then save the list of names during category creation and retrieve it when we convert box annotations.
@frozen | ||
class _AttributeMemberMapping: | ||
member_name: str | ||
attribute_name: str = field() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
attribute_name: str = field() | |
attribute_name: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need the field object to be able to call attribute_name.default
.
_AttributeMemberMapping('is_difficult', 'difficult'), | ||
_AttributeMemberMapping('is_truncated', 'truncated'), | ||
_AttributeMemberMapping('pose', | ||
value_converter=lambda idx, state: state.pose_names[idx]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, it needs to be aligned with https://github.com/openvinotoolkit/datumaro/blob/develop/datumaro/plugins/voc_format/format.py#L50
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only difference was the case of the string. I changed the adapter to convert the pose names to title case to better match the original dataset.
TFDS returns the pose names as lower case, but they are upper case in the original dataset. Convert them to title case so that the output of the extractor better resembles the original dataset. Remove `_SaveFeatureClassList`, since it's not useful anymore (and it's unclear whether it'll ever be useful again).
Summary
The main nuisance with this is that Datumaro represents the pose as a string, while TFDS represents it as an index, and to convert the latter into the former, we need access to the list of poses.
To implement this, allow adapters access to a persistent state object that can be used to store and retrieve arbitrary data during conversion. We can then save the list of names during category creation and retrieve it when we convert box annotations.
How to test
Checklist
develop
branchLicense
Feel free to contact the maintainers if that's a concern.