Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create cvat_sdk.datasets, a framework-agnostic version of cvat_sdk.pytorch #6428

Merged
merged 4 commits into from
Jul 18, 2023

Conversation

SpecLad
Copy link
Contributor

@SpecLad SpecLad commented Jul 5, 2023

Motivation and context

The new TaskDataset class provides conveniences like per-frame annotations, bulk data downloading, and caching without forcing a dependency on PyTorch (and somewhat awkwardly conforming to the PyTorch dataset interface). It also provides a few extra niceties, like easy access to labels and original frame numbers.

Note that it's called TaskDataset rather than TaskVisionDataset, as my plan is to keep it domain-agnostic. The MediaElement class is extensible, and we can add, for example, support for point clouds, by adding another load_* method.

There is currently no ProjectDataset equivalent, although one could (and probably should) be added later. If we add one, we should probably also add a task_id field to Sample.

How has this been tested?

SDK unit tests.

Checklist

  • I submit my changes into the develop branch
  • I have added a description of my changes into the CHANGELOG file
  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • [ ] I have linked related issues (see GitHub docs)
  • [ ] I have increased versions of npm packages if it is necessary
    (cvat-canvas,
    cvat-core,
    cvat-data and
    cvat-ui)

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

@codecov
Copy link

codecov bot commented Jul 7, 2023

Codecov Report

Merging #6428 (415233a) into develop (bf3e31d) will increase coverage by 0.03%.
The diff coverage is 97.32%.

@@             Coverage Diff             @@
##           develop    #6428      +/-   ##
===========================================
+ Coverage    81.70%   81.73%   +0.03%     
===========================================
  Files          334      337       +3     
  Lines        38446    38501      +55     
  Branches      3540     3540              
===========================================
+ Hits         31411    31470      +59     
+ Misses        7035     7031       -4     
Components Coverage Δ
cvat-ui 75.16% <ø> (+0.03%) ⬆️
cvat-server 87.82% <97.32%> (+0.01%) ⬆️

…torch

The new `TaskDataset` class provides conveniences like per-frame
annotations, bulk data downloading, and caching without forcing a dependency
on PyTorch (and somewhat awkwardly conforming to the PyTorch dataset
interface). It also provides a few extra niceties, like easy access to labels
and original frame numbers.

Note that it's called `TaskDataset` rather than `TaskVisionDataset`, as my plan is
to keep it domain-agnostic. The `MediaElement` class is extensible, and we can add,
for example, support for point clouds, by adding another `load_*` method.

There is currently no `ProjectDataset` equivalent, although one could (and
probably should) be added later. If we add one, we should probably also add
a `task_id` field to `Sample`.
@nmanovic nmanovic merged commit bc5036f into cvat-ai:develop Jul 18, 2023
34 checks passed
@SpecLad SpecLad deleted the sdk-datasets branch July 20, 2023 10:13
@azhavoro azhavoro mentioned this pull request Jul 27, 2023
bsekachev added a commit that referenced this pull request Jul 27, 2023
## \[2.5.2\] - 2023-07-27

### Added

- We've added support for multi-line text attributes
(<#6458>)
- You can now set a default attribute value for SELECT, RADIO types on
UI
  (<#6474>)
- \[SDK\] `cvat_sdk.datasets`, is now available, providing a
framework-agnostic alternative to `cvat_sdk.pytorch`
  (<#6428>)
- We've introduced analytics for Jobs, Tasks, and Project
(<#6371>)

### Changed

- \[Helm\] In Helm, we've added a configurable default storage option to
the chart (<#6137>)

### Removed

- \[Helm\] In Helm, we've eliminated the obligatory use of hardcoded
traefik ingress (<#6137>)

### Fixed

- Fixed an issue with calculating the number of objects on the
annotation view when frames are deleted
  (<#6493>)
- \[SDK\] In SDK, we've fixed the issue with creating attributes with
blank default values
  (<#6454>)
- \[SDK\] We've corrected a problem in SDK where it was altering input
data in models (<#6455>)
- Fixed exporting of hash for shapes and tags in a specific corner case
(<#6517>)
- Resolved the issue where 3D jobs couldn't be opened in validation mode
(<#6507>)
- Fixed SAM plugin (403 code for workers in organizations)
(<#6514>)
- Fixed the issue where initial frame from query parameter was not
opening specific frame in a job
  (<#6506>)
- Corrected the issue with the removal of the first keyframe
(<#6494>)
- Fixed the display of project previews on small screens and updated
stylelint & rules (<#6551>)
- Implemented server-side validation for attribute specifications
  (<#6447>)
- \[API\] Fixed API issue related to file downloading failures for
filenames with special characters
(<#6492>)
- \[Helm\] In Helm, we've resolved an issue with multiple caches
in the same RWX volume, which was preventing db migration from starting
(<#6137>)

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Anastasia Yasakova <yasakova.an@gmail.com>
Co-authored-by: yasakova-anastasia <anastasia@cvat.ai>
Co-authored-by: Roman Donchenko <roman@cvat.ai>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Boris Sekachev <boris.sekachev@yandex.ru>
Co-authored-by: Maxim Zhiltsov <zhiltsov.max35@gmail.com>
Co-authored-by: Kirill Sizov <kirill.sizov@cvat.ai>
Co-authored-by: Nikita Manovich <nikita@cvat.ai>
Co-authored-by: Mariia Acoca <39969264+mdacoca@users.noreply.github.com>
Co-authored-by: Kirill Lakhov <kirill.9992@gmail.com>
Co-authored-by: Michael Kirpichev <mkirpic+github@gmail.com>
Co-authored-by: Michael Kirpichev <m.kirpichev@haut.ai>
Co-authored-by: Boris Sekachev <boris@cvat.ai>
PMazarovich pushed a commit to PMazarovich/cvat that referenced this pull request Aug 15, 2023
## \[2.5.2\] - 2023-07-27

### Added

- We've added support for multi-line text attributes
(<cvat-ai#6458>)
- You can now set a default attribute value for SELECT, RADIO types on
UI
  (<cvat-ai#6474>)
- \[SDK\] `cvat_sdk.datasets`, is now available, providing a
framework-agnostic alternative to `cvat_sdk.pytorch`
  (<cvat-ai#6428>)
- We've introduced analytics for Jobs, Tasks, and Project
(<cvat-ai#6371>)

### Changed

- \[Helm\] In Helm, we've added a configurable default storage option to
the chart (<cvat-ai#6137>)

### Removed

- \[Helm\] In Helm, we've eliminated the obligatory use of hardcoded
traefik ingress (<cvat-ai#6137>)

### Fixed

- Fixed an issue with calculating the number of objects on the
annotation view when frames are deleted
  (<cvat-ai#6493>)
- \[SDK\] In SDK, we've fixed the issue with creating attributes with
blank default values
  (<cvat-ai#6454>)
- \[SDK\] We've corrected a problem in SDK where it was altering input
data in models (<cvat-ai#6455>)
- Fixed exporting of hash for shapes and tags in a specific corner case
(<cvat-ai#6517>)
- Resolved the issue where 3D jobs couldn't be opened in validation mode
(<cvat-ai#6507>)
- Fixed SAM plugin (403 code for workers in organizations)
(<cvat-ai#6514>)
- Fixed the issue where initial frame from query parameter was not
opening specific frame in a job
  (<cvat-ai#6506>)
- Corrected the issue with the removal of the first keyframe
(<cvat-ai#6494>)
- Fixed the display of project previews on small screens and updated
stylelint & rules (<cvat-ai#6551>)
- Implemented server-side validation for attribute specifications
  (<cvat-ai#6447>)
- \[API\] Fixed API issue related to file downloading failures for
filenames with special characters
(<cvat-ai#6492>)
- \[Helm\] In Helm, we've resolved an issue with multiple caches
in the same RWX volume, which was preventing db migration from starting
(<cvat-ai#6137>)

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Anastasia Yasakova <yasakova.an@gmail.com>
Co-authored-by: yasakova-anastasia <anastasia@cvat.ai>
Co-authored-by: Roman Donchenko <roman@cvat.ai>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Boris Sekachev <boris.sekachev@yandex.ru>
Co-authored-by: Maxim Zhiltsov <zhiltsov.max35@gmail.com>
Co-authored-by: Kirill Sizov <kirill.sizov@cvat.ai>
Co-authored-by: Nikita Manovich <nikita@cvat.ai>
Co-authored-by: Mariia Acoca <39969264+mdacoca@users.noreply.github.com>
Co-authored-by: Kirill Lakhov <kirill.9992@gmail.com>
Co-authored-by: Michael Kirpichev <mkirpic+github@gmail.com>
Co-authored-by: Michael Kirpichev <m.kirpichev@haut.ai>
Co-authored-by: Boris Sekachev <boris@cvat.ai>
mikhail-treskin pushed a commit to retailnext/cvat that referenced this pull request Oct 25, 2023
…torch (cvat-ai#6428)

The new `TaskDataset` class provides conveniences like per-frame
annotations, bulk data downloading, and caching without forcing a
dependency on PyTorch (and somewhat awkwardly conforming to the PyTorch
dataset interface). It also provides a few extra niceties, like easy
access to labels and original frame numbers.

Note that it's called `TaskDataset` rather than `TaskVisionDataset`, as
my plan is to keep it domain-agnostic. The `MediaElement` class is
extensible, and we can add, for example, support for point clouds, by
adding another `load_*` method.

There is currently no `ProjectDataset` equivalent, although one could
(and probably should) be added later. If we add one, we should probably
also add a `task_id` field to `Sample`.
mikhail-treskin pushed a commit to retailnext/cvat that referenced this pull request Oct 25, 2023
- We've added support for multi-line text attributes
(<cvat-ai#6458>)
- You can now set a default attribute value for SELECT, RADIO types on
UI
  (<cvat-ai#6474>)
- \[SDK\] `cvat_sdk.datasets`, is now available, providing a
framework-agnostic alternative to `cvat_sdk.pytorch`
  (<cvat-ai#6428>)
- We've introduced analytics for Jobs, Tasks, and Project
(<cvat-ai#6371>)

- \[Helm\] In Helm, we've added a configurable default storage option to
the chart (<cvat-ai#6137>)

- \[Helm\] In Helm, we've eliminated the obligatory use of hardcoded
traefik ingress (<cvat-ai#6137>)

- Fixed an issue with calculating the number of objects on the
annotation view when frames are deleted
  (<cvat-ai#6493>)
- \[SDK\] In SDK, we've fixed the issue with creating attributes with
blank default values
  (<cvat-ai#6454>)
- \[SDK\] We've corrected a problem in SDK where it was altering input
data in models (<cvat-ai#6455>)
- Fixed exporting of hash for shapes and tags in a specific corner case
(<cvat-ai#6517>)
- Resolved the issue where 3D jobs couldn't be opened in validation mode
(<cvat-ai#6507>)
- Fixed SAM plugin (403 code for workers in organizations)
(<cvat-ai#6514>)
- Fixed the issue where initial frame from query parameter was not
opening specific frame in a job
  (<cvat-ai#6506>)
- Corrected the issue with the removal of the first keyframe
(<cvat-ai#6494>)
- Fixed the display of project previews on small screens and updated
stylelint & rules (<cvat-ai#6551>)
- Implemented server-side validation for attribute specifications
  (<cvat-ai#6447>)
- \[API\] Fixed API issue related to file downloading failures for
filenames with special characters
(<cvat-ai#6492>)
- \[Helm\] In Helm, we've resolved an issue with multiple caches
in the same RWX volume, which was preventing db migration from starting
(<cvat-ai#6137>)

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Anastasia Yasakova <yasakova.an@gmail.com>
Co-authored-by: yasakova-anastasia <anastasia@cvat.ai>
Co-authored-by: Roman Donchenko <roman@cvat.ai>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Boris Sekachev <boris.sekachev@yandex.ru>
Co-authored-by: Maxim Zhiltsov <zhiltsov.max35@gmail.com>
Co-authored-by: Kirill Sizov <kirill.sizov@cvat.ai>
Co-authored-by: Nikita Manovich <nikita@cvat.ai>
Co-authored-by: Mariia Acoca <39969264+mdacoca@users.noreply.github.com>
Co-authored-by: Kirill Lakhov <kirill.9992@gmail.com>
Co-authored-by: Michael Kirpichev <mkirpic+github@gmail.com>
Co-authored-by: Michael Kirpichev <m.kirpichev@haut.ai>
Co-authored-by: Boris Sekachev <boris@cvat.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants