Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add memory bounded datumaro data format detect to release 1.5.1 #1241

Conversation

vinnamkim
Copy link
Contributor

Summary

How to test

Already tested in the previous PRs.

Checklist

  • I have added unit tests to cover my changes.​
  • I have added integration tests to cover my changes.​
  • I have added the description of my changes into CHANGELOG.​
  • I have updated the documentation accordingly

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below).
# Copyright (C) 2023 Intel Corporation
#
# SPDX-License-Identifier: MIT

@vinnamkim vinnamkim added the ENHANCE Enhancement of existing features label Jan 9, 2024
@vinnamkim vinnamkim added this to the 1.5.1 milestone Jan 9, 2024
@vinnamkim vinnamkim marked this pull request as ready for review January 9, 2024 07:38
@vinnamkim vinnamkim requested review from a team as code owners January 9, 2024 07:38
@vinnamkim vinnamkim requested review from wonjuleee and removed request for a team January 9, 2024 07:38
- Ticket no. 127135 and 127136.
- Develop `JsonSectionPageMapper` to construct page maps for top-level
sections in a given JSON file.
- Enhance `DatumaroImporter.detect()`'s performance by replacing JSON
file parsing logic with the `JsonSectionPageMapper`.

Our existing test will validate its functionality. For the performance
comparison, please see the following.

- Before
```python
from datumaro.rust_api import JsonSectionPageMapper
from time import time
import datumaro as dm

start = time()
format = dm.Dataset.detect("ws_test/coco/datumaro")
dt = 1000.0 * (time() - start)
print(f"Duration for detecting Datumaro data format: {dt:.1f}ms, format={format}")
```

```console
Duration for detecting Datumaro data format: 25784.5ms, format=datumaro
```

- After
```python
from datumaro.rust_api import JsonSectionPageMapper
from time import time
import datumaro as dm

start = time()
format = dm.Dataset.detect("ws_test/coco/datumaro")
dt = 1000.0 * (time() - start)
print(f"Duration for detecting Datumaro data format: {dt:.1f}ms, format={format}")
```

```console
Duration for detecting Datumaro data format: 17234.7ms, format=datumaro
```

It saves ~7 secs.

<!-- Put an 'x' in all the boxes that apply -->
- [ ] I have added unit tests to cover my changes.​
- [ ] I have added integration tests to cover my changes.​
- [x] I have added the description of my changes into
[CHANGELOG](https://github.com/openvinotoolkit/datumaro/blob/develop/CHANGELOG.md).​
- [ ] I have updated the
[documentation](https://github.com/openvinotoolkit/datumaro/tree/develop/docs)
accordingly

- [x] I submit _my code changes_ under the same [MIT
License](https://github.com/openvinotoolkit/datumaro/blob/develop/LICENSE)
that covers the project.
  Feel free to contact the maintainers if that's a concern.
- [x] I have updated the license header for each file (see an example
below).

```python
```

---------

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
…mant (openvinotoolkit#1229)

- Ticket no. 127136

Refer to openvinotoolkit#1224 for
details on how we obtained the following results.

1. Performance

- Before

```console
Duration for detecting Datumaro data format: 25784.5ms, format=datumaro
```
- After

```console
Duration for detecting Datumaro data format: 5966.8ms, format=datumaro
```

2. Memory usage
- Before

![before](https://github.com/openvinotoolkit/datumaro/assets/26541465/9f6432f7-108d-4d9f-a535-f954bfd55f02)
- After

![after](https://github.com/openvinotoolkit/datumaro/assets/26541465/8ff7a1a4-6106-46cc-9f16-74a4979b8a3b)

<!-- Put an 'x' in all the boxes that apply -->
- [ ] I have added unit tests to cover my changes.​
- [ ] I have added integration tests to cover my changes.​
- [x] I have added the description of my changes into
[CHANGELOG](https://github.com/openvinotoolkit/datumaro/blob/develop/CHANGELOG.md).​
- [ ] I have updated the
[documentation](https://github.com/openvinotoolkit/datumaro/tree/develop/docs)
accordingly

- [x] I submit _my code changes_ under the same [MIT
License](https://github.com/openvinotoolkit/datumaro/blob/develop/LICENSE)
that covers the project.
  Feel free to contact the maintainers if that's a concern.
- [x] I have updated the license header for each file (see an example
below).

```python
```

---------

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Copy link

codecov bot commented Jan 9, 2024

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (375d184) 79.97% compared to head (8cbff63) 79.97%.

Files Patch % Lines
.../plugins/data_formats/segment_anything/importer.py 68.42% 3 Missing and 3 partials ⚠️
Additional details and impacted files
@@               Coverage Diff               @@
##           releases/1.5.0    #1241   +/-   ##
===============================================
  Coverage           79.97%   79.97%           
===============================================
  Files                 265      265           
  Lines               29705    29732   +27     
  Branches             5831     5833    +2     
===============================================
+ Hits                23756    23778   +22     
- Misses               4617     4619    +2     
- Partials             1332     1335    +3     
Flag Coverage Δ
macos-11_Python-3.8 ?
ubuntu-20.04_Python-3.8 79.95% <84.21%> (-0.01%) ⬇️
windows-2022_Python-3.8 79.94% <81.57%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@vinnamkim vinnamkim merged commit e426036 into openvinotoolkit:releases/1.5.0 Jan 11, 2024
6 of 7 checks passed
@yunchu yunchu modified the milestones: 1.5.1, 1.5.2 Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ENHANCE Enhancement of existing features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants