Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE-BRANCH] ImageField: add support to new fields of type image #5279

Merged
merged 43 commits into from
Sep 3, 2024

Conversation

jfcalvo
Copy link
Member

@jfcalvo jfcalvo commented Jul 19, 2024

Description

This PR is the first one related with the addition of a new field of type image including the following changes:

  • Changes in the backend to support a new image field type supporting URLs and Data URLs.
  • Changes in the frontend rendering this new image field for records.
  • Changes in the SDK to define datasets with image fields and create records specifying images.
  • Changes in the documentation with examples on how to use the SDK with this new type of field.

This new type will have the following characteristics:

  • A new filed type image is supported (without additional settings).
  • Records created with fields of type image are validated to have correct Web URLs or data URLs like:
    • https://argilla.io/image.jpeg:
      • HTTPS schema is present
      • A host and a path is present.
      • The URL is not longer than 2038 characters.
    • http://argilla.io/image.jpeg:
      • HTTP schema is present
      • A host and a path is present.
      • The URL is not longer than 2038 characters.
    • data:image/webp;base64,UklGRhgCAABXRUJQVlA4WA:
      • data schema is present
      • A path is present.
      • The Data URL is not longer than 5 million characters.
      • A valid MIME type is present.

Closes #5276

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested

  • Adding new tests to our suite checking this new field.

Checklist

  • I added relevant documentation
  • I followed the style guidelines of this project
  • I did a self-review of my code
  • I made corresponding changes to the documentation
  • I confirm My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

@jfcalvo jfcalvo requested a review from frascuchon July 19, 2024 14:37
Copy link

codecov bot commented Jul 19, 2024

Codecov Report

Attention: Patch coverage is 96.10390% with 3 lines in your changes missing coverage. Please review.

Project coverage is 91.12%. Comparing base (745e57b) to head (452e0b4).
Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
...la-server/src/argilla_server/validators/records.py 94.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #5279      +/-   ##
===========================================
- Coverage    91.20%   91.12%   -0.09%     
===========================================
  Files          139      139              
  Lines         5745     5816      +71     
===========================================
+ Hits          5240     5300      +60     
- Misses         505      516      +11     
Flag Coverage Δ
argilla-server 91.12% <96.10%> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jfcalvo jfcalvo linked an issue Jul 19, 2024 that may be closed by this pull request
@frascuchon frascuchon added this to the v2.1.0 milestone Jul 29, 2024
jfcalvo and others added 5 commits July 31, 2024 10:35
…ine (#5355)

# Description
<!-- Please include a summary of the changes and the related issue.
Please also include relevant motivation and context. List any
dependencies that are required for this change. -->

This PR adds support to indexing datasets and records with image fields.

Note: A datasets reindex is required with this change.

**Type of change**
<!-- Please delete options that are not relevant. Remember to title the
PR according to the type of change -->

- New feature (non-breaking change which adds functionality)

**How Has This Been Tested**
<!-- Please add some reference about how your feature has been tested.
-->

**Checklist**
<!-- Please go over the list and make sure you've taken everything into
account -->

- I added relevant documentation
- I followed the style guidelines of this project
- I did a self-review of my code
- I made corresponding changes to the documentation
- I confirm My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature
works
- I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)
# Description
<!-- Please include a summary of the changes and the related issue.
Please also include relevant motivation and context. List any
dependencies that are required for this change. -->

This PR adds support to work with image fields using the SDK. 

## TODO (as separate PRs)

- [ ] Add documentation section
- [ ] Verify behavior using from/to_hub methods
- [ ] Add tooling to read data URLs from image files/folders (based on
logic defined in `image_to_html`)
- [ ] Add some media type validation (supported image types) 
- [ ] [Optional] Add tooling to re-scale images. The lower the image
sizes, the better the UI work. We can define a transparent re-scaling
process to support large images (currently the limit size in the backend
is 5MB for data URL)


**Type of change**
<!-- Please delete options that are not relevant. Remember to title the
PR according to the type of change -->

- New feature (non-breaking change which adds functionality)

**How Has This Been Tested**
<!-- Please add some reference about how your feature has been tested.
-->

**Checklist**
<!-- Please go over the list and make sure you've taken everything into
account -->

- I added relevant documentation
- I followed the style guidelines of this project
- I did a self-review of my code
- I made corresponding changes to the documentation
- I confirm My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature
works
- I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)

---------

Co-authored-by: José Francisco Calvo <jose@argilla.io>
@burtenshaw burtenshaw changed the base branch from feat/image-field to develop August 6, 2024 10:34
@burtenshaw burtenshaw self-assigned this Aug 6, 2024
@burtenshaw burtenshaw changed the title feat: add support to new fields of type image [FEATURE] ImageField: add support to new fields of type image Aug 6, 2024
Copy link

Docs for this PR have been deployed hidden from versioning: https://docs.argilla.io/docs_5405-docs-tutorial-on-the-usage-of-image-fields

burtenshaw and others added 3 commits August 14, 2024 11:08
# Description
<!-- Please include a summary of the changes and the related issue.
Please also include relevant motivation and context. List any
dependencies that are required for this change. -->

This PR changes the mapper to identify fields of all types (not just
text).

Closes #5406 

**Type of change**
<!-- Please delete options that are not relevant. Remember to title the
PR according to the type of change -->

- Bug fix (non-breaking change which fixes an issue)
- New feature (non-breaking change which adds functionality)
- Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- Refactor (change restructuring the codebase without changing
functionality)
- Improvement (change adding some improvement to an existing
functionality)
- Documentation update

**How Has This Been Tested**
<!-- Please add some reference about how your feature has been tested.
-->

**Checklist**
<!-- Please go over the list and make sure you've taken everything into
account -->

- I added relevant documentation
- I followed the style guidelines of this project
- I did a self-review of my code
- I made corresponding changes to the documentation
- I confirm My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature
works
- I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)
# Description
<!-- Please include a summary of the changes and the related issue.
Please also include relevant motivation and context. List any
dependencies that are required for this change. -->

Closes #<issue_number>

**Type of change**
<!-- Please delete options that are not relevant. Remember to title the
PR according to the type of change -->

- Bug fix (non-breaking change which fixes an issue)
- New feature (non-breaking change which adds functionality)
- Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- Refactor (change restructuring the codebase without changing
functionality)
- Improvement (change adding some improvement to an existing
functionality)
- Documentation update

**How Has This Been Tested**
<!-- Please add some reference about how your feature has been tested.
-->

**Checklist**
<!-- Please go over the list and make sure you've taken everything into
account -->

- I added relevant documentation
- I followed the style guidelines of this project
- I did a self-review of my code
- I made corresponding changes to the documentation
- I confirm My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature
works
- I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)
@jfcalvo jfcalvo changed the title [FEATURE] ImageField: add support to new fields of type image [FEATURE-BRANCH] ImageField: add support to new fields of type image Aug 26, 2024
argilla/CHANGELOG.md Outdated Show resolved Hide resolved
)
```

1. The image can be referenced as either a remote url or a data uri.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an example using data URI would be great @sdiazlor

Copy link
Member

@frascuchon frascuchon Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@burtenshaw maybe we can include here the image_to_data_uri (or whatever) util function usage once you have implemented it

fields={"image": "https://example.com/image.jpg"},
),
rg.Record(
fields={"image": "data:image/png;base64,iV..."}, # (1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here about the imate-to-data-uri function

frascuchon and others added 7 commits August 27, 2024 15:43
# Description
<!-- Please include a summary of the changes and the related issue.
Please also include relevant motivation and context. List any
dependencies that are required for this change. -->
This PR introduces a utility into the SDK so that it automatically casts
PIL objects to base64 based on the datasets `Features`

**Type of change**
<!-- Please delete options that are not relevant. Remember to title the
PR according to the type of change -->

- Bug fix (non-breaking change which fixes an issue)
- New feature (non-breaking change which adds functionality)
- Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- Refactor (change restructuring the codebase without changing
functionality)
- Improvement (change adding some improvement to an existing
functionality)
- Documentation update

**How Has This Been Tested**
<!-- Please add some reference about how your feature has been tested.
-->

**Checklist**
<!-- Please go over the list and make sure you've taken everything into
account -->

- I added relevant documentation
- I followed the style guidelines of this project
- I did a self-review of my code
- I made corresponding changes to the documentation
- I confirm My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature
works
- I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
# Description
<!-- Please include a summary of the changes and the related issue.
Please also include relevant motivation and context. List any
dependencies that are required for this change. -->

Closes #5405 

**Type of change**
<!-- Please delete options that are not relevant. Remember to title the
PR according to the type of change -->

- Documentation update

**How Has This Been Tested**
<!-- Please add some reference about how your feature has been tested.
-->

**Checklist**
<!-- Please go over the list and make sure you've taken everything into
account -->

- I added relevant documentation
- I followed the style guidelines of this project
- I did a self-review of my code
- I made corresponding changes to the documentation
- I confirm My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature
works
- I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)

---------

Co-authored-by: burtenshaw <ben@argilla.io>
Co-authored-by: Paco Aranda <frascuchon@gmail.com>
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@frascuchon frascuchon merged commit 0f05270 into develop Sep 3, 2024
3 of 8 checks passed
@frascuchon frascuchon deleted the feat/add-image-field-support branch September 3, 2024 09:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[TASK] Add support for ImageField at argilla-server
6 participants