Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent special whitespaces in the names of entities #2665

Merged
merged 9 commits into from
May 7, 2024

Conversation

avishniakov
Copy link
Contributor

@avishniakov avishniakov commented May 3, 2024

Describe changes

I added masking using the banned set of symbols (only special characters \t\n\r\v\f are considered banned with this PR) for tags, models, artifacts, stacks, and stack components.
This impacts only the creation of new entities - old ones with invalid names will remain to work, as before, but new ones cannot be created using banned symbols.

Original motivation: symbols like \n would be allowed in API calls, but it breaks the Dashboard subsequently.

Pre-requisites

Please ensure you have done the following:

  • I have read the CONTRIBUTING.md document.
  • If my change requires a change to docs, I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • I have based my new branch on develop and the open PR is targeting develop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.
  • If my changes require changes to the dashboard, these changes are communicated/requested.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Other (add details above)

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Enhanced name validation across various components and requests to ensure names contain only allowed characters.
  • Bug Fixes

    • Implemented additional name validation logic in artifact, component, model, model version, stack, and tag requests to prevent invalid name entries.
  • Tests

    • Added new tests to verify that invalid names are correctly handled and trigger exceptions in various system components.

@github-actions github-actions bot added internal To filter out internal PRs and issues bug Something isn't working labels May 3, 2024
Copy link
Contributor

coderabbitai bot commented May 3, 2024

Important

Auto Review Skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The recent updates across the ZenML codebase primarily focus on enhancing name validation for various entities. A new constant and class (NameValidatedModel) have been introduced to enforce name standards, affecting multiple request classes across different modules. This ensures that names conform to predefined character restrictions, enhancing data integrity and consistency within the system.

Changes

Files Affected Summary of Changes
constants.py, utils/string_utils.py Introduced ALLOWED_NAME_CHARACTERS and NameValidatedModel for name validation.
models/v2/core/.../artifact.py, component.py, model.py, model_version.py, stack.py, tag.py Extended various classes to inherit NameValidatedModel for enhanced name validation.
zen_stores/sql_zen_store.py Added validate_name calls for various parameters in different methods.
tests/integration/functional/zen_stores/test_zen_store.py Added tests for invalid name handling in different scenarios.
scripts/test-coverage-xml.sh Modified splitting algorithm parameter in pytest commands.

🐇✨📝
In the land of code where rabbits hop,
A new rule for names, oh what a swap!
Only letters, digits, they must contain,
To keep the chaos at bay, maintain.
Hopping through tests, with glee and might,
Ensuring every name is just right! 🌟🐰


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@strickvl
Copy link
Contributor

strickvl commented May 3, 2024

@coderabbitai review

Copy link
Contributor

@strickvl strickvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just nits. Otherwise looks good to me. I look to @bcdurak for the Pydantic perspective, however.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Out of diff range and nitpick comments (31)
tests/integration/functional/zen_stores/test_zen_store.py (31)

Line range hint 1-1: Add a module-level docstring to describe the purpose and scope of this test module.


Line range hint 148-148: Remove the extra blank line after the function docstring to adhere to PEP 257.


Line range hint 383-383: Add a docstring to the TestAdminUser class to explain its purpose and functionality.


Line range hint 671-674: Ensure there is a blank line between the summary line and the description in the docstring for better readability.


Line range hint 712-715: Ensure there is a blank line between the summary line and the description in the docstring for better readability.


Line range hint 3029-3029: The first line of the docstring should end with a period, question mark, or exclamation point.


Line range hint 3291-3291: Remove the extra blank line after the function docstring to adhere to PEP 257.


Line range hint 3723-3723: Remove the extra blank line after the function docstring to adhere to PEP 257.


Line range hint 3774-3774: Remove the extra blank line after the function docstring to adhere to PEP 257.


Line range hint 3802-3802: Remove the extra blank line after the function docstring to adhere to PEP 257.


Line range hint 3843-3843: Remove the extra blank line after the function docstring to adhere to PEP 257.


Line range hint 4043-4043: Add a docstring to the TestModel class to explain its purpose and functionality.


Line range hint 4646-4646: Add a docstring to the TestTag class to explain its purpose and functionality.


Line range hint 4647-4647: Add a docstring to the test_create_pass method to explain its purpose and functionality.


Line range hint 4663-4663: Add a docstring to the test_create_bad_input method to explain its purpose and functionality.


Line range hint 4719-4719: Add a docstring to the test_create_duplicate method to explain its purpose and functionality.


Line range hint 4753-4753: Add a docstring to the test_get_tag_found method to explain its purpose and functionality.


Line range hint 4779-4779: Add a docstring to the test_get_tag_not_found method to explain its purpose and functionality.


Line range hint 4805-4805: Add a docstring to the test_list_tags method to explain its purpose and functionality.


Line range hint 4814-4814: Add a docstring to the test_update_tag method to explain its purpose and functionality.


Line range hint 4824-4824: Add a docstring to the test_create_tag_resource_pass method to explain its purpose and functionality.


Line range hint 4917-4917: Add a docstring to the TestTagResource class to explain its purpose and functionality.


Line range hint 4918-4918: Add a docstring to the test_create_tag_resource_pass method to explain its purpose and functionality.


Line range hint 4961-4961: Add a docstring to the test_create_tag_resource_fails_on_duplicate method to explain its purpose and functionality.


Line range hint 4988-4988: Add a docstring to the test_delete_tag_resource_pass method to explain its purpose and functionality.


Line range hint 4996-4996: Add a docstring to the test_delete_tag_resource_mismatch method to explain its purpose and functionality.


Line range hint 5006-5006: Add a docstring to the test_cascade_deletion method to explain its purpose and functionality.


Line range hint 5063-5063: Add a docstring to the TestRunMetadata class to explain its purpose and functionality.


Line range hint 5142-5142: Add a docstring to the TestModelVersionArtifactLinks class to explain its purpose and functionality.


Line range hint 5281-5281: Add a docstring to the TestModelVersionPipelineRunLinks class to explain its purpose and functionality.


Line range hint 5287-5287: Add a docstring to the test_link_create_pass method to explain its purpose and functionality.

src/zenml/utils/string_utils.py Outdated Show resolved Hide resolved
@avishniakov avishniakov changed the title Mask names of entities with allowed characters Prevent special whitespaces in the names of entities May 3, 2024
@avishniakov
Copy link
Contributor Author

Just nits. Otherwise looks good to me. I look to @bcdurak for the Pydantic perspective, however.

I made it in a bit more functional way, so no impact on Pydantic.

@avishniakov avishniakov requested a review from strickvl May 3, 2024 10:31
@avishniakov
Copy link
Contributor Author

@coderabbitai review

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Out of diff range and nitpick comments (30)
tests/integration/functional/test_client.py (30)

Line range hint 1-1: Add a module-level docstring to describe the purpose and scope of this test module.


Line range hint 557-557: The docstring should end with a period.

-    """Test creating run metadata linked only to a pipeline run."""
+    """Test creating run metadata linked only to a pipeline run. """

Line range hint 635-635: The docstring should end with a period.

-    """Test creating metadata linked to a pipeline run and a stack component"""
+    """Test creating metadata linked to a pipeline run and a stack component. """

Line range hint 846-846: Remove the blank line after the function docstring.

-    """Test that artifact pruning works."""
-
+    """Test that artifact pruning works."""

Line range hint 991-991: Add a docstring to the TestArtifact class to describe its purpose.


Line range hint 1179-1181: Add a blank line between the summary line and the description in the docstring.

-    """Test that model version name is mutable."""
+    """Test that model version name is mutable.
+

Line range hint 1195-1195: Add a docstring to the TestModelVersion class to describe its purpose.


Line range hint 1309-1309: Add a docstring to the TestModelVersion class to describe its purpose.


Line range hint 1314-1314: Add a docstring to the test_get_model_version_by_name_found method.


Line range hint 1328-1328: Add a docstring to the test_get_model_version_by_id_found method.


Line range hint 1341-1341: Add a docstring to the test_get_model_version_by_index_found method.


Line range hint 1345-1345: Add a docstring to the test_get_model_version_by_stage_found method.


Line range hint 1374-1374: Add a docstring to the test_get_model_version_by_stage_not_found method.


Line range hint 1378-1378: Add a docstring to the test_get_model_version_not_found method.


Line range hint 1384-1384: Add a docstring to the test_create_model_version_pass method.


Line range hint 1388-1388: Add a docstring to the test_create_model_version_duplicate_fails method.


Line range hint 1481-1481: Add a docstring to the TestModelVersion class to describe its purpose.


Line range hint 1488-1488: Add a docstring to the test_get_model_version_by_name_found method.


Line range hint 1497-1497: Add a docstring to the test_get_model_version_by_id_found method.


Line range hint 1509-1509: Add a docstring to the test_get_model_version_by_index_found method.


Line range hint 1523-1523: Add a docstring to the test_get_model_version_by_stage_found method.


Line range hint 1533-1533: Add a docstring to the test_get_model_version_by_stage_not_found method.


Line range hint 1552-1552: Add a docstring to the test_get_model_version_not_found method.


Line range hint 1560-1560: Add a docstring to the test_create_model_version_pass method.


Line range hint 1564-1564: Add a docstring to the test_create_model_version_duplicate_fails method.


Line range hint 1595-1595: Add a docstring to the test_update_model_version method.


Line range hint 1603-1603: Add a docstring to the test_list_model_version method.


Line range hint 1672-1672: Add a docstring to the test_delete_model_version_found method.


Line range hint 1721-1721: Add a docstring to the test_delete_model_version_not_found method.


Line range hint 1733-1733: Add a docstring to the test_name_and_description_is_mutable method.

Copy link
Contributor

@bcdurak bcdurak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the changes. Thank you for making it less pydantic heavy as well, makes my job a lot easier. 😄 One question though, when it comes to named entities, I thought this approach would also go well with pipelines and pipeline runs too (not sure if I am forgetting about anything else). Is there a reason why we left those out?

Feel free to merge it. Going forward, I think we can implement a simple validator for names like this Pydantic v2 to generalize this even further.

@avishniakov
Copy link
Contributor Author

I like the changes. Thank you for making it less pydantic heavy as well, makes my job a lot easier. 😄 One question though, when it comes to named entities, I thought this approach would also go well with pipelines and pipeline runs too (not sure if I am forgetting about anything else). Is there a reason why we left those out?

Feel free to merge it. Going forward, I think we can implement a simple validator for names like this Pydantic v2 to generalize this even further.

Make sense, we have way more named entities indeed. I'll create a ticket with low prior to revisit this later.
The problem with making it as validator in the Pydantic is the fact that old "poorly" named entities should still work, so it might be tricky to restrict this on the model level.

@avishniakov avishniakov merged commit 164cc09 into develop May 7, 2024
56 of 57 checks passed
@avishniakov avishniakov deleted the bugfix/OSSK-531-mask-names-of-stack-components branch May 7, 2024 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working internal To filter out internal PRs and issues run-slow-ci
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants