Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eng 1187 Script to automatically run gitbook snippets #1172

Merged
merged 6 commits into from
Apr 11, 2023

Conversation

eunice-chan
Copy link
Contributor

@eunice-chan eunice-chan commented Apr 5, 2023

Describe your changes and why you are making these changes

Script to run through Gitbook Python code snippets. Run through all of them except the ones that require lots of set up (integrations) or have workflow UUIDs hard-coded in.

Also fixes bugs discovered via running through the code snippets:

  • The monthly function instance type check incorrect for day (DayOfWeek --> DayOfMonth).
  • Fix DayOfMonth exception statement from "Invalid hour value" to "Invalid day value".
  • Fix assertion for bounds to work when bound_value is falsey (e.g. 0).

TODO: Get integrated into our Github Actions. Will work as a follow up task to determine the frequency to run this & get it set up (https://linear.app/aqueducthq/issue/ENG-2748/integrate-into-github-actions).

Related issue number (if any)

Eng 1187

Loom demo (if any)

Script output:

>> OK        parameters.md
>> OK        quickstart-guide.md
>> SKIPPING  (no Python snippets found) metrics-and-checks.md
>> SKIPPING  (no Python snippets found) faqs.md
>> SKIPPING  (no Python snippets found) SUMMARY.md
>> OK        artifacts.md
>> SKIPPING  (no Python snippets found) the-aqueduct-philosophy.md
>> SKIPPING  (no Python snippets found) operators.md
>> OK        README.md
>> OK        metrics-and-checks/preset-metrics-and-checks.md
>> OK        metrics-and-checks/checks-ensuring-correctness.md
>> OK        metrics-and-checks/metrics-measuring-your-predictions.md
>> SKIPPING  operators/configuring-resource-constraints.md
>> OK        operators/file-dependencies-in-python.md
>> SKIPPING  operators/specifying-a-requirements.txt.md
>> OK        operators/lazy-vs-eager-execution.md
>> OK        operators/creating-a-python-operator.md
>> SKIPPING  api-reference
>> SKIPPING  api-reference/sdk-reference
>> SKIPPING  api-reference/sdk-reference/package-aqueduct
>> SKIPPING  api-reference/sdk-reference/package-aqueduct/package-aqueduct.artifacts
>> SKIPPING  api-reference/sdk-reference/package-aqueduct/package-aqueduct.constants
>> SKIPPING  api-reference/sdk-reference/package-aqueduct/package-aqueduct.integrations
>> SKIPPING  api-reference/sdk-reference/package-aqueduct/package-aqueduct.models
>> OK        workflows/managing-workflow-schedules.md
>> OK        workflows/editing-a-workflow.md
>> SKIPPING  (no Python snippets found) workflows/README.md
>> OK        workflows/workflow-versions.md
>> SKIPPING  workflows/deleting-a-workflow.md
>> OK        workflows/creating-a-workflow.md
>> SKIPPING  integrations
>> SKIPPING  integrations/on-demand-resources
>> SKIPPING  integrations/data-systems
>> SKIPPING  integrations/data-systems/sql-systems
>> SKIPPING  integrations/data-systems/non-sql-systems
>> SKIPPING  integrations/compute-systems
>> SKIPPING  integrations/compute-systems/on-demand-resources
>> SKIPPING  integrations/adding-an-integration
>> SKIPPING  example-workflows
>> OK        guides/porting-a-workflow-to-aqueduct.md
>> SKIPPING  (no Python snippets found) guides/conquering-the-python-environment.md
>> SKIPPING  (no Python snippets found) guides/README.md
>> SKIPPING  (no Python snippets found) guides/debugging-a-failed-workflow.md
>> SKIPPING  installation-and-configuration
>> SKIPPING  installation-and-configuration/installing-aqueduct
>> SKIPPING  (no Python snippets found) notifications/connecting-to-email.md
>> SKIPPING  (no Python snippets found) notifications/connecting-to-slack.md
>> SKIPPING  (no Python snippets found) notifications/README.md
Congrats! You've successfully ran all code snippets in the documentation.

Error Output:

>> FAILED    parameters.md
-------------------------[CODE]-------------------------
from aqueduct import op, Client
import pandas as pd

client = Client() 
db = client.integration("aqueduct_demo")

@op
def predict_churn(reviews: pd.DataFrame, country: str):
    ...

reviews = db.sql("Select * from customers")
country_param = client.create_param("country", default="Venezuela") 
churn = predict_churn(reviews, country_param)

# This will return the result of `predict_churn()` with default "Venezuela" as the country input.
churn.get()

# This will return the result of `predict_churn()` with "United Kingdom" as the country input.
churn.get(parameters={"country": "United Kingdom"})
# This will create a new parameter directly, named `country`, with default value `Venezuela`.
churn = predict_churn(reviews, "Venezuela")
-------------------------[FAILED BLOCK]-------------------------
churn = predict_churn(reviews, country_param)
flow = client.publish_flow("Churn Prediction", artifacts=[churn])

...

import tim
# Sleep for 10 seconds to wait for flow to finish.
time.sleep(10)

# This will trigger a new run of the already published flow "Churn Prediction",
# but with the country parameter value set to "China".
client.trigger(flow.id(), parameters={"country": "China"})
-------------------------[ERROR]-------------------------
WARNING:Operator `predict_churn`'s argument `country` is not an artifact type. We have implicitly created a parameter named `predict_churn:country` and your input will be used as its default value. This parameter will be used when running the function.
Traceback (most recent call last):
  File "/Users/eunicechan/Desktop/aqueduct/regression_tests/tests/gitbook/_Users_eunicechan_Desktop_gitbook_parameters.py", line 27, in <module>
    import tim
ModuleNotFoundError: No module named 'tim'

>> OK        quickstart-guide.md
>> SKIPPING  (no Python snippets found) metrics-and-checks.md
>> SKIPPING  (no Python snippets found) faqs.md
>> SKIPPING  (no Python snippets found) SUMMARY.md
>> OK        artifacts.md
...

Checklist before requesting a review

  • I have created a descriptive PR title. The PR title should complete the sentence "This PR...".
  • I have performed a self-review of my code.
  • I have included a small demo of the changes. For the UI, this would be a screenshot or a Loom video.
  • [N/A] If this is a new feature, I have added unit tests and integration tests.
  • [N/A] I have run the integration tests locally and they are passing.
  • I have run the linter script locally (See python3 scripts/run_linters.py -h for usage).
  • [N/A] All features on the UI continue to work correctly.
  • Added one of the following CI labels:
    • run_integration_test: Runs integration tests
    • skip_integration_test: Skips integration tests (Should be used when changes are ONLY documentation/UI)

@eunice-chan eunice-chan added the skip_integration_test Skips required integration test (only documentation/UI changes) label Apr 5, 2023
@vsreekanti
Copy link
Contributor

Should we run this as a part of the release process?

@eunice-chan
Copy link
Contributor Author

@vsreekanti I think that's a good time to test. I can look into integrating it into the release script.

Copy link
Contributor

@kenxu95 kenxu95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly readability concerns - this is a bit challenging for me to understand and give a good review on.

Can we just run this GH action on every Gitbook commit?

sdk/aqueduct/artifacts/numeric_artifact.py Outdated Show resolved Hide resolved
regression_tests/tests/gitbook/test_docs.py Outdated Show resolved Hide resolved
regression_tests/tests/gitbook/test_docs.py Show resolved Hide resolved
regression_tests/tests/gitbook/test_docs.py Show resolved Hide resolved
@eunice-chan
Copy link
Contributor Author

@kenxu95 Happy to chat about how we can integrate this into our testing process.

I think running it at release makes sense because it is not as frequent and that way we can be sure the code snippets work with the release version of Aqueduct. However, I think running it on every commit could also be useful to ensure that we are not committing code that does not work and spreads out the work more.

Maybe we can do a combination or just have it at one part of the process.

@eunice-chan eunice-chan requested a review from kenxu95 April 9, 2023 03:43
Copy link
Contributor

@kenxu95 kenxu95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot better! Thanks for all the work :) Left a bunch of additional style/refactor suggestions, let me know if they don't make sense.

I think would should make a new github action in the gitbook repo as a follow up? I think it should be triggered on every PR and block the merging. How long does each run take?

regression_tests/tests/gitbook/test_docs.py Show resolved Hide resolved
regression_tests/tests/gitbook/test_docs.py Show resolved Hide resolved
regression_tests/tests/gitbook/test_docs.py Show resolved Hide resolved
regression_tests/tests/gitbook/test_docs.py Show resolved Hide resolved
regression_tests/tests/gitbook/test_docs.py Show resolved Hide resolved
regression_tests/tests/gitbook/test_docs.py Show resolved Hide resolved
regression_tests/tests/gitbook/test_docs.py Show resolved Hide resolved
regression_tests/tests/gitbook/test_docs.py Show resolved Hide resolved
regression_tests/tests/gitbook/test_docs.py Show resolved Hide resolved
@eunice-chan
Copy link
Contributor Author

@kenxu95 It takes 173.07111406326294 s (about 3 minutes).

@eunice-chan eunice-chan merged commit 04267ea into main Apr 11, 2023
@kenxu95 kenxu95 deleted the eng-1187-investigate-an-automated-way-to-keep-all branch April 11, 2023 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
skip_integration_test Skips required integration test (only documentation/UI changes)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants