Eng 1187 Script to automatically run gitbook snippets #1172

eunice-chan · 2023-04-05T00:47:17Z

Describe your changes and why you are making these changes

Script to run through Gitbook Python code snippets. Run through all of them except the ones that require lots of set up (integrations) or have workflow UUIDs hard-coded in.

Also fixes bugs discovered via running through the code snippets:

The monthly function instance type check incorrect for day (DayOfWeek --> DayOfMonth).
Fix DayOfMonth exception statement from "Invalid hour value" to "Invalid day value".
Fix assertion for bounds to work when bound_value is falsey (e.g. 0).

TODO: Get integrated into our Github Actions. Will work as a follow up task to determine the frequency to run this & get it set up (https://linear.app/aqueducthq/issue/ENG-2748/integrate-into-github-actions).

Related issue number (if any)

Eng 1187

Loom demo (if any)

Script output:

>> OK        parameters.md
>> OK        quickstart-guide.md
>> SKIPPING  (no Python snippets found) metrics-and-checks.md
>> SKIPPING  (no Python snippets found) faqs.md
>> SKIPPING  (no Python snippets found) SUMMARY.md
>> OK        artifacts.md
>> SKIPPING  (no Python snippets found) the-aqueduct-philosophy.md
>> SKIPPING  (no Python snippets found) operators.md
>> OK        README.md
>> OK        metrics-and-checks/preset-metrics-and-checks.md
>> OK        metrics-and-checks/checks-ensuring-correctness.md
>> OK        metrics-and-checks/metrics-measuring-your-predictions.md
>> SKIPPING  operators/configuring-resource-constraints.md
>> OK        operators/file-dependencies-in-python.md
>> SKIPPING  operators/specifying-a-requirements.txt.md
>> OK        operators/lazy-vs-eager-execution.md
>> OK        operators/creating-a-python-operator.md
>> SKIPPING  api-reference
>> SKIPPING  api-reference/sdk-reference
>> SKIPPING  api-reference/sdk-reference/package-aqueduct
>> SKIPPING  api-reference/sdk-reference/package-aqueduct/package-aqueduct.artifacts
>> SKIPPING  api-reference/sdk-reference/package-aqueduct/package-aqueduct.constants
>> SKIPPING  api-reference/sdk-reference/package-aqueduct/package-aqueduct.integrations
>> SKIPPING  api-reference/sdk-reference/package-aqueduct/package-aqueduct.models
>> OK        workflows/managing-workflow-schedules.md
>> OK        workflows/editing-a-workflow.md
>> SKIPPING  (no Python snippets found) workflows/README.md
>> OK        workflows/workflow-versions.md
>> SKIPPING  workflows/deleting-a-workflow.md
>> OK        workflows/creating-a-workflow.md
>> SKIPPING  integrations
>> SKIPPING  integrations/on-demand-resources
>> SKIPPING  integrations/data-systems
>> SKIPPING  integrations/data-systems/sql-systems
>> SKIPPING  integrations/data-systems/non-sql-systems
>> SKIPPING  integrations/compute-systems
>> SKIPPING  integrations/compute-systems/on-demand-resources
>> SKIPPING  integrations/adding-an-integration
>> SKIPPING  example-workflows
>> OK        guides/porting-a-workflow-to-aqueduct.md
>> SKIPPING  (no Python snippets found) guides/conquering-the-python-environment.md
>> SKIPPING  (no Python snippets found) guides/README.md
>> SKIPPING  (no Python snippets found) guides/debugging-a-failed-workflow.md
>> SKIPPING  installation-and-configuration
>> SKIPPING  installation-and-configuration/installing-aqueduct
>> SKIPPING  (no Python snippets found) notifications/connecting-to-email.md
>> SKIPPING  (no Python snippets found) notifications/connecting-to-slack.md
>> SKIPPING  (no Python snippets found) notifications/README.md
Congrats! You've successfully ran all code snippets in the documentation.

Error Output:

>> FAILED    parameters.md
-------------------------[CODE]-------------------------
from aqueduct import op, Client
import pandas as pd

client = Client() 
db = client.integration("aqueduct_demo")

@op
def predict_churn(reviews: pd.DataFrame, country: str):
    ...

reviews = db.sql("Select * from customers")
country_param = client.create_param("country", default="Venezuela") 
churn = predict_churn(reviews, country_param)

# This will return the result of `predict_churn()` with default "Venezuela" as the country input.
churn.get()

# This will return the result of `predict_churn()` with "United Kingdom" as the country input.
churn.get(parameters={"country": "United Kingdom"})
# This will create a new parameter directly, named `country`, with default value `Venezuela`.
churn = predict_churn(reviews, "Venezuela")
-------------------------[FAILED BLOCK]-------------------------
churn = predict_churn(reviews, country_param)
flow = client.publish_flow("Churn Prediction", artifacts=[churn])

...

import tim
# Sleep for 10 seconds to wait for flow to finish.
time.sleep(10)

# This will trigger a new run of the already published flow "Churn Prediction",
# but with the country parameter value set to "China".
client.trigger(flow.id(), parameters={"country": "China"})
-------------------------[ERROR]-------------------------
WARNING:Operator `predict_churn`'s argument `country` is not an artifact type. We have implicitly created a parameter named `predict_churn:country` and your input will be used as its default value. This parameter will be used when running the function.
Traceback (most recent call last):
  File "/Users/eunicechan/Desktop/aqueduct/regression_tests/tests/gitbook/_Users_eunicechan_Desktop_gitbook_parameters.py", line 27, in <module>
    import tim
ModuleNotFoundError: No module named 'tim'

>> OK        quickstart-guide.md
>> SKIPPING  (no Python snippets found) metrics-and-checks.md
>> SKIPPING  (no Python snippets found) faqs.md
>> SKIPPING  (no Python snippets found) SUMMARY.md
>> OK        artifacts.md
...

Checklist before requesting a review

I have created a descriptive PR title. The PR title should complete the sentence "This PR...".
I have performed a self-review of my code.
I have included a small demo of the changes. For the UI, this would be a screenshot or a Loom video.
[N/A] If this is a new feature, I have added unit tests and integration tests.
[N/A] I have run the integration tests locally and they are passing.
I have run the linter script locally (See python3 scripts/run_linters.py -h for usage).
[N/A] All features on the UI continue to work correctly.
Added one of the following CI labels:
- run_integration_test: Runs integration tests
- skip_integration_test: Skips integration tests (Should be used when changes are ONLY documentation/UI)

vsreekanti · 2023-04-05T02:24:19Z

Should we run this as a part of the release process?

eunice-chan · 2023-04-05T20:22:01Z

@vsreekanti I think that's a good time to test. I can look into integrating it into the release script.

kenxu95

Mostly readability concerns - this is a bit challenging for me to understand and give a good review on.

Can we just run this GH action on every Gitbook commit?

sdk/aqueduct/artifacts/numeric_artifact.py

regression_tests/tests/gitbook/test_docs.py

eunice-chan · 2023-04-07T15:01:07Z

@kenxu95 Happy to chat about how we can integrate this into our testing process.

I think running it at release makes sense because it is not as frequent and that way we can be sure the code snippets work with the release version of Aqueduct. However, I think running it on every commit could also be useful to ensure that we are not committing code that does not work and spreads out the work more.

Maybe we can do a combination or just have it at one part of the process.

…nvestigate-an-automated-way-to-keep-all

kenxu95

A lot better! Thanks for all the work :) Left a bunch of additional style/refactor suggestions, let me know if they don't make sense.

I think would should make a new github action in the gitbook repo as a follow up? I think it should be triggered on every PR and block the merging. How long does each run take?

regression_tests/tests/gitbook/test_docs.py

eunice-chan · 2023-04-11T17:30:34Z

@kenxu95 It takes 173.07111406326294 s (about 3 minutes).

Script to automatically run gitbook snippets

c38dd3e

eunice-chan added the skip_integration_test Skips required integration test (only documentation/UI changes) label Apr 5, 2023

eunice-chan requested review from vsreekanti, kenxu95 and cw75 April 5, 2023 00:47

eunice-chan mentioned this pull request Apr 5, 2023

Eng 1187 Automatically run Gitbook code snippets RunLLM/gitbook#81

Merged

kenxu95 reviewed Apr 6, 2023

View reviewed changes

eunice-chan mentioned this pull request Apr 7, 2023

ENG-2757 Fix SDK function arg validation failures #1183

Merged

4 tasks

Remove SDK changes

83d3e46

eunice-chan added 4 commits April 7, 2023 08:27

Simplify test code

5d594f0

Add more comments

c25cf14

Merge branch 'main' of github.com:aqueducthq/aqueduct into eng-1187-i…

664d1a4

…nvestigate-an-automated-way-to-keep-all

Increase documentation coverage

0612fc0

eunice-chan requested a review from kenxu95 April 9, 2023 03:43

kenxu95 approved these changes Apr 10, 2023

View reviewed changes

eunice-chan merged commit 04267ea into main Apr 11, 2023

kenxu95 deleted the eng-1187-investigate-an-automated-way-to-keep-all branch April 11, 2023 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eng 1187 Script to automatically run gitbook snippets #1172

Eng 1187 Script to automatically run gitbook snippets #1172

eunice-chan commented Apr 5, 2023 •

edited

Loading

vsreekanti commented Apr 5, 2023

eunice-chan commented Apr 5, 2023

kenxu95 left a comment

eunice-chan commented Apr 7, 2023

kenxu95 left a comment

eunice-chan commented Apr 11, 2023

Eng 1187 Script to automatically run gitbook snippets #1172

Eng 1187 Script to automatically run gitbook snippets #1172

Conversation

eunice-chan commented Apr 5, 2023 • edited Loading

Describe your changes and why you are making these changes

Related issue number (if any)

Loom demo (if any)

Checklist before requesting a review

vsreekanti commented Apr 5, 2023

eunice-chan commented Apr 5, 2023

kenxu95 left a comment

Choose a reason for hiding this comment

eunice-chan commented Apr 7, 2023

kenxu95 left a comment

Choose a reason for hiding this comment

eunice-chan commented Apr 11, 2023

eunice-chan commented Apr 5, 2023 •

edited

Loading