-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eng 2589 add documentation for the local data #1178
Eng 2589 add documentation for the local data #1178
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Some questions from my side:
|
@Fanjia-Yan, just a note that in addition to documenting here, you should also update the GitBook documentation: https://github.com/aqueducthq/gitbook |
"---\n", | ||
"### Local Data Parameters\n", | ||
"\n", | ||
"In the case that we haven't acquire accesses to the databases, or we want to test out a workflow with a subset of a dataset, we can consume a local JSON, CSV, or Parquet file to create a parameter Artifact that serve as a temporary data." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit grammar and cleanup, change to:
You can also create a parameter to seamlessly pass in local data to your flow.
In the following example, we pass in a local CSV file that the flow can consume as a table artifact.
" as_type=ArtifactType.TABLE,\n", | ||
" format=\"csv\")\n", | ||
"\n", | ||
"# The local data parameter we just create will have the same content as `reviews_table`. \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this comment and replace the following line with simply: data_param.get()
so the user sees the data.
"id": "a2e121b8", | ||
"metadata": {}, | ||
"source": [ | ||
"If you decides to publish flow using local data, set `use_local` to `True` in `publish_flow`." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit grammar: If you decide to publish a flow that refers to local data, you must set use_local
to True
in publish_flow
.
…dd-documentation-for-the-local-data
Describe your changes and why you are making these changes
This PR adds documentation for using local data by introducing local data parameter as an alternative method in
Accessing Data
section. The motivation for putting the option here is that a new user can leverage the quickstart tutorial to quickly create his own workflow without the complication of hooking his database or uploading the data through Aqueduct UI(which is not immediately obvious). I also reminded the user to set theuse_local
flag to `true when publishing the workflow.---Addition 1---
---Addition 2----
Related issue number (if any)
https://linear.app/aqueducthq/issue/ENG-2589/add-documentation-for-the-local-data-feature
Loom demo (if any)
[I can add one later]
Checklist before requesting a review
python3 scripts/run_linters.py -h
for usage).run_integration_test
: Runs integration testsskip_integration_test
: Skips integration tests (Should be used when changes are ONLY documentation/UI)