Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-982770: [Local Testing] Add support for rank related features. #1890

Merged

Conversation

sfc-gh-jrose
Copy link
Contributor

  1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-982770

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
      • If this test skips Local Testing mode, I'm requesting review from @snowflakedb/local-testing
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
    • If this is a new feature/behavior, I'm adding the Local Testing parity changes.
  3. Please describe how your code solves the related issue.

    This PR does a few things to add support for rank related features to local testing.

  • Adds some metadata to ColumnEmultors so that they can retain information about how they have been sorted. Arguably this could be made a private variable as I would not expect users to need direct access to this information outside of mocking their own functions.
  • Modifies windowed functions to retain calculated expression columns that are used for ordering when passing to a mocked rank function.
  • Adds the mocked rank functions and expands test examples to include more branches in the local testing logic.

@sfc-gh-jrose sfc-gh-jrose added the NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md label Jul 9, 2024
@sfc-gh-jrose sfc-gh-jrose requested a review from a team July 9, 2024 21:18
@github-actions github-actions bot added the local testing Local Testing issues/PRs label Jul 9, 2024
@sfc-gh-jrose sfc-gh-jrose removed the NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md label Jul 10, 2024
@sfc-gh-jrose sfc-gh-jrose force-pushed the jrose_snow_982770_local_testing_rank_related_functions branch from 34a49e2 to aaacd32 Compare July 11, 2024 16:03
@sfc-gh-jrose sfc-gh-jrose marked this pull request as ready for review July 11, 2024 16:04
@sfc-gh-jrose sfc-gh-jrose requested a review from a team as a code owner July 11, 2024 16:04
CHANGELOG.md Outdated Show resolved Hide resolved
src/snowflake/snowpark/mock/_functions.py Outdated Show resolved Hide resolved
def test_cume_dist(session):
Utils.check_answer(
TestData.xyz(session).select(
cume_dist().over(Window.partition_by(col("X")).order_by(col("Y")))
"X", "Y", cume_dist().over(Window.partition_by(col("X")).order_by(col("Y")))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see some changes in the test, like adding "X", "Y" in the select statement, change sort from False to True in the check_answer.

what's the purpose of the change here? is the previous test not working in the implementation or it's for adding more test coverage here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've enabled the sort because our current logic in _plan.py does not result in the same ordering as the server side when sorting and partitioning windows. Fixing that would be a lot more scope than I think this PR should include. I'm happy to open a ticket to track that work in the future.

With sorting enabled the "X" and "Y" columns are required to avoid ambiguous results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, now I understand why cols X and Y are introduced.

btw does the server guarantee the same ordering or can return different orders. wondering if we should mention this in the limitation or known gap section of local testing framework.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the server order is consistent, not sure if it has a guarentee.

@sfc-gh-jrose sfc-gh-jrose merged commit 9d41463 into main Jul 19, 2024
34 checks passed
@sfc-gh-jrose sfc-gh-jrose deleted the jrose_snow_982770_local_testing_rank_related_functions branch July 19, 2024 15:27
@github-actions github-actions bot locked and limited conversation to collaborators Jul 19, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
local testing Local Testing issues/PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants