Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reflection regression #64

Merged
merged 2 commits into from
Nov 2, 2015
Merged

Reflection regression #64

merged 2 commits into from
Nov 2, 2015

Conversation

jklukas
Copy link
Member

@jklukas jklukas commented Oct 30, 2015

Iteration on #62. Because this branch is on the main repo, travis tests should run.

This adds creation of the schema, but the new test fails with NoSuchTableError: other_schema.basic. I don't understand why this happens.

cc @berdario, @graingert

@jklukas
Copy link
Member Author

jklukas commented Oct 30, 2015

@berdario - Am I correct in understanding that the problem you've seen is that databases in alternate schemas don't get reflected properly?

I have successfully used the most recent version of the dialect to reflect tables from schemas other than public, so I'm trying to figure out exactly what the failure mode is.

I'm not sure whether the failing test here is due to a defect of the dialect, or a defect of the test framework.

@berdario
Copy link
Contributor

Yup, the problem I've seen is that tables don't get reflected properly... The outermost error I get is requested table(s) not available... with version 0.1.2 it seems to work fine

Since I found the problem here with a schema/table that is actually in use I think it's perfectly normal that the test fails, and thus it should be a defect in the dialect

@jklukas
Copy link
Member Author

jklukas commented Oct 30, 2015

@berdario: In your case, is the schema you're reflecting a table from not in the search_path? I'm beginning to suspect that the problem here has to do with the concept of "visibility". For all the work that I've done with this dialect, the remote schema has been on the search path, so the tables are considered visible. That might explain why I haven't seen this before.

@berdario
Copy link
Contributor

Indeed, if I execute show search_path;, I don't see the schemas that I'm currently using

@jklukas
Copy link
Member Author

jklukas commented Oct 31, 2015

I think this is the important gotcha (from http://docs.aws.amazon.com/redshift/latest/dg/r_PG_TABLE_DEF.html):

PG_TABLE_DEF only returns information about tables that are visible to the user. If PG_TABLE_DEF does not return the expected results, verify that the search_path parameter is set correctly to include the relevant schemas.

I'm looking at some options for solving this.

-- We temporarily set the search_path here to include
-- all non-system schemas because pg_table_def doesn't show
-- results for schemas not on the search_path.
SET LOCAL search_path TO {search_path};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ridiculous that we have to do this, but it works and I think it'll be robust.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what was postgres default reflection doing before to get the tables?

@jklukas
Copy link
Member Author

jklukas commented Oct 31, 2015

I just pushed an amended commit that actually fixes #62. @berdario's test case now passes; again, thank you for reporting the issue and writing the test.

@berdario and @graingert - Does this look good to you?

@graingert
Copy link
Member

Does this table https://docs.aws.amazon.com/redshift/latest/dg/r_SVV_TABLE_INFO.html have anything useful in?

@graingert
Copy link
Member

@jklukas
Copy link
Member Author

jklukas commented Oct 31, 2015

My understanding of the situation is the pg_table_def is the only place in Redshift that holds a complete column-level account of distkey, sortkey, and encoding configuration. SVV_TABLE_INFO and table_info.sql present the same partial view of this info (they both present a table-level view that tells whether any column encoding is set, what the first element of the sort key is, etc.).

pg_table_def is named as if it's a Postgres table, but it's Redshift-specific. It appears to be the only pg_* table that's Redshift-specific, and the only pg_* table that has this search_path restriction.

The PG dialect doesn't have this visibility problem because it doesn't have to access the pg_table_def table.

You can see that AWS also needs to manipulate the search_path in their column encoding utility, which I take as confirmation that search_path manipulation is the only approach here that's going to be comprehensive.

The SET made here to modify the search_path specifies LOCAL so that it only takes effect within that transaction and shouldn't affect the search_path of the session overall.

@graingert
Copy link
Member

Is this connection always a connection? Could it be an engine? I think you might need to use a nested transaction here

@jklukas
Copy link
Member Author

jklukas commented Nov 2, 2015

Added a nested transaction per @graingert's recommendation.

@jklukas
Copy link
Member Author

jklukas commented Nov 2, 2015

Removed print statement in test case, per @graingert's suggestion.

@jklukas
Copy link
Member Author

jklukas commented Nov 2, 2015

There's a lot of discussion happening on 94723ec which isn't automatically pulled in here anymore because the commit has been amended.

@graingert
Copy link
Member

ah nuts I meant to comment on the PR

@jklukas
Copy link
Member Author

jklukas commented Nov 2, 2015

As discussed in 94723ec, we now use a contextual_connect to ensure we aren't jumping to different connections on the same engine, and we explicitly set the search_path back to its original value.

There's also now a test to verify that the search_path change doesn't leak.

@jklukas
Copy link
Member Author

jklukas commented Nov 2, 2015

Note that since we now set the search_path back, there's no need for ensuring a transaction, so I'm not explicitly creating one.

@graingert
Copy link
Member

That makes sense, I'm happy for you to merge this when you're ready

jklukas added a commit that referenced this pull request Nov 2, 2015
@jklukas jklukas merged commit b95c5e6 into master Nov 2, 2015
@graingert graingert deleted the reflection-regression branch December 4, 2015 14:26
@jbasko
Copy link

jbasko commented Sep 14, 2018

This LOCAL keyword breaks it for me #145

When I remove it (SET search_path ...), it works. I have autocommit=True if that's relevant.

@jbasko
Copy link

jbasko commented Sep 14, 2018

What also works is putting the SET LOCAL search_path ... inside the same query string passed to cc.execute as in:

cc.execute("""
    SET LOCAL search_path TO %s;
    SELECT ....
""" % modified_search_path)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants