-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support GROUP BY alias of the expression in the SELECT list. #5581
Conversation
That's the expected behavior. In ANSI SQL, that's not a valid query. The expressions in the select clause are "evaluated" after group by, window functions, etc. so their names are not in scope for those operations. |
@martint yes, we just migrate lots of application from oracle/mysql to presto (currently running to 0.144.1), found that many query failed due to this, so I changed this to fit for those legacy queries. |
This feature is problematic. Per the SQL standard, the GROUP BY column names must be resolved based on the input relations. The additional resolution against output names would have to happen after input names. The consequence is that adding a column to a table can break existing queries. Consider the following: CREATE TABLE t (a bigint);
SELECT a AS b, count(*)
FROM t
GROUP BY b;
ALTER TABLE t ADD COLUMN b bigint; Re-running the query would now fail with an error: I'm having trouble coming up with an example, but there might be other queries that change behavior and thus silently return incorrect answers. |
Note that there's a standard, portable way to do this (without repeating a complex expression in the GROUP BY clause): SELECT (a + b + c) AS x, (d * e) AS y, count(*)
FROM t
GROUP BY 1, 2; I realize this doesn't help with applications that have existing queries, but might help for the future. |
PostgreSQL supports this, but resolves names against the input relations first, which illustrates the problem I described above:
|
Actually, |
Yeah, new column definition b with the same name of the alias will surely lead to the SQL failure, which is expected. This could be categorized to the kinds of "syntactic sugar", which oracle/postgre/mysql implemented in the past decades. |
This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the task, make sure you've addressed reviewer comments, and rebase on the latest master. Thank you for your contributions! |
Seems that group by alias will fail currently.
presto:default> select count(cname), cid as id from ss_dev.course_rt group by id;
Query 20160705_122534_00018_ej4np failed: line 1:63: Column '"id"' cannot be resolved
select count(cname), cid as id from ss_dev.course_rt group by id