Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROPOSAL] Prepare Statement Improvement #3065

Closed
andyfengHKU opened this issue Mar 15, 2024 · 0 comments
Closed

[PROPOSAL] Prepare Statement Improvement #3065

andyfengHKU opened this issue Mar 15, 2024 · 0 comments
Assignees

Comments

@andyfengHKU
Copy link
Contributor

andyfengHKU commented Mar 15, 2024

At a high level, our query execution is split into 2 stages.

  • Stage 1: Convert string to preparedStatement. This stage includes parsing, binding and planning
  • Stage 2: Convert preparedStatement into result. This stage includes mapping and executing. Note that we put mapping into Stage 2 instead of 1 because we don't have a mechanism to reset operator state yet. The problem of resetting is mostly about testing.
    For auto-committed transactions, state 1 will create an auto-committed transaction and stage 2 will end that transaction.

For manual transactions, we don't need to anything special because transaction behaviour is controlled by the user through transaction statements.

Transaction

This problem with current transaction over prepared statement happens in the following 2 cases. We assume auto-committed transaction in the following description.
Case 1: prepare multiple times

PREPARE stmt1;
PREPARE stmt2;
EXECUTE stmt2;

When preparing stmt1, we will start a tx1. Then when preparing stmt2, we will try to start a tx2 without committing tx1 which leads to an exception.
Case 2: prepare once and execute multiple times

PREPARE stmt;
EXECUTE stmt;
EXECUTE stmt;

When executing stmt for the first time. We will commit the tx generated in stage 1. And when we execute stmt for the second time. We can no longer find a valid tx.

Prepare with Transaction

The core idea is to separate tx logic between stage 1 and stage 2. And force the invariant that at stage 2 there is always a correct tx if a tx is needed. (The case where tx may not be needed will be explained later).

We should provide the following interfaces

public
     
QueryResult query(string) {
    Parse string as multiple stmts
    For each stmt
         pResult = prepareInternal(stmt)
         result = executeInternal(pResult)
}

PreparedStmt prepare(string) {
    Parse string as single stmt. 
    Return prepareInternal(stmt)
}

QueryResult execute(preparedStmt) {
    Begin tx if auto
    execute(preparedStmt)
}

private

PreparedStmt prepareInternal(stmt) {
    Begin tx if auto
    Parse
    Bind
    Plan
    Commit tx if auto
}

// Stage 2
QueryResult execute(preparedStmt) {
    Execute
    End tx depend on exec state
}

Now that if revisit previous cases
Case 1: prepare multiple times

PREPARE stmt1;
PREPARE stmt2;
EXECUTE stmt2;

Each prepare will have its own tx committed so prepare multiple stmts consecutively won't cause a problem.

Case 2: prepare once and execute multiple times

PREPARE stmt;
EXECUTE stmt;
EXECUTE stmt;

Similarly, when executing stmt multiple times, each execute(preparedStmt) will begin its own tx.

Require valid transaction

Under auto tx, we cannot blindly start tx for each given statement because some statements, e.g. BEGIN TX does not require a tx at all.

So in the pseudo code above, for places like Begin tx if auto, we need to check if the current statement requires a valid tx or not. This can be done by implementing a common interface requireTx() under statement.

Parameter Type Solving

We need a rebind mechanism for our prepared statement. Consider the case

RETURN 1 + $param;

After preparing, I can only do a best effort guess, i.e. INT64 for the type of $param. And if user input a INT32, ideally we should cast to INT64 automatically. This, however, is not doable if we don't trigger a rebind because I need to append an addition expression evaluator between + and $param.

You might argue that we can always implement a cast for literal separately. But this is redundant work given we already have a expression evaluator for casting.

PREPARE statement

We should implement PREPARE and EXECUTE keyword so that we don't have to test prepare in C++ code. And move to our testing framework.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants