Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of flat vs unflat distinction in the query processor #744

Open
semihsalihoglu-uw opened this issue Aug 10, 2022 · 0 comments
Open

Comments

@semihsalihoglu-uw
Copy link
Contributor

semihsalihoglu-uw commented Aug 10, 2022

Currently we have many pieces of code that has if (flat) { ... } else { // unflat ... } branches. This does not seem necessary if the flatten operator, which is the only operator that flattens correctly sets the selectedSize of the DataChunk it is flattening to 1 and set the selectedPos correctly. This would get rid of all these branches from many places. However, I don't expect all such branches to go away because some code may still need to identify the flat data chunk. For example in binary comparison expressions one of the two sides are assumed to be flat, so that code still needs to check which side is flat. So we might need to keep the isFlat() function, which would return true if the selectedSize = 1.

The flatten operator would need to change and make sure that if it receives an already selected data chunk to flatten, it copies those original selected vectors. At a high level, its pseudocode should look like this:

Flatten : PhysicalOp {
   void getNextTuples() {
      bool[] prevSelectedPos = dataChunkToFlatten->copySelectedState();
      uint64_t prevSelectedSize = dataChunkToFlatten->selectedSize; 
      dataChunkToFlatten->selectedSize = 1;
      for (int i = 0; i < prevSelectedSize; ++i) { // this would not be a for loop but be incremented with each call to getNextTuples()
         dataChunkToFlatten->selectedPos[0] = prevSelectedPos[i];
      }
   } 
}

And then many but not all if (isFlat()) code paths would be removed.

I'm labeling this as phase 1 but this can be addressed later.

@semihsalihoglu-uw semihsalihoglu-uw added feature New features or missing components of existing features phase 1 labels Aug 10, 2022
@semihsalihoglu-uw semihsalihoglu-uw added testing Testing related issues and removed phase 1 labels Nov 2, 2022
@ray6080 ray6080 added performance optimization and removed feature New features or missing components of existing features testing Testing related issues labels Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants