Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: lazy boolean operations #114

Closed
pca006132 opened this issue May 13, 2022 · 16 comments · Fixed by #171
Closed

feature request: lazy boolean operations #114

pca006132 opened this issue May 13, 2022 · 16 comments · Fixed by #171
Assignees
Labels
enhancement New feature or request

Comments

@pca006132
Copy link
Collaborator

pca006132 commented May 13, 2022

I guess it might be better to open an issue for this and discuss about the possible implementation for lazy boolean operations. The opt branch of my fork only implements the optimizations for union.

Idea: Apply boolean operations lazily to allow more optimizations, e.g. operation reordering or cheap batch processing for certain operations.

Operation reordering:

  1. Permutation for union and intersection. The idea is to work on manifolds that have smaller number of vertices/faces first. For larger manifolds, the time required for boolean operations is about linear to the number of vertices/faces, so work on manifolds that have smaller number of vertices/faces can reduce the time required for the operation.
  2. De Morgan's Laws (?): a - b - c <=> a - (b + c). Again, the idea is to minimize the number of required operations. For this case, b and c might be disjoint, if so they can be composed together cheaply. If not, the LHS is perhaps the cheapest way to do this.

Batch processing:

  1. Use compose instead of union for a set of pairwise disjoint manifolds. Although checking for pairwise disjointness can take O(n^2) time with a simple implementation, it is still much cheaper in practice than doing union for up to something like 10000 manifolds.
  2. Similarly, we can do batch processing for intersection by discarding disjoint manifolds.

Cache for manifolds that only differ in transformation: We can use a shared_ptr of the underlying CSG tree, and store the transformation separately. When we need to evaluate the CSG tree, e.g. when exporting the object or need to apply some complicated operations on it, we can cache the result and then apply transform to a copy of the result. This way, we can store objects that only differ in transformation (e.g. a grid of objects) cheaply, and allows us to defer the evaluation of the CSG tree as late as possible. This way, copying the manifold is also essentially free: we only need to copy the shared_ptr and the 4x3 matrix.

@elalish
Copy link
Owner

elalish commented May 13, 2022

I already have quite a bit of related optimization: Booleans of manifolds whose bounding boxes don't overlap should be nearly as fast as Compose for instance. Checking disjointness precisely is already nlogn and parallelized rather than n^2. If you only want to check disjointness, NumOverlaps will do it.

Reordering ops for a union/intersection set may help, but I'm not sure by how much. I'm also not sure how common that kind of operation is in practice. Do you have a particular use case in mind for your 10,000 manifold set? My OpenSCAD designs have always tended to have a lot of difference ops interspersed that would defeat this.

Not to say I'm against the idea, but it seems like a lot of effort for a performance increase on a relatively narrow use case. For instance, I think building out our Polygon API would probably bring users a lot broader value with less effort.

@pca006132
Copy link
Collaborator Author

The n here is the number of manifolds rather than the number of vertices. The union optimization is to minimize the number of compose calls: if we have n pairwise disjoint manifolds, we only have to compose them once, rather than doing n union.

Regarding use cases, I'm mainly looking at the benchmarks in https://gist.github.com/ochafik/2db96400e3c1f73558fcede990b8a355 and keyboard builds which involves cutting many holes, something like a - b - b.translate(...) - b.translate(...) - ....

@pca006132
Copy link
Collaborator Author

And regarding the polygon API: any thought about integrating with the clipper library? It seems widely used and quite fast. It already supports boolean operations and offsetting.

@elalish
Copy link
Owner

elalish commented May 13, 2022

Good point regarding n, but I think my point is I don't want to over-index on benchmarks; they tend to be created for the programmer's ease, rather than most common utility. Same reason I don't pay too much attention to the speed of Sponge4; it's a good test, but not a common situation. For keyboard holes, it's easy enough to Compose all the holes first and do one difference.

@elalish
Copy link
Owner

elalish commented May 13, 2022

Yes, I haven't looked at clipperLib for quite some time, but that does look ideally suited. I prefer the "strictly positive" fill rule to even-odd or non-zero for overlapping polygons (it's also in the 3MF standard). Hopefully they can handle that.

@pca006132
Copy link
Collaborator Author

pca006132 commented May 13, 2022

Regarding compose: Personally I prefer a more generic operation and don't perform much optimization in the DSL. For example, if the design is parametric and the objects may or may not intersect depending on the parameter, users will most likely just use union. Also, it should be rather common that a grid of objects do intersect each other but can be partitioned into a few sets of pairwise disjoint objects. In that case, such optimization is non-trivial and best implemented in the library.

For complexity of the implementation, I think this should not be too complex: https://github.com/pca006132/manifold/blob/opt/manifold/src/manifold_set.cpp is just about 200 lines and can probably written more concisely. The main challenge implementing this is that the reordering triggers some bug in the library, making it hard to determine whether that is an issue with this module or other modules in the library. Also, I'm not familiar with geometry and graphics, so I can only implement simpler stuff like this rather than debugging the implementation or implement new features like polygon API or advanced features like convex hull.

@elalish
Copy link
Owner

elalish commented May 13, 2022

Well, then we make a good team! I'm happy on the geometry/graphics side, but you are vastly quicker than me at getting compilers up and running and doing optimizations. You make a very good point about parametric designs; that's the point of this library after all. Let's continue the discussion in your PR; if you can add some doxygen comments to these methods, that'll help me review them better.

@pca006132
Copy link
Collaborator Author

Great! For the PR, it will need some rework to do optimization for intersection and difference. I will write some documentation about how CoW works etc.

One question: What do you think about thread safety? Should we add a lock for the shared CSG tree? In case users want to share the objects between threads and do different operations in different threads.

@elalish
Copy link
Owner

elalish commented May 13, 2022

Funny thing about me is that I know a lot more about GPU parallelization than I do about multi threading. I'm happy to trust your judgement here.

@pca006132
Copy link
Collaborator Author

A prototype implementation here (not yet tested, but the overall structure should be fine): pca006132@18295a2

It will take some time to modify the manifold class to use this, together with fixing test breakage. Also it will break the meshID feature as the copy behavior is changed.

@elalish elalish added the enhancement New feature or request label Jun 4, 2022
@pca006132 pca006132 mentioned this issue Jun 4, 2022
@pca006132
Copy link
Collaborator Author

Also, I'm skimming over the algorithm description and found that union is not necessarily symmetric:

In page 65 equation 4.1 and the explanation below it:

The effect of the condition in the formula is that when the ξ terms are computed to be
exactly equal, the situation is treated as though the ξB term were strictly greater than ξA.
It is as though object B had been adjusted by an arbitrarily small shift in the direction
of the positive axis for ξ. It is this symbolic perturbation that makes in unnecessary to
treat special cases differently. A consequence of this rule is that the basic operation is not
symmetric: the structures representing A ∪ B and B ∪ A are not necessarily identical.

But I did not see the corresponding code that compares ξ in the boolean3.cpp file. Not sure if reordering boolean operations is appropriate in this case. (conceptually I think it should be fine to reorder, users are unlikely to expect the result to be dependent on the order of operations)

@elalish
Copy link
Owner

elalish commented Jun 19, 2022

Yeah, I was thinking about this too, but I came to the same conclusion as you. But it is important to ensure that objects are completely disjoint (>, not >=) in order to apply Compose.

@pca006132
Copy link
Collaborator Author

I tried reversing the csg_tree: disabled reordering commit after #139 is fixed and see if that fixes the test failures, but nothing changed :(.

I'm thinking if the issue with BatchUnion is caused by the overlap calculation, perhaps I should take precision into account.

@pca006132
Copy link
Collaborator Author

Allowing reordering (without using compose) will only cause Sponge4 to fail, with error message

Triangulation failed! Precision = 1e-05
Error in file: /home/pca006132/code/manifold/src/polygon/src/polygon.cpp (1019): 'std::all_of(triangles.begin(), triangles.end(), [&vertPos, precision](const glm::ivec3 &tri) { return CCW(vertPos[tri[0]], vertPos[tri[1]], vertPos[tri[2]], precision) >= 0; })' is false: triangulation is not entirely CCW!

Wondering if this is caused by precision problem, as the triangles are very small now.

Patch:

diff --git a/src/manifold/src/csg_tree.cpp b/src/manifold/src/csg_tree.cpp
index 8f11620..6de9168 100644
--- a/src/manifold/src/csg_tree.cpp
+++ b/src/manifold/src/csg_tree.cpp
@@ -267,29 +267,39 @@ std::shared_ptr<CsgLeafNode> CsgOpNode::ToLeafNode() const {
   if (children_.empty()) return nullptr;
   // turn the children into leaf nodes
   GetChildren();
-  Manifold::OpType op;
   switch (op_) {
     case CsgNodeType::UNION:
-      op = Manifold::OpType::ADD;
+      BatchUnion();
       break;
-    case CsgNodeType::DIFFERENCE:
-      op = Manifold::OpType::SUBTRACT;
-      break;
-    case CsgNodeType::INTERSECTION:
-      op = Manifold::OpType::INTERSECT;
+    case CsgNodeType::INTERSECTION: {
+      std::vector<std::shared_ptr<const Manifold::Impl>> impls;
+      for (auto &child : children_) {
+        impls.push_back(
+            std::dynamic_pointer_cast<CsgLeafNode>(child)->GetImpl());
+      }
+      BatchBoolean(Manifold::OpType::INTERSECT, impls);
+      children_.clear();
+      children_.push_back(std::make_shared<CsgLeafNode>(impls.front()));
       break;
-    default:
-      throw std::runtime_error("unreachable CSG operation");
+    };
+    case CsgNodeType::DIFFERENCE: {
+      // take the lhs out and treat the remaining nodes as the rhs, perform
+      // union optimization for them
+      auto lhs = std::dynamic_pointer_cast<CsgLeafNode>(children_.front());
+      children_.erase(children_.begin());
+      BatchUnion();
+      auto rhs = std::dynamic_pointer_cast<CsgLeafNode>(children_.front());
+      children_.clear();
+      Boolean3 boolean(*lhs->GetImpl(), *rhs->GetImpl(),
+                       Manifold::OpType::SUBTRACT);
+      children_.push_back(
+          std::make_shared<CsgLeafNode>(std::make_shared<Manifold::Impl>(
+              boolean.Result(Manifold::OpType::SUBTRACT))));
+    };
+    case CsgNodeType::LEAF:
+      // unreachable
       break;
   }
-  auto a = std::static_pointer_cast<CsgLeafNode>(children_[0])->GetImpl();
-  for (int i = 1; i < children_.size(); i++) {
-    auto b = std::static_pointer_cast<CsgLeafNode>(children_[i])->GetImpl();
-    Boolean3 boolean(*a, *b, op);
-    a = std::make_shared<Manifold::Impl>(boolean.Result(op));
-  }
-  children_.clear();
-  children_.push_back(std::make_shared<CsgLeafNode>(a));
   // children_ must contain only one CsgLeafNode now, and its Transform will
   // give CsgLeafNode as well
   cache_ = std::dynamic_pointer_cast<CsgLeafNode>(
@@ -340,62 +350,14 @@ void CsgOpNode::BatchUnion() const {
   // with O(n^2) complexity to take too long.
   // If the number of children exceeded this limit, we will operate on chunks
   // with size kMaxUnionSize.
-  constexpr int kMaxUnionSize = 1000;
-  while (children_.size() > 1) {
-    int start;
-    if (children_.size() > kMaxUnionSize) {
-      start = children_.size() - kMaxUnionSize;
-    } else {
-      start = 0;
-    }
-    VecDH<Box> boxes;
-    boxes.reserve(children_.size() - start);
-    for (int i = start; i < children_.size(); i++) {
-      boxes.push_back(std::dynamic_pointer_cast<CsgLeafNode>(children_[i])
-                          ->GetBoundingBox());
-    }
-    const Box *boxesD = boxes.cptrD();
-    // partition the children into a set of disjoint sets
-    // each set contains a set of children that are pairwise disjoint
-    std::vector<VecDH<size_t>> disjointSets;
-    for (size_t i = 0; i < boxes.size(); i++) {
-      auto lambda = [boxesD, i](const VecDH<size_t> &set) {
-        return find_if<decltype(set.end())>(
-                   autoPolicy(set.size()), set.begin(), set.end(),
-                   CheckOverlap({boxesD, i})) == set.end();
-      };
-      decltype(disjointSets.end()) it =
-          std::find_if(disjointSets.begin(), disjointSets.end(), lambda);
-      if (it == disjointSets.end()) {
-        disjointSets.push_back(std::vector<size_t>{i});
-      } else {
-        it->push_back(i);
-      }
-    }
-    // compose each set of disjoint children
-    std::vector<std::shared_ptr<const Manifold::Impl>> impls;
-    for (const auto &set : disjointSets) {
-      if (set.size() == 1) {
-        impls.push_back(
-            std::dynamic_pointer_cast<CsgLeafNode>(children_[start + set[0]])
-                ->GetImpl());
-      } else {
-        std::vector<std::shared_ptr<CsgLeafNode>> tmp;
-        for (size_t j : set) {
-          tmp.push_back(
-              std::dynamic_pointer_cast<CsgLeafNode>(children_[start + j]));
-        }
-        impls.push_back(
-            std::make_shared<const Manifold::Impl>(CsgLeafNode::Compose(tmp)));
-      }
-    }
-    BatchBoolean(Manifold::OpType::ADD, impls);
-    children_.erase(children_.begin() + start, children_.end());
-    children_.push_back(std::make_shared<CsgLeafNode>(impls.back()));
-    // move it to the front as we process from the back, and the newly added
-    // child should be quite complicated
-    std::swap(children_.front(), children_.back());
+  std::vector<std::shared_ptr<const Manifold::Impl>> impls;
+  for (auto &child : children_) {
+    impls.push_back(
+        std::dynamic_pointer_cast<CsgLeafNode>(child)->GetImpl());
   }
+  BatchBoolean(Manifold::OpType::ADD, impls);
+  children_.clear();
+  children_.push_back(std::make_shared<CsgLeafNode>(impls.front()));
 }
 
 /**
@@ -415,10 +377,19 @@ std::vector<std::shared_ptr<CsgNode>> &CsgOpNode::GetChildren(
 
   CsgNodeType op = op_;
   for (auto &child : children_) {
-    if (!finalize || child->GetNodeType() == CsgNodeType::LEAF) {
-      newChildren.push_back(child);
+    if (child->GetNodeType() == op) {
+      auto grandchildren =
+          std::dynamic_pointer_cast<CsgOpNode>(child)->GetChildren(finalize);
+      int start = children_.size();
+      for (auto &grandchild : grandchildren) {
+        newChildren.push_back(grandchild->Transform(child->GetTransform()));
+      }
     } else {
-      newChildren.push_back(child->ToLeafNode());
+      if (!finalize || child->GetNodeType() == CsgNodeType::LEAF) {
+        newChildren.push_back(child);
+      } else {
+        newChildren.push_back(child->ToLeafNode());
+      }
     }
     // special handling for difference: we treat it as first - (second + third +
     // ...) so op = UNION after the first node
diff --git a/test/mesh_test.cpp b/test/mesh_test.cpp
index 4ce83d2..5e4db2d 100644
--- a/test/mesh_test.cpp
+++ b/test/mesh_test.cpp
@@ -911,27 +911,27 @@ TEST(Boolean, Subtract) {
   first.GetMesh();
 }

@elalish
Copy link
Owner

elalish commented Aug 2, 2022

Want to open a new PR so we can discuss there? Also, can you reproduce the triangulation problem by re-ordering the operations in sponge.cpp without your BatchUnion change? If it's generally sensitive to the ordering, then that's not the fault of your change. If it only reproduces with your change, then I'd guess there's just a small bug somewhere (possibly precision).

@pca006132
Copy link
Collaborator Author

Want to open a new PR so we can discuss there?

Yes, I can do that.

Also, can you reproduce the triangulation problem by re-ordering the operations in sponge.cpp without your BatchUnion change?

The patch above changed BatchUnion to BatchBoolean which just perform some reordering before doing the boolean operation, and doesn't use compose. So I think it is probably related to precision.

pca006132 added a commit to pca006132/manifold that referenced this issue Sep 24, 2022
Last mystery in elalish#114. It turns out that the problem is due to
misunderstanding of the bounding box API: Box::Transform is *not* a safe
over-approximation of the bounding box of the manifold after the
transform. In fact it might be smaller than that if the transform is not
axis aligned. The simple fix is to apply the transform first and then
compute the bounding box, this will also make sure that the bounding box
is tight.
pca006132 added a commit to pca006132/manifold that referenced this issue Sep 24, 2022
Last mystery in elalish#114. It turns out that the problem is due to
misunderstanding of the bounding box API: Box::Transform is *not* a safe
over-approximation of the bounding box of the manifold after the
transform. In fact it might be smaller than that if the transform is not
axis aligned. The simple fix is to apply the transform first and then
compute the bounding box, this will also make sure that the bounding box
is tight.
pca006132 added a commit to pca006132/manifold that referenced this issue Sep 25, 2022
Last mystery in elalish#114. It turns out that the problem is due to
misunderstanding of the bounding box API: Box::Transform is *not* a safe
over-approximation of the bounding box of the manifold after the
transform. In fact it might be smaller than that if the transform is not
axis aligned. The simple fix is to apply the transform first and then
compute the bounding box, this will also make sure that the bounding box
is tight.
elalish added a commit that referenced this issue Sep 25, 2022
* use compose for disjoint union

Last mystery in #114. It turns out that the problem is due to
misunderstanding of the bounding box API: Box::Transform is *not* a safe
over-approximation of the bounding box of the manifold after the
transform. In fact it might be smaller than that if the transform is not
axis aligned. The simple fix is to apply the transform first and then
compute the bounding box, this will also make sure that the bounding box
is tight.

* fix batch union

Co-authored-by: Emmett Lalish <elalish@gmail.com>
cartesian-theatrics pushed a commit to SovereignShop/manifold that referenced this issue Mar 11, 2024
* use compose for disjoint union

Last mystery in elalish#114. It turns out that the problem is due to
misunderstanding of the bounding box API: Box::Transform is *not* a safe
over-approximation of the bounding box of the manifold after the
transform. In fact it might be smaller than that if the transform is not
axis aligned. The simple fix is to apply the transform first and then
compute the bounding box, this will also make sure that the bounding box
is tight.

* fix batch union

Co-authored-by: Emmett Lalish <elalish@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants