feature request: lazy boolean operations #114

pca006132 · 2022-05-13T04:00:26Z

I guess it might be better to open an issue for this and discuss about the possible implementation for lazy boolean operations. The opt branch of my fork only implements the optimizations for union.

Idea: Apply boolean operations lazily to allow more optimizations, e.g. operation reordering or cheap batch processing for certain operations.

Operation reordering:

Permutation for union and intersection. The idea is to work on manifolds that have smaller number of vertices/faces first. For larger manifolds, the time required for boolean operations is about linear to the number of vertices/faces, so work on manifolds that have smaller number of vertices/faces can reduce the time required for the operation.
De Morgan's Laws (?): a - b - c <=> a - (b + c). Again, the idea is to minimize the number of required operations. For this case, b and c might be disjoint, if so they can be composed together cheaply. If not, the LHS is perhaps the cheapest way to do this.

Batch processing:

Use compose instead of union for a set of pairwise disjoint manifolds. Although checking for pairwise disjointness can take O(n^2) time with a simple implementation, it is still much cheaper in practice than doing union for up to something like 10000 manifolds.
Similarly, we can do batch processing for intersection by discarding disjoint manifolds.

Cache for manifolds that only differ in transformation: We can use a shared_ptr of the underlying CSG tree, and store the transformation separately. When we need to evaluate the CSG tree, e.g. when exporting the object or need to apply some complicated operations on it, we can cache the result and then apply transform to a copy of the result. This way, we can store objects that only differ in transformation (e.g. a grid of objects) cheaply, and allows us to defer the evaluation of the CSG tree as late as possible. This way, copying the manifold is also essentially free: we only need to copy the shared_ptr and the 4x3 matrix.

The text was updated successfully, but these errors were encountered:

elalish · 2022-05-13T05:25:48Z

I already have quite a bit of related optimization: Booleans of manifolds whose bounding boxes don't overlap should be nearly as fast as Compose for instance. Checking disjointness precisely is already nlogn and parallelized rather than n^2. If you only want to check disjointness, NumOverlaps will do it.

Reordering ops for a union/intersection set may help, but I'm not sure by how much. I'm also not sure how common that kind of operation is in practice. Do you have a particular use case in mind for your 10,000 manifold set? My OpenSCAD designs have always tended to have a lot of difference ops interspersed that would defeat this.

Not to say I'm against the idea, but it seems like a lot of effort for a performance increase on a relatively narrow use case. For instance, I think building out our Polygon API would probably bring users a lot broader value with less effort.

pca006132 · 2022-05-13T05:32:31Z

The n here is the number of manifolds rather than the number of vertices. The union optimization is to minimize the number of compose calls: if we have n pairwise disjoint manifolds, we only have to compose them once, rather than doing n union.

Regarding use cases, I'm mainly looking at the benchmarks in https://gist.github.com/ochafik/2db96400e3c1f73558fcede990b8a355 and keyboard builds which involves cutting many holes, something like a - b - b.translate(...) - b.translate(...) - ....

pca006132 · 2022-05-13T05:45:00Z

And regarding the polygon API: any thought about integrating with the clipper library? It seems widely used and quite fast. It already supports boolean operations and offsetting.

elalish · 2022-05-13T06:05:06Z

Good point regarding n, but I think my point is I don't want to over-index on benchmarks; they tend to be created for the programmer's ease, rather than most common utility. Same reason I don't pay too much attention to the speed of Sponge4; it's a good test, but not a common situation. For keyboard holes, it's easy enough to Compose all the holes first and do one difference.

elalish · 2022-05-13T06:07:50Z

Yes, I haven't looked at clipperLib for quite some time, but that does look ideally suited. I prefer the "strictly positive" fill rule to even-odd or non-zero for overlapping polygons (it's also in the 3MF standard). Hopefully they can handle that.

pca006132 · 2022-05-13T07:07:40Z

Regarding compose: Personally I prefer a more generic operation and don't perform much optimization in the DSL. For example, if the design is parametric and the objects may or may not intersect depending on the parameter, users will most likely just use union. Also, it should be rather common that a grid of objects do intersect each other but can be partitioned into a few sets of pairwise disjoint objects. In that case, such optimization is non-trivial and best implemented in the library.

For complexity of the implementation, I think this should not be too complex: https://github.com/pca006132/manifold/blob/opt/manifold/src/manifold_set.cpp is just about 200 lines and can probably written more concisely. The main challenge implementing this is that the reordering triggers some bug in the library, making it hard to determine whether that is an issue with this module or other modules in the library. Also, I'm not familiar with geometry and graphics, so I can only implement simpler stuff like this rather than debugging the implementation or implement new features like polygon API or advanced features like convex hull.

elalish · 2022-05-13T15:48:06Z

Well, then we make a good team! I'm happy on the geometry/graphics side, but you are vastly quicker than me at getting compilers up and running and doing optimizations. You make a very good point about parametric designs; that's the point of this library after all. Let's continue the discussion in your PR; if you can add some doxygen comments to these methods, that'll help me review them better.

pca006132 · 2022-05-13T15:54:34Z

Great! For the PR, it will need some rework to do optimization for intersection and difference. I will write some documentation about how CoW works etc.

One question: What do you think about thread safety? Should we add a lock for the shared CSG tree? In case users want to share the objects between threads and do different operations in different threads.

elalish · 2022-05-13T16:24:39Z

Funny thing about me is that I know a lot more about GPU parallelization than I do about multi threading. I'm happy to trust your judgement here.

pca006132 · 2022-05-26T15:57:43Z

A prototype implementation here (not yet tested, but the overall structure should be fine): pca006132@18295a2

It will take some time to modify the manifold class to use this, together with fixing test breakage. Also it will break the meshID feature as the copy behavior is changed.

pca006132 · 2022-06-19T05:40:21Z

Also, I'm skimming over the algorithm description and found that union is not necessarily symmetric:

In page 65 equation 4.1 and the explanation below it:

The eﬀect of the condition in the formula is that when the ξ terms are computed to be
exactly equal, the situation is treated as though the ξB term were strictly greater than ξA.
It is as though object B had been adjusted by an arbitrarily small shift in the direction
of the positive axis for ξ. It is this symbolic perturbation that makes in unnecessary to
treat special cases diﬀerently. A consequence of this rule is that the basic operation is not
symmetric: the structures representing A ∪ B and B ∪ A are not necessarily identical.

But I did not see the corresponding code that compares ξ in the boolean3.cpp file. Not sure if reordering boolean operations is appropriate in this case. (conceptually I think it should be fine to reorder, users are unlikely to expect the result to be dependent on the order of operations)

elalish · 2022-06-19T17:28:25Z

Yeah, I was thinking about this too, but I came to the same conclusion as you. But it is important to ensure that objects are completely disjoint (>, not >=) in order to apply Compose.

pca006132 · 2022-07-09T09:34:59Z

I tried reversing the csg_tree: disabled reordering commit after #139 is fixed and see if that fixes the test failures, but nothing changed :(.

I'm thinking if the issue with BatchUnion is caused by the overlap calculation, perhaps I should take precision into account.

pca006132 · 2022-08-01T12:18:55Z

Allowing reordering (without using compose) will only cause Sponge4 to fail, with error message

Triangulation failed! Precision = 1e-05
Error in file: /home/pca006132/code/manifold/src/polygon/src/polygon.cpp (1019): 'std::all_of(triangles.begin(), triangles.end(), [&vertPos, precision](const glm::ivec3 &tri) { return CCW(vertPos[tri[0]], vertPos[tri[1]], vertPos[tri[2]], precision) >= 0; })' is false: triangulation is not entirely CCW!

Wondering if this is caused by precision problem, as the triangles are very small now.

Patch:

diff --git a/src/manifold/src/csg_tree.cpp b/src/manifold/src/csg_tree.cpp
index 8f11620..6de9168 100644
--- a/src/manifold/src/csg_tree.cpp
+++ b/src/manifold/src/csg_tree.cpp
@@ -267,29 +267,39 @@ std::shared_ptr<CsgLeafNode> CsgOpNode::ToLeafNode() const {
   if (children_.empty()) return nullptr;
   // turn the children into leaf nodes
   GetChildren();
-  Manifold::OpType op;
   switch (op_) {
     case CsgNodeType::UNION:
-      op = Manifold::OpType::ADD;
+      BatchUnion();
       break;
-    case CsgNodeType::DIFFERENCE:
-      op = Manifold::OpType::SUBTRACT;
-      break;
-    case CsgNodeType::INTERSECTION:
-      op = Manifold::OpType::INTERSECT;
+    case CsgNodeType::INTERSECTION: {
+      std::vector<std::shared_ptr<const Manifold::Impl>> impls;
+      for (auto &child : children_) {
+        impls.push_back(
+            std::dynamic_pointer_cast<CsgLeafNode>(child)->GetImpl());
+      }
+      BatchBoolean(Manifold::OpType::INTERSECT, impls);
+      children_.clear();
+      children_.push_back(std::make_shared<CsgLeafNode>(impls.front()));
       break;
-    default:
-      throw std::runtime_error("unreachable CSG operation");
+    };
+    case CsgNodeType::DIFFERENCE: {
+      // take the lhs out and treat the remaining nodes as the rhs, perform
+      // union optimization for them
+      auto lhs = std::dynamic_pointer_cast<CsgLeafNode>(children_.front());
+      children_.erase(children_.begin());
+      BatchUnion();
+      auto rhs = std::dynamic_pointer_cast<CsgLeafNode>(children_.front());
+      children_.clear();
+      Boolean3 boolean(*lhs->GetImpl(), *rhs->GetImpl(),
+                       Manifold::OpType::SUBTRACT);
+      children_.push_back(
+          std::make_shared<CsgLeafNode>(std::make_shared<Manifold::Impl>(
+              boolean.Result(Manifold::OpType::SUBTRACT))));
+    };
+    case CsgNodeType::LEAF:
+      // unreachable
       break;
   }
-  auto a = std::static_pointer_cast<CsgLeafNode>(children_[0])->GetImpl();
-  for (int i = 1; i < children_.size(); i++) {
-    auto b = std::static_pointer_cast<CsgLeafNode>(children_[i])->GetImpl();
-    Boolean3 boolean(*a, *b, op);
-    a = std::make_shared<Manifold::Impl>(boolean.Result(op));
-  }
-  children_.clear();
-  children_.push_back(std::make_shared<CsgLeafNode>(a));
   // children_ must contain only one CsgLeafNode now, and its Transform will
   // give CsgLeafNode as well
   cache_ = std::dynamic_pointer_cast<CsgLeafNode>(
@@ -340,62 +350,14 @@ void CsgOpNode::BatchUnion() const {
   // with O(n^2) complexity to take too long.
   // If the number of children exceeded this limit, we will operate on chunks
   // with size kMaxUnionSize.
-  constexpr int kMaxUnionSize = 1000;
-  while (children_.size() > 1) {
-    int start;
-    if (children_.size() > kMaxUnionSize) {
-      start = children_.size() - kMaxUnionSize;
-    } else {
-      start = 0;
-    }
-    VecDH<Box> boxes;
-    boxes.reserve(children_.size() - start);
-    for (int i = start; i < children_.size(); i++) {
-      boxes.push_back(std::dynamic_pointer_cast<CsgLeafNode>(children_[i])
-                          ->GetBoundingBox());
-    }
-    const Box *boxesD = boxes.cptrD();
-    // partition the children into a set of disjoint sets
-    // each set contains a set of children that are pairwise disjoint
-    std::vector<VecDH<size_t>> disjointSets;
-    for (size_t i = 0; i < boxes.size(); i++) {
-      auto lambda = [boxesD, i](const VecDH<size_t> &set) {
-        return find_if<decltype(set.end())>(
-                   autoPolicy(set.size()), set.begin(), set.end(),
-                   CheckOverlap({boxesD, i})) == set.end();
-      };
-      decltype(disjointSets.end()) it =
-          std::find_if(disjointSets.begin(), disjointSets.end(), lambda);
-      if (it == disjointSets.end()) {
-        disjointSets.push_back(std::vector<size_t>{i});
-      } else {
-        it->push_back(i);
-      }
-    }
-    // compose each set of disjoint children
-    std::vector<std::shared_ptr<const Manifold::Impl>> impls;
-    for (const auto &set : disjointSets) {
-      if (set.size() == 1) {
-        impls.push_back(
-            std::dynamic_pointer_cast<CsgLeafNode>(children_[start + set[0]])
-                ->GetImpl());
-      } else {
-        std::vector<std::shared_ptr<CsgLeafNode>> tmp;
-        for (size_t j : set) {
-          tmp.push_back(
-              std::dynamic_pointer_cast<CsgLeafNode>(children_[start + j]));
-        }
-        impls.push_back(
-            std::make_shared<const Manifold::Impl>(CsgLeafNode::Compose(tmp)));
-      }
-    }
-    BatchBoolean(Manifold::OpType::ADD, impls);
-    children_.erase(children_.begin() + start, children_.end());
-    children_.push_back(std::make_shared<CsgLeafNode>(impls.back()));
-    // move it to the front as we process from the back, and the newly added
-    // child should be quite complicated
-    std::swap(children_.front(), children_.back());
+  std::vector<std::shared_ptr<const Manifold::Impl>> impls;
+  for (auto &child : children_) {
+    impls.push_back(
+        std::dynamic_pointer_cast<CsgLeafNode>(child)->GetImpl());
   }
+  BatchBoolean(Manifold::OpType::ADD, impls);
+  children_.clear();
+  children_.push_back(std::make_shared<CsgLeafNode>(impls.front()));
 }
 
 /**
@@ -415,10 +377,19 @@ std::vector<std::shared_ptr<CsgNode>> &CsgOpNode::GetChildren(
 
   CsgNodeType op = op_;
   for (auto &child : children_) {
-    if (!finalize || child->GetNodeType() == CsgNodeType::LEAF) {
-      newChildren.push_back(child);
+    if (child->GetNodeType() == op) {
+      auto grandchildren =
+          std::dynamic_pointer_cast<CsgOpNode>(child)->GetChildren(finalize);
+      int start = children_.size();
+      for (auto &grandchild : grandchildren) {
+        newChildren.push_back(grandchild->Transform(child->GetTransform()));
+      }
     } else {
-      newChildren.push_back(child->ToLeafNode());
+      if (!finalize || child->GetNodeType() == CsgNodeType::LEAF) {
+        newChildren.push_back(child);
+      } else {
+        newChildren.push_back(child->ToLeafNode());
+      }
     }
     // special handling for difference: we treat it as first - (second + third +
     // ...) so op = UNION after the first node
diff --git a/test/mesh_test.cpp b/test/mesh_test.cpp
index 4ce83d2..5e4db2d 100644
--- a/test/mesh_test.cpp
+++ b/test/mesh_test.cpp
@@ -911,27 +911,27 @@ TEST(Boolean, Subtract) {
   first.GetMesh();
 }

elalish · 2022-08-02T15:21:27Z

Want to open a new PR so we can discuss there? Also, can you reproduce the triangulation problem by re-ordering the operations in sponge.cpp without your BatchUnion change? If it's generally sensitive to the ordering, then that's not the fault of your change. If it only reproduces with your change, then I'd guess there's just a small bug somewhere (possibly precision).

pca006132 · 2022-08-02T15:42:34Z

Want to open a new PR so we can discuss there?

Yes, I can do that.

Also, can you reproduce the triangulation problem by re-ordering the operations in sponge.cpp without your BatchUnion change?

The patch above changed BatchUnion to BatchBoolean which just perform some reordering before doing the boolean operation, and doesn't use compose. So I think it is probably related to precision.

Last mystery in elalish#114. It turns out that the problem is due to misunderstanding of the bounding box API: Box::Transform is *not* a safe over-approximation of the bounding box of the manifold after the transform. In fact it might be smaller than that if the transform is not axis aligned. The simple fix is to apply the transform first and then compute the bounding box, this will also make sure that the bounding box is tight.

* use compose for disjoint union Last mystery in #114. It turns out that the problem is due to misunderstanding of the bounding box API: Box::Transform is *not* a safe over-approximation of the bounding box of the manifold after the transform. In fact it might be smaller than that if the transform is not axis aligned. The simple fix is to apply the transform first and then compute the bounding box, this will also make sure that the bounding box is tight. * fix batch union Co-authored-by: Emmett Lalish <elalish@gmail.com>

* use compose for disjoint union Last mystery in elalish#114. It turns out that the problem is due to misunderstanding of the bounding box API: Box::Transform is *not* a safe over-approximation of the bounding box of the manifold after the transform. In fact it might be smaller than that if the transform is not axis aligned. The simple fix is to apply the transform first and then compute the bounding box, this will also make sure that the bounding box is tight. * fix batch union Co-authored-by: Emmett Lalish <elalish@gmail.com>

elalish assigned pca006132 Jun 4, 2022

elalish added the enhancement New feature or request label Jun 4, 2022

pca006132 mentioned this issue Jun 4, 2022

Lazy boolean #131

Merged

pca006132 mentioned this issue Aug 2, 2022

re-enabled reordering #171

Merged

elalish closed this as completed in #171 Aug 14, 2022

pca006132 mentioned this issue Sep 24, 2022

use compose for disjoint union #217

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature request: lazy boolean operations #114

feature request: lazy boolean operations #114

pca006132 commented May 13, 2022 •

edited

Loading

elalish commented May 13, 2022

pca006132 commented May 13, 2022

pca006132 commented May 13, 2022

elalish commented May 13, 2022

elalish commented May 13, 2022

pca006132 commented May 13, 2022 •

edited

Loading

elalish commented May 13, 2022

pca006132 commented May 13, 2022

elalish commented May 13, 2022

pca006132 commented May 26, 2022

pca006132 commented Jun 19, 2022

elalish commented Jun 19, 2022

pca006132 commented Jul 9, 2022

pca006132 commented Aug 1, 2022

elalish commented Aug 2, 2022

pca006132 commented Aug 2, 2022

feature request: lazy boolean operations #114

feature request: lazy boolean operations #114

Comments

pca006132 commented May 13, 2022 • edited Loading

elalish commented May 13, 2022

pca006132 commented May 13, 2022

pca006132 commented May 13, 2022

elalish commented May 13, 2022

elalish commented May 13, 2022

pca006132 commented May 13, 2022 • edited Loading

elalish commented May 13, 2022

pca006132 commented May 13, 2022

elalish commented May 13, 2022

pca006132 commented May 26, 2022

pca006132 commented Jun 19, 2022

elalish commented Jun 19, 2022

pca006132 commented Jul 9, 2022

pca006132 commented Aug 1, 2022

elalish commented Aug 2, 2022

pca006132 commented Aug 2, 2022

pca006132 commented May 13, 2022 •

edited

Loading

pca006132 commented May 13, 2022 •

edited

Loading