Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations #106

Merged
merged 7 commits into from
May 11, 2022
Merged

Optimizations #106

merged 7 commits into from
May 11, 2022

Conversation

pca006132
Copy link
Collaborator

optimizations as mentioned in #105

Copy link
Owner

@elalish elalish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great, thanks! Would you mind including a link to your performance spreadsheet in this PR (ideally updated once it's finished)? 30% improvement on CUDA, especially for the larger test cases is quite impressive indeed!

@@ -50,12 +53,10 @@ void Manifold::Impl::Face2Tri(const VecDH<int>& faceEdge,
ALWAYS_ASSERT(numEdge >= 3, topologyErr, "face has less than three edges.");
const glm::vec3 normal = faceNormal[face];

std::map<int, int> vertBary;
for (int j = firstEdge; j < lastEdge; ++j)
vertBary[halfedge[j].startVert] = halfedgeBary.H()[j];
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I bet this could be a big part of why the triangulation step has been scaling worse than it should.

manifold/src/face_op.cpp Outdated Show resolved Hide resolved
// Stable sort is required here so that halfedges from the same face are
// paired together (the triangles were created in face order). In some
// degenerate situations the triangulator can add the same internal edge in
// two different faces, causing this edge to not be 2-manifold. We detect this
// and fix it by swapping one of the identical edges, so it is important that
// we have the edges paired according to their face.
std::stable_sort(edge.begin(), edge.end());
thrust::stable_sort(edge.beginD(), edge.endD());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this new? I vaguely recall using std:: because thrust didn't have a stable_sort.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From git blame of https://github.com/NVIDIA/thrust/blame/fa54f2c6f1217237953f27ddf67f901b6b34fbdd/testing/stable_sort.cu, it seems that stable_sort is in thrust for over 10 years. Perhaps it is related to cudatoolkit version?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, I must have remembered wrong.

utilities/include/sparse.h Outdated Show resolved Hide resolved
@pca006132 pca006132 force-pushed the optimizations branch 2 times, most recently from ad1e096 to 02ce2b4 Compare May 10, 2022 14:58
Copy link
Owner

@elalish elalish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really love how much simpler this is! And perfTest for CUDA is indeed running ~30% faster for me:

nTri = 512, time = 0.00647049 sec
nTri = 2048, time = 0.00899614 sec
nTri = 8192, time = 0.0170416 sec
nTri = 32768, time = 0.0450551 sec
nTri = 131072, time = 0.144635 sec
nTri = 524288, time = 0.524747 sec
nTri = 2097152, time = 1.89367 sec
nTri = 8388608, time = 7.68228 sec

@@ -266,7 +266,7 @@ Collider::Collider(const VecDH<Box>& leafBB,
*/
template <typename T>
SparseIndices Collider::Collisions(const VecDH<T>& querriesIn) const {
int maxOverlaps = 1 << 20;
int maxOverlaps = querriesIn.size() * 4;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

collider/src/collider.cpp Show resolved Hide resolved
// Stable sort is required here so that halfedges from the same face are
// paired together (the triangles were created in face order). In some
// degenerate situations the triangulator can add the same internal edge in
// two different faces, causing this edge to not be 2-manifold. We detect this
// and fix it by swapping one of the identical edges, so it is important that
// we have the edges paired according to their face.
std::stable_sort(edge.begin(), edge.end());
thrust::stable_sort(edge.beginD(), edge.endD());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, I must have remembered wrong.

Copy link
Owner

@elalish elalish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the cleanup!

@@ -282,7 +282,11 @@ SparseIndices Collider::Collisions(const VecDH<T>& querriesIn) const {
break;
else { // if not enough memory was allocated, guess how much will be needed
int lastQuery = querryTri.Get(0).H().back();
maxOverlaps *= 2;
float ratio = static_cast<float>(querriesIn.size()) / lastQuery;
if (ratio > 1000) // do not trust the ratio if it is too large
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@elalish elalish merged commit 48d51b6 into elalish:master May 11, 2022
@pca006132 pca006132 deleted the optimizations branch May 12, 2022 08:22
cartesian-theatrics pushed a commit to SovereignShop/manifold that referenced this pull request Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants