-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizations #106
Optimizations #106
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great, thanks! Would you mind including a link to your performance spreadsheet in this PR (ideally updated once it's finished)? 30% improvement on CUDA, especially for the larger test cases is quite impressive indeed!
@@ -50,12 +53,10 @@ void Manifold::Impl::Face2Tri(const VecDH<int>& faceEdge, | |||
ALWAYS_ASSERT(numEdge >= 3, topologyErr, "face has less than three edges."); | |||
const glm::vec3 normal = faceNormal[face]; | |||
|
|||
std::map<int, int> vertBary; | |||
for (int j = firstEdge; j < lastEdge; ++j) | |||
vertBary[halfedge[j].startVert] = halfedgeBary.H()[j]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, I bet this could be a big part of why the triangulation step has been scaling worse than it should.
// Stable sort is required here so that halfedges from the same face are | ||
// paired together (the triangles were created in face order). In some | ||
// degenerate situations the triangulator can add the same internal edge in | ||
// two different faces, causing this edge to not be 2-manifold. We detect this | ||
// and fix it by swapping one of the identical edges, so it is important that | ||
// we have the edges paired according to their face. | ||
std::stable_sort(edge.begin(), edge.end()); | ||
thrust::stable_sort(edge.beginD(), edge.endD()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this new? I vaguely recall using std::
because thrust didn't have a stable_sort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From git blame of https://github.com/NVIDIA/thrust/blame/fa54f2c6f1217237953f27ddf67f901b6b34fbdd/testing/stable_sort.cu, it seems that stable_sort
is in thrust for over 10 years. Perhaps it is related to cudatoolkit version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, I must have remembered wrong.
ad1e096
to
02ce2b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really love how much simpler this is! And perfTest
for CUDA is indeed running ~30% faster for me:
nTri = 512, time = 0.00647049 sec
nTri = 2048, time = 0.00899614 sec
nTri = 8192, time = 0.0170416 sec
nTri = 32768, time = 0.0450551 sec
nTri = 131072, time = 0.144635 sec
nTri = 524288, time = 0.524747 sec
nTri = 2097152, time = 1.89367 sec
nTri = 8388608, time = 7.68228 sec
@@ -266,7 +266,7 @@ Collider::Collider(const VecDH<Box>& leafBB, | |||
*/ | |||
template <typename T> | |||
SparseIndices Collider::Collisions(const VecDH<T>& querriesIn) const { | |||
int maxOverlaps = 1 << 20; | |||
int maxOverlaps = querriesIn.size() * 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
// Stable sort is required here so that halfedges from the same face are | ||
// paired together (the triangles were created in face order). In some | ||
// degenerate situations the triangulator can add the same internal edge in | ||
// two different faces, causing this edge to not be 2-manifold. We detect this | ||
// and fix it by swapping one of the identical edges, so it is important that | ||
// we have the edges paired according to their face. | ||
std::stable_sort(edge.begin(), edge.end()); | ||
thrust::stable_sort(edge.beginD(), edge.endD()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, I must have remembered wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the cleanup!
@@ -282,7 +282,11 @@ SparseIndices Collider::Collisions(const VecDH<T>& querriesIn) const { | |||
break; | |||
else { // if not enough memory was allocated, guess how much will be needed | |||
int lastQuery = querryTri.Get(0).H().back(); | |||
maxOverlaps *= 2; | |||
float ratio = static_cast<float>(querriesIn.size()) / lastQuery; | |||
if (ratio > 1000) // do not trust the ratio if it is too large |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Optimizations
optimizations as mentioned in #105