Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace GLM with linalg #984

Merged
merged 23 commits into from
Oct 15, 2024
Merged

Replace GLM with linalg #984

merged 23 commits into from
Oct 15, 2024

Conversation

elalish
Copy link
Owner

@elalish elalish commented Oct 11, 2024

Fixes #976
Related to #962

Here I'm removing our GLM dependency in favor of including a forked version of linalg directly, which is much smaller and simpler.

A question: should we have a follow-on that converts our public API from linalg::vec to std::array so that the linalg.h header is internal instead of public? linalg already provides these implicit conversions (through a copy).

@elalish elalish self-assigned this Oct 11, 2024
struct std_isfinite {
template <class A>
auto operator()(A a) const -> decltype(std::isfinite(a)) {
return std::isfinite(a);
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a couple of functions that we commonly use.

include/manifold/linalg.h Outdated Show resolved Hide resolved
include/manifold/linalg.h Outdated Show resolved Hide resolved
@elalish
Copy link
Owner Author

elalish commented Oct 11, 2024

I've got 80% done I think. Most of the remaining errors will be cleared by taking care of the comments above, and then I've just got a few more specialized math functions to rename and/or implement.

@pca006132
Copy link
Collaborator

A question: should we have a follow-on that converts our public API from linalg::vec to std::array so that the linalg.h header is internal instead of public? linalg already provides these implicit conversions (through a copy).

Probably not needed, considering the header is included here already and it is small. Even if we return std::array, users still have to do conversion to their own math library data type, which is the same as exposing linalg types.

@elalish
Copy link
Owner Author

elalish commented Oct 12, 2024

Okay, still some work to do to get this compiling, but it's pretty close.

using mat2x3 = la::mat<double, 2, 3>;
using mat3 = la::mat<double, 3, 3>;
using mat4x3 = la::mat<double, 4, 3>;
using mat3x4 = la::mat<double, 3, 4>;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed our matrix naming from the stupid backwards way GLM had it.

inline double degrees(double a) { return a * 180 / kPi; }

inline mat3x4 Identity3x4() { return mat3x4(mat3(la::identity), vec3(0.0)); }
inline mat2x3 Identity2x3() { return mat2x3(mat2(la::identity), vec2(0.0)); }
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linalg doesn't do non-square identities like GLM did.

src/utils.h Show resolved Hide resolved
///@}

constexpr double kPi = 3.14159265358979323846264338327950288;
constexpr double kTwoPi = 6.28318530717958647692528676655900576;
constexpr double kHalfPi = 1.57079632679489661923132169163975144;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this seem like a reasonable place to define these?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think cmath should have M_PI and M_PI_2 defined, but anyway this should be fine.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, well an hour later I know understand painfully well to never use M_PI.

@elalish
Copy link
Owner Author

elalish commented Oct 13, 2024

Okay, I've got it compiling and running now (locally at least). Some tests even pass! I definitely have some math errors to track down. @pca006132 if you have a chance to look, it would be great to have another set of eyes on this. I bet the problems are mostly just typos. Also the CI is suddenly more annoyed about size_t -> int conversions.

Copy link
Collaborator

@pca006132 pca006132 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the narrowing warnings, we can just disable them for now. We do so many narrowing operations that it is not worth it to mark them explicitly each time.

using ivec3 = glm::vec<3, int>;
using ivec4 = glm::vec<4, int>;
using quat = glm::dquat;
namespace la = linalg;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we want this namespace alias in our public header. Maybe define this in our internal header?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this is fine, we are having la alias inside our manifold namespace, so this will not cause conflict elsewhere.

///@}

constexpr double kPi = 3.14159265358979323846264338327950288;
constexpr double kTwoPi = 6.28318530717958647692528676655900576;
constexpr double kHalfPi = 1.57079632679489661923132169163975144;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think cmath should have M_PI and M_PI_2 defined, but anyway this should be fine.

constexpr double kTwoPi = 6.28318530717958647692528676655900576;
constexpr double kHalfPi = 1.57079632679489661923132169163975144;

inline double radians(double a) { return a * kPi / 180; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constexpr?

src/csg_tree.cpp Show resolved Hide resolved
src/csg_tree.cpp Outdated
@@ -111,29 +111,29 @@ std::shared_ptr<CsgNode> CsgNode::Boolean(
}

std::shared_ptr<CsgNode> CsgNode::Translate(const vec3 &t) const {
mat4x3 transform(1.0);
mat3x4 transform(1.0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the identity matrix, this is a matrix with all entries = 1.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - I had fixed a bunch of those, but must have missed this one.

-Wno-unused
-Wno-array-bounds
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those are left from the thrust era.

inMesh.triVerts[3 * inTri + 1],
inMesh.triVerts[3 * inTri + 2]};
inTriangle *= inMesh.numProp;
ivec3 inTriangle(inMesh.triVerts[3 * inTri],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, the compiler is more picky when it performs implicit conversion when putting things inside the implicit constructor. Maybe the more implicit it gets, the more likely it is an issue...

@pca006132
Copy link
Collaborator

Windows users wanted! Can someone using Windows check if this PR is faster/slower comparing with the current master, when running extras/perfTest and test/manifold_test.

In addition, try the following patch and see if it affects anything:

diff --git a/src/polygon.cpp b/src/polygon.cpp
index 8f8fcdaa..6625288f 100644
--- a/src/polygon.cpp
+++ b/src/polygon.cpp
@@ -30,11 +30,6 @@ static ExecutionParams params;
 
 constexpr double kBest = -std::numeric_limits<double>::infinity();
 
-// it seems that MSVC cannot optimize la::determinant(mat2(a, b))
-constexpr double determinant2x2(vec2 a, vec2 b) {
-  return a.x * b.y - a.y * b.x;
-}
-
 #ifdef MANIFOLD_DEBUG
 struct PolyEdge {
   int startVert, endVert;
@@ -180,7 +175,7 @@ bool IsConvex(const PolygonsIdx &polys, double precision) {
     for (size_t v = 0; v < poly.size(); ++v) {
       const vec2 edge =
           v + 1 < poly.size() ? poly[v + 1].pos - poly[v].pos : firstEdge;
-      const double det = determinant2x2(lastEdge, edge);
+      const double det = la::determinant(mat2(lastEdge, edge));
       if (det <= 0 ||
           (std::abs(det) < precision && la::dot(lastEdge, edge) < 0))
         return false;
@@ -454,11 +449,11 @@ class EarClip {
     // goes to the outside. No need to check the other side, since all verts are
     // processed in the EarCost loop.
     double SignedDist(VertItr v, vec2 unit, double precision) const {
-      double d = determinant2x2(unit, v->pos - pos);
+      double d = la::determinant(mat2(unit, v->pos - pos));
       if (std::abs(d) < precision) {
-        double dR = determinant2x2(unit, v->right->pos - pos);
+        double dR = la::determinant(mat2(unit, v->right->pos - pos));
         if (std::abs(dR) > precision) return dR;
-        double dL = determinant2x2(unit, v->left->pos - pos);
+        double dL = la::determinant(mat2(unit, v->left->pos - pos));
         if (std::abs(dL) > precision) return dL;
       }
       return d;
@@ -470,7 +465,7 @@ class EarClip {
       double cost = std::min(SignedDist(v, rightDir, precision),
                              SignedDist(v, left->rightDir, precision));
 
-      const double openCost = determinant2x2(openSide, v->pos - right->pos);
+      const double openCost = la::determinant(mat2(openSide, v->pos - right->pos));
       return std::min(cost, openCost);
     }
 
@@ -672,7 +667,7 @@ class EarClip {
     auto AddPoint = [&](VertItr v) {
       bBox.Union(v->pos);
       const double area1 =
-          determinant2x2(v->pos - origin, v->right->pos - origin);
+          la::determinant(mat2(v->pos - origin, v->right->pos - origin));
       const double t1 = area + area1;
       areaCompensation += (area - t1) + area1;
       area = t1;

Copy link
Owner Author

@elalish elalish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for debugging this for me, not to mention all the cleanup!

@@ -578,5 +578,3 @@ struct ExecutionParams {
};

} // namespace manifold

#undef HOST_DEVICE
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@elalish elalish changed the title [WIP] Replace GLM with linalg Replace GLM with linalg Oct 15, 2024
@elalish
Copy link
Owner Author

elalish commented Oct 15, 2024

I like the idea of simplifying the determinant above - let's do that as a follow-up, once we have assurance from Windows that it's not slower.

@elalish elalish merged commit e067653 into master Oct 15, 2024
19 checks passed
@elalish elalish deleted the linalg branch October 15, 2024 18:22
@fire
Copy link
Contributor

fire commented Oct 15, 2024

Horray!

@pca006132
Copy link
Collaborator

Windows users wanted! Can someone using Windows check if this PR is faster/slower comparing with the current master, when running extras/perfTest and test/manifold_test.

In addition, try the following patch and see if it affects anything:

diff --git a/src/polygon.cpp b/src/polygon.cpp
index 8f8fcdaa..6625288f 100644
--- a/src/polygon.cpp
+++ b/src/polygon.cpp
@@ -30,11 +30,6 @@ static ExecutionParams params;
 
 constexpr double kBest = -std::numeric_limits<double>::infinity();
 
-// it seems that MSVC cannot optimize la::determinant(mat2(a, b))
-constexpr double determinant2x2(vec2 a, vec2 b) {
-  return a.x * b.y - a.y * b.x;
-}
-
 #ifdef MANIFOLD_DEBUG
 struct PolyEdge {
   int startVert, endVert;
@@ -180,7 +175,7 @@ bool IsConvex(const PolygonsIdx &polys, double precision) {
     for (size_t v = 0; v < poly.size(); ++v) {
       const vec2 edge =
           v + 1 < poly.size() ? poly[v + 1].pos - poly[v].pos : firstEdge;
-      const double det = determinant2x2(lastEdge, edge);
+      const double det = la::determinant(mat2(lastEdge, edge));
       if (det <= 0 ||
           (std::abs(det) < precision && la::dot(lastEdge, edge) < 0))
         return false;
@@ -454,11 +449,11 @@ class EarClip {
     // goes to the outside. No need to check the other side, since all verts are
     // processed in the EarCost loop.
     double SignedDist(VertItr v, vec2 unit, double precision) const {
-      double d = determinant2x2(unit, v->pos - pos);
+      double d = la::determinant(mat2(unit, v->pos - pos));
       if (std::abs(d) < precision) {
-        double dR = determinant2x2(unit, v->right->pos - pos);
+        double dR = la::determinant(mat2(unit, v->right->pos - pos));
         if (std::abs(dR) > precision) return dR;
-        double dL = determinant2x2(unit, v->left->pos - pos);
+        double dL = la::determinant(mat2(unit, v->left->pos - pos));
         if (std::abs(dL) > precision) return dL;
       }
       return d;
@@ -470,7 +465,7 @@ class EarClip {
       double cost = std::min(SignedDist(v, rightDir, precision),
                              SignedDist(v, left->rightDir, precision));
 
-      const double openCost = determinant2x2(openSide, v->pos - right->pos);
+      const double openCost = la::determinant(mat2(openSide, v->pos - right->pos));
       return std::min(cost, openCost);
     }
 
@@ -672,7 +667,7 @@ class EarClip {
     auto AddPoint = [&](VertItr v) {
       bBox.Union(v->pos);
       const double area1 =
-          determinant2x2(v->pos - origin, v->right->pos - origin);
+          la::determinant(mat2(v->pos - origin, v->right->pos - origin));
       const double t1 = area + area1;
       areaCompensation += (area - t1) + area1;
       area = t1;

@starseeker do you have windows machines around for benchmark? If yes, can you try this and see if there is any difference in performance?

@starseeker
Copy link
Contributor

@starseeker do you have windows machines around for benchmark? If yes, can you try this and see if there is any difference in performance?

I do, but it will take some time to get set up on on it.

@starseeker
Copy link
Contributor

Without patch:

.\perfTest.exe
nTri = 512, time = 0.002399 sec
nTri = 2048, time = 0.0057338 sec
nTri = 8192, time = 0.0191217 sec
nTri = 32768, time = 0.0676792 sec
nTri = 131072, time = 0.256196 sec
nTri = 524288, time = 1.04938 sec

.\largeSceneTest.exe
n = 20
nTri = 91814, time = 24.9764 sec

With patch:

.\perfTest.exe
nTri = 512, time = 0.0020762 sec
nTri = 2048, time = 0.0060967 sec
nTri = 8192, time = 0.0194322 sec
nTri = 32768, time = 0.0685545 sec
nTri = 131072, time = 0.265982 sec
nTri = 524288, time = 1.10732 sec
nTri = 2097152, time = 4.72829 sec

.\largeSceneTest.exe
n = 20
nTri = 91814, time = 23.7445 sec

@elalish
Copy link
Owner Author

elalish commented Oct 16, 2024

Okay, looks like we should apply the patch then, thanks! 👍

@pca006132
Copy link
Collaborator

The weird thing though is that the performance difference is pretty large and inconsistent, maybe we should try multiple runs and account for deviations, and let the system rest before running another run because the CPU may be hotter for example. Alternatively, we may want to run these single threaded with turboboost disabled to avoid deviation due to CPU states.

E.g. for nTri = 524288, the patch made it slower by 4.5%, and pretty consistently slower for other sphere sizes. For largeSceneTest however, it is somehow 5% faster.

RahimovIR pushed a commit to RahimovIR/manifold that referenced this pull request Oct 18, 2024
* added linalg

* removed GLM

* fix matrix multiplication

* more matrix constructors

* fixed more functions

* fixed matrix notation

* more fixes

* fix ostream

* further fixes

* added more rotation functions

* compiles

* some fixes

* fix2

* fix build

* another warning...

* fix install

* for wasm?

* no warning for unknown options

* proper fix?

* remove legacy warning flags

* format

* misc constexpr

---------

Co-authored-by: pca006132 <john.lck40@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor public API to hide glm as an implementation detail
4 participants