Skip to content
Yoan edited this page Dec 25, 2017 · 5 revisions

Why this crate ?

Low-dimensioned matrices and vectors (i.e <= 4 dimensions) are so commonly used in multimedia applications that they deserve to be promoted to core types, and are expected to be as fast as the hardware allows, at all times.

It is safe to assume that they are used a LOT, especially when they're processed en masse.
Positions, directions, curves, shapes, colors, texture coordinates, physics systems, animation systems, and many others... All of them need and process such types all the time. When they are fast, a lot of dependent systems greatly benefit from it.

It is quite a commitment to choose a vector/matrix library for a project - any game, engine or library that uses it is implictly tied to its decisions regarding performance and functionality. When none of these libraries are suitable for the task, one ends up being written in-house, with a variable quality criteria.

Today, especially with Rust's power, it is possible to unconditionnaly have genericity, ergonomics, functionality and performance all at once.
With vek, I hope to provide these core types as generics, while generating high-quality assembly output on specific hardware, when appropriate types are used (and you're on Nightly with the repr_simd and repr_align features enabled).

Let's have our cake and eat it too !

Can I use it on Stable ?

Yes! The repr_simd and repr_align features simply do nothing in this case.

How do I make my crate compile on both Stable and Nightly while using SIMD on Nightly only ?

Use the excellent rustc_version crate in a custom build script.
You may want to look at how vek itself does this.

Where does the name come from ?

vec always sounded cool in my head. With a k at the end, it's even better!
Perhaps it's also because it reminds me of vk, the prefix used by the Vulkan API.

This crate is so slow to compile! (especially on Nightly with repr_simd!)

Yeah it's quite a pain. There are ways to make it noticeably less slow though :

Disable default features, then selectively enable the ones you need.
It's unlikely that you need absolutely all of vek's types, so pick the few that you'll actually use.

In particular, disabling the repr_simd feature should divide build times by approximately two.

Why not repr(packed) ? How should I deal with this ?

Borrowing members of packed structs is unsafe see #46043.
One of the numerous implication of this is that packed structs cannot contain members that implement Drop.
So, this would mean goodbye to Vec4<BigInt>.

Arguably, this is weird, because vectors are essentially arrays, and getting a reference to an array's element is safe.
So, why aren't vectors implemented as arrays ? Well, because IMO, the following is outright ridiculous :

let v = Vec4 { elements: [0, 1, 2, 3] };
let x = v.x();
v.x() = 42;

Compared to:

let v = Vec4 { x: 0, y: 1, z: 2, w: 3 };
let x = v.x;
v.x = 42;

So we're in quite a bind here. The choice I've made was to remove repr(packed) but still have tests that demonstrate that members are actually packed (at least for basic types).

In any case, don't forget you may safely convert a vector to an array with into_array().

How do I shuffle a vector ?

With the ergonomic, idiomatic way to shuffle any struct in Rust!
You want destructuring:

let Vec4 { x, y, .. } = xyzw;
let wzyx = Vec4 { x: w, y: z, z: y, w: x };
let xyxx = Vec4 { x, y, z: x, w: x };

But don't take my word for it - let the (release mode) assembly speak for itself!
On x86 with SSE, it lowers to shufps as wanted (provided you used repr_simd::Vec4).

If you're only interested in a single element you can use broadcast (or even from):

let Vec4 { x, .. } = xyzw;
let xxxx = Vec4::broadcast(x);
let xxxx = Vec4::from(x);

Vec4s also provide their own shuffle API based on existing x86 intrinsics.
See Vec4::shuffled(...) for instance.

Why can't I index a matrix directly (i.e write m[1][3]) ?

Because the actual meaning changes depending on the matrix's storage layout.
What you describe will probably do what you expect on row-major matrices, but how should it behave on column-major matrices ? vek's mantra is to stay true to reality, so if it had to do this, it would just index the public member cols in this case.

But here's the thing - if you decided to switch the layout, no m[i][j] access would behave as expected anymore, and there's no way the compiler could catch this.

For these reasons, in vek, it is almost always required that you explicitly use the matrix's public member instead, which conveys the intent and enforces correctness.
Here's how you index matrices in vek (assuming, for instance, that i=1 and j=3):

  • Row-major, static indexing: (m.rows.y).w;
  • Row-major, dynamic indexing: m.rows[i][j];
  • Column-major, static indexing: (m.cols.w).y;
  • Column-major, dynamic indexing: m.cols[j][i];
  • Any layout, dynamic indexing: m[(i, j)].

(Static indexing with x, y, z and w is not pretty-looking, but I wanted to reuse the vector types because of their representation, alignement requirements, etc.
Using tuples and/or creating tuple structs for this sole purpose isn't a good idea)

In the same way, if you want to get a pointer to the matrix's data, for e.g transferring it to a graphics API, get the address of the public member explictly instead, which makes clear whether you're sending an array of rows or an array of columns.
If you're using OpenGL, check out the gl_should_transpose() method!

Is there a rationale for the terseness ? (a.k.a Vec2 vs Vector2 and Mat4 vs Matrix4x4)

I believe the names match Rust's general terseness level.
Dynamic growable arrays are named Vec, not Vector, because they're so widely used that typing more than three characters becomes annoying.
It's the same reasoning behind names of shell commands such as ls.

Also, people accustomed to GLSL are familiar with names such as mat4 and vec4.

Finally, in Rust, we might tend to forget that we have renaming imports :

use vek::Vec2 as Vector2;
use vek::Mat4 as Matrix4x4;

let v = Vector2 { x: 13, y: 42 };
// ....

This crate misses one or more optimized operations for my target platform (or, some parts could be improved) !

I care about this. File an issue and let's see what we can do!

Why generics? [insert grumpy Mike Acton quote here]

TL:DR;
First, the historical problem with generics in C++ is the increase in build times, and possibly insanely long and confusing error message, but none of these apply to Rust (within reason, of course!).
Second, we can be generic and still generate efficient code. The compiler has enough information and Rust is backed by LLVM.
I hear that guy in the back saying "THAT doesn't guarantee anything!" with a smug face, to which I would reply "well if you're not happy with the assembly, AND if it is indeed a noticeable bottleneck, then that's where you should drop everything and use intrinsics".

The actual reason
As much as 32-bit floating-point happens to be the most common case, the algorithms are universal! There's no reason we wouldn't suddenly switch to, say, fixed-point numbers or bignums.

Fixed-point numbers do provide some goodies:

  • Consistency of results across platforms and compiler options, useful for lockstep networking models;
  • They are the only option for representing real numbers on some esoteric platforms such as the Nintendo DS. So I want to be able to switch to using them painlessly.

Bignums also come up as an interesting target, even though we might wonder in which cases we need more than 64-bit worth of integer values.

On the other hand, one thing that indeed plagues generics is that the code is written once but over-generalized, such that the compiler can't always "see" what the actual optimal code for the target hardware is.

#[repr(simd)] is good, but not a magic wand. It will lower basic vector operations into shuffles, packed arithmetic operations, etc, but it won't be able to "guess" when what you just wrote is actually the "unpcklps" instruction (and there, the generated assembly is an awful bunch of "shufps" instead).

It happens that sometimes we do want to use the intrinsics directly, but we still want to be generic!
That's why, in the future, I would like to provide specialized functions that lower to relevant intrinsics (for instance, Mat4<f32> would provide transposed_sse() on SSE-enabled x86 CPUs).

In any case, there's still the option to use intrinsics yourself.

(question from the future) Why aren't "specialized" implementations the default ?

Because most hardware-specific intrinsics have semantics that bypass some of Rust's assumptions, some of which are :

  • Alignment requirements (most load/store instructions);
  • Precision of floating-point operations (e.g _mm_rsqrt_ps());
  • Handling of integer overflow (e.g _mm_add_epi32());
  • Expectations from the user (e.g _mm_cmpeq_ps() uses 0xffffffff in the output vector as the value for true);

The point is, hardware-specific intrinsics are, well, hardware-specific, which is why it's up to you to opt-in explicitly.
The generic implementations may not be as efficient, but won't backstab you either.

Why don't I get pretty assembly on debug builds ?

You don't actually care in this situation. It's release builds you're after.

Also keep in mind that Rust checks for integer overflows on debug builds, so e.g your pretty Vec4<i32> addition won't be lowered to paddd on SSE2-enabled CPUs.

Why not [insert related crate here] ?

As some have discussed, the perceived "quality" of a vector/matrix library is often a matter of personal preference.
I think it is normal and healthy that there are so many choices available! It's as healthy as having so many different game engines, operating systems, Web browsers, countries, and cultures.

So instead of trying to convince you why, for some reason, you shouldn't use one of the well-written libraries out there such as cgmath, nalgebra and vecmath, I'll try my best at explaining the problems I want to solve with vek.

(Yeah I knew that XKCD and not only because of the mint crate :) )

0. I'm a Not-Invented-Here guy

Totally.

In all seriousness, it's my first "public crate" and I think it's safe to say that it made me improve in Rust A LOT.
This wasn't that much part of my goals, but it's a really handy consequence.

Also, it made me learn more than I wanted to know about the underlying maths.
I understand better quaternions, know of a way to picture the result of a cross product, and in general, have gone through confusions that I won't ever have to deal with anymore, because I've experienced them.
All in all, NIH is pretty great! 10/10

1. I don't want to worry anymore about my vectors and matrices being less efficient than they ought to be.

It's common to assume that the compiler can optimize everything (and often, it does) but it's a huge oversight for libraries that provide core types.

As a user, you might not realize that the "matrix * vector" products you use everywhere take twice as many instructions as they should.
Yes, you won't see the difference in "most cases", but in my (limited) experience, "most cases" is "moderately ambitious games running on x86-64 CPUs", which is why there's no noticeable slowdown (they're very forgiving compared to those of previous generation consoles), but that shouldn't get in the way of "potentially ambitious games running on PC, consoles and mobile devices".

SSE and SSE2 have been around since 1999 and 2001 respectively. All x86-64 CPUs have them, and nearly every PC today is powered by such CPUs.
According to the Steam Hardware Survey, 100% of PCs have SSE2.

So obviously, on such targets and in release builds, if my Vec4<f32> addition doesn't lower to the addps instruction, I'll get quite upset.

2. I want to be able to choose freely between row-major and column-major (and get rid of the confusion between them, while I'm at it).

Row-major matrices have their uses just as well as column-major ones. One should be allowed to pick the correct one for the job at any time and place.
It seems to be widely accepted that all libraries only offer one of these two layouts (either always assumed, or one at a time via #defines (e.g GLM)).

It happens that column-major matrices are good at multiplying themselves by a column vector, which is the most common case in computer graphics (because it's how most people transform vertices), but this doesn't mean this is somehow the One True Layout.

Row-major matrices are good at being the right-hand side of a product with a row vector. Also, one might just prefer them because of the indexing order.

This all boils down to giving more control to the user. Who am I to decide on your behalf ?

3. Writing Extent2, Rgba and Rgb types from scratch every time gets old fast.

Back when I was using SFML, I would write stuff such as window.size.x but something about it feels odd.
The Vulkan API was wise enough to define vkExtent2d and vkExtent3d types for representing spatial extents, so I want these types too. What are they ? Plain old vectors. But their members are named such that it is clear that we're dealing with widths, heights and depths.

The others that come back all too often are Rgb and Rgba. I need these all the time, either as Rgba<u8> of Rgba<f32>. They are used everywhere there are images and GUIs, and even more than that (that is, pretty much every application or game I've ever known).

4. I want a library that is close to the reality.

"Reality of the hardware" rather than "pretty pink pony mathematical reality".
I'm looking at libraries that abuse the type system to shoehorn mathematical properties and result in a mess that gamedevs don't actually know or care about.

I don't need more abstraction. I want compression of information.

I don't need a pretty-looking mathematical model. I want to have access to the actual building blocks that don't actively try to hide what's actually happening.
There's no such type as OrthogonalMat4, AffineMat4 or the like. It's a damn Mat4, because it's what the hardware deals with.

5. I want to be able to have vectors/matrices of fixed-points and bignums.

i.e I don't want to be stuck on floating-point numbers.

6. I don't want to have to import a trait for every piece of functionality.

This may be fixed by providing a prelude, but I don't want this.

Fundamentally, if I'm given an Rgba type, I don't want to have to import some ColorVector trait (or some prelude) to be able to call red() on it. The same goes for dot products, identity matrices, and whatever.

However, of course it's practical to make types implement relevant traits.

7. I want fully-fledged types with exhaustive abilities.

I don't want to pollute my mental cache by asking myself if I have to use fixed-size arrays, tuples, structs, or tuple structs for my vectors or matrices.
I want them readily available for use and never have to go back to, or question, their implementation.

I know how much I'll need these types for the foreseeable future.

Clone this wiki locally