Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[raymath] Matrix determinant calculation speed up (enhancement) #4780

Closed
raylibfan opened this issue Feb 20, 2025 · 3 comments
Closed

[raymath] Matrix determinant calculation speed up (enhancement) #4780

raylibfan opened this issue Feb 20, 2025 · 3 comments

Comments

@raylibfan
Copy link

More then 2.0x faster Matrix determinant calculation can be achieved

In raymath.h line 1467 we have such realisation:

// Compute matrix determinant
RMAPI float MatrixDeterminant(Matrix mat)
{
	float result = 0.0f;

	// Cache the matrix values (speed optimization)
	float a00 = mat.m0, a01 = mat.m1, a02 = mat.m2, a03 = mat.m3;
	float a10 = mat.m4, a11 = mat.m5, a12 = mat.m6, a13 = mat.m7;
	float a20 = mat.m8, a21 = mat.m9, a22 = mat.m10, a23 = mat.m11;
	float a30 = mat.m12, a31 = mat.m13, a32 = mat.m14, a33 = mat.m15;

	result = a30 * a21 * a12 * a03 - a20 * a31 * a12 * a03 - a30 * a11 * a22 * a03 + a10 * a31 * a22 * a03 +
		 a20 * a11 * a32 * a03 - a10 * a21 * a32 * a03 - a30 * a21 * a02 * a13 + a20 * a31 * a02 * a13 +
		 a30 * a01 * a22 * a13 - a00 * a31 * a22 * a13 - a20 * a01 * a32 * a13 + a00 * a21 * a32 * a13 +
		 a30 * a11 * a02 * a23 - a10 * a31 * a02 * a23 - a30 * a01 * a12 * a23 + a00 * a31 * a12 * a23 +
		 a10 * a01 * a32 * a23 - a00 * a11 * a32 * a23 - a20 * a11 * a02 * a33 + a10 * a21 * a02 * a33 +
		 a20 * a01 * a12 * a33 - a00 * a21 * a12 * a33 - a10 * a01 * a22 * a33 + a00 * a11 * a22 * a33;

	return result;
}

It takes 72 multiplication to calculate 4x4 matrix determinant as one can see.

Meanwhile, accordinly to this Wikipedia's article https://en.wikipedia.org/wiki/Laplace_expansion , this calculation can be reduced to (4 + 4 * (3 + 3 * 2)) = 40 multiplication by decreasing matrix size from 4x4 to 2x2 using minors.

Such algorithm is used in blender 3d application source code as for example: https://github.com/blender/blender/blob/main/source/blender/blenlib/intern/math_matrix_c.cc from LINE 1837

So we can replace those existing code fragment from above by something like this (using raylib function naming convention):

RMAPI float DeterminantM2(const float a, const float b, const float c, const float d)
{
	return (a * d - b * c);
}

RMAPI float DeterminantM3(float a1, float a2, float a3, float b1, float b2, float b3, float c1, float c2, float c3)
{
	return (a1 * DeterminantM2(b2, b3, c2, c3) - b1 * DeterminantM2(a2, a3, c2, c3) + c1 * DeterminantM2(a2, a3, b2, b3));
}

// Compute matrix determinant
RMAPI float MatrixDeterminant(Matrix m)
{
	return (m.m0  *  DeterminantM3(m.m5, m.m6, m.m7, m.m9, m.m10, m.m11, m.m13, m.m14, m.m15) -
		m.m4  *  DeterminantM3(m.m1, m.m2, m.m3, m.m9, m.m10, m.m11, m.m13, m.m14, m.m15) +
		m.m8  *  DeterminantM3(m.m1, m.m2, m.m3, m.m5, m.m6,  m.m7,  m.m13, m.m14, m.m15) -
		m.m12 *  DeterminantM3(m.m1, m.m2, m.m3, m.m5, m.m6,  m.m7,  m.m9,  m.m10, m.m11));
}

I wrote some simple test application to check perfomance boost. New or "blender style" function is more then twice faster, then current "raylib style".

Issue Screenshot

Image

P.S. I am crab-handed and can not provide a PR :)

@raysan5
Copy link
Owner

raysan5 commented Feb 23, 2025

@raylibfan Thanks for the improvement, please, note that raymath follows some strict rules on its functions implementations, you can check them on the file-header description. Two of the rules are, using a result variable for return and avoid functions-calling-functions, function should be self-contained.

Please, could you review provided code to follow those rules?

@raysan5 raysan5 changed the title [rcore] Matrix determinant calculation speed up (enhancement) [raymath] Matrix determinant calculation speed up (enhancement) Feb 23, 2025
@raylibfan
Copy link
Author

raylibfan commented Feb 23, 2025

// updated version without calling extra functions

RMAPI float MatrixDeterminant(Matrix mat)
{
	float result = 0.0f;
	
	// Cache the matrix values (speed optimization)
	float m0  = mat.m0,  m1 =  mat.m1,  m2 = mat.m2,   m3 = mat.m3;
	float m4  = mat.m4,  m5 =  mat.m5,  m6 = mat.m6,   m7 = mat.m7;
	float m8  = mat.m8,  m9 =  mat.m9,  m10 = mat.m10, m11 = mat.m11;
	float m12 = mat.m12, m13 = mat.m13, m14 = mat.m14, m15 = mat.m15;

	result = (m0  * ((m5 * (m10 * m15 - m11 * m14) - m9 * (m6 * m15 - m7 * m14) + m13 * (m6 * m11 - m7 * m10))) -
			  m4  * ((m1 * (m10 * m15 - m11 * m14) - m9 * (m2 * m15 - m3 * m14) + m13 * (m2 * m11 - m3 * m10))) +
			  m8  * ((m1 * (m6 * m15 -  m7 * m14)  - m5 * (m2 * m15 - m3 * m14) + m13 * (m2 * m7  - m3 * m6))) -
			  m12 * ((m1 * (m6 * m11 -  m7 * m10)  - m5 * (m2 * m11 - m3 * m10) + m9  * (m2 * m7  - m3 * m6))));

	return result;
}

P.S. I made similar improvement to the Godot game engine. So I tested all versions of determinant calculation in a single test program:

Image

P.P.S. Today I found even more optimized version of 4x4 matrix determinant calculation with only 30 multiplication. It may be mathematicaly simplified version of my code. But I am not checked and tested it yet. But it worth looking.

https://github.com/FreeCAD/FreeCAD/blob/main/src/Base/Matrix.cpp LINE 164

(they store matrix with indexes differently but as soon as we have a structure - it's seems work fine)

raysan5 added a commit that referenced this issue Feb 23, 2025
@raysan5
Copy link
Owner

raysan5 commented Feb 23, 2025

@raylibfan thanks for the improvement, I updated the function!

@raysan5 raysan5 closed this as completed Feb 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants