Removes preprocessor check for FMA instructions in matrix multiplication functions.
This simplifies the code and relies on the compiler's ability to optimize the
code based on available hardware support. The assumption is that modern
compilers will automatically utilize FMA instructions if available, and fall
back to alternative implementations if not.
Ensures that AVX2 intrinsics are only included when the
OMATH_USE_AVX2 preprocessor definition is set. This prevents
compilation errors when AVX2 support is not available or
explicitly disabled.
Optimizes matrix multiplication by specializing the algorithm
based on the matrix storage type (row-major or column-major).
This change significantly improves performance by leveraging
memory access patterns specific to each storage order.
Moves linear algebra headers to a new subdirectory to improve project structure.
Updates includes to reflect the directory change.
Adds vcpkg to the tracked repositories.