Implementation
Be sure to review Microsoft Learn: Library Internals.
Compiler conformance
For Visual C++, the projects make use of the default C++11/C++14 mode rather than /std:c++17 mode. The library does not make use of newer C++17 language & library features such as string_view, static_assert without a message, etc. although that may change in the future. The projects make use of /Wall, /permissive-, /Zc:__cplusplus, and /analyze to ensure a high-level of C++ conformance.
For clang/LLVM for Windows, there is a CMakeList.txt provided to validate the code and ensure a high-level of conformance. This primarily means addressing warnings generated using /Wall -Wpedantic -Wextra.
Language extensions
DirectXMath is written using standard Intel-style intrinsics, which should be portable to other compilers. The ARM and ARM64 codepaths use ARM-style intrinsics (earlier versions of the library used Visual C++ specific __n64 and __n128), so these are also portable.
The DirectXMath library make use of two commonly implemented extensions to Standard C++:
- anonymous structs, which are widely supported and are part of the C11 standard. Note that the library also uses anonymous unions, but these are part of the C++ and C99 standard.
- #pragma once rather than old-style #define based guards, but are widely supported
Because of these, DirectXMath is not compatible with Visual C++'s
/Zaswitch which enforces ISO C89 / C++11. It does work with/permissive-.
Naming conventions
- PascalCase for class names, methods, functions, and enums.
- camelCase for class member variables, struct members
- UPPERCASE for preprocessor defines (and nameless enums)
The library does not generally make use of Hungarian notation which as been deprecated for Win32 C++ APIs for many years, with the exception of a few uses of p for pointers and sz for strings.
Type usage
The use of Standard C++ types is preferred including the fundamental types supplied by the language (i.e. int, unsigned int, size_t, ptrdiff_t, bool, true/false, char, wchar_t) with the addition of the C99 fixed width types (i.e. uint32_t, uint64_t, intptr_t, uintptr_t, etc.)
Avoid using Windows "portability" types except when dealing directly with Win32 APIs: VOID, UINT, INT, DWORD, FLOAT, BOOL, TRUE/FALSE, WCHAR, CONST, etc.
Error reporting
As a low-level math library, DirectXMath does not make use of C++ exception handling or HRESULT COM-style error values. Generally, parameter validation is limited to assert macros. All functions should be annotated with noexcept.
SAL annotation
The DirectXMath library makes extensive use of SAL2 annotations (_In_, _Outptr_opt_, etc.) which greatly improves the accuracy of the Visual C++ static code analysis (also known as PREFAST). The standard Windows headers #define them all to empty strings if not building with /analyze, so they have no effect on code-generation.
Calling-conventions
One of the more complicated aspects of DirectXMath's implementation is implementing the various calling-conventions optimally for SIMD which changes per architecture. This is detailed on Microsoft Learn.
128-bit SIMD
XMVECTOR XM_CALLCONV XMVectorHermite(FXMVECTOR Position0, FXMVECTOR Tangent0, FXMVECTOR Position1, GXMVECTOR Tangent1, float t) noexcept;
-
XMVECTORis the standard 128-bit SIMD register type, and we return it by value. -
XM_CALLCONVis set to__vectorcallwhere supported,__fastcallotherwise unless the target compiler doesn't support it. -
FXMVECTORis used for the first three SIMD parameters to support SIMD-passing behavior for_fastcall. -
GXMVECTORis used for the fourth SIMD parameter to support_vectorcalland the ARM ABI passing of the first four SIMD registers. -
HXMVECTORis used for the fifth and six SIMD parameter to support_vectorcall. -
CXMVECTORis used for all remaining SIMD registers which passes by 'const ref'.
In configurations where the platform doesn't support 6 SIMD registers, the types are equivalent to CXMMVECTOR.
4x4 Matrix
XMVECTOR XM_CALLCONV XMVector3Project(FXMVECTOR V, float ViewportX, float ViewportY, float ViewportWidth, float ViewportHeight, float ViewportMinZ, float ViewportMaxZ, FXMMATRIX Projection, CXMMATRIX View, CXMMATRIX World) noexcept;
Because of heterogeneous vector aggregates a matrix which consists of 4 SIMD values can be passed as if it were 4 individual SIMD values.
-
FXMMATRIXgenerally this is used if there are 0, 1, or 2XMVECTORparameters preceding the matrix. -
CXMMATRIXis sued for all other matrix parameters which passes by 'const ref'.
Compiler directives
DirectXMath makes use of many preprocessor defines to target many different instruction sets and architectures.
A full table of defines can be found on Microsoft Learn.
inline XMVECTOR XM_CALLCONV XMVectorRound(FXMVECTOR V) noexcept { #if defined(_XM_NO_INTRINSICS_) XMVECTORF32 Result = { { { MathInternal::round_to_nearest(V.vector4_f32[0]), MathInternal::round_to_nearest(V.vector4_f32[1]), MathInternal::round_to_nearest(V.vector4_f32[2]), MathInternal::round_to_nearest(V.vector4_f32[3]) } } }; return Result.v; #elif defined(_XM_ARM_NEON_INTRINSICS_) #if defined(_M_ARM64) || defined(_M_HYBRID_X86_ARM64) || defined(_M_ARM64EC) || __aarch64__ // ARM_NEON v8 implementation #else // ARM-NEON v7 implementation #endif #elif defined(_XM_SSE4_INTRINSICS_) // SSE 4.1 implementation #elif defined(_XM_SSE_INTRINSICS_) // SSE/SSE2 implementation (the minimum required for x86/x64) #endif }
Instruction Set Usage
See this blog series for more details on how each is applied to DirectXMath:
- DirectXMath - SSE, SSE2, and ARM-NEON
- DirectXMath - SSE3 and SSSE3
- DirectXMath - SSE4.1 and SSE4.2
- DirectXMath - AVX
- DirectXMath - F16C and FMA
- DirectXMath - ARM64
Implementation macros
-
XM_ALIGNED_DATAis used to declare aligned data variables. -
XM_ALIGNED_STRUCTis used to declare an aligned struct.
x86/x64
-
XM_STREAM_PS,XM256_STREAM_PS, andXM_SFENCEwhich are controlled by the_XM_NO_MOVNT_define. -
XM_PERMUTE_PSis_mm_permute_pswhen building for AVX and_mm_shuffle_pswhen building for SSE/SSE2. -
XM_FMADD_PSandXM_FNMADD_PSwhich are controlled by the use of FMA3 or not. -
XM_LOADU_SI16is a fix-up for older versions of GNUC which were missing_mm_loadu_si16.
ARM/ARM64
-
XM_PREFETCHis__prefetchor__builtin_prefetchfor ARM/ARM64.