PPU/SPU LLVM: Use native ARM shuffles in recompilers instead of emulating x86 pshufb by Whatcookie · Pull Request #18056 · RPCS3/rpcs3

Finally properly emulates the PS3's most iconic instruction (according to me) efficiently on ARM machines too!
Brings SHUFB from 9 instructions down to 5, though it should be 4 if LLVM would just emit BCAX...

Should result in some nice speedup for arm machines. In another pull request I will tackle the ROTQBY family of instructions.