This repository has been archived by the owner on Mar 1, 2023. It is now read-only.
Tags: EhViewer-NekoInverter/libjpeg-turbo
Tags
Neon/AArch64: Explicitly unroll quant loop w/Clang The loop in jsimd_quantize_neon() is only executed twice and should be unrolled for AArch64 targets. GCC does that by default, but Clang 11 and later versions available at the time of this writing do not. This patch adds an unroll pragma when targetting AArch64 with Clang. We do not use the unroll pragma for AArch32 targets, because it causes the Clang-generated assembly code to exhaust the available Neon registers (32 x 64-bit) and spill to the stack. (DRC: Referring to the discussion in #570, this is likely due to compiler confusion that results in poor register allocation. It is possible to eliminate the spillage and reduce the instruction count by loading the data on a just-in-time basis, thus explicitly interleaving compute and I/O, but the performance implications of that are currently unknown.) The effects of unrolling the quantization loop are: 1) elimination of the loop control flow overhead and 2) enabling the use of LDP/STP instructions that work from a single base pointer, instead of using double the number of LDR/STR instructions, each requiring an address calculation. Closes #570
jmemmgr.c: Pass correct size arg to jpeg_free_*() This issue was introduced in 5557fd2 due to an oversight, so it has existed in libjpeg-turbo since the project's inception. However, the issue is effectively a non-issue. Although #325 proposes allowing programs to override jpeg_get_*() and jpeg_free_*() externally, there is currently no way to override those functions without modifying the libjpeg-turbo source code. libjpeg-turbo only includes the malloc()/free() memory manager from libjpeg, and the implementation of jpeg_free_*() in that memory manager ignores the size argument. libjpeg had several additional memory managers for legacy systems (MS-DOS, System 7, etc.), but those memory managers ignored the size argument to jpeg_free_*() as well. Thus, this issue would have only potentially affected custom memory managers in downstream libjpeg-turbo forks, and since no one has complained until now, apparently those are rare. Fixes #542
PreviousNext