forked from FFTW/fftw3
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Previously, some kernels were actually faster with the old SSE2 SIMD, which made it necessary to compile with both sse2 and avx for good performance. This adds 128-bit AVX kernels which are enabled together with the standard AVX kernels. Apart from being encoded with AVX rather than SSE instructions (depending on compiler flags), it also uses a couple of new instructions only available with AVX that use fewer micro-ops. These instructions have also been added to the 256-bit AVX SIMD implementation. No new configure flags needed, it is just faster.
- Loading branch information
Erik Lindahl
committed
Mar 25, 2015
1 parent
131027a
commit b606e31
Showing
14 changed files
with
393 additions
and
30 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
SUBDIRS = common sse2 avx altivec neon | ||
SUBDIRS = common sse2 avx avx-128 altivec neon | ||
EXTRA_DIST = n1b.h n1f.h n2b.h n2f.h n2s.h q1b.h q1f.h t1b.h t1bu.h \ | ||
t1f.h t1fu.h t2b.h t2f.h t3b.h t3f.h ts.h codlist.mk simd.mk |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
AM_CFLAGS = $(AVX_CFLAGS) | ||
SIMD_HEADER=simd-avx-128.h | ||
|
||
include $(top_srcdir)/dft/simd/codlist.mk | ||
include $(top_srcdir)/dft/simd/simd.mk | ||
|
||
if HAVE_AVX | ||
|
||
BUILT_SOURCES = $(EXTRA_DIST) | ||
noinst_LTLIBRARIES = libdft_avx_128_codelets.la | ||
libdft_avx_128_codelets_la_SOURCES = $(BUILT_SOURCES) | ||
|
||
endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
SUBDIRS = common sse2 avx altivec neon | ||
SUBDIRS = common sse2 avx avx-128 altivec neon | ||
EXTRA_DIST = hc2cbv.h hc2cfv.h codlist.mk simd.mk |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
AM_CFLAGS = $(AVX_CFLAGS) | ||
SIMD_HEADER=simd-avx-128.h | ||
|
||
include $(top_srcdir)/rdft/simd/codlist.mk | ||
include $(top_srcdir)/rdft/simd/simd.mk | ||
|
||
if HAVE_AVX | ||
|
||
noinst_LTLIBRARIES = librdft_avx_128_codelets.la | ||
BUILT_SOURCES = $(EXTRA_DIST) | ||
librdft_avx_128_codelets_la_SOURCES = $(BUILT_SOURCES) | ||
|
||
endif | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.