-
Notifications
You must be signed in to change notification settings - Fork 190
intrinsics module with alternative implementations #915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
54 commits
Select commit
Hold shift + click to select a range
08ec0aa
intrinsics module with fast sums
jalvesz c36251e
Merge branch 'fortran-lang:master' into intrinsics
jalvesz 2207f41
Merge branch 'fortran-lang:master' into intrinsics
jalvesz 2bc7af9
add fast dot_product and start tests
jalvesz 4625205
Merge branch 'intrinsics' of https://github.com/jalvesz/stdlib into i…
jalvesz 243ea6f
add complex sum test
jalvesz c38dcd6
test masked sum
jalvesz bf1ce2f
add dot_product tests
jalvesz cc9df61
start specs
jalvesz 671fd61
Merge branch 'fortran-lang:master' into intrinsics
jalvesz 75945f1
split into submodules
jalvesz d05903f
specs and examples
jalvesz c0d96e5
Merge branch 'intrinsics' of https://github.com/jalvesz/stdlib into i…
jalvesz 4abd8d3
Merge branch 'fortran-lang:master' into intrinsics
jalvesz 7c6e8a4
fix specs
jalvesz 7cea1fd
fix test: complex initialization
jalvesz eaffa4a
fix test: complex assignment caused accuracy loss
jalvesz ad64162
Merge branch 'fortran-lang:master' into intrinsics
jalvesz a3d24e4
extend fsum support for ndarrays
jalvesz 5a1fdcb
remove unnecessary definition
jalvesz 47396ac
update specs, change name of kahan kernel
jalvesz ecb7050
small reorganization
jalvesz 87ef502
Merge branch 'intrinsics' of https://github.com/jalvesz/stdlib into i…
jalvesz 14be974
change names to stdlib_*
jalvesz aaa68bc
add comments
jalvesz cc232e1
Merge branch 'fortran-lang:master' into intrinsics
jalvesz 6e36b6f
extend kahan sum for rank N arrays
jalvesz 65175d7
Merge branch 'intrinsics' of https://github.com/jalvesz/stdlib into i…
jalvesz 8a35f38
Merge branch 'fortran-lang:master' into intrinsics
jalvesz 16a0e96
Merge branch 'fortran-lang:master' into intrinsics
jalvesz f0ed271
Update src/stdlib_intrinsics.fypp
jalvesz 316269b
Update test/intrinsics/test_intrinsics.fypp
jalvesz 52aab02
Update test/intrinsics/test_intrinsics.fypp
jalvesz a6be0a0
fix test allocation
jalvesz 3e171f7
nmask allocation
jalvesz 332b748
revert nmask allocation
jalvesz 537cef8
change kahan reference
jalvesz a4370c2
Merge branch 'fortran-lang:master' into intrinsics
jalvesz 9c5b2e0
Refactor stdlib module functions to unify handling of integer, real, …
jalvesz 10add87
Update src/stdlib_intrinsics.fypp
jalvesz 4b1c7e7
Update src/stdlib_intrinsics.fypp
jalvesz 23981d6
Update src/stdlib_intrinsics.fypp
jalvesz db97da9
Update src/stdlib_intrinsics.fypp
jalvesz 1b9019c
Update src/stdlib_intrinsics.fypp
jalvesz 783aabd
Update doc/specs/stdlib_intrinsics.md
jalvesz f867f5a
Update doc/specs/stdlib_intrinsics.md
jalvesz a54b962
Update doc/specs/stdlib_intrinsics.md
jalvesz 6fd728d
Update doc/specs/stdlib_intrinsics.md
jalvesz 11fb555
Merge branch 'intrinsics' of https://github.com/jalvesz/stdlib into i…
jalvesz cf76219
fix documentation
jalvesz 86b0ebc
Update doc/specs/stdlib_intrinsics.md
jalvesz 7bc190e
Update doc/specs/stdlib_intrinsics.md
jalvesz d905ed8
Update doc/specs/stdlib_intrinsics.md
jalvesz 12612bc
Update doc/specs/stdlib_intrinsics.md
jalvesz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,158 @@ | ||
--- | ||
title: intrinsics | ||
--- | ||
|
||
# The `stdlib_intrinsics` module | ||
|
||
[TOC] | ||
|
||
## Introduction | ||
|
||
The `stdlib_intrinsics` module provides replacements for some of the well known intrinsic functions found in Fortran compilers for which either a faster and/or more accurate implementation is found which has also proven of interest to the Fortran community. | ||
jvdp1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --> | ||
### `stdlib_sum` function | ||
|
||
#### Description | ||
|
||
The `stdlib_sum` function can replace the intrinsic `sum` for `real`, `complex` or `integer` arrays. It follows a chunked implementation which maximizes vectorization potential as well as reducing the round-off error. This procedure is recommended when summing large (e..g, >2**10 elements) arrays, for repetitive summation of smaller arrays consider the classical `sum`. | ||
|
||
#### Syntax | ||
|
||
`res = ` [[stdlib_intrinsics(module):stdlib_sum(interface)]] ` (x [,mask] )` | ||
|
||
`res = ` [[stdlib_intrinsics(module):stdlib_sum(interface)]] ` (x, dim [,mask] )` | ||
|
||
#### Status | ||
|
||
Experimental | ||
|
||
#### Class | ||
|
||
Pure function. | ||
|
||
#### Argument(s) | ||
|
||
`x`: N-D array of either `real`, `complex` or `integer` type. This argument is `intent(in)`. | ||
|
||
`dim` (optional): scalar of type `integer` with a value in the range from 1 to n, where n equals the rank of `x`. | ||
|
||
`mask` (optional): N-D array of `logical` values, with the same shape as `x`. This argument is `intent(in)`. | ||
|
||
#### Output value or Result value | ||
|
||
If `dim` is absent, the output is a scalar of the same `type` and `kind` as to that of `x`. Otherwise, an array of rank n-1, where n equals the rank of `x`, and a shape similar to that of `x` with dimension `dim` dropped is returned. | ||
|
||
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --> | ||
### `stdlib_sum_kahan` function | ||
|
||
#### Description | ||
|
||
The `stdlib_sum_kahan` function can replace the intrinsic `sum` for `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential complemented by an `elemental` kernel based on the [kahan summation](https://doi.org/10.1145%2F363707.363723) strategy to reduce the round-off error: | ||
jalvesz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```fortran | ||
elemental subroutine kahan_kernel_<kind>(a,s,c) | ||
type(<kind>), intent(in) :: a | ||
type(<kind>), intent(inout) :: s | ||
type(<kind>), intent(inout) :: c | ||
type(<kind>) :: t, y | ||
y = a - c | ||
t = s + y | ||
c = (t - s) - y | ||
s = t | ||
end subroutine | ||
``` | ||
|
||
#### Syntax | ||
|
||
`res = ` [[stdlib_intrinsics(module):stdlib_sum_kahan(interface)]] ` (x [,mask] )` | ||
|
||
`res = ` [[stdlib_intrinsics(module):stdlib_sum_kahan(interface)]] ` (x, dim [,mask] )` | ||
|
||
#### Status | ||
|
||
Experimental | ||
|
||
#### Class | ||
|
||
Pure function. | ||
|
||
#### Argument(s) | ||
|
||
`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`. | ||
jalvesz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
`dim` (optional): scalar of type `integer` with a value in the range from 1 to n, where n equals the rank of `x`. | ||
|
||
`mask` (optional): N-D array of `logical` values, with the same shape as `x`. This argument is `intent(in)`. | ||
|
||
#### Output value or Result value | ||
|
||
If `dim` is absent, the output is a scalar of the same type and kind as to that of `x`. Otherwise, an array of rank n-1, where n equals the rank of `x`, and a shape similar to that of `x` with dimension `dim` dropped is returned. | ||
|
||
#### Example | ||
|
||
```fortran | ||
{!example/intrinsics/example_sum.f90!} | ||
``` | ||
|
||
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --> | ||
### `stdlib_dot_product` function | ||
|
||
#### Description | ||
|
||
The `stdlib_dot_product` function can replace the intrinsic `dot_product` for 1D `real`, `complex` or `integer` arrays. It follows a chunked implementation which maximizes vectorization potential as well as reducing the round-off error. This procedure is recommended when crunching large arrays, for repetitive products of smaller arrays consider the classical `dot_product`. | ||
|
||
#### Syntax | ||
|
||
`res = ` [[stdlib_intrinsics(module):stdlib_dot_product(interface)]] ` (x, y)` | ||
|
||
#### Status | ||
|
||
Experimental | ||
|
||
#### Class | ||
|
||
Pure function. | ||
|
||
#### Argument(s) | ||
|
||
`x`: 1D array of either `real`, `complex` or `integer` type. This argument is `intent(in)`. | ||
|
||
`y`: 1D array of the same type and kind as `x`. This argument is `intent(in)`. | ||
|
||
#### Output value or Result value | ||
|
||
The output is a scalar of `type` and `kind` same as to that of `x` and `y`. | ||
|
||
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --> | ||
### `stdlib_dot_product_kahan` function | ||
|
||
#### Description | ||
|
||
The `stdlib_dot_product_kahan` function can replace the intrinsic `dot_product` for 1D `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential, complemented by the same `elemental` kernel based on the [kahan summation](https://doi.org/10.1145%2F363707.363723) used for `stdlib_sum` to reduce the round-off error. | ||
|
||
#### Syntax | ||
|
||
`res = ` [[stdlib_intrinsics(module):stdlib_dot_product_kahan(interface)]] ` (x, y)` | ||
|
||
#### Status | ||
|
||
Experimental | ||
|
||
#### Class | ||
|
||
Pure function. | ||
|
||
#### Argument(s) | ||
|
||
`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`. | ||
|
||
`y`: 1D array of the same type and kind as `x`. This argument is `intent(in)`. | ||
|
||
#### Output value or Result value | ||
|
||
The output is a scalar of the same type and kind as to that of `x` and `y`. | ||
|
||
```fortran | ||
{!example/intrinsics/example_dot_product.f90!} | ||
``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
ADD_EXAMPLE(sum) | ||
ADD_EXAMPLE(dot_product) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
program example_dot_product | ||
use stdlib_kinds, only: sp | ||
use stdlib_intrinsics, only: stdlib_dot_product, stdlib_dot_product_kahan | ||
implicit none | ||
|
||
real(sp), allocatable :: x(:), y(:) | ||
real(sp) :: total_prod(3) | ||
|
||
allocate( x(1000), y(1000) ) | ||
call random_number(x) | ||
call random_number(y) | ||
|
||
total_prod(1) = dot_product(x,y) !> compiler intrinsic | ||
total_prod(2) = stdlib_dot_product(x,y) !> chunked summation over inner product | ||
total_prod(3) = stdlib_dot_product_kahan(x,y) !> chunked kahan summation over inner product | ||
print *, total_prod(1:3) | ||
|
||
end program example_dot_product |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
program example_sum | ||
use stdlib_kinds, only: sp | ||
use stdlib_intrinsics, only: stdlib_sum, stdlib_sum_kahan | ||
implicit none | ||
|
||
real(sp), allocatable :: x(:) | ||
real(sp) :: total_sum(3) | ||
|
||
allocate( x(1000) ) | ||
call random_number(x) | ||
|
||
total_sum(1) = sum(x) !> compiler intrinsic | ||
total_sum(2) = stdlib_sum(x) !> chunked summation | ||
total_sum(3) = stdlib_sum_kahan(x)!> chunked kahan summation | ||
print *, total_sum(1:3) | ||
|
||
end program example_sum |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.