-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8299525: RISC-V: Add backend support for half float conversion intrinsics #11828
Conversation
👋 Welcome back yadongwang! A progress list of the required criteria for merging this PR into |
Webrevs
|
src/hotspot/cpu/riscv/riscv.ad
Outdated
%} | ||
|
||
ins_encode %{ | ||
__ fmv_h_x($tmp$$FloatRegister, $src$$Register); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation here is two spaces instead of four spaces.
src/hotspot/cpu/riscv/riscv.ad
Outdated
%} | ||
|
||
ins_encode %{ | ||
__ fcvt_h_s($tmp$$FloatRegister, $src$$FloatRegister); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue here.
src/hotspot/cpu/riscv/riscv.ad
Outdated
__ fcvt_s_h($dst$$FloatRegister, $tmp$$FloatRegister); | ||
%} | ||
|
||
ins_pipe(fp_f2d); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the type of src
and dst
, we can use fp_i2f
as ins_pipe.
src/hotspot/cpu/riscv/riscv.ad
Outdated
__ fmv_x_h($dst$$Register, $tmp$$FloatRegister); | ||
%} | ||
|
||
ins_pipe(fp_f2d); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, fp_f2i
is better.
jdk/src/hotspot/cpu/riscv/vm_version_riscv.cpp Lines 51 to 74 in 82deb5c
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated change looks good.
@yadongw This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 112 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@RealFYang) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
/integrate |
/sponsor |
Going to push as commit 3a66737.
Your commit was automatically rebased without conflicts. |
@RealFYang @yadongw Pushed as commit 3a66737. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
This patch adds RISC-V backend support for library intrinsics that implement conversions between half-precision and single-precision floats by using RISC-V Zfhmin Extension, which was already ratified by November 2021 (https://wiki.riscv.org/display/HOME/Recently+Ratified+Extensions) and one of RVA22U64 Mandatory Extensions (https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#rva22-profiles).
The C2 output for PrintOptoAssembly:
0dc B10: # out( B33 B11 ) <- in( B9 ) Freq: 1.99802
0dc + flw F1, [R29, #16] # float, #@LoadF
0e0 + fcvt.h.s F0, F1 #@convF2HF_reg_reg
fmv.x.h R8, F0 #@convF2HF_reg_reg
0dc B10: # out( B33 B11 ) <- in( B9 ) Freq: 1.99801
0dc + lh R12, [R11, #16] # short, #@loads
0e0 + fmv.h.x F0, R12 #@convHF2F_reg_reg
fcvt.s.h F1, F0 #@convHF2F_reg_reg
We don't have any hardware supporting yet, so ran the following benchmarks in Qemu for unreliable reference:
VM options: -XX:+UnlockExperimentalVMOptions -XX:-UseZfhmin
Benchmark (size) Mode Samples Score Score error Units
o.s.Fp16ConversionBenchmark.float16ToFloat 2048 thrpt 15 44.523 0.116 ops/ms
o.s.Fp16ConversionBenchmark.float16ToFloatMemory 2048 thrpt 15 8379.835 27.309 ops/ms
o.s.Fp16ConversionBenchmark.floatToFloat16 2048 thrpt 15 7.370 0.028 ops/ms
o.s.Fp16ConversionBenchmark.floatToFloat16Memory 2048 thrpt 15 11292.278 11.962 ops/ms
VM options: -XX:+UnlockExperimentalVMOptions -XX:+UseZfhmin
Benchmark (size) Mode Samples Score Score error Units
o.s.Fp16ConversionBenchmark.float16ToFloat 2048 thrpt 15 12.357 0.153 ops/ms
o.s.Fp16ConversionBenchmark.float16ToFloatMemory 2048 thrpt 15 10213.944 69.222 ops/ms
o.s.Fp16ConversionBenchmark.floatToFloat16 2048 thrpt 15 11.728 0.067 ops/ms
o.s.Fp16ConversionBenchmark.floatToFloat16Memory 2048 thrpt 15 15008.550 13.917 ops/ms
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/11828/head:pull/11828
$ git checkout pull/11828
Update a local copy of the PR:
$ git checkout pull/11828
$ git pull https://git.openjdk.org/jdk pull/11828/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 11828
View PR using the GUI difftool:
$ git pr show -t 11828
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/11828.diff