-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRT Pow function has bad performance on Windows #10798
Comments
@fiigii, do you have any metrics for what time is spent in |
@tannergooding Here is the VTune data of CRT |
The CPI of |
@fiigii, that would be a bit surprising. I'm looking at the implementation and it is some fairly heavily optimized FMA3 code (there is also an SSE2 code path, but you shouldn't hit that). Unfortunately, that implementation is closed source, so I can't share it here. |
I'm trying to collect a trace locally as well, to see if I get the same. |
Right, I saw it also from disasm. |
@fiigii, how was CoreCLR compiled for you? I'm testing on only a |
I changed the image size from 250x250 to 2480x2480 and just rendered one image (not using RenderLoop) to avoid the collection pool imapct for the profling. |
So, to clarify, you changed just the following?: -private const int Width = 250;
-private const int Height = 250;
-private const int Iterations = 7;
+private const int Width = 2480;
+private const int Height = 2480;
+private const int Iterations = 1; |
Similar, but I did not use |
Looks like glibc recently (07 AUG 2017) made a few changes: https://sourceware.org/git/?p=glibc.git;a=commit;h=57a72fa3502673754d14707da02c7c44e83b8d20 Namely, they still use the Additionally, it looks like, since the calling conventions map up, they generally end up calling CC. @CarolEidt, @AndyAyersMS, @jkotas |
Does CoreCLR not JIT methods with the platform calling convention? |
@roterdam, it does. However, the backing implementations for most |
@tannergooding Were you able to reproduce this issue? Is this something we can do or do we need to loop in the VC++ team? |
@AaronRobinsonMSFT. Yes, I was able to reproduce this. I believe this is already tracked by one of the C++ bugs I logged internally, but I will double-check and log a new one if not. This is also part of a bigger picture with System.Math/MathF that is being tracked internally. I can share more details offline if necessary. |
@tannergooding Not necessary. My main goal was simply to set milestones for issues tagged with VM. If this is something that is post-3.0, then feel free to tag as needed. If 3.0, do we have a plan to deliver it? |
Tagging subscribers to this area: @tannergooding |
Due to lack of recent activity, this issue has been marked as a candidate for backlog cleanup. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will undo this process. This process is part of our issue cleanup automation. |
This issue will now be closed since it had been marked |
During benchmarking AoS/SoA ray-tracer dotnet/coreclr#18839, we found that the
Vector3
benchmark (RayTracer) is much slower on Windows than Linux.According to VTune analysis, this gap is caused by the CRT math library, which RayTracer uses
Math.Pow
at https://github.com/dotnet/coreclr/blob/master/tests/src/JIT/Performance/CodeQuality/SIMD/RayTracer/Raytracer.cs#L153Windows
Linux
On the left side (AoS means RayTracer), we can see
ucrtbase.dll
on Windows has much more time consuming and instruction retired thanlibm-2.23.so
on Linux.The data is collected on Core i9 + VS2017, but Core i7+ VS2015 has the same performance gap.
The text was updated successfully, but these errors were encountered: