Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support in archdetect for detecting A64FX #608

Conversation

boegel
Copy link
Contributor

@boegel boegel commented Jun 13, 2024

No description provided.

@boegel boegel added aarch64 related to Arm 64-bit targets (aarch64) 2023.06-software.eessi.io 2023.06 version of software.eessi.io labels Jun 13, 2024

This comment was marked as outdated.

This comment was marked as outdated.

@boegel boegel force-pushed the 2023.06-software.eessi.io_archdetect-a64fx branch from d426d99 to e3225a8 Compare June 13, 2024 19:04
@boegel boegel force-pushed the 2023.06-software.eessi.io_archdetect-a64fx branch from 688d36e to cb1672b Compare June 13, 2024 19:10
@boegel boegel added the a64fx label Jul 6, 2024
@boegel boegel marked this pull request as ready for review January 10, 2025 12:30
@boegel
Copy link
Contributor Author

boegel commented Jan 10, 2025

bot: build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1

Copy link

eessi-bot bot commented Jan 10, 2025

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1 from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1 resulted in:

Copy link

eessi-bot bot commented Jan 10, 2025

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1 from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1 resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented Jan 10, 2025

New job on instance eessi-bot-mc-aws for CPU micro-architecture aarch64-neoverse_n1 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2025.01/pr_608/39812

date job status comment
Jan 10 12:32:22 UTC 2025 submitted job id 39812 awaits release by job manager
Jan 10 12:32:52 UTC 2025 released job awaits launch by Slurm scheduler
Jan 10 12:38:03 UTC 2025 running job 39812 is running
Jan 10 12:47:25 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-39812.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_n1-1736512762.tar.gzsize: 0 MiB (17750 bytes)
entries: 3
modules under 2023.06/software/linux/aarch64/neoverse_n1/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/neoverse_n1/software
no software packages in tarball
other under 2023.06/software/linux/aarch64/neoverse_n1
2023.06/init/arch_specs/eessi_arch_arm.spec
2023.06/init/easybuild/eb_hooks.py
2023.06/init/eessi_archdetect.sh
Jan 10 12:47:25 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] ( 1/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos /aeb2d9df @BotBuildTests:aarch64-neoverse-n1-node+default
P: perf: 663.951 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 2/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos /04ff9ece @BotBuildTests:aarch64-neoverse-n1-node+default
P: perf: 671.001 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 3/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /31ac6ab9 @BotBuildTests:aarch64-neoverse-n1-node+default
P: latency: 3.77 us (r:0, l:None, u:None)
[ OK ] ( 4/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /f3be40a2 @BotBuildTests:aarch64-neoverse-n1-node+default
P: latency: 3.61 us (r:0, l:None, u:None)
[ OK ] ( 5/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /10e66fba @BotBuildTests:aarch64-neoverse-n1-node+default
P: latency: 5.67 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /5be57ae7 @BotBuildTests:aarch64-neoverse-n1-node+default
P: latency: 5.48 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /c8c9aff5 @BotBuildTests:aarch64-neoverse-n1-node+default
P: latency: 0.46 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /9795e491 @BotBuildTests:aarch64-neoverse-n1-node+default
P: latency: 0.43 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /48da21c5 @BotBuildTests:aarch64-neoverse-n1-node+default
P: bandwidth: 19692.91 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /1b8c1ca2 @BotBuildTests:aarch64-neoverse-n1-node+default
P: bandwidth: 19603.35 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-39812.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Jan 10 13:38:51 UTC 2025 uploaded transfer of eessi-2023.06-software-linux-aarch64-neoverse_n1-1736512762.tar.gz to S3 bucket succeeded

@boegel boegel changed the title add support for detecting A64FX to archdetect add support in archdetect for detecting A64FX Jan 10, 2025
"aarch64/neoverse_v1" "ARM" "asimddp svei8mm"
"aarch64/neoverse_v1" "" "asimddp svei8mm" # AWS Graviton3
"aarch64/neoverse_v1" "0x41" "asimddp svei8mm" # AWS Graviton3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when there are multiple vendors implementing the instruction set, like with Gravition/Grace for "aarch64/neoverse_v2"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is only necessary when there is a doubt about what the match is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://gpages.juszkiewicz.com.pl/arm-socs-table/arm-socs.html only lists Graviton 3 as CPU supporting Neoverse V1, so that makes it very unlikely in practice we'll run into something that's detected as Neoverse V1 but isn't a Graviton 3 (at least today).

For neoverse_v2 with Graviton 4 vs Grace, there's a more complex situation though, because Graviton 4 supports CPU instructions like paca, pacg, rng which Grace doesn't, and the other way around (like sm3, sm4, svesm4).
It gets even more interesting when Google Axion is taken into account, since that doesn't support sbss, which the other two do...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, it is right there in the file, just means we need to add each supported CPU explicitly. That seems like a bit of a pity, doesn't it make the archdetect selection very conservative if it encounters a CPU it hasn't seen before? It doesn't matter if the CPU has all the required features

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For neoverse_v2 with Graviton 4 vs Grace, there's a more complex situation though, because Graviton 4 supports CPU instructions like paca, pacg, rng which Grace doesn't, and the other way around (like sm3, sm4, svesm4). It gets even more interesting when Google Axion is taken into account, since that doesn't support sbss, which the other two do...

Isn't that quite a big problem? Can we disable some instructions? I guess compilers would be caught by surprise...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This data is probably pretty useful for us: https://github.com/hrw/arm-socs-table/tree/main/cpuinfo-data

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we shouldn't have a much more considered set of CPU features that reflect the CPU where the software was built for the target and then match that? This would do a much better job of future matching CPUs. Is it really the case that CPU features alone are not enough to distinguish A64FX from Neoverse_N1?

Not going to hold this PR back for that discussion, will open an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also wondering if we can instruct the compiler to not emit particular instructions even if they're supported.
My gut says that should be possible, but I'm not sure it actually is...

Let's follow up on that in a dedicated issue though => #845

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For A64FX vs Neoverse N1/V1, see the /proc/cpuinfo dumps we have under tests/archdetect/.

@boegel
Copy link
Contributor Author

boegel commented Jan 10, 2025

eb_hooks.py is in the generated tarball, but untouched in this PR...
Did we overlook a deploy in a previous already merged PR?

@ocaisa
Copy link
Member

ocaisa commented Jan 10, 2025

Sure did, #841

@bedroge
Copy link
Collaborator

bedroge commented Jan 10, 2025

eb_hooks.py is in the generated tarball, but untouched in this PR... Did we overlook a deploy in a previous already merged PR?

Yeah, I also just noticed that. I guess it's because of #841 (comment).

@bedroge bedroge added the bot:deploy Ask bot to deploy missing software installations to EESSI label Jan 10, 2025
@bedroge
Copy link
Collaborator

bedroge commented Jan 10, 2025

The tarball has been approved and ingested.

@bedroge bedroge merged commit 3af9cf6 into EESSI:2023.06-software.eessi.io Jan 10, 2025
51 checks passed
Copy link

eessi-bot bot commented Jan 10, 2025

PR merged! Moved ['/project/def-users/SHARED/jobs/2025.01/pr_608/39812'] to /project/def-users/SHARED/trash_bin/EESSI/software-layer/2025.01.10

Copy link

eessi-bot bot commented Jan 10, 2025

PR merged! Moved [] to /project/def-users/SHARED/trash_bin/EESSI/software-layer/2025.01.10

@boegel boegel deleted the 2023.06-software.eessi.io_archdetect-a64fx branch January 12, 2025 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2023.06-software.eessi.io 2023.06 version of software.eessi.io a64fx aarch64 related to Arm 64-bit targets (aarch64) bot:deploy Ask bot to deploy missing software installations to EESSI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants