Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let 2023.06 modulefile redirect RISC-V clients to riscv.eessi.io, add symlink for 20240402 version #840

Merged
merged 3 commits into from
Jan 14, 2025

Conversation

bedroge
Copy link
Collaborator

@bedroge bedroge commented Dec 10, 2024

This allows RISC-V clients (and, in particular, the build bot) to use the same modulefile as we use for software.eessi.io. By adding a symlink 20240402.lua -> 2023.06.lua we can reuse the exact same modulefile. Note that I haven't added the symlink to install_scripts.sh, which ensures that it won't be deployed to any repo (it would be confusing to have this in software.eessi.io, and we don't actually need it in riscv.eessi.io either, as long as it's available in the git repo; the bot sources the file from the git repo). The only slightly confusing thing is that the RISC-V repo will deploy the 2023.06.lua modulefile, so users from that repo will only see EESSI/2023.06. But I'm not really sure how to prevent that, except by adding a bunch of specific if statements to install_scripts.sh.

Instead of this approach we could also modify the EESSI-install-software.sh script and do something like this:

if [[ ${EESSI_CVMFS_REPO} == "/cvmfs/riscv.eessi.io" ]]; then
    # For the RISC-V repository we don't have a module file, so stick to the bash init script
    source $TOPDIR/init/bash
else
    module use $TOPDIR/init/modules
    module load EESSI/$EESSI_VERSION
fi

But I liked the solution with the modulefile a bit more.

@riscv-eessi-io-bot
Copy link

Instance eessi-bot-riscv is configured to build for:

  • architectures: riscv64/generic
  • repositories: riscv.eessi.io-20240402

Copy link

eessi-bot bot commented Dec 10, 2024

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-software, eessi-hpc.org-2023.06-compat, eessi.io-2023.06-software

Copy link

eessi-bot bot commented Dec 10, 2024

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi.io-2023.06-compat, eessi.io-2023.06-software

@bedroge bedroge added enhancement New feature or request riscv labels Dec 10, 2024
@bedroge
Copy link
Collaborator Author

bedroge commented Dec 10, 2024

Successful build with the RISC-V bot using this approach in #803 (comment).

The latter still had a hardcoded eessi_version = "20240402", it's now doing another one with the more flexible eessi_version = os.getenv("EESSI_VERSION_OVERRIDE") or "20240402":
#803 (comment)
(from the logs I can already see that the build was successful, it's now running the test step)

Copy link
Collaborator

@trz42 trz42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

@bedroge
Copy link
Collaborator Author

bedroge commented Jan 14, 2025

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic

Copy link

eessi-bot bot commented Jan 14, 2025

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from bedroge

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented Jan 14, 2025

Updates by the bot instance eessi-bot-mc-aws (click for details)

Copy link

eessi-bot bot commented Jan 14, 2025

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2025.01/pr_840/40398

date job status comment
Jan 14 11:08:41 UTC 2025 submitted job id 40398 awaits release by job manager
Jan 14 11:08:54 UTC 2025 released job awaits launch by Slurm scheduler
Jan 14 11:14:56 UTC 2025 running job 40398 is running
Jan 14 11:22:04 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-40398.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-generic-1736853348.tar.gzsize: 0 MiB (3034 bytes)
entries: 1
modules under 2023.06/software/linux/x86_64/generic/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/generic/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/generic
2023.06/init/modules/EESSI/2023.06.lua
Jan 14 11:22:04 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] ( 1/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos %scale=1_node /aeb2d9df @BotBuildTests:x86-64-generic-node+default
P: perf: 454.372 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 2/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos %scale=1_node /04ff9ece @BotBuildTests:x86-64-generic-node+default
P: perf: 468.161 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 3/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /31ac6ab9 @BotBuildTests:x86-64-generic-node+default
P: latency: 5.23 us (r:0, l:None, u:None)
[ OK ] ( 4/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /f3be40a2 @BotBuildTests:x86-64-generic-node+default
P: latency: 5.06 us (r:0, l:None, u:None)
[ OK ] ( 5/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /10e66fba @BotBuildTests:x86-64-generic-node+default
P: latency: 8.38 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /5be57ae7 @BotBuildTests:x86-64-generic-node+default
P: latency: 7.76 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /c8c9aff5 @BotBuildTests:x86-64-generic-node+default
P: latency: 0.74 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /9795e491 @BotBuildTests:x86-64-generic-node+default
P: latency: 0.67 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /48da21c5 @BotBuildTests:x86-64-generic-node+default
P: bandwidth: 10179.84 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /1b8c1ca2 @BotBuildTests:x86-64-generic-node+default
P: bandwidth: 10262.59 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-40398.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Jan 14 11:26:45 UTC 2025 uploaded transfer of eessi-2023.06-software-linux-x86_64-generic-1736853348.tar.gz to S3 bucket succeeded

@trz42 trz42 added the bot:deploy Ask bot to deploy missing software installations to EESSI label Jan 14, 2025
@trz42
Copy link
Collaborator

trz42 commented Jan 14, 2025

Tarball got ingested. New module file is available via CernVM-FS.

@trz42 trz42 merged commit 641bf47 into EESSI:2023.06-software.eessi.io Jan 14, 2025
50 checks passed
Copy link

eessi-bot bot commented Jan 14, 2025

PR merged! Moved ['/project/def-users/SHARED/jobs/2025.01/pr_840/40398'] to /project/def-users/SHARED/trash_bin/EESSI/software-layer/2025.01.14

Copy link

eessi-bot bot commented Jan 14, 2025

PR merged! Moved [] to /project/def-users/SHARED/trash_bin/EESSI/software-layer/2025.01.14

@gpu-bot-ugent
Copy link

gpu-bot-ugent bot commented Jan 14, 2025

PR merged! Moved [] to /scratch/gent/vo/002/gvo00211/SHARED/trash_bin/EESSI/software-layer/2025.01.14

@bedroge bedroge deleted the modulefile_riscv branch January 14, 2025 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:deploy Ask bot to deploy missing software installations to EESSI enhancement New feature or request riscv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants