forked from apache/arrow
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ARROW-16901: [R][CI] Prune R nightly builds (apache#13453)
This PR adds pruning to the nightly R upload, 14 versions will be kept by default. I have removed the `burnett01/rsync-deployments` actions because the use of docker for this was unnecessary and the action can only upload to a remote. This new manual version also utilizes host key checking for which I created `secrets.NIGHTLIES_RSYNC_HOST_KEY` (which should contain the result of ` ssh-keyscan -H nightlies.apache.org 2> /dev/null` and needs to be added to apache/arrow before this can run). This way we are no longer depending on the action and it's associated Dockerfile (`drinternet/rsync`). We might want to refactor this into a local action for use with all nightly upload jobs. The pruning is not super efficient as we download the whole nightly repository (on cache miss). This could be avoided for the libarrow files, they could possibly be deleted via ssh instead but we need to download all R packages as `tools::write_PACKAGES` needs access to each archive. Authored-by: Jacob Wujciak-Jens <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
- Loading branch information
1 parent
6c4261e
commit 804c08c
Showing
4 changed files
with
285 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
<!--- | ||
Licensed to the Apache Software Foundation (ASF) under one | ||
or more contributor license agreements. See the NOTICE file | ||
distributed with this work for additional information | ||
regarding copyright ownership. The ASF licenses this file | ||
to you under the Apache License, Version 2.0 (the | ||
"License"); you may not use this file except in compliance | ||
with the License. You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, | ||
software distributed under the License is distributed on an | ||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations | ||
under the License. | ||
---> | ||
|
||
# Sync Nightlies | ||
This action can be used to sync directories from/to [nightlies.apache.org] with | ||
rsync. It requires the correct secrets to be in place as described | ||
[below](#usage). | ||
Currently this action is intended to sync the *contents* of `local_path` to | ||
`remote_path` (or vice versa), so a slash will be appended to the source path. | ||
Uploading single files or dirs is not possible directly but only by wrapping | ||
them in an additional directory. | ||
|
||
## Inputs | ||
- `upload` Set to `true` to upload from `local_path` to `remote_path` | ||
- `switches` See rsync --help for available switches. | ||
- `local_path` The relative local path within $GITHUB_WORKSPACE | ||
- `remote_path` The remote path incl. sub dirs e.g. {{secrets.path}}/arrow/r. | ||
- `remote_host` The remote host. | ||
- `remote_port` The remote port. | ||
- `remote_user` The remote user. | ||
- `remote_key` The remote ssh key. | ||
- `remote_host_key` The host key fot StrictHostKeyChecking. | ||
|
||
## Usage | ||
The secrets have to be set by INFRA, except `secrets.NIGHTLIES_RSYNC_HOST_KEY` | ||
which should contain the result of `ssh-keyscan -H nightlies.apache.org 2> | ||
/dev/null`. This example requires apache/arrow to be checked out in `arrow`. | ||
|
||
```yaml | ||
- name: Sync from Remote | ||
uses: ./arrow/.github/actions/sync-nightlies | ||
with: | ||
switches: -avzh --update --delete --progress | ||
local_path: repo | ||
remote_path: ${{ secrets.NIGHTLIES_RSYNC_PATH }}/arrow/r | ||
remote_host: ${{ secrets.NIGHTLIES_RSYNC_HOST }} | ||
remote_port: ${{ secrets.NIGHTLIES_RSYNC_PORT }} | ||
remote_user: ${{ secrets.NIGHTLIES_RSYNC_USER }} | ||
remote_key: ${{ secrets.NIGHTLIES_RSYNC_KEY }} | ||
remote_host_key: ${{ secrets.NIGHTLIES_RSYNC_HOST_KEY }} | ||
|
||
- name: Sync to Remote | ||
uses: ./arrow/.github/actions/sync-nightlies | ||
with: | ||
upload: true | ||
switches: -avzh --update --delete --progress | ||
local_path: repo | ||
remote_path: ${{ secrets.NIGHTLIES_RSYNC_PATH }}/arrow/r | ||
remote_host: ${{ secrets.NIGHTLIES_RSYNC_HOST }} | ||
remote_port: ${{ secrets.NIGHTLIES_RSYNC_PORT }} | ||
remote_user: ${{ secrets.NIGHTLIES_RSYNC_USER }} | ||
remote_key: ${{ secrets.NIGHTLIES_RSYNC_KEY }} | ||
remote_host_key: ${{ secrets.NIGHTLIES_RSYNC_HOST_KEY }} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one | ||
# or more contributor license agreements. See the NOTICE file | ||
# distributed with this work for additional information | ||
# regarding copyright ownership. The ASF licenses this file | ||
# to you under the Apache License, Version 2.0 (the | ||
# "License"); you may not use this file except in compliance | ||
# with the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, | ||
# software distributed under the License is distributed on an | ||
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
# KIND, either express or implied. See the License for the | ||
# specific language governing permissions and limitations | ||
# under the License. | ||
|
||
name: 'Sync Nightlies' | ||
description: 'Sync files to and from nightlies.apache.org' | ||
inputs: | ||
upload: | ||
description: 'Sync from local to remote' | ||
default: false | ||
required: false | ||
switches: | ||
description: 'see rsync --help' | ||
required: true | ||
local_path: | ||
description: 'The relative local path within $GITHUB_WORKSPACE' | ||
required: true | ||
remote_path: | ||
description: 'The remote path incl. sub dirs e.g. {{secrets.path}}/arrow/r' | ||
required: true | ||
remote_host: | ||
description: 'The remote host' | ||
required: true | ||
remote_port: | ||
description: 'The remote port' | ||
required: false | ||
default: 22 | ||
remote_user: | ||
description: 'The remote user' | ||
required: true | ||
remote_key: | ||
description: 'The remote key' | ||
required: true | ||
remote_host_key: | ||
description: 'The host key for StrictHostKeyChecking' | ||
|
||
required: true | ||
|
||
runs: | ||
using: "composite" | ||
steps: | ||
- name: Sync files | ||
shell: bash | ||
env: | ||
SWITCHES: "${{ inputs.switches }}" | ||
LOCAL_PATH: "${{ github.workspace }}/${{ inputs.local_path }}" | ||
|
||
SSH_KEY: "${{ inputs.remote_key }}" | ||
PORT: "${{ inputs.remote_port }}" | ||
USER: "${{ inputs.remote_user }}" | ||
HOST: "${{ inputs.remote_host }}" | ||
HOST_KEY: "${{ inputs.remote_host_key }}" | ||
REMOTE_PATH: "${{ inputs.remote_path }}" | ||
run: | | ||
# Make SSH key available and add remote to known hosts | ||
eval "$(ssh-agent)" > /dev/null | ||
echo "$SSH_KEY" | tr -d '\r' | ssh-add - >/dev/null | ||
mkdir -p .ssh | ||
chmod go-rwx .ssh | ||
echo "$HOST_KEY" >> .ssh/known_hosts | ||
# strict errors | ||
set -eu | ||
# We have to use a custom RSH to supply the port | ||
RSH="ssh -o UserKnownHostsFile=.ssh/known_hosts -p $PORT" | ||
DSN="$USER@$HOST" | ||
# It is important to append '/' to the source path otherwise | ||
# the entire source dir will be created as a sub dir in the destination | ||
if [ "${{ inputs.upload }}" = true ] | ||
then | ||
SOURCE=$LOCAL_PATH/ | ||
DEST=$DSN:$REMOTE_PATH | ||
else | ||
SOURCE=$DSN:$REMOTE_PATH/ | ||
DEST=$LOCAL_PATH | ||
fi | ||
rsync $SWITCHES --rsh="$RSH" $SOURCE $DEST |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,17 +17,25 @@ | |
|
||
name: Upload R Nightly builds | ||
# This workflow downloads the (nightly) binaries created in crossbow and uploads them | ||
# to nightlies.apache.org. Due to authorization requirements, this upload can't be done | ||
|
||
# to nightlies.apache.org. Due to authorization requirements, this upload can't be done | ||
# from the crossbow repository. | ||
|
||
# This removes all permissions from the token | ||
permissions: | ||
contents: none | ||
|
||
on: | ||
workflow_dispatch: | ||
inputs: | ||
prefix: | ||
description: Job prefix to use. | ||
required: false | ||
default: '' | ||
keep: | ||
description: Number of versions to keep. | ||
required: false | ||
default: 14 | ||
|
||
schedule: | ||
#Crossbow packaging runs at 0 8 * * * | ||
- cron: '0 14 * * *' | ||
|
@@ -78,10 +86,28 @@ jobs: | |
echo "No files found. Stopping upload." | ||
exit 1 | ||
fi | ||
- name: Cache Repo | ||
uses: actions/cache@v3 | ||
with: | ||
path: repo | ||
key: r-nightly-${{ github.run_id }} | ||
restore-keys: r-nightly- | ||
- name: Sync from Remote | ||
uses: ./arrow/.github/actions/sync-nightlies | ||
with: | ||
switches: -avzh --update --delete --progress | ||
local_path: repo | ||
remote_path: ${{ secrets.NIGHTLIES_RSYNC_PATH }}/arrow/r | ||
remote_host: ${{ secrets.NIGHTLIES_RSYNC_HOST }} | ||
remote_port: ${{ secrets.NIGHTLIES_RSYNC_PORT }} | ||
remote_user: ${{ secrets.NIGHTLIES_RSYNC_USER }} | ||
remote_key: ${{ secrets.NIGHTLIES_RSYNC_KEY }} | ||
remote_host_key: ${{ secrets.NIGHTLIES_RSYNC_HOST_KEY }} | ||
- run: tree repo | ||
- name: Build Repository | ||
shell: Rscript {0} | ||
run: | | ||
# folder that we rsync to nightlies.apache.org | ||
# folder that we sync to nightlies.apache.org | ||
repo_root <- "repo" | ||
# The binaries are in a nested dir | ||
# so we need to find the correct path. | ||
|
@@ -101,18 +127,37 @@ jobs: | |
# strip superfluous nested dirs | ||
new_paths <- sub(art_path, ".", new_paths) | ||
dirs <- dirname(new_paths) | ||
dir_result <- sapply(dirs, dir.create, recursive = TRUE) | ||
if (!all(dir_result)) { | ||
stop("There was an issue while creating the folders!") | ||
} | ||
sapply(dirs, dir.create, recursive = TRUE, showWarnings = FALSE) | ||
copy_result <- file.copy(current_path, new_paths) | ||
# overwrite allows us to "force push" a new version with the same name | ||
copy_result <- file.copy(current_path, new_paths, overwrite = TRUE) | ||
if (!all(copy_result)) { | ||
stop("There was an issue while copying the files!") | ||
} | ||
- name: Prune Repository | ||
shell: bash | ||
env: | ||
KEEP: ${{ github.event.inputs.keep || 14 }} | ||
run: | | ||
prune() { | ||
# list files | retain $KEEP newest files | delete everything else | ||
ls -t $1/arrow* | tail -n +$((KEEP + 1)) | xargs --no-run-if-empty rm | ||
} | ||
# find leaf sub dirs | ||
repo_dirs=$(find repo -type d -links 2) | ||
# We want to retain $keep (14) versions of each pkg/lib so we call | ||
# prune on each leaf dir and not on repo/. | ||
for dir in ${repo_dirs[@]}; do | ||
prune $dir | ||
done | ||
- name: Update Repository Index | ||
shell: Rscript {0} | ||
run: | | ||
# folder that we sync to nightlies.apache.org | ||
repo_root <- "repo" | ||
tools::write_PACKAGES(file.path(repo_root, "src/contrib"), type = "source", verbose = TRUE) | ||
repo_dirs <- list.dirs(repo_root) | ||
|
@@ -125,14 +170,16 @@ jobs: | |
tools::write_PACKAGES(dir, type = ifelse(on_win, "win.binary", "mac.binary"), verbose = TRUE ) | ||
} | ||
- name: Show repo contents | ||
run: ls -R repo | ||
- name: Upload Files | ||
uses: burnett01/[email protected] | ||
run: tree repo | ||
- name: Sync to Remote | ||
uses: ./arrow/.github/actions/sync-nightlies | ||
with: | ||
switches: -avzr | ||
path: repo/* | ||
upload: true | ||
switches: -avzh --update --delete --progress | ||
local_path: repo | ||
remote_path: ${{ secrets.NIGHTLIES_RSYNC_PATH }}/arrow/r | ||
remote_host: ${{ secrets.NIGHTLIES_RSYNC_HOST }} | ||
remote_port: ${{ secrets.NIGHTLIES_RSYNC_PORT }} | ||
remote_user: ${{ secrets.NIGHTLIES_RSYNC_USER }} | ||
remote_key: ${{ secrets.NIGHTLIES_RSYNC_KEY }} | ||
remote_host_key: ${{ secrets.NIGHTLIES_RSYNC_HOST_KEY }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters