Skip to content

Commit

Permalink
qa/tasks/ceph: resurrect log compression
Browse files Browse the repository at this point in the history
Commit 9536625 ("qa/tasks/ceph: use Cluster.sh() and Remote.sh()
when appropriate") dropped run.wait(), which waits for all given
processes to exit.  This resulted in errors like

  INFO:teuthology.orchestra.run.smithi107.stderr:tar: ./objectstore_tool..log: file changed as we read it
  INFO:teuthology.orchestra.run.smithi107.stderr:tar: ./ceph-client.admin.175125.log: File removed before we read it

as the task moved on to archiving semi-corrupted and uncompressed logs,
filling up the lab cluster.

Revert that hunk, as Cluster.sh() is useless here -- we don't need
stdout or stderr, but very much need parallel execution and wait for
the compression to finish.

Signed-off-by: Ilya Dryomov <[email protected]>
  • Loading branch information
idryomov committed Nov 23, 2020
1 parent c569a30 commit 0e4bc27
Showing 1 changed file with 21 additions and 4 deletions.
25 changes: 21 additions & 4 deletions qa/tasks/ceph.py
Original file line number Diff line number Diff line change
Expand Up @@ -255,10 +255,27 @@ def write_rotate_conf(ctx, daemons):
not (ctx.config.get('archive-on-error') and ctx.summary['success']):
# and logs
log.info('Compressing logs...')
ctx.cluster.sh(
'sudo find /var/log/ceph -name *.log -print0 | '
'sudo xargs -0 --no-run-if-empty -- gzip --',
wait=False)
run.wait(
ctx.cluster.run(
args=[
'sudo',
'find',
'/var/log/ceph',
'-name',
'*.log',
'-print0',
run.Raw('|'),
'sudo',
'xargs',
'-0',
'--no-run-if-empty',
'--',
'gzip',
'--',
],
wait=False,
),
)

log.info('Archiving logs...')
path = os.path.join(ctx.archive, 'remote')
Expand Down

0 comments on commit 0e4bc27

Please sign in to comment.