Slow performance, BTRFS timeouts #783

SlavikCA · 2024-07-12T11:30:40Z

Operating system

Harvester 1.3.1

Description

I ran container in Kubernetes with 2 discs: SSD and HDD via PVC.

The host node is Dell Precision T7820, 128GB RAM, 32 cores Intel Xeon Gold 5218.

vDSM has 4GB RAM, 4 cores.

I decided to copy about ~400GB to HDD. It's very slow. in the logs I see these warnings and errors - see logs.

Overall, the vDSM works very slow.

Is there anything I can do troubleshoot the performance? And idea what the issue may be?

Kubernetes manifest

apiVersion: v1
kind: Pod
metadata:
  name: dsm
  labels:
    name: dsm
  namespace: dsm
  annotations:
    # https://github.com/k8snetworkplumbingwg/multus-cni/blob/master/docs/how-to-use.md#launch-pod-with-json-annotation
    k8s.v1.cni.cncf.io/networks: '[ { "name": "untagged","namespace": "default","mac": "aa:fa:1b:fb:1a:54","interface": "net2" } ]'
spec:
  terminationGracePeriodSeconds: 120 # the Kubernetes default is 30 seconds and it may be not enough
  containers:
    - name: dsm
      image: vdsm/virtual-dsm
      resources:
        limits:
          devices.kubevirt.io/vhost-net: 1
          memory: "5Gi"
          cpu: "4"
        requests:
          cpu: 200m
          memory: "1Gi"
      securityContext:
        privileged: true
        capabilities:
          add: ["NET_ADMIN"]
      env:
        - name: RAM_SIZE
          value: 4G
        - name: CPU_CORES
          value: "4"
        - name: DISK_SIZE
          value: "380G"
        - name: DISK2_SIZE
          value: "2000G"
        - name: DISK_FMT
          value: "qcow2" # qcow2 does not allocate the total disk size, but grows with the data
        - name: DHCP
          value: "Y"
        - name: VM_NET_DEV
          value: "net2"
      volumeMounts:
        - mountPath: /storage
          name: dsm-ssd
        - mountPath: /storage2
          name: dsm-hdd
        - mountPath: /dev/kvm
          name: dev-kvm
  volumes:
    - name: dsm-ssd
      persistentVolumeClaim:
        claimName: dsm-ssd-pvc
    - name: dsm-hdd
      persistentVolumeClaim:
        claimName: dsm-hdd-pvc
    - name: dev-kvm
      hostPath:
        path: /dev/kvm

Docker log

[10681.689074] INFO: task btrfs-transacti:9502 blocked for more than 120 seconds.
[10681.691936]       Tainted: P           O    4.4.302+ #69057
[10681.694104] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[10681.697137] btrfs-transacti D ffff880079dc3998     0  9502      2 0x00000000
[10681.699975]  ffff880079dc3998 ffff880079dc0000 ffff880179290380 ffff88007282e440
[10681.703032]  ffff880079dc4000 ffff88017fd1a4c0 7fffffffffffffff ffff88007282e440
[10681.705415]  7fffffffffffffff ffff880079dc39b0 ffffffff815479c0 7fffffffffffffff
[10681.707545] Call Trace:
[10681.708247]  [<ffffffff815479c0>] schedule+0x30/0x80
[10681.709488]  [<ffffffff81549c22>] schedule_timeout+0xb2/0x140
[10681.711293]  [<ffffffff8104397e>] ? kvm_clock_get_cycles+0x1e/0x20
[10681.713023]  [<ffffffff810b55b4>] ? ktime_get+0x34/0xa0
[10681.714389]  [<ffffffff8154715b>] io_schedule_timeout+0x9b/0x100
[10681.715498]  [<ffffffff815470c0>] ? pci_mmcfg_check_reserved+0xa0/0xa0
[10681.716634]  [<ffffffff815483b0>] __wait_for_common+0xa0/0x140
[10681.717648]  [<ffffffff8107fc70>] ? wake_up_q+0x60/0x60
[10681.718569]  [<ffffffff8154857f>] wait_for_completion_io+0x1f/0x30
[10681.719641]  [<ffffffff812c0179>] blkdev_issue_discard+0x1d9/0x220
[10681.720735]  [<ffffffffa0b907fe>] btrfs_issue_discard+0x17e/0x250 [btrfs]
[10681.722212]  [<ffffffffa0b98cb6>] btrfs_discard_extent+0xf6/0x190 [btrfs]
[10681.723604]  [<ffffffffa0b9c6ab>] btrfs_finish_extent_commit+0xfb/0x250 [btrfs]
[10681.725095]  [<ffffffffa0bbcc38>] btrfs_commit_transaction+0xc08/0xf00 [btrfs]
[10681.726335]  [<ffffffffa0bb9151>] transaction_kthread+0x211/0x290 [btrfs]
[10681.727494]  [<ffffffffa0bb8f40>] ? open_ctree+0x3500/0x3500 [btrfs]
[10681.728571]  [<ffffffff81075297>] kthread+0xc7/0xe0
[10681.729388]  [<ffffffff810751d0>] ? kthread_parkme+0x20/0x20
[10681.730335]  [<ffffffff8154ae1f>] ret_from_fork+0x3f/0x80
[10681.731217]  [<ffffffff810751d0>] ? kthread_parkme+0x20/0x20

[25974.225814] BTRFS warning (device sdc1): commit trans:
[25974.225814] total_time: 598309, meta-read[miss/total]:[371/169041], meta-write[count/size]:[29/10096 K]
[25974.225814] prepare phase: time: 1, refs[before/process/after]:[244/123/0]
[25974.225814] wait prev trans completed: time: 0
[25974.225814] pre-run delayed item phase: time: 1, inodes/items:[59/97]
[25974.225814] wait join end trans: time: 0
[25974.225814] run data refs for usrquota: time: 0, refs:[0]
[25974.225814] create snpashot: time: 0, inodes/items:[0/0], refs:[0]
[25974.225814] delayed item phase: time: 0, inodes/items:[0/0]
[25974.225814] delayed refs phase: time: 0, refs:[2]
[25974.225814] commit roots phase: time: 0
[25974.225814] writeback phase: time: 598308

Screenshots (optional)

No response

The text was updated successfully, but these errors were encountered:

relink2013 · 2024-08-14T13:20:26Z

Im actually having the same issue and I'm not sure why.

Volume 1 is on a ZFS nvme mirror and I only use this for DSM, apps, databases and indexing.
Volume 2 is a 30TB is on a ZFS HDD pool. I tried mirroring, RaidZ, and RaidZ2, all with and without L2ARC. I also tried both lz4 and zstd compression, disabling atime, changing the block size to 4k to better match the btrfs vdisk.

No matter what I try I end up with numerous btrfs errors in the logs. Speeds of large transfers will initially be ok and then very quickly drop down to about 50MB/s and it's only a matter of time before the container locks up, or the VM inside the container crashes. This often leaves me in a position where I cant kill the container process and have to reboot the entire host.

SlavikCA · 2024-08-27T03:35:21Z

Relevant blog post about this error:

https://vivani.net/2020/02/08/task-blocked-for-more-than-120-seconds/

To solve this problem, we have to increase the ‘dirty_background_bytes‘ kernel setting to higher values to be able to accommodate the throughput.

relink2013 · 2024-08-28T23:37:48Z

Relevant blog post about this error:

vivani.net/2020/02/08/task-blocked-for-more-than-120-seconds

To solve this problem, we have to increase the ‘dirty_background_bytes‘ kernel setting to higher values to be able to accommodate the throughput.

Would this all be set on the host? I'm running on Unraid and thoes options are available in the GUI

SlavikCA mentioned this issue Aug 19, 2024

[Bug]: dsm pod unresponsive, unless DISK_FMT set to QCOW #784

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow performance, BTRFS timeouts #783

Slow performance, BTRFS timeouts #783

SlavikCA commented Jul 12, 2024 •

edited

Loading

relink2013 commented Aug 14, 2024

SlavikCA commented Aug 27, 2024

relink2013 commented Aug 28, 2024

Slow performance, BTRFS timeouts #783

Slow performance, BTRFS timeouts #783

Comments

SlavikCA commented Jul 12, 2024 • edited Loading

Operating system

Description

Kubernetes manifest

Docker log

Screenshots (optional)

relink2013 commented Aug 14, 2024

SlavikCA commented Aug 27, 2024

relink2013 commented Aug 28, 2024

SlavikCA commented Jul 12, 2024 •

edited

Loading