Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow performance, BTRFS timeouts #783

Open
SlavikCA opened this issue Jul 12, 2024 · 3 comments
Open

Slow performance, BTRFS timeouts #783

SlavikCA opened this issue Jul 12, 2024 · 3 comments

Comments

@SlavikCA
Copy link
Contributor

SlavikCA commented Jul 12, 2024

Operating system

Harvester 1.3.1

Description

I ran container in Kubernetes with 2 discs: SSD and HDD via PVC.

The host node is Dell Precision T7820, 128GB RAM, 32 cores Intel Xeon Gold 5218.

vDSM has 4GB RAM, 4 cores.

I decided to copy about ~400GB to HDD. It's very slow. in the logs I see these warnings and errors - see logs.

Overall, the vDSM works very slow.

Is there anything I can do troubleshoot the performance? And idea what the issue may be?

Kubernetes manifest

apiVersion: v1
kind: Pod
metadata:
  name: dsm
  labels:
    name: dsm
  namespace: dsm
  annotations:
    # https://github.com/k8snetworkplumbingwg/multus-cni/blob/master/docs/how-to-use.md#launch-pod-with-json-annotation
    k8s.v1.cni.cncf.io/networks: '[ { "name": "untagged","namespace": "default","mac": "aa:fa:1b:fb:1a:54","interface": "net2" } ]'
spec:
  terminationGracePeriodSeconds: 120 # the Kubernetes default is 30 seconds and it may be not enough
  containers:
    - name: dsm
      image: vdsm/virtual-dsm
      resources:
        limits:
          devices.kubevirt.io/vhost-net: 1
          memory: "5Gi"
          cpu: "4"
        requests:
          cpu: 200m
          memory: "1Gi"
      securityContext:
        privileged: true
        capabilities:
          add: ["NET_ADMIN"]
      env:
        - name: RAM_SIZE
          value: 4G
        - name: CPU_CORES
          value: "4"
        - name: DISK_SIZE
          value: "380G"
        - name: DISK2_SIZE
          value: "2000G"
        - name: DISK_FMT
          value: "qcow2" # qcow2 does not allocate the total disk size, but grows with the data
        - name: DHCP
          value: "Y"
        - name: VM_NET_DEV
          value: "net2"
      volumeMounts:
        - mountPath: /storage
          name: dsm-ssd
        - mountPath: /storage2
          name: dsm-hdd
        - mountPath: /dev/kvm
          name: dev-kvm
  volumes:
    - name: dsm-ssd
      persistentVolumeClaim:
        claimName: dsm-ssd-pvc
    - name: dsm-hdd
      persistentVolumeClaim:
        claimName: dsm-hdd-pvc
    - name: dev-kvm
      hostPath:
        path: /dev/kvm

Docker log

[10681.689074] INFO: task btrfs-transacti:9502 blocked for more than 120 seconds.
[10681.691936]       Tainted: P           O    4.4.302+ #69057
[10681.694104] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[10681.697137] btrfs-transacti D ffff880079dc3998     0  9502      2 0x00000000
[10681.699975]  ffff880079dc3998 ffff880079dc0000 ffff880179290380 ffff88007282e440
[10681.703032]  ffff880079dc4000 ffff88017fd1a4c0 7fffffffffffffff ffff88007282e440
[10681.705415]  7fffffffffffffff ffff880079dc39b0 ffffffff815479c0 7fffffffffffffff
[10681.707545] Call Trace:
[10681.708247]  [<ffffffff815479c0>] schedule+0x30/0x80
[10681.709488]  [<ffffffff81549c22>] schedule_timeout+0xb2/0x140
[10681.711293]  [<ffffffff8104397e>] ? kvm_clock_get_cycles+0x1e/0x20
[10681.713023]  [<ffffffff810b55b4>] ? ktime_get+0x34/0xa0
[10681.714389]  [<ffffffff8154715b>] io_schedule_timeout+0x9b/0x100
[10681.715498]  [<ffffffff815470c0>] ? pci_mmcfg_check_reserved+0xa0/0xa0
[10681.716634]  [<ffffffff815483b0>] __wait_for_common+0xa0/0x140
[10681.717648]  [<ffffffff8107fc70>] ? wake_up_q+0x60/0x60
[10681.718569]  [<ffffffff8154857f>] wait_for_completion_io+0x1f/0x30
[10681.719641]  [<ffffffff812c0179>] blkdev_issue_discard+0x1d9/0x220
[10681.720735]  [<ffffffffa0b907fe>] btrfs_issue_discard+0x17e/0x250 [btrfs]
[10681.722212]  [<ffffffffa0b98cb6>] btrfs_discard_extent+0xf6/0x190 [btrfs]
[10681.723604]  [<ffffffffa0b9c6ab>] btrfs_finish_extent_commit+0xfb/0x250 [btrfs]
[10681.725095]  [<ffffffffa0bbcc38>] btrfs_commit_transaction+0xc08/0xf00 [btrfs]
[10681.726335]  [<ffffffffa0bb9151>] transaction_kthread+0x211/0x290 [btrfs]
[10681.727494]  [<ffffffffa0bb8f40>] ? open_ctree+0x3500/0x3500 [btrfs]
[10681.728571]  [<ffffffff81075297>] kthread+0xc7/0xe0
[10681.729388]  [<ffffffff810751d0>] ? kthread_parkme+0x20/0x20
[10681.730335]  [<ffffffff8154ae1f>] ret_from_fork+0x3f/0x80
[10681.731217]  [<ffffffff810751d0>] ? kthread_parkme+0x20/0x20
[25974.225814] BTRFS warning (device sdc1): commit trans:
[25974.225814] total_time: 598309, meta-read[miss/total]:[371/169041], meta-write[count/size]:[29/10096 K]
[25974.225814] prepare phase: time: 1, refs[before/process/after]:[244/123/0]
[25974.225814] wait prev trans completed: time: 0
[25974.225814] pre-run delayed item phase: time: 1, inodes/items:[59/97]
[25974.225814] wait join end trans: time: 0
[25974.225814] run data refs for usrquota: time: 0, refs:[0]
[25974.225814] create snpashot: time: 0, inodes/items:[0/0], refs:[0]
[25974.225814] delayed item phase: time: 0, inodes/items:[0/0]
[25974.225814] delayed refs phase: time: 0, refs:[2]
[25974.225814] commit roots phase: time: 0
[25974.225814] writeback phase: time: 598308

Screenshots (optional)

No response

@relink2013
Copy link

Im actually having the same issue and I'm not sure why.

  • Volume 1 is on a ZFS nvme mirror and I only use this for DSM, apps, databases and indexing.

  • Volume 2 is a 30TB is on a ZFS HDD pool. I tried mirroring, RaidZ, and RaidZ2, all with and without L2ARC. I also tried both lz4 and zstd compression, disabling atime, changing the block size to 4k to better match the btrfs vdisk.

No matter what I try I end up with numerous btrfs errors in the logs. Speeds of large transfers will initially be ok and then very quickly drop down to about 50MB/s and it's only a matter of time before the container locks up, or the VM inside the container crashes. This often leaves me in a position where I cant kill the container process and have to reboot the entire host.

@SlavikCA
Copy link
Contributor Author

Relevant blog post about this error:

https://vivani.net/2020/02/08/task-blocked-for-more-than-120-seconds/

To solve this problem, we have to increase the ‘dirty_background_bytes‘ kernel setting to higher values to be able to accommodate the throughput.

@relink2013
Copy link

Relevant blog post about this error:

vivani.net/2020/02/08/task-blocked-for-more-than-120-seconds

To solve this problem, we have to increase the ‘dirty_background_bytes‘ kernel setting to higher values to be able to accommodate the throughput.

Would this all be set on the host? I'm running on Unraid and thoes options are available in the GUI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants