From 04e484c5973ed0f9234c97685c3c5e1ebf0d6eb6 Mon Sep 17 00:00:00 2001 From: Qu Wenruo Date: Fri, 3 Jul 2020 15:05:50 +0800 Subject: [PATCH 1/3] btrfs: discard: add missing put when grabbing block group from unused list [BUG] The following small test script can trigger ASSERT() at unmount time: mkfs.btrfs -f $dev mount $dev $mnt mount -o remount,discard=async $mnt umount $mnt The call trace: assertion failed: atomic_read(&block_group->count) == 1, in fs/btrfs/block-group.c:3431 ------------[ cut here ]------------ kernel BUG at fs/btrfs/ctree.h:3204! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 4 PID: 10389 Comm: umount Tainted: G O 5.8.0-rc3-custom+ #68 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: btrfs_free_block_groups.cold+0x22/0x55 [btrfs] close_ctree+0x2cb/0x323 [btrfs] btrfs_put_super+0x15/0x17 [btrfs] generic_shutdown_super+0x72/0x110 kill_anon_super+0x18/0x30 btrfs_kill_super+0x17/0x30 [btrfs] deactivate_locked_super+0x3b/0xa0 deactivate_super+0x40/0x50 cleanup_mnt+0x135/0x190 __cleanup_mnt+0x12/0x20 task_work_run+0x64/0xb0 __prepare_exit_to_usermode+0x1bc/0x1c0 __syscall_return_slowpath+0x47/0x230 do_syscall_64+0x64/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The code: ASSERT(atomic_read(&block_group->count) == 1); btrfs_put_block_group(block_group); [CAUSE] Obviously it's some btrfs_get_block_group() call doesn't get its put call. The offending btrfs_get_block_group() happens here: void btrfs_mark_bg_unused(struct btrfs_block_group *bg) { if (list_empty(&bg->bg_list)) { btrfs_get_block_group(bg); list_add_tail(&bg->bg_list, &fs_info->unused_bgs); } } So every call sites removing the block group from unused_bgs list should reduce the ref count of that block group. However for async discard, it didn't follow the call convention: void btrfs_discard_punt_unused_bgs_list(struct btrfs_fs_info *fs_info) { list_for_each_entry_safe(block_group, next, &fs_info->unused_bgs, bg_list) { list_del_init(&block_group->bg_list); btrfs_discard_queue_work(&fs_info->discard_ctl, block_group); } } And in btrfs_discard_queue_work(), it doesn't call btrfs_put_block_group() either. [FIX] Fix the problem by reducing the reference count when we grab the block group from unused_bgs list. Reported-by: Marcos Paulo de Souza Fixes: 6e80d4f8c422 ("btrfs: handle empty block_group removal for async discard") CC: stable@vger.kernel.org # 5.6+ Tested-by: Marcos Paulo de Souza Reviewed-by: Anand Jain Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/discard.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c index 5615320fa659f3..741c7e19c32f2c 100644 --- a/fs/btrfs/discard.c +++ b/fs/btrfs/discard.c @@ -619,6 +619,7 @@ void btrfs_discard_punt_unused_bgs_list(struct btrfs_fs_info *fs_info) list_for_each_entry_safe(block_group, next, &fs_info->unused_bgs, bg_list) { list_del_init(&block_group->bg_list); + btrfs_put_block_group(block_group); btrfs_discard_queue_work(&fs_info->discard_ctl, block_group); } spin_unlock(&fs_info->unused_bgs_lock); From 230ed397435e85b54f055c524fcb267ae2ce3bc4 Mon Sep 17 00:00:00 2001 From: Josef Bacik Date: Mon, 6 Jul 2020 09:14:12 -0400 Subject: [PATCH 2/3] btrfs: fix double put of block group with nocow While debugging a patch that I wrote I was hitting use-after-free panics when accessing block groups on unmount. This turned out to be because in the nocow case if we bail out of doing the nocow for whatever reason we need to call btrfs_dec_nocow_writers() if we called the inc. This puts our block group, but a few error cases does if (nocow) { btrfs_dec_nocow_writers(); goto error; } unfortunately, error is error: if (nocow) btrfs_dec_nocow_writers(); so we get a double put on our block group. Fix this by dropping the error cases calling of btrfs_dec_nocow_writers(), as it's handled at the error label now. Fixes: 762bf09893b4 ("btrfs: improve error handling in run_delalloc_nocow") CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Filipe Manana Signed-off-by: Josef Bacik Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/inode.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index cfa863d2d97c05..11f81a1483504b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1690,12 +1690,8 @@ static noinline int run_delalloc_nocow(struct inode *inode, ret = fallback_to_cow(inode, locked_page, cow_start, found_key.offset - 1, page_started, nr_written); - if (ret) { - if (nocow) - btrfs_dec_nocow_writers(fs_info, - disk_bytenr); + if (ret) goto error; - } cow_start = (u64)-1; } @@ -1711,9 +1707,6 @@ static noinline int run_delalloc_nocow(struct inode *inode, ram_bytes, BTRFS_COMPRESS_NONE, BTRFS_ORDERED_PREALLOC); if (IS_ERR(em)) { - if (nocow) - btrfs_dec_nocow_writers(fs_info, - disk_bytenr); ret = PTR_ERR(em); goto error; } From d77765911385b65fc82d74ab71b8983cddfe0b58 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 9 Jul 2020 18:22:06 +0200 Subject: [PATCH 3/3] btrfs: wire up iter_file_splice_write btrfs implements the iter_write op and thus can use the more efficient iov_iter based splice implementation. For now falling back to the less efficient default is pretty harmless, but I have a pending series that removes the default, and thus would cause btrfs to not support splice at all. Reported-by: Andy Lavr Tested-by: Andy Lavr Signed-off-by: Christoph Hellwig Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/file.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 2520605afc256e..b0d2c976587e52 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -3509,6 +3509,7 @@ const struct file_operations btrfs_file_operations = { .read_iter = generic_file_read_iter, .splice_read = generic_file_splice_read, .write_iter = btrfs_file_write_iter, + .splice_write = iter_file_splice_write, .mmap = btrfs_file_mmap, .open = btrfs_file_open, .release = btrfs_release_file,