Skip to content

Commit

Permalink
Revert "lib/dmesg.rb: ignore oops when kernel is tainted"
Browse files Browse the repository at this point in the history
This reverts commit 631129b.

WARNING rcu_torture_stats_print is shown in a Tainted kernel.

[   91.720955][    T0] ==================================================================
[   91.722870][    T0] BUG: KASAN: out-of-bounds in cpu_init_exception_handling+0xb71/0xcce
[   91.725314][    T0] Write of size 2 at addr ffff8881009bfea0 by task swapper/1/0
[   91.727658][    T0]
[   91.728191][    T0] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.16.0-rc2 intel#1 a24463070ea22534f6c67668cecaa669e0bcb985
[   91.766536][    T0] Disabling lock debugging due to kernel taint
...
[  120.265469][  T553] ------------[ cut here ]------------
[  120.267872][  T553] WARNING: CPU: 0 PID: 553 at kernel/rcu/rcutorture.c:1832 rcu_torture_stats_print+0xb80/0xd99 [rcutorture]
...
[  120.275075][  T553] CPU: 0 PID: 553 Comm: rcu_torture_sta Tainted: G    B             5.16.0-rc2 intel#1 a24463070ea22534f6c67668cecaa669e0bcb985
[  120.276925][  T553] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-5 04/01/2014
[  120.278470][  T553] RIP: 0010:rcu_torture_stats_print+0xb80/0xd99 [rcutorture]

But such WARNING can occur standalone, which means it is not the result of early oops/panic.

[  130.687780][  T578] ------------[ cut here ]------------
[  130.689263][  T578] WARNING: CPU: 0 PID: 578 at kernel/rcu/rcutorture.c:1832 rcu_torture_stats_print+0xb80/0xd99 [rcutorture]
...
[  130.693473][  T578] CPU: 0 PID: 578 Comm: rcu_torture_sta Not tainted 5.16.0-rc2-00001-gdce1ca0525bf intel#1 3c0077784af7403eda198a81ec74f23688d9ec99
...
[  130.696637][  T578] RIP: 0010:rcu_torture_stats_print+0xb80/0xd99 [rcutorture]

This means the rcu_torture_stats_print WARNING worth being recorded. So revert the patch
before a better way to know which oops is really related to early occurred issue.

Signed-off-by: Philip Li <[email protected]>
  • Loading branch information
rli9 committed Feb 14, 2023
1 parent c6fc5e8 commit 4fc650d
Show file tree
Hide file tree
Showing 7 changed files with 68 additions and 18 deletions.
19 changes: 1 addition & 18 deletions lib/dmesg.rb
Original file line number Diff line number Diff line change
Expand Up @@ -117,16 +117,8 @@ def concat_context_from_dmesg(dmesg_file, line)
line
end

ZDAY_KERNEL_STATE_NOT_TAINTED = 1
ZDAY_KERNEL_STATE_TAINTED = 2
ZDAY_KERNEL_STATE_UNKNOWN = 3

def grep_crash_head(dmesg_file)
# [ 142.621156][ T0] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.2.0-rc1-wt-ath-06577-g6c396023bf6e #1
# [ 142.690028][ T0] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 6.2.0-rc1-wt-ath-06577-g6c396023bf6e #1
# [ 142.665410][ T0] WARNING: suspicious RCU usage
# [ 142.665426][ T0] 6.2.0-rc1-wt-ath-06577-g6c396023bf6e #1 Not tainted
raw_oops = %x[ #{grep_cmd(dmesg_file)} -a -E -e " Not tainted" -e " Tainted:" -e \\\\+0x -f #{LKP_SRC_ETC}/oops-pattern #{dmesg_file} |
raw_oops = %x[ #{grep_cmd(dmesg_file)} -a -E -e \\\\+0x -f #{LKP_SRC_ETC}/oops-pattern #{dmesg_file} |
grep -v -E -f #{LKP_SRC_ETC}/oops-pattern-ignore ]

return {} if raw_oops.empty?
Expand All @@ -136,7 +128,6 @@ def grep_crash_head(dmesg_file)
oops_re = load_regular_expressions("#{LKP_SRC_ETC}/oops-pattern")
prev_line = nil
has_oom = false
kernel_state = ZDAY_KERNEL_STATE_UNKNOWN

add_one_calltrace = lambda do |line|
break if has_oom
Expand All @@ -154,14 +145,6 @@ def grep_crash_head(dmesg_file)
oops_map[$1] ||= line

has_oom = true if line =~ OOM_PATTERN
when / Not tainted/
kernel_state = ZDAY_KERNEL_STATE_NOT_TAINTED
when / Tainted:/
log_warn "kernel becomes tainted from #{kernel_state} directly" unless kernel_state == ZDAY_KERNEL_STATE_NOT_TAINTED

kernel_state = ZDAY_KERNEL_STATE_TAINTED
# ignore rest oops when kernel is tainted
break
when /\+0x/
next if line.index ' ? '

Expand Down
18 changes: 18 additions & 0 deletions spec/stats/dmesg/23.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,24 @@ Oops:#[##]: 1
message:Oops:#[##]: [ 21.748156] Oops: 0002 [#1]
pattern:Oops:#[##]: Oops:

# EIP is at bpf_prog_free+0x
EIP_is_at_bpf_prog_free: 1
message:EIP_is_at_bpf_prog_free: [ 21.755582] EIP is at bpf_prog_free+0x11/0x50
pattern:EIP_is_at_bpf_prog_free: EIP is at bpf_prog_free+0x

# SyS_socketcall+0x
calltrace:SyS_socketcall: 1
message:calltrace:SyS_socketcall: [ 21.827023] [<817eefcb>] SyS_socketcall+0x42b/0x600
pattern:calltrace:SyS_socketcall: SyS_socketcall+0x

# Kernel panic - not syncing: Fatal exception
Kernel_panic-not_syncing:Fatal_exception: 1
message:Kernel_panic-not_syncing:Fatal_exception: [ 21.889319] Kernel panic - not syncing: Fatal exception
pattern:Kernel_panic-not_syncing:Fatal_exception: Kernel panic - not syncing: Fatal exception

timestamp:last: 21.889319
timestamp:BUG:unable_to_handle_kernel: 21.734461
timestamp:Oops:#[##]: 21.748156
timestamp:EIP_is_at_bpf_prog_free: 21.755582
timestamp:calltrace:SyS_socketcall: 21.827023
timestamp:Kernel_panic-not_syncing:Fatal_exception: 21.889319
12 changes: 12 additions & 0 deletions spec/stats/dmesg/48.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,19 @@ general_protection_fault:#[##]: 1
message:general_protection_fault:#[##]: [ 150.375503] general protection fault: 0000 [#1] SMP KASAN
pattern:general_protection_fault:#[##]: general protection fault:

# RIP: .* init_real_cpu_mask+0x
RIP:init_real_cpu_mask: 1
message:RIP:init_real_cpu_mask: [ 150.382189] RIP: 0010:[<ffffffff83a0b6e7>] [<ffffffff83a0b6e7>] init_real_cpu_mask+0x77/0x150
pattern:RIP:init_real_cpu_mask: RIP: .* init_real_cpu_mask+0x

# Kernel panic - not syncing: Fatal exception
Kernel_panic-not_syncing:Fatal_exception: 1
message:Kernel_panic-not_syncing:Fatal_exception: [ 150.524788] Kernel panic - not syncing: Fatal exception
pattern:Kernel_panic-not_syncing:Fatal_exception: Kernel panic - not syncing: Fatal exception

timestamp:last: 150.524788
timestamp:BUG:KASAN:wild-memory-access_on_address: 150.330704
timestamp:calltrace:hfi1_mod_init: 150.361179
timestamp:general_protection_fault:#[##]: 150.375503
timestamp:RIP:init_real_cpu_mask: 150.382189
timestamp:Kernel_panic-not_syncing:Fatal_exception: 150.524788
6 changes: 6 additions & 0 deletions spec/stats/dmesg/62.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,11 @@ BUG:spinlock_bad_magic_on_CPU: 1
message:BUG:spinlock_bad_magic_on_CPU: [ 90.322140] BUG: spinlock bad magic on CPU#1, kworker/u4:1/123
pattern:BUG:spinlock_bad_magic_on_CPU: BUG: spinlock bad magic on CPU#1

# fd_timer_workfn+0x
calltrace:fd_timer_workfn: 1
message:calltrace:fd_timer_workfn: [ 90.322660] [<c173d70a>] fd_timer_workfn+0x9/0xb
pattern:calltrace:fd_timer_workfn: fd_timer_workfn+0x

timestamp:last: 90.333534
timestamp:BUG:spinlock_bad_magic_on_CPU: 90.322140
timestamp:calltrace:fd_timer_workfn: 90.322660
10 changes: 10 additions & 0 deletions spec/stats/dmesg/68.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,13 @@ related_general_protection_fault: 1
message:related_general_protection_fault: [18444175627.134970] segment-related general protection fault: 206c [#2] PTI
pattern:related_general_protection_fault: related general protection fault:

# RIP: LC16+0x
RIP:.LC16: 1
message:RIP:.LC16: [18444175627.139181] RIP: 0010:.LC16+0x30d/0x7a7
pattern:RIP:.LC16: RIP: LC16+0x

# RIP: __nr_to_section+0x
RIP:__nr_to_section: 1
message:RIP:__nr_to_section: [18444175627.157179] RIP: 0010:__nr_to_section+0x23/0x48
pattern:RIP:__nr_to_section: RIP: __nr_to_section+0x

6 changes: 6 additions & 0 deletions spec/stats/dmesg/72.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,11 @@ BUG:workqueue_leaked_lock_or_atomic: 1
message:BUG:workqueue_leaked_lock_or_atomic: [ 179.857058] BUG: workqueue leaked lock or atomic: kworker/u2:2/0x00000001/127
pattern:BUG:workqueue_leaked_lock_or_atomic: BUG: workqueue leaked lock or atomic:

# BUG: scheduling while atomic:
BUG:scheduling_while_atomic: 1
message:BUG:scheduling_while_atomic: [ 179.867914] BUG: scheduling while atomic: kworker/u2:2/127/0x00000002
pattern:BUG:scheduling_while_atomic: BUG: scheduling while atomic:

timestamp:last: 179.867914
timestamp:BUG:workqueue_leaked_lock_or_atomic: 179.857058
timestamp:BUG:scheduling_while_atomic: 179.867914
15 changes: 15 additions & 0 deletions spec/stats/dmesg/kfence-01.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,18 @@ BUG:KFENCE:use-after-free_read_in_test_use_after_free_read: 1
message:BUG:KFENCE:use-after-free_read_in_test_use_after_free_read: BUG: KFENCE: use-after-free read in test_use_after_free_read+0xb3/0x143
pattern:BUG:KFENCE:use-after-free_read_in_test_use_after_free_read: BUG: KFENCE: use-after-free read in test_use_after_free_read

# BUG: KFENCE: invalid free in test_double_free
BUG:KFENCE:invalid_free_in_test_double_free: 1
message:BUG:KFENCE:invalid_free_in_test_double_free: BUG: KFENCE: invalid free in test_double_free+0xdc/0x171
pattern:BUG:KFENCE:invalid_free_in_test_double_free: BUG: KFENCE: invalid free in test_double_free

# BUG: KFENCE: memory corruption in test_kmalloc_aligned_oob_write
BUG:KFENCE:memory_corruption_in_test_kmalloc_aligned_oob_write: 1
message:BUG:KFENCE:memory_corruption_in_test_kmalloc_aligned_oob_write: BUG: KFENCE: memory corruption in test_kmalloc_aligned_oob_write+0xef/0x184
pattern:BUG:KFENCE:memory_corruption_in_test_kmalloc_aligned_oob_write: BUG: KFENCE: memory corruption in test_kmalloc_aligned_oob_write

# BUG: KFENCE: invalid read in test_invalid_access
BUG:KFENCE:invalid_read_in_test_invalid_access: 1
message:BUG:KFENCE:invalid_read_in_test_invalid_access: BUG: KFENCE: invalid read in test_invalid_access+0x26/0xe0
pattern:BUG:KFENCE:invalid_read_in_test_invalid_access: BUG: KFENCE: invalid read in test_invalid_access

0 comments on commit 4fc650d

Please sign in to comment.