Kernel Crash dump Analysis

귀차니즘에 안하고 있던 커널 덤프 분석 13번째 케이스 공유.

- Symptom : 특정 시간대를 기점으로 HA 로 구성된 서로 다른 세개의 시스템 세트(총 6대)가 비슷한 형태의 메시지를 보이며 리부팅.

일부 HA 및 RAC 시스템의 경우 5분간격으로 리부팅 되는 등, 심각한 서비스의 영향도를 초래하는 상황.

- Environment

Server : FUJITSU PRIMERGY RX2540 M2
OS : Oracle Linux 6 Update 8
Kernel : 4.1.12-94.3.8.el6uek.x86_64

- Analysis step

      KERNEL: 4.1.12-94.3.8.el6uek.x86_64/vmlinux
    DUMPFILE: 127.0.0.1-2017-12-17-19:03:50/vmcore [PARTIAL DUMP]
        CPUS: 4
        DATE: Sun Dec 17 19:03:14 2017
      UPTIME: 00:01:43
LOAD AVERAGE: 2.16, 0.69, 0.24
       TASKS: 747
    NODENAME: ***rac1
     RELEASE: 4.1.12-94.3.8.el6uek.x86_64
     VERSION: #2 SMP Fri Jun 30 11:00:28 PDT 2017
     MACHINE: x86_64 (3491 Mhz)
      MEMORY: 71.7 GB
       PANIC: "BUG: unable to handle kernel NULL pointer dereference at 0000000000000068"
         PID: 7023
     COMMAND: "kworker/1:3"
        TASK: ffff881225e11c00 [THREAD_INFO: ffff881228434000]
         CPU: 1
       STATE: TASK_RUNNING (PANIC)

crash7latest> log

[[[ Snip ]]]

[   82.592884] st 13:0:7:0: [st5] Error 18 (driver bt 0x0, host bt 0x0).
[   82.704777] st 13:0:5:0: reservation conflict
[   82.704978] st 13:0:5:0: [st3] Error 18 (driver bt 0x0, host bt 0x0).
[   82.705780] st 13:0:6:0: reservation conflict
[   82.705787] st 13:0:6:0: [st4] Error 18 (driver bt 0x0, host bt 0x0).
[   82.706867] st 13:0:7:0: reservation conflict
[   82.706896] st 13:0:7:0: [st5] Error 18 (driver bt 0x0, host bt 0x0).
[   86.328998] mtrr: no MTRR for c5000000,400000 found
[   87.833196] mtrr: no MTRR for c5000000,7f0000 found
[   88.757892] st 13:0:5:0: reservation conflict
[   88.757921] st 13:0:5:0: [st3] Error 18 (driver bt 0x0, host bt 0x0).
[   88.758745] st 13:0:6:0: reservation conflict
[   88.758769] st 13:0:6:0: [st4] Error 18 (driver bt 0x0, host bt 0x0).
[   88.759911] st 13:0:7:0: reservation conflict
[   88.759935] st 13:0:7:0: [st5] Error 18 (driver bt 0x0, host bt 0x0).
[ 103.128959] rport-13:0-18: blocked FC remote port time out: removing target and saving binding
[ 103.129045] lpfc 0000:83:00.0: 2:(0):0203 Devloss timeout on WWPN 50:05:07:60:44:41:c6:0b NPort x010600 Data: x100 x5 x0
[ 103.129049] rport-13:0-17: blocked FC remote port time out: removing target and saving binding
[ 103.129076] lpfc 0000:83:00.0: 2:(0):0203 Devloss timeout on WWPN 50:05:07:60:44:41:c6:09 NPort x011600 Data: x40000 x1 x0
[ 103.130990] ------------[ cut here ]------------
[ 103.131003] WARNING: CPU: 3 PID: 737 at lib/kobject.c:244 kobject_add_internal+0x11d/0x200()
[ 103.131006] kobject_add_internal failed for 13:0:14:0 (error: -2 parent: target13:0:14)
[ 103.131008] Modules linked in: oracleasm autofs4 ipmi_devintf cpufreq_powersave osst st bonding ipv6 dm_round_robin scsi_dh_alua dm_multipath uinput iTCO_wdt iTCO_vendor_support pcspkr sb_edac edac_core i2c_i801 xhci_pci xhci_hcd lpc_ich mfd_core ixgbe mdio igb dca i2c_algo_bit ptp pps_core sg ipmi_ssif i2c_core ipmi_si ipmi_msghandler ext4 jbd2 mbcache2 sd_mod lpfc scsi_transport_fc ahci libahci be2net vxlan udp_tunnel ip6_udp_tunnel megaraid_sas mxm_wmi wmi dm_mirror dm_region_hash dm_log dm_mod
[ 103.131054] CPU: 3 PID: 737 Comm: kworker/u8:9 Not tainted 4.1.12-94.3.8.el6uek.x86_64 #2
[ 103.131056] Hardware name: FUJITSU PRIMERGY RX2540 M2/D3289-B1, BIOS V5.0.0.11 R1.15.0 for D3289-B1x                    02/24/2017
[ 103.131065] Workqueue: scsi_wq_13 fc_scsi_scan_rport [scsi_transport_fc]
[ 103.131068] 0000000000000000 ffff881221db3958 ffffffff816e018f ffff881221db39a8
[ 103.131072] 00000000000000f4 ffff881221db3998 ffffffff810868c5 ffff881221155df0
[ 103.131075] ffff88121ff3f978 00000000fffffffe ffff88120a49f438 ffff88121ff3f978
[ 103.131079] Call Trace:
[ 103.131086] [<ffffffff816e018f>] dump_stack+0x63/0x84
[ 103.131093] [<ffffffff810868c5>] warn_slowpath_common+0x95/0xe0
[ 103.131097] [<ffffffff810869c6>] warn_slowpath_fmt+0x46/0x50
[ 103.131101] [<ffffffff8132354d>] kobject_add_internal+0x11d/0x200
[ 103.131105] [<ffffffff814c67c8>] ? scsi_execute+0x118/0x180
[ 103.131109] [<ffffffff81323818>] kobject_add_varg+0x38/0x60
[ 103.131112] [<ffffffff813238f4>] kobject_add+0x44/0x70
[ 103.131117] [<ffffffff8147094f>] device_add+0xdf/0x470
[ 103.131121] [<ffffffff8147f31c>] ? __pm_runtime_resume+0x6c/0x90
[ 103.131127] [<ffffffff814cce53>] scsi_sysfs_add_sdev+0xb3/0x2a0
[ 103.131130] [<ffffffff814c9580>] scsi_add_lun+0x440/0x5c0
[ 103.131133] [<ffffffff814c9e08>] scsi_probe_and_add_lun+0x208/0x530
[ 103.131137] [<ffffffff814cabd6>] __scsi_scan_target+0xe6/0x230
[ 103.131139] [<ffffffff8147f31c>] ? __pm_runtime_resume+0x6c/0x90
[ 103.131142] [<ffffffff814cb1b1>] scsi_scan_target+0xe1/0x100
[ 103.131149] [<ffffffffa00ce5cf>] fc_scsi_scan_rport+0xbf/0xd0 [scsi_transport_fc]
[ 103.131156] [<ffffffff810a14e1>] process_one_work+0x151/0x4b0
[ 103.131160] [<ffffffff810a1960>] worker_thread+0x120/0x480
[ 103.131163] [<ffffffff816e05cb>] ? __schedule+0x30b/0x890
[ 103.131167] [<ffffffff810a1840>] ? process_one_work+0x4b0/0x4b0
[ 103.131171] [<ffffffff810a1840>] ? process_one_work+0x4b0/0x4b0
[ 103.131175] [<ffffffff810a6b3e>] kthread+0xce/0xf0
[ 103.131178] [<ffffffff810a6a70>] ? kthread_freezable_should_stop+0x70/0x70
[ 103.131183] [<ffffffff816e52a2>] ret_from_fork+0x42/0x70
[ 103.131186] [<ffffffff810a6a70>] ? kthread_freezable_should_stop+0x70/0x70
[ 103.131189] ---[ end trace c5457f0f37a9c45d ]---
[ 103.131194] scsi 13:0:14:0: failed to add device: -2

[ 103.131527] ---[ end trace c5457f0f37a9c460 ]---
[ 103.131532] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068
[ 103.131534] IP: [<ffffffff81288f49>] kernfs_find_ns+0x19/0x110
[ 103.131536] PGD 11d087b067 PUD 0
[ 103.131538] Oops: 0000 [#1] SMP
[ 103.131539] Modules linked in: oracleasm autofs4 ipmi_devintf cpufreq_powersave osst st bonding ipv6 dm_round_robin scsi_dh_alua dm_multipath uinput iTCO_wdt iTCO_vendor_support pcspkr sb_edac edac_core i2c_i801 xhci_pci xhci_hcd lpc_ich mfd_core ixgbe mdio igb dca i2c_algo_bit ptp pps_core sg ipmi_ssif i2c_core ipmi_si ipmi_msghandler ext4 jbd2 mbcache2 sd_mod lpfc scsi_transport_fc ahci libahci be2net vxlan udp_tunnel ip6_udp_tunnel megaraid_sas mxm_wmi wmi dm_mirror dm_region_hash dm_log dm_mod
[ 103.131558] CPU: 1 PID: 7023 Comm: kworker/1:3 Tainted: G W 4.1.12-94.3.8.el6uek.x86_64 #2
[ 103.131559] Hardware name: FUJITSU PRIMERGY RX2540 M2/D3289-B1, BIOS V5.0.0.11 R1.15.0 for D3289-B1x 02/24/2017
[ 103.131562] Workqueue: fc_wq_13 fc_starget_delete [scsi_transport_fc]
[ 103.131563] task: ffff881225e11c00 ti: ffff881228434000 task.ti: ffff881228434000
[ 103.131564] RIP: 0010:[<ffffffff81288f49>] [<ffffffff81288f49>] kernfs_find_ns+0x19/0x110
[ 103.131567] RSP: 0000:ffff881228437ba8 EFLAGS: 00010282
[ 103.131568] RAX: ffff881225e11c00 RBX: 0000000000000000 RCX: 0000000000000006
[ 103.131569] RDX: 0000000000000000 RSI: ffffffff81771ce0 RDI: 0000000000000000
[ 103.131570] RBP: ffff881228437bd8 R08: 000000000000fffe R09: 0000000000000b4e
[ 103.131571] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff81771ce0
[ 103.131572] R13: 0000000000000000 R14: ffffffff814782e0 R15: ffff88120a49f400
[ 103.131573] FS: 0000000000000000(0000) GS:ffff88127fc80000(0000) knlGS:0000000000000000
[ 103.131574] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 103.131575] CR2: 0000000000000068 CR3: 00000011d265d000 CR4: 00000000003406e0
[ 103.131576] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 103.131577] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 103.131578] Stack:
[ 103.131579] ffffffff814782e0 ffff88120a49f400 ffff881228437bd8 0000000000000000
[ 103.131581] ffffffff81771ce0 0000000000000000 ffff881228437c08 ffffffff8128934c
[ 103.131583] ffff881228437c18 ffffffff81ba8700 ffff88120a49e428 ffffffffa00d7020
[ 103.131585] Call Trace:
[ 103.131587] [<ffffffff814782e0>] ? transport_add_device+0x20/0x20
[ 103.131589] [<ffffffff8128934c>] kernfs_find_and_get_ns+0x3c/0x70
[ 103.131591] [<ffffffff8128d2dd>] sysfs_unmerge_group+0x1d/0x60
[ 103.131593] [<ffffffff8147a28c>] dpm_sysfs_remove+0x2c/0x70
[ 103.131595] [<ffffffff8146f4e8>] device_del+0x58/0x230
[ 103.131596] [<ffffffff8146f329>] ? device_remove_file+0x19/0x20
[ 103.131598] [<ffffffff814782e0>] ? transport_add_device+0x20/0x20
[ 103.131600] [<ffffffff81477bbe>] attribute_container_class_device_del+0x1e/0x30
[ 103.131602] [<ffffffff81478339>] transport_remove_classdev+0x59/0x70
[ 103.131603] [<ffffffff81477de0>] attribute_container_device_trigger+0xa0/0xe0
[ 103.131605] [<ffffffff81478295>] transport_remove_device+0x15/0x20
[ 103.131606] [<ffffffff814c9ba2>] scsi_target_reap_ref_release+0x32/0x50
[ 103.131608] [<ffffffff814c9bec>] scsi_target_reap+0x2c/0x40
[ 103.131610] [<ffffffff814cd1ab>] scsi_remove_target+0xeb/0x120
[ 103.131613] [<ffffffffa00d0116>] fc_starget_delete+0x26/0x30 [scsi_transport_fc]
[ 103.131615] [<ffffffff810a14e1>] process_one_work+0x151/0x4b0
[ 103.131617] [<ffffffff810a1960>] worker_thread+0x120/0x480
[ 103.131619] [<ffffffff816e05cb>] ? __schedule+0x30b/0x890
[ 103.131621] [<ffffffff810a1840>] ? process_one_work+0x4b0/0x4b0
[ 103.131623] [<ffffffff810a1840>] ? process_one_work+0x4b0/0x4b0
[ 103.131625] [<ffffffff810a6b3e>] kthread+0xce/0xf0
[ 103.131626] [<ffffffff810a6a70>] ? kthread_freezable_should_stop+0x70/0x70
[ 103.131629] [<ffffffff816e52a2>] ret_from_fork+0x42/0x70
[ 103.131630] [<ffffffff810a6a70>] ? kthread_freezable_should_stop+0x70/0x70
[ 103.131631] Code: 40 00 eb ef 89 c2 eb de 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 30 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 0f 1f 44 00 00 <0f> b7 47 68 48 8b 5f 48 49 89 f1 c1 e8 05 83 e0 01 48 85 d2 0f
[ 103.131652] RIP [<ffffffff81288f49>] kernfs_find_ns+0x19/0x110
[ 103.131654] RSP <ffff881228437ba8>
[ 103.131655] CR2: 0000000000000068

Crash log 확인 결과 시스템은 "kworker/1:3" Process 에서 최종적으로 Crash 를 일으켰으며,

kern_find_fs_ns 함수의 수행 중 문제를 일으킨 것으로 확인 된다.

눈여겨 볼 것은

1. FC remote port 에 대한 disconnection 등의 물리적인 연결오류가 발생했다는 것

2. Workqueue: fc_wq_13 fc_starget_delete [scsi_transport_fc]

라는 부분들로써, 우리는 무엇인가 FC 로 연결된 스토리지 관련 이슈가 발생했다는 것을 짐작할 수 있다.

Backtrace 를 확인해 보기로 한다.

crash7latest> bt -l
PID: 7023   TASK: ffff881225e11c00 CPU: 1   COMMAND: "kworker/1:3"
#0 [ffff881228437820] machine_kexec at ffffffff8105e200
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/arch/x86/kernel/machine_kexec_64.c: 320
#1 [ffff881228437890] crash_kexec at ffffffff81112c98
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/kernel/kexec.c: 1503
#2 [ffff881228437960] oops_end at ffffffff8101a7f8
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/arch/x86/kernel/dumpstack.c: 232
#3 [ffff881228437990] no_context at ffffffff8106d931
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/arch/x86/mm/fault.c: 737
#4 [ffff8812284379e0] __bad_area_nosemaphore at ffffffff8106db2d
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/arch/x86/mm/fault.c: 817
#5 [ffff881228437a30] bad_area_nosemaphore at ffffffff8106dc43
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/arch/x86/mm/fault.c: 825
#6 [ffff881228437a40] __do_page_fault at ffffffff8106e1c8
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/arch/x86/mm/fault.c: 1302
#7 [ffff881228437ab0] do_page_fault at ffffffff8106e4d7
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/arch/x86/mm/fault.c: 1320
#8 [ffff881228437af0] page_fault at ffffffff816e7228
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/arch/x86/kernel/entry_64.S: 1410
    [exception RIP: kernfs_find_ns+25]                                    <<<
    RIP: ffffffff81288f49 RSP: ffff881228437ba8 RFLAGS: 00010282
    RAX: ffff881225e11c00 RBX: 0000000000000000 RCX: 0000000000000006
    RDX: 0000000000000000 RSI: ffffffff81771ce0 RDI: 0000000000000000
    RBP: ffff881228437bd8   R8: 000000000000fffe   R9: 0000000000000b4e
    R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff81771ce0
    R13: 0000000000000000 R14: ffffffff814782e0 R15: ffff88120a49f400
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/include/linux/kernfs.h: 265
#9 [ffff881228437be0] kernfs_find_and_get_ns at ffffffff8128934c
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/fs/kernfs/dir.c: 691
#10 [ffff881228437c10] sysfs_unmerge_group at ffffffff8128d2dd
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/fs/sysfs/group.c: 305
#11 [ffff881228437c30] dpm_sysfs_remove at ffffffff8147a28c
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/drivers/base/power/sysfs.c: 749
#12 [ffff881228437c50] device_del at ffffffff8146f4e8
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/drivers/base/core.c: 1195
#13 [ffff881228437ca0] attribute_container_class_device_del at ffffffff81477bbe
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/drivers/base/attribute_container.c: 412
#14 [ffff881228437cc0] transport_remove_classdev at ffffffff81478339
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/drivers/base/transport_class.c: 232
#15 [ffff881228437cf0] attribute_container_device_trigger at ffffffff81477de0
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/drivers/base/attribute_container.c: 271
#16 [ffff881228437d30] transport_remove_device at ffffffff81478295
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/drivers/base/transport_class.c: 250
#17 [ffff881228437d40] scsi_target_reap_ref_release at ffffffff814c9ba2
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/drivers/scsi/scsi_scan.c: 389
#18 [ffff881228437d60] scsi_target_reap at ffffffff814c9bec
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/drivers/scsi/scsi_scan.c: 517
#19 [ffff881228437d70] scsi_remove_target at ffffffff814cd1ab
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/drivers/scsi/scsi_sysfs.c: 1174
#20 [ffff881228437dc0] fc_starget_delete at ffffffffa00d0116 [scsi_transport_fc]
#21 [ffff881228437de0] process_one_work at ffffffff810a14e1
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/include/trace/events/workqueue.h: 116
#22 [ffff881228437e40] worker_thread at ffffffff810a1960
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/include/linux/list.h: 189
#23 [ffff881228437ec0] kthread at ffffffff810a6b3e
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/kernel/kthread.c: 207
#24 [ffff881228437f50] ret_from_fork at ffffffff816e52a2
    /usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/arch/x86/kernel/entry_64.S: 641

Kworker 는 BIOS 로부터 ACPI wakeup signal 을 처리하는 커널 하부에서 동작하는 커널스레드(프로세스)이다.

Kworker 에 대한 설명에서와 같이 역시 하드웨어 시그널에 대한 컨트롤을 관리하는 것이므로

하드웨어적인 문제와 연관된다고 볼 수 있다.

표시된 page fault 와 Exception 이 발생한 펑션을 확인해 보자.

crash7latest> whatis kernfs_find_ns

struct kernfs_node *kernfs_find_ns(struct kernfs_node *, const unsigned char *, const void *);

crash7latest> dis -rl kernfs_find_ns+25

/usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/fs/kernfs/dir.c: 644
0xffffffff81288f30 <kernfs_find_ns>:    push   %rbp
0xffffffff81288f31 <kernfs_find_ns+1>: mov    %rsp,%rbp
0xffffffff81288f34 <kernfs_find_ns+4>: sub    $0x30,%rsp
0xffffffff81288f38 <kernfs_find_ns+8>: mov    %rbx,-0x18(%rbp)
0xffffffff81288f3c <kernfs_find_ns+12>: mov    %r12,-0x10(%rbp)
0xffffffff81288f40 <kernfs_find_ns+16>: mov    %r13,-0x8(%rbp)
0xffffffff81288f44 <kernfs_find_ns+20>: nopl   0x0(%rax,%rax,1)
/usr/src/debug/kernel-4.1.12/linux-4.1.12-94.3.8.el6uek/include/linux/kernfs.h: 265
0xffffffff81288f49 <kernfs_find_ns+25>: movzwl 0x68(%rdi),%eax

직접 해당 부분의 코드를 확인해 보기로 했다 :

    632 /**
    633 * kernfs_find_ns - find kernfs_node with the given name
    635 * @name: name to look for
    636 * @ns: the namespace tag to use
    637 *
    638 * Look for kernfs_node with name @name under @parent. Returns pointer to
    639 * the found kernfs_node on success, %NULL on failure.
    640 */
    641 static struct kernfs_node *kernfs_find_ns(struct kernfs_node *parent,
    642                                           const unsigned char *name,
    643                                           const void *ns)
    644 {
    645         struct rb_node *node = parent->dir.children.rb_node;
    646         bool has_ns = kernfs_ns_enabled(parent);
    647         unsigned int hash;
    648
    649         lockdep_assert_held(&kernfs_mutex);
    650
    651         if (has_ns != (bool)ns) {
    652                 WARN(1, KERN_WARNING "kernfs: ns %s in '%s' for '%s'\n",
    653                      has_ns ? "required" : "invalid", parent->name, name);
    654                 return NULL;
    655         }
    656
    657         hash = kernfs_name_hash(name, ns);
    658         while (node) {
    659                 struct kernfs_node *kn;
    660                 int result;
    661
    662                 kn = rb_to_kn(node);
    663                 result = kernfs_name_compare(hash, name, ns, kn);
    664                 if (result < 0)
    665                         node = node->rb_left;
    666                 else if (result > 0)
    667                         node = node->rb_right;
    668                 else
    669                         return kn;
    670         }
    671         return NULL;
    672 }

대략적으로 주어진 devnode 이름을 커널 파일시스템 엔트리에서 찾아 결과를 반환해 주는 함수이다.

kernfs 는 커널서브시스템에서 사용되는 procfs 등과 같은 가상 파일시스템을 핸들링하기 위한 함수의 집합이라고 볼 수 있다.

무튼, 이 함수에서 Null point Exception 이 발생하였고,
이 말은 검색하려고 한 노드를 찾을 수 없거나 잘못된 값을 참조한다는 이야기이다.

crash7latest> x/s 0xffffffff81771ce0 <<< R12
0xffffffff81771ce0 <power_group_name>: "power"
crash7latest> x/s
0xffffffff81771ce6 <_enabled>: "enabled" <<< R15

crash7latest> dis -rl ffffffff81771ce0+10
0xffffffff81771ce6 <_enabled>: outsb %gs:(%rsi),(%dx)
0xffffffff81771ce8 <_enabled+2>:        (bad)
0xffffffff81771ce9 <_enabled+3>:        (bad)
0xffffffff81771cea <_enabled+4>:        insb   (%dx),%es:(%rdi)

웁스... Deep 하게 들어가버렸으니 어서 빠져나오자.
지금 상황에서는 이보다 더 Deep 하게 들어갈 필요는 없다.

일단 FC 로 연결된 SCSI Device 에 관련된 문제는 알겠는데 그럼 어떻게 해야 하는것일까?

또한 왜 Device 의 강제 제거를 하거나 Masking 처리를 하면 될것 같은데 패닉이 발생한 것일까?

일단 문제가 좁혀졌으니 관련하여 구글링등을 해본다.
분명 Null point 가 떨어지는것을 보아,
논리적 버그가 코드상에 있을것이고, 보고된 것이 있을 것이다.

kernfs_find_ns 와 SCSI 등을 조합으로 검색해 본다.

(사실 난 회사 Stack checker 라는 툴을 이용했다....)

찾았다!

commit: f9279c968c257ee39b0d7bd2571a4d231a67bcc1 scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state

문제의 해결은 간단했다. 단순히 Exception 용 상태처리를 추가한 것..

아래는 scsi_transport_fc.c 의 scsi_target_reap_ref_relase 주석이다 :

* scsi_target_reap_ref_release - remove target from visibility
* @kref: the reap_ref in the target being released
*
* Called on last put of reap_ref, which is the indication that no device
* under this target is visible anymore, so render the target invisible in
* sysfs. Note: we have to be in user context here because the target reaps
* should be done in places where the scsi device visibility is being removed.

리눅스 커널은 어떤 Device 가 Disconnection 되었을 때, 해당 노드를 remove 하기 위해서 일정기간 대기하고,

workqueue 를 확인한 뒤 remove 즉, Invisible 처리를 한다.

이는 나중에 다시 동일 Target 이 연결되었을 때, 동일한 Target ID 를 사용하기 위해서 이다.
(관련 주석이 있던 소스코드가 무엇인지 생각나지 않지만 코드에 주석으로 설명되어있다..)

따라서 연결에 물리적 문제가 생겼을 경우, Kernel 은 SCSI protocol 에 따라,

I/O Workqueue 를 확인하고 가능할때 재개하려고 하거나, Abort 시키려고 한다.

때, 이 Device 의 상태는 Remove 된 것으로 인식되야 하는데
이 디바이스가 이미 Created 된 상태에서 Remove 를 해야하는 것이기 때문에,
Race condition 이 발생한다는 것이였다.

이번 문제의 경우 이 세부적인 상태에 대한 Signal 은 받았으나

처리하지 못하여 결국엔 kernel panic 이 지속적으로 발생된 것이다.

아래는 위 kernel git commit 에서 소개하는 Patch 이다 :
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?id=f9279c968c257ee39b0d7bd2571a4d231a67bcc1

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 3c44032..3832ba5 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c

@@ -385,11 +385,12 @@ static void scsi_target_reap_ref_release(struct kref *kref)

= container_of(kref, struct scsi_target, reap_ref);

- * if we get here and the target is still in the CREATED state that

+ * if we get here and the target is still in a CREATED state that

* means it was allocated but never made visible (because a scan

* turned up no LUNs), so don't call device_del() on it.

- if (starget->state != STARGET_CREATED) {

+ if ((starget->state != STARGET_CREATED) &&

+ (starget->state != STARGET_CREATED_REMOVE)) {

transport_remove_device(&starget->dev);

device_del(&starget->dev);

}

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index ce470f6..d6984df 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c

@@ -1394,11 +1394,15 @@ restart:

spin_lock_irqsave(shost->host_lock, flags);

list_for_each_entry(starget, &shost->__targets, siblings) {

if (starget->state == STARGET_DEL ||

- starget->state == STARGET_REMOVE)

+ starget->state == STARGET_REMOVE ||

+ starget->state == STARGET_CREATED_REMOVE)

continue;

if (starget->dev.parent == dev || &starget->dev == dev) {

kref_get(&starget->reap_ref);

- starget->state = STARGET_REMOVE;

+ if (starget->state == STARGET_CREATED)

+ starget->state = STARGET_CREATED_REMOVE;

+ else

+ starget->state = STARGET_REMOVE;

spin_unlock_irqrestore(shost->host_lock, flags);

__scsi_remove_target(starget);

scsi_target_reap(starget);

diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index d3fb98f..b41ee9d 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h

@@ -248,6 +248,7 @@ enum scsi_target_state {

STARGET_CREATED = 1,

STARGET_RUNNING,

STARGET_REMOVE,

+ STARGET_CREATED_REMOVE,

STARGET_DEL,

};

실재로 고객은 해당 서버에서 LTO 테잎 라이브러리를 이용해 백업을 받는 작업이 있었고,
이때마다 리부팅이 발생한 것으로 판단된다.

처음에는 Tape Driver 의 문제를 의심했으나,

Null point 에러와 여러대의 다른 Cluster set 에서 동일하게 발생하고 있었기 때문에

버그를 찾아보게 되었고, 픽스를 확인할 수 있었다.

물론 이 버그에 대한 패치는 완벽하게 릴리즈 되지 않았고, QA 중이며, UEK4QU6 에 출시되는 4.1.12-119 버젼에 포함될 것이다.

** 완전 새로운 버그였으면 그냥 새 버그로 오픈하거나 커널메일링에 문의를 할 수 있던 상황인데 아쉽..(?)

저작자표시 비영리 변경금지 (새창열림)

'Skills > mY Technutz' 카테고리의 다른 글

Kernel Dump Analysis #15 (0)	2018.02.19
Kernel Dump Analysis #14 (0)	2018.02.13
커널이 지원하는 기능을 확인하는 습관. (1)	2017.02.06
Kexec/Kdump 의 제약사항에 대해서 (4)	2016.01.14
Patch of Egypt timezone data - Urgent (0)	2014.06.18

::: Dragon's Tail :::

Kernel Crash dump Analysis - #13

'Skills > mY Technutz' 카테고리의 다른 글

티스토리툴바

« 2026/07 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Kernel Crash dump Analysis - #13

'Skills > mY Technutz' 카테고리의 다른 글

'Skills/mY Technutz' 관련글

티스토리툴바