Oracle Linux 7 의 RHCK 를 사용하는 고객인데, Crash dump 가 발생하여 분석을 요청했다.
그냥 넘어가려고 했는데 eBPF 를 사용중인 서버에서 발생한 이슈라 흥미를 갖고
한번 살펴보기로 했다.
crash7latest> crashinfo
+==========================+
| *** Crashinfo v1.3.4 *** |
+==========================+
+++WARNING+++ PARTIAL DUMP with size(vmcore) < 25% size(RAM)
KERNEL: /share/linuxrpm/vmlinux_repo/64/3.10.0-957.21.3.el7.x86_64/vmlinux
DUMPFILE: tds-2020-04-02/vmcore.tar.gz_extract/vmcore [PARTIAL DUMP]
CPUS: 16
DATE: Thu Apr 2 17:59:48 2020
UPTIME: 00:06:04
LOAD AVERAGE: 0.70, 0.99, 0.56
TASKS: 2211
NODENAME: ******
RELEASE: 3.10.0-957.21.3.el7.x86_64
VERSION: #1 SMP Mon Jun 17 15:42:47 PDT 2019
MACHINE: x86_64 (3600 Mhz)
MEMORY: 382.5 GB
PANIC: "BUG: unable to handle kernel paging request at ffffffff86ab96c0"
+--------------------------+
>------------------------| Per-cpu Stacks ('bt -a') |------------------------<
+--------------------------+
-- CPU#0 --
PID=0 CPU=0 CMD=swapper/0
#0 crash_nmi_callback+0x37
#1 nmi_handle+0x8c
#2 do_nmi+0x15d
#3 end_repeat_nmi+0x1e
#-1 intel_idle+0xeb, 507 bytes of data
#4 intel_idle+0xeb
#5 cpuidle_enter_state+0x45
#6 cpuidle_idle_call+0xde
#7 arch_cpu_idle+0xe
#8 cpu_startup_entry+0x14a
#9 rest_init+0x77
#10 start_kernel+0x44b
#11 x86_64_start_reservations+0x24
#12 x86_64_start_kernel+0x154
#13 start_cpu+0x5
-- CPU#10 --
PID=0 CPU=10 CMD=swapper/10
#0 crash_nmi_callback+0x37
#1 nmi_handle+0x8c
#2 do_nmi+0x15d
#3 end_repeat_nmi+0x1e
#-1 intel_idle+0xeb, 507 bytes of data
#4 intel_idle+0xeb
#5 cpuidle_enter_state+0x45
#6 cpuidle_idle_call+0xde
#7 arch_cpu_idle+0xe
#8 cpu_startup_entry+0x14a
#9 start_secondary+0x1f7
#10 start_cpu+0x5
-- CPU#11 --
PID=132676 CPU=11 CMD=bpftool
#0 machine_kexec+0x204
#1 __crash_kexec+0x72
#2 crash_kexec+0x30
#3 oops_end+0xa8
#4 no_context+0x285
#5 __bad_area_nosemaphore+0x74
#6 bad_area_nosemaphore+0x14
#7 __do_page_fault+0x2d0
#8 do_page_fault+0x35
#9 page_fault+0x28, 519 bytes of data
#10 security_bpf+0x1c
#11 sys_bpf+0xee
#12 system_call_fastpath+0x22, 477 bytes of data
[[[ Snip ]]]
+-------------------------------+
>----------------------| Last 40 lines of dmesg buffer |----------------------<
+-------------------------------+
[
48.029720] OKSK-00004: Module load succeeded. Build information: (LOW
DEBUG) USM_12.2.0.1.0ACFSJUL2019RU_LINUX.X64_190610 2019/06/10 08:52:11
[
48.680775] ADVMK-0001: Module load succeeded. Build information: (LOW
DEBUG) - USM_12.2.0.1.0ACFSJUL2019RU_LINUX.X64_190610 built on
2019/06/10 09:17:30.
[ 48.713305] asmInit: rval=0 mode=0 offload=0
[
49.731059] ACFSK-0037: Module load succeeded. Build information: (LOW
DEBUG) USM_12.2.0.1.0ACFSJUL2019RU_LINUX.X64_190610 2019/06/10 09:50:34
[ 50.678688] SEOS Syscall Monitor - ACTIVATED
[ 106.342380] warning: `osysmond.bin' uses legacy ethtool link settings API, link modes are only partially reported
[ 152.180599] oracle (110125): Using mlock ulimits for SHM_HUGETLB is deprecated
[ 363.467610] TECH PREVIEW: eBPF syscall may not be fully supported.
Please review provided documentation for limitations.
[ 363.468841] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
[ 363.469394] BUG: unable to handle kernel paging request at ffffffff86ab96c0
[ 363.469966] IP: [<ffffffff86ab96c0>] default_security_ops+0x0/0x6a0
[ 363.470532] PGD 3202014067 PUD 3202015063 PMD 2f7bb62063 PTE 80000032020b9063
[ 363.471092] Oops: 0011 [#1] SMP
[
363.471644] Modules linked in: oracleacfs(PO) oracleadvm(PO)
oracleoks(PO) seos(POE) twnotify(OE) oracleafd(PO) dm_round_robin
dm_service_time dm_multipath sunrpc dell_smbios dell_wmi_descriptor
dcdbas skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm
irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul
glue_helper ablk_helper cryptd wdat_wdt pcspkr ipmi_ssif mgag200 ttm
drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm
drm_panel_orientation_quirks mei_me lpc_ich i2c_i801 mei wmi ipmi_si
ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad b9k_742120(POE)
sg cbproxy_cbp_742_20190815(POE) binfmt_misc ip_tables xfs libcrc32c
raid1 sr_mod cdrom sd_mod lpfc crc32c_intel nvmet_fc nvmet crc_t10dif
ixgbe ahci crct10dif_generic i40e crct10dif_pclmul nvme_fc libahci
[
363.475152] igb nvme_fabrics megaraid_sas nvme_core libata
scsi_transport_fc mdio scsi_tgt ptp crct10dif_common pps_core
i2c_algo_bit dca nfit libnvdimm dm_mirror dm_region_hash dm_log dm_mod
[
363.476331] CPU: 11 PID: 132676 Comm: bpftool Kdump: loaded Tainted:
P OE ------------ T 3.10.0-957.21.3.el7.x86_64 #1
[ 363.477487] Hardware name: Dell Inc. PowerEdge R740/014X06, BIOS 2.2.11 06/13/2019
[ 363.478093] task: ffff8c9577615140 ti: ffff8c86a4f8c000 task.ti: ffff8c86a4f8c000
[ 363.478699] RIP: 0010:[<ffffffff86ab96c0>] [<ffffffff86ab96c0>] default_security_ops+0x0/0x6a0
[ 363.479324] RSP: 0018:ffff8c86a4f8fea0 EFLAGS: 00010246
[ 363.479953] RAX: ffffffff86ab96c0 RBX: ffff8c86a4f8fed0 RCX: 0000000000000000
[ 363.480591] RDX: 0000000000000048 RSI: ffff8c86a4f8fed0 RDI: 000000000000000b
[ 363.481230] RBP: ffff8c86a4f8fea8 R08: ffff8c86a4f90000 R09: 00000000ffffffff
[ 363.481870] R10: 000000000000351b R11: 0000000000000001 R12: 0000000000000048
[ 363.482507] R13: 00007ffc1e74e410 R14: 000000000000000b R15: 0000000000000048
[ 363.483140] FS: 00007fb2108eb740(0000) GS:ffff8c957bf40000(0000) knlGS:0000000000000000
[ 363.483776] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 363.484410] CR2: ffffffff86ab96c0 CR3: 00000051eaa16000 CR4: 00000000007607e0
[ 363.485045] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 363.485674] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 363.486295] PKRU: 55555554
[ 363.486907] Call Trace:
[ 363.487513] [<ffffffff860fa4fc>] ? security_bpf+0x1c/0x20
[ 363.488120] [<ffffffff85f8ca8e>] SyS_bpf+0xee/0xa60
[ 363.488729] [<ffffffff86575ddb>] system_call_fastpath+0x22/0x27
[
363.489331] Code: ff ff ff 40 8a ab 86 ff ff ff ff 00 10 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 <64> 65 66 61 75 6c 74 00 00 00 00 00 00 00 00 00 d0 6a 0f 86
ff
[ 363.490637] RIP [<ffffffff86ab96c0>] default_security_ops+0x0/0x6a0
[ 363.491255] RSP <ffff8c86a4f8fea0>
[ 363.491875] CR2: ffffffff86ab96c0
******************************************************************************
************************ A Summary Of Problems Found *************************
******************************************************************************
-------------------- A list of all +++WARNING+++ messages --------------------
PARTIAL DUMP with size(vmcore) < 25% size(RAM)
There are 2 threads running in their own namespaces
Use 'taskinfo --ns' to get more details
------------------------------------------------------------------------------
** Execution took 74.31s (real) 28.16s (CPU), Child processes: 40.17s
crash7latest>
나머지들을 살펴보자.
crash7latest> lockup
CPU 12: 0.00 sec behind by 0xffff8c37370fd140, swapper/12 [N:120] (0 in queue)
[[[ ... ]]]
CPU 11: 0.02 sec behind by 0xffff8c9577615140, bpftool [N:120] (1 in queue)
crash7latest> meminfo
======================================================================
[ RSS usage ] [ Process name ]
======================================================================
1 GiB ( 1363068 KiB) java
232 MiB ( 237736 KiB) ocssd.bin
164 MiB ( 168748 KiB) osysmond.bin
151 MiB ( 155396 KiB) b9daemon
144 MiB ( 148096 KiB) cssdagent
141 MiB ( 145388 KiB) cssdmonitor
137 MiB ( 140988 KiB) splunkd
129 MiB ( 132496 KiB) oraagent.bin
98 MiB ( 100612 KiB) orarootagent.bi
91 MiB ( 94192 KiB) ora_mmon_ttdsp2
======================================================================
Total memory usage from user-space = 11.23 GiB
** Execution took 0.20s (real) 0.16s (CPU)
crash7latest> syscallinfo
0 ffffffff86042540 (T) sys_read fs/read_write.c: 569
1 ffffffffc04ffb00 (t) twnotify_sys_write [twnotify]
2 ffffffffc0a75ec0 (t) my_open [seos]
3 ffffffffc0500ee0 (t) twnotify_sys_close [twnotify]
4 ffffffffc0a74f30 (t) my_stat [seos]
5 ffffffffc0a75430 (t) my_fstat [seos]
6 ffffffffc0a75130 (t) my_lstat [seos]
18 ffffffffc04ffa80 (t) twnotify_sys_pwrite64 [twnotify]
20 ffffffffc04ffb80 (t) twnotify_sys_writev [twnotify]
321 ffffffff85f8c9a0 (T) sys_bpf kernel/bpf/syscall.c: 1694
대략 extension 에서 제공하는 자동툴을 전반적으로 확인해 보았다.
이제 본격적인 Backtracing 에 들어가 보자.
crash7latest> bt -l
PID: 132676 TASK: ffff8c9577615140 CPU: 11 COMMAND: "bpftool"
#0 [ffff8c86a4f8fb30] machine_kexec at ffffffff85e63934
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/arch/x86/kernel/machine_kexec_64.c: 333
#1 [ffff8c86a4f8fb90] __crash_kexec at ffffffff85f1d162
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/kernel/kexec_core.c: 891
#2 [ffff8c86a4f8fc60] crash_kexec at ffffffff85f1d250
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/arch/x86/include/asm/atomic.h: 37
#3 [ffff8c86a4f8fc78] oops_end at ffffffff8656d778
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/arch/x86/kernel/dumpstack.c: 201
#4 [ffff8c86a4f8fca0] no_context at ffffffff8655bdbe
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/arch/x86/mm/fault.c: 743
#5 [ffff8c86a4f8fcf0] __bad_area_nosemaphore at ffffffff8655be55
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/arch/x86/mm/fault.c: 823
#6 [ffff8c86a4f8fd40] bad_area_nosemaphore at ffffffff8655bfc6
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/arch/x86/mm/fault.c: 831
#7 [ffff8c86a4f8fd50] __do_page_fault at ffffffff865706d0
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/arch/x86/mm/fault.c: 1331
#8 [ffff8c86a4f8fdc0] do_page_fault at ffffffff86570925
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/arch/x86/mm/fault.c: 1341
#9 [ffff8c86a4f8fdf0] page_fault at ffffffff8656c768
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/arch/x86/kernel/entry_64.S: 1419
[exception RIP: default_security_ops]
RIP: ffffffff86ab96c0 RSP: ffff8c86a4f8fea0 RFLAGS: 00010246
RAX: ffffffff86ab96c0 RBX: ffff8c86a4f8fed0 RCX: 0000000000000000
RDX: 0000000000000048 RSI: ffff8c86a4f8fed0 RDI: 000000000000000b
RBP: ffff8c86a4f8fea8 R8: ffff8c86a4f90000 R9: 00000000ffffffff
R10: 000000000000351b R11: 0000000000000001 R12: 0000000000000048
R13: 00007ffc1e74e410 R14: 000000000000000b R15: 0000000000000048
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#10 [ffff8c86a4f8fea0] security_bpf at ffffffff860fa4fc
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/security/security.c: 1554
#11 [ffff8c86a4f8feb0] sys_bpf at ffffffff85f8ca8e
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/kernel/bpf/syscall.c: 1718
#12 [ffff8c86a4f8ff50] system_call_fastpath at ffffffff86575ddb
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/arch/x86/kernel/entry_64.S: 503
RIP: 00007fb20fdd3349 RSP: 00007ffc1e74e468 RFLAGS: 00010202
RAX: 0000000000000141 RBX: 00007ffc1e74e5b8 RCX: 00000000ffffffff
RDX: 0000000000000048 RSI: 00007ffc1e74e410 RDI: 000000000000000b
RBP: 00007ffc1e74e3f0 R8: 0000000000000000 R9: 00007ffc1e74e410
R10: ffffffffffffffff R11: 0000000000000206 R12: 00007ffc1e74e48c
R13: 00007ffc1e74e5a0 R14: 0000000000000000 R15: 0000000000000000
ORIG_RAX: 0000000000000141 CS: 0033 SS: 002b
살짝 타스크/쓰레드 정보를 살펴보자.
crash7latest> task_struct.thread ffff8c9577615140
thread = {
tls_array = {{
{
{
a = 0,
b = 0
},
{
limit0 = 0,
base0 = 0,
base1 = 0,
type = 0,
s = 0,
dpl = 0,
p = 0,
limit = 0,
avl = 0,
l = 0,
d = 0,
g = 0,
base2 = 0
}
}
}, {
{
{
a = 0,
b = 0
....
ptrace_bps = {0x0, 0x0, 0x0, 0x0},
debugreg6 = 0,
ptrace_dr7 = 0,
cr2 = 18446744071673976512,
trap_nr = 14,
error_code = 17,
fpu = {
last_cpu = 11,
has_fpu = 1,
state = 0xffff8c9573be0000
},
io_bitmap_ptr = 0x0,
iopl = 0,
io_bitmap_max = 0
}
별게 없다. bpftool 에 인자를 확인해 보려 했는데 실행파일명 밖에 안 보였다.
crash7latest> edis -rg ffffffff85f8ca8e
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/kernel/bpf/syscall.c: 1694
0xffffffff85f8c9a0 <sys_bpf>: nopl 0x0(%rax,%rax,1) [FTRACE NOP]
0xffffffff85f8c9a5 <sys_bpf+5>: push %rbp ; 0x0000000000000000
||||| 0xffffffff85f8ca42 <sys_bpf+162>: nopw 0x0(%rax,%rax,1)
||||| /usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/kernel/bpf/syscall.c: 1711
||+==>0xffffffff85f8ca48 <sys_bpf+168>: cmp $0x48,%r12d
|| || 0xffffffff85f8ca4c <sys_bpf+172>: mov $0x48,%r15d
|| || /usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/include/linux/thread_info.h: 181
|| || 0xffffffff85f8ca52 <sys_bpf+178>: mov %rbx,%rdi
|| || /usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/kernel/bpf/syscall.c: 1711
|| || 0xffffffff85f8ca55 <sys_bpf+181>: cmovbe %r12d,%r15d
|| || /usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/include/linux/thread_info.h: 181
|| || 0xffffffff85f8ca59 <sys_bpf+185>: xor %edx,%edx
|| || /usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/kernel/bpf/syscall.c: 1714
|| || 0xffffffff85f8ca5b <sys_bpf+187>: mov %r15d,%esi
|| || /usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/include/linux/thread_info.h: 181
|| || 0xffffffff85f8ca5e <sys_bpf+190>: callq 0xffffffff8603e660 <__check_object_size>
|| || /usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/arch/x86/include/asm/uaccess_64.h: 80
|| || 0xffffffff85f8ca63 <sys_bpf+195>: mov %r15d,%edx
|| || 0xffffffff85f8ca66 <sys_bpf+198>: mov %r13,%rsi
|| || 0xffffffff85f8ca69 <sys_bpf+201>: mov %rbx,%rdi
|| || 0xffffffff85f8ca6c <sys_bpf+204>: callq 0xffffffff861847c0 <_copy_from_user>
|| || 0xffffffff85f8ca71 <sys_bpf+209>: mov %rax,%rdx
|| || /usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/kernel/bpf/syscall.c: 1715
|| || 0xffffffff85f8ca74 <sys_bpf+212>: mov $0xfffffffffffffff2,%rax
|| || /usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/kernel/bpf/syscall.c: 1714
|| || 0xffffffff85f8ca7b <sys_bpf+219>: test %rdx,%rdx
|| |+*0xffffffff85f8ca7e <sys_bpf+222>: jne 0xffffffff85f8ca1f <sys_bpf+127>
|| | /usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/kernel/bpf/syscall.c: 1717
|| | 0xffffffff85f8ca80 <sys_bpf+224>: mov %r15d,%edx
|| | 0xffffffff85f8ca83 <sys_bpf+227>: mov %rbx,%rsi
|| | 0xffffffff85f8ca86 <sys_bpf+230>: mov %r14d,%edi
|| | 0xffffffff85f8ca89 <sys_bpf+233>: callq 0xffffffff860fa4e0 <security_bpf>
|| | /usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/kernel/bpf/syscall.c: 1718
|| | 0xffffffff85f8ca8e <sys_bpf+238>: test %eax,%eax
|| |
1717 err = security_bpf(cmd, &attr, size);
1718 if (err < 0)
1719 return err;
1720
Sungju 님의 pykdump extension 명령인 edis 인데 이쁘게 연결되는 코드별로 표기해준다.
더 살펴보도록 하자.
crash7latest> edis -rg ffffffff860fa4fc
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/security/security.c: 1552
0xffffffff860fa4e0 <security_bpf>: nopl 0x0(%rax,%rax,1) [FTRACE NOP]
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/security/security.c: 1553
0xffffffff860fa4e5 <security_bpf+5>: mov 0xf2a414(%rip),%rax # 0xffffffff87024900
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/security/security.c: 1552
0xffffffff860fa4ec <security_bpf+12>: push %rbp ; 0xffff8c86a4f8ff48
0xffffffff860fa4ed <security_bpf+13>: mov %rsp,%rbp
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/security/security.c: 1553
0xffffffff860fa4f0 <security_bpf+16>: mov 0x650(%rax),%rax
0xffffffff860fa4f7 <security_bpf+23>: callq 0xffffffff86186ff0 <__x86_indirect_thunk_rax>
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/security/security.c: 1554
0xffffffff860fa4fc <security_bpf+28>: pop %rbp
1551 int security_bpf(int cmd, union bpf_attr *attr, unsigned int size)
1552 {
1553 return security_ops->bpf(cmd, attr, size);
1554 }
security_bpf 를 확인해보자.
crash7latest> security_bpf
security_bpf = $1 =
{int (int, union bpf_attr *, unsigned int)} 0xffffffff860fa4e0
crash7latest> kmem 0xffffffff860fa4e0
ffffffff860fa4e0
(T) security_bpf
/usr/src/debug/kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/security/security.c:
1552
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffea1ec805be80 32016fa000 0 0 1 6fffff00000400 reserved
crash7latest> ptov 32016fa000
VIRTUAL PHYSICAL
ffff8c68016fa000 32016fa000
crash7latest> vtop ffff8c68016fa000
VIRTUAL PHYSICAL
ffff8c68016fa000 32016fa000
PGD DIRECTORY: ffffffff86a10000
PAGE DIRECTORY: 3202652067
PUD: 3202652d00 => 30775c5063
PMD: 30775c5058 => 80000032016000e1
PAGE: 3201600000 (2MB)
PTE PHYSICAL FLAGS
80000032016000e1 3201600000 (PRESENT|ACCESSED|DIRTY|PSE|NX)
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffea1ec805be80 32016fa000 0 0 1 6fffff00000400 reserved
security_ops->bpf 의 값을 확인해 봐야 한다.
코드를 보면 해당 구조체는 security_operations 라는 구조체이다.
crash7latest> security_operations 0xffffffffc01d24c0
struct security_operations {
name = "cbstub\000\000\000\000",
ptrace_access_check = 0xffffffffc01cf764,
ptrace_traceme = 0xffffffffc01cf82a,
capget = 0xffffffff860f6c30,
capset = 0xffffffff860f6c60,
capable = 0xffffffff860f6a30,
quotactl = 0xffffffff860fa5d0,
quota_on = 0xffffffff860fa5e0,
syslog = 0xffffffff860fa5c0,
settime = 0xffffffff860f6ab0,
vm_enough_memory = 0xffffffff860f81a0,
bprm_set_creds = 0xffffffffc01cec0d,
bprm_check_security = 0xffffffff860fa5f0,
bprm_secureexec = 0xffffffff860f7a30,
bprm_committing_creds = 0xffffffff860fa600,
bprm_committed_creds = 0xffffffff860fa610,
....
audit_rule_free = 0xffffffff860fb110,
bpf = 0xffffffff86ab96c0,
bpf_map = 0xffffffff87024900,
bpf_prog = 0x0,
bpf_map_alloc_security = 0xffffffffc01d2b28,
bpf_map_free_security = 0xffffffffc01d2b28,
bpf_prog_alloc_security = 0x0,
bpf_prog_free_security = 0xffffffffc01d2b40
}
해당 값을 살펴보았으나 특별히 패닉을 일으킬만한 요소는 보이지 않았다.
값이 0 아래로 떨어진다고 해도, 패닉을 일으킬 필요는 없는 코드고,
NX protection 이 발생할 소지가 없음에도 불구하고 해당 프로텍션이 발생하고 있었다.
현재 커널 버젼보다 상위인 커널에는 어떤 패치가 있나 확인해 보았으나 코드는 변화가 없었다.
재밌는건 위 bpftool 은 sosreport 를 수집할때 systemtab 에 의해 수행되는 명령이었다.
재현테스트를 테스트머신을 만들어 수행해 보았으나, 아무런 이상없이
정상적으로 수집되고, 값도 정상적이다.
또한 bpftool 명령이 설치되어 있지 않다면 sosreport 수행 시 실행되지 않는다.
현재 고객은 위에서 보다시피 seos 등 3rd party 모듈을 여럿 사용하고 있었고,
filesystem permission 등의 관리가 이루어지고 있었기 때문에,
보안모듈을 의심할 수 밖에 없는 상황이였다.
안타깝게도, eBPF 는 OL/RHEL 7 에서 Technical Preview 로 제공되므로,
더 깊은 상태의 investigation 이나 패치는 제공되지 않는다.
보다 상위의 커널들 ( Kernel 4.x, UEK4/5 ) 에서는 security_ops 를 아예 제거했고,
기본적으로 보안체크를 거치지 않고 다른 방법으로 처리하고 있었기에
적용하기도 어려운 상태였다.
결국 임시방편으로 고객에게 3rd-party 솔루션에서 bpftool 를 사용하는지 확인 후,
사용하지 않고, 필요로 하지 않는다면,
해당 패키지를 삭제하고 사용하지 않도록 하라고 권고하였다.
해당 패키지 (bpftool) 삭제 후 현재까지 이상없이 잘 운영중으로 보인다.
** Notes : 사실 Herbert 나 Todd 등 유수한 개발자님들께서 Technical Preview 라 귀찮다고
코어덤프 거들떠보지도 않으셔서 상당히 어렵게 확인함 ㅋ
'Skills > mY Technutz' 카테고리의 다른 글
Guest VM 의 rx overflow issue vmcore 분석 (1) | 2021.12.29 |
---|---|
kernel dump 로 생성된 vmcore 에서 NIC device name 확인하기 (0) | 2021.04.29 |
Mac 에서 launchctl 을 이용하여 특정 명령을 지정된 시간에 자동수행 시켜보자 (0) | 2020.03.19 |
PyKdump extension - pycrashext (0) | 2019.12.14 |
The effective crash-utility for vmcore analysis (PyKdump) (0) | 2019.12.13 |