The solution which may be accepted by gdb upstream (unwilling to have kernel-specific code inside of gdb) is to extend the gdb's Python binding to enable us write own gdb target in Python. Then there's a new library called libkdumpfile which should enable us to open potentially any dump format (from any architecture). And then you need a glue - the target itself, which enables you to access the data.
Currently it's splitted into three projects:
export MYLOCAL=/tmp/mylocal git clone -b python-working-target https://github.com/jeffmahoney/gdb-python.git pushd gdb-python/ ./configure --prefix=$MYLOCAL '--enable-targets=x86_64-pc-linux,s390x-linux,s390-linux,ppc64-linux' make make install popd git clone https://github.com/ptesarik/libkdumpfile.git pushd libkdumpfile autoreconf -fi ./configure --prefix=$MYLOCAL --with-python make make install popd export PYTHONPATH=$MYLOCAL/lib/python2.7/site-packages/:$MYLOCAL/lib64/python2.7/site-packages/ export LD_LIBRARY_PATH=$MYLOCAL/lib64 git clone https://github.com/jeffmahoney/crash-python.git pushd cd crash-python python setup.py install --prefix $MYLOCAL popd
Ok, we have it installed, now how to use it? At /path/to/my/ there's a debuginfo and a vmcore:
# export PYTHONPATH=$MYLOCAL/lib/python2.7/site-packages/:$MYLOCAL/lib64/python2.7/site-packages/ # export LD_LIBRARY_PATH=$MYLOCAL/lib64 # $MYLOCAL/bin/gdb /path/to/my/vmlinux-3.16.7-29-desktop.debug GNU gdb (GDB) 7.10.50.20151210-cvs Copyright (C) 2015 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later ... (gdb) python from crash.kdump import target (gdb) python target.Target("/path/to/my/vmcore") (gdb) info threads Id Target Id Frame 1 pid 1 "systemd" 0xffffffff8161f172 in context_switch (next=<optimized out>, prev=<optimized out>, rq=<optimized out>) at ../kernel/sched/core.c:2334 2 pid 2 "kthreadd" 0xffffffff8161f172 in context_switch (next=<optimized out>, prev=<optimized out>, rq=<optimized out>) at ../kernel/sched/core.c:2334 3 pid 3 "ksoftirqd/0" 0xffffffff8161f172 in context_switch (next=<optimized out>, prev=<optimized out>, rq=<optimized out>) at ../kernel/sched/core.c:2334 4 pid 4 "kworker/0:0" 0xffffffff8161f172 in context_switch (next=<optimized out>, prev=<optimized out>, rq=<optimized out>) at ../kernel/sched/core.c:2334 ... (gdb) thread 1 (gdb) bt f #0 0xffffffff8161f172 in context_switch (next=<optimized out>, prev=<optimized out>, rq=<optimized out>) at ../kernel/sched/core.c:2334 mm = 0x0 <irq_stack_union> oldmm = 0xffff880439fb6b20 #1 __schedule () at ../kernel/sched/core.c:2795 prev = <unavailable> switch_count = <optimized out> rq = 0xffff88013a6b4010 ...
See that all the kernel-specific functionality is concentrated into one tiny file (in my installation, it's the $MYLOCAL/lib/python2.7/site-packages/crash-0.1-py2.7.egg/crash/kdump/target.py). This is expected to grow - see Jeff's work-in-progress branch "crash-wip" of crash-python, or my tiny target accessing s390's dump".
Linux kernel can be configured to reserve an area of memory for crashkernel. Once the original kernel panics (i.e. dies), instead of just rebooting, it does kexec to this crashkernel. That one through /proc/vmcore has access to the original kernel memory. It runs a kdump tool which saves that memory to the dump - either to the disk file, or somewhere over the network. Makedumpfile tool can be set to save only the "interesting" pages - like it can omit the (usually space-consuming and otherwise unimportant) userspace pages and compress the saved ones - which can be done only to some file formats.
There are also other ways - like taking the dump from from the hypervisor's side (xm dump-core), or just taking for instance VMware's VMSS file.
Dumps can be essential for analyzing the cause of the panic (and finding the bug in the kernel eventually), because in contrast to just the Oops message, in contains e.g. the contents of the failing process stack, so we can see what were the relevant functions's arguments.
For more info about the obtaining of the dumps, see kernel documentation.For inspecting the dumps, there's currently only one tool available - the crash. It's very useful, has many commands for inspecting certain kernel structures/subsystems (network, files, memory, devices, runqueues, ...), understands many kernel versions and many architectures - however it has its downsides, above all, these concern me:
It can, but only the ELFs - which are not useful for a this-day's machines. Furthermore, it doesn't understand virtual memory and last but not least - it knows nothing about the Linux kernel.