linux coredump 的设置与读取
nxdong October 26, 2022 [linux] #coredumplinux可以在程序崩溃的时候把当时的内存状态保存下来,便于之后的调试。
coredump 信号
linux会为程序设置一些默认的信号处理方法。
根据man signal.7 或者在线文档 可以查看哪些信号可以触发coredump操作。
- Term Default action is to terminate the process.
- Ign Default action is to ignore the signal.
- Core Default action is to terminate the process and dump core (see core(5)).
- Stop Default action is to stop the process.
- Cont Default action is to continue the process if it is currently stopped.
| Signal | Standard | Action | Comment |
|---|---|---|---|
| SIGABRT | P1990 | Core | Abort signal from abort(3) |
| SIGALRM | P1990 | Term | Timer signal from alarm(2) |
| SIGBUS | P2001 | Core | Bus error (bad memory access) |
| SIGCHLD | P1990 | Ign | Child stopped or terminated |
| SIGCLD | - | Ign | A synonym for SIGCHLD |
| SIGCONT | P1990 | Cont | Continue if stopped |
| SIGEMT | - | Term | Emulator trap |
| SIGFPE | P1990 | Core | Floating-point exception |
| SIGHUP | P1990 | Term | Hangup detected on controlling terminalor death of controlling process |
| SIGILL | P1990 | Core | Illegal Instruction |
| SIGINFO | - | - | Asynonym for SIGPWR |
| SIGINT | P1990 | Term | Interrupt from keyboard |
| SIGIO | - | Term | I/O now possible (4.2BSD) |
| SIGIOT | - | Core | IOT trap. A synonym for SIGABRT |
| SIGKILL | P1990 | Term | Kill signal |
| SIGLOST | - | Term | File lock lost (unused) |
| SIGPIPE | P1990 | Term | Broken pipe: write to pipe with noreaders; see pipe(7) |
| SIGPOLL | P2001 | Term | Pollable event (Sys V); synonym for SIGIO |
| SIGPROF | P2001 | Term | Profiling timer expired |
| SIGPWR | - | Term | Power failure (System V) |
| SIGQUIT | P1990 | Core | Quit from keyboard |
| SIGSEGV | P1990 | Core | Invalid memory reference |
| SIGSTKFLT | - | Term | Stack fault on coprocessor (unused) |
| SIGSTOP | P1990 | Stop | Stop process |
| SIGTSTP | P1990 | Stop | Stop typed at terminal |
| SIGSYS | P2001 | Core | Bad system call (SVr4); see also seccomp(2) |
| SIGTERM | P1990 | Term | Termination signal |
| SIGTRAP | P2001 | Core | Trace/breakpoint trap |
| SIGTTIN | P1990 | Stop | Terminal input for background process |
| SIGTTOU | P1990 | Stop | Terminal output for background process |
| SIGUNUSED | - | Core | Synonymous with SIGSYS |
| SIGURG | P2001 | Ign | Urgent condition on socket (4.2BSD) |
| SIGUSR1 | P1990 | Term | User-defined signal 1 |
| SIGUSR2 | P1990 | Term | User-defined signal 2 |
| SIGVTALRM | P2001 | Term | Virtual alarm clock (4.2BSD) |
| SIGXCPU | P2001 | Core | CPU time limit exceeded (4.2BSD);see setrlimit(2) |
| SIGXFSZ | P2001 | Core | File size limit exceeded (4.2BSD); see setrlimit(2) |
| SIGWINCH | - | Ign | Window resize signal (4.3BSD, Sun) |
上个表中Action 为Core 的信号,在默认情况下都会触发coredump操作。
开启coredump
查看coredump是否开启:
# 在我现在的设置中,返回0,表示不会记录coredump文件。
开启不限制大小的coredump:
这个设置只会影响当前的终端会话,重新连接就没了。
如果想持久化开启这个设置,需要额外的设置。
设置生成coredump文件的名字
查看当前的coredump文件的名字:cat /proc/sys/kernel/core_pattern。
这里我的终端输出是core。
所以在我的实验中,程序异常后产生的coredump文件的名字是core。
可以通过在命令行输入man core.5 或者 在线文档 查看一些可以设置的内容。
在其Naming of core dump files 一节中,有以下说明:
| 标识符 | 说明 |
|---|---|
| %% | A single % character. |
| %c | Core file size soft resource limit of crashing process(since Linux 2.6.24). |
| %d | Dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE (since Linux 3.7). |
| %e | The process or thread's comm value, which typically is the same as the executable filename (without path prefix, and truncated to a maximum of 15 characters), but may have been modified to be something different; see the discussion of /proc/[pid]/comm and /proc/[pid]/task/[tid]/comm in proc(5). |
| %E | Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0). |
| %g | Numeric real GID of dumped process. |
| %h | Hostname (same as nodename returned by uname(2)). |
| %i | TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides (since Linux 3.18). |
| %I | TID of thread that triggered core dump, as seen in the initial PID namespace (since Linux 3.18). |
| %p | PID of dumped process, as seen in the PID namespace in which the process resides. |
| %P | PID of dumped process, as seen in the initial PID namespace (since Linux 3.12). |
| %s | Number of signal causing dump. |
| %t | Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). |
| %u | Numeric real UID of dumped process. |
文件名也不能超过128字节。
如果我们想得到一个名字为core.进程号.信号.unix时间的coredump文件生成在/tmp/corefile/ (这个文件夹需要提前创建,如果只想在可执行程序所在目录生成,则不需要指定路径)文件夹内。
我们就需要执行echo /tmp/corefile/core.%p.%s.%t > /proc/sys/kernel/core_pattern (执行这个命令需要切到root命令行,sudo还不行)。
这样,发生coredump之后,我们就能得到一个类似/tmp/corefile/core.1803.8.1666772279 的文件(可以看出,这个程序是搞了错误的算术计算,触发了SIGFPE<8> 的信号)。
通过这个方式设置的coredump文件名字重启后会失效。
可以通过设置/etc/sysctl.conf 的方式来持久化设置这个信息。
向/etc/sysctl.conf文件的末尾添加如下内容:
kernel.core_pattern=/tmp/corefile/core.%p.%s.%t
并保存退出,执行sysctl -p命令使其生效。
通过cat /proc/sys/kernel/core_pattern 验证设置是否生效。
每个进程也可以通过setrlimit的RLIMIT_CORE配置进程级别的core大小。
测试coredump的程序代码
注意,为了方便调试,我们的coredump文件名称中去掉了路径的指定,coredump文件会生成在可执行文件所在的文件夹。
相应的命令:echo core.%p.%s.%t > /proc/sys/kernel/core_pattern
文件coredump_test.c内容如下:
void
void
int
编译:gcc coredump_test.c
运行生成coredump文件:
# 运行分支1
)
#生成了coredump文件
)
# 运行分支2
)
)
# 查看文件大小
可见,core文件会把程序当前的内存空间也dump出来。
这可能会导致产生巨大的coredump文件。 所以可以对coredump的内容进行一些过滤。
使用coredump文件
coredump文件可以使用gdb加载。
()
过滤dump内容
如上所见,coredump默认情况下会dump内存空间,这可能导致生成巨大的core文件。
所以需要对coredump的内容进行过滤。
这一部分内容可以参考man core.5 或者网页手册core 的Controlling which mappings are written to the core dump 一节。
| 设置位 | 说明 |
|---|---|
| bit 0 | Dump anonymous private mappings. |
| bit 1 | Dump anonymous shared mappings. |
| bit 2 | Dump file-backed private mappings. |
| bit 3 | Dump file-backed shared mappings. |
| bit 4 | (since Linux 2.6.24) Dump ELF headers. |
| bit 5 | (since Linux 2.6.28) Dump private huge pages. |
| bit 6 | (since Linux 2.6.28) Dump shared huge pages. |
| bit 7 | (since Linux 4.4) Dump private DAX pages. |
| bit 8 | (since Linux 4.4) Dump shared DAX pages. |
默认情况下,开启: 0 1 4 5。
这也可以通过命令cat /proc/self/coredump_filter查看。
我这里显示的结果是00000033, 这个值是十六进制显示的,其二进制为110011,从后往前,刚好是0145位置1。
同理,我们可以通过向这个文件写值的方式改变当前程序的过滤设置。
参考
linux信号以及core-dump文件 - 知乎 (zhihu.com)
coredump文件生成,以及GDB工具使用_ITPUB博客
coredump文件 - 海林的菜园子 - 博客园 (cnblogs.com)
coredump文件过大_如何调试没有core文件的coredump_weixin_39652658的博客-CSDN博客