How to: Investigate a segmentation fault¶
Case 1¶
Problem¶
Segmentation fault when the IOC shell starts:
[iocuser@host:~]$ iocsh st.cmd
...
/epics/base-7.0.6.1/require/4.0.0/bin/iocsh: line 164: 10243 Segmentation fault (core dumped) ${__LOADER__}${EPICS_BASE}/bin/${EPICS_HOST_ARCH}/softIoc${__PVA__} -D ${EPICS_BASE}/dbd/softIoc${__PVA__}.dbd "${IOC_STARTUP}" 2>&1
Solution¶
Run the IOC using gdb
to try to identify the source of the problem.
Since EPICS base 7.0.6.1, require 4.0.0, all e3 environments are built by default with a debug architecture.1
Execute the following commands.
[iocuser@host:~]$ source /epics/base-7.0.6.1/require/4.0.0/bin/setE3Env.bash
[iocuser@host:~]$ export EPICS_HOST_ARCH=linux-x86_64-debug
[iocuser@host:~]$ iocsh -dg st.cmd
The IOC will still segmentation fault, and will return you to the (gdb)
prompt, meaning that you are now in the gdb
environment, and can start to
investigate the cause of the problem.
Note
The following example shows a segmentation fault that occurred in an older version of EPICS base. Your exact output will be different to this.
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffcfdfe700 (LWP 18940)]
epicsMutexLock (pmutexNode=0x0) at ../../../src/libCom/osi/epicsMutex.cpp:143
143 epicsMutexOsdLock(pmutexNode->id);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7.x86_64 libgcc-4.8.5-36.el7.x86_64 libstdc++-4.8.5-36.el7.x86_64 ncurses-libs-5.9-14.20130511.el7_4.x86_64 readline-6.2-10.el7.x86_64
(gdb)
Check the backtrace ((gdb) bt
) to identify the source code file and line that
caused the problem, and the sequence of function calls that lead to it.
(gdb) bt
#0 epicsMutexLock (pmutexNode=0x0) at ../../../src/libCom/osi/epicsMutex.cpp:143
#1 0x00007ffff796c9d9 in casStatsFetch (pChanCount=0x7fffcfdfdd10, pCircuitCount=0x7fffcfdfdce0) at ../../../src/ioc/rsrv/caservertask.c:1466
#2 0x00007ffff4cf2d08 in scan_time (type=3) at ../devIocStats/devIocStatsAnalog.c:355
#3 0x00007ffff7485bae in epicsTimerForC::expire (this=<optimized out>) at ../../../src/libCom/timer/epicsTimer.cpp:61
#4 0x00007ffff7486ffe in timerQueue::process (this=this@entry=0x2524090, currentTime=...) at ../../../src/libCom/timer/timerQueue.cpp:139
#5 0x00007ffff7487663 in timerQueueActive::run (this=0x2524060) at ../../../src/libCom/timer/timerQueueActive.cpp:93
#6 0x00007ffff7476a19 in epicsThreadCallEntryPoint (pPvt=0x2524130) at ../../../src/libCom/osi/epicsThread.cpp:83
#7 0x00007ffff747c77c in start_routine (arg=0x2524580) at ../../../src/libCom/osi/os/posix/osdThread.c:403
#8 0x00007ffff6639dd5 in start_thread () from /lib64/libpthread.so.0
#9 0x00007ffff694bead in clone () from /lib64/libc.so.6
(gdb) bt full
#0 epicsMutexLock (pmutexNode=0x0) at ../../../src/libCom/osi/epicsMutex.cpp:143
No locals.
#1 0x00007ffff796c9d9 in casStatsFetch (pChanCount=0x7fffcfdfdd10, pCircuitCount=0x7fffcfdfdce0) at ../../../src/ioc/rsrv/caservertask.c:1466
status = <optimized out>
#2 0x00007ffff4cf2d08 in scan_time (type=3) at ../devIocStats/devIocStatsAnalog.c:355
cainfo_clients_local = 0
cainfo_connex_local = 0
#3 0x00007ffff7485bae in epicsTimerForC::expire (this=<optimized out>) at ../../../src/libCom/timer/epicsTimer.cpp:61
No locals.
#4 0x00007ffff7486ffe in timerQueue::process (this=this@entry=0x2524090, currentTime=...) at ../../../src/libCom/timer/timerQueue.cpp:139
unguard = {_guard = <synthetic pointer>, _pTargetMutex = 0x25240c8}
pTmpNotify = 0x25249f8
expStat = {delay = -1.7976931348623157e+308}
guard = {_pTargetMutex = 0x0}
delay = 1.7976931348623157e+308
#5 0x00007ffff7487663 in timerQueueActive::run (this=0x2524060) at ../../../src/libCom/timer/timerQueueActive.cpp:93
delay = <optimized out>
#6 0x00007ffff7476a19 in epicsThreadCallEntryPoint (pPvt=0x2524130) at ../../../src/libCom/osi/epicsThread.cpp:83
pThread = 0x2524130
threadDestroyed = false
#7 0x00007ffff747c77c in start_routine (arg=0x2524580) at ../../../src/libCom/osi/os/posix/osdThread.c:403
pthreadInfo = 0x2524580
status = <optimized out>
blockAllSig = {__val = {18446744067267100671, 18446744073709551615 <repeats 15 times>}}
#8 0x00007ffff6639dd5 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#9 0x00007ffff694bead in clone () from /lib64/libc.so.6
No symbol table info available.
Check the current thread:
(gdb) info threads
Id Target Id Frame
26 Thread 0x7fffceef6700 (LWP 18948) "scan-0.1" 0x00007ffff663dd12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
25 Thread 0x7fffcf0f7700 (LWP 18947) "scan-0.2" 0x00007ffff663dd12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
24 Thread 0x7fffcf2f8700 (LWP 18946) "scan-0.5" 0x00007ffff663dd12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
23 Thread 0x7fffcf4f9700 (LWP 18945) "scan-1" 0x00007ffff663dd12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
22 Thread 0x7fffcf6fa700 (LWP 18944) "scan-2" 0x00007ffff663dd12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
21 Thread 0x7fffcf8fb700 (LWP 18943) "scan-5" 0x00007ffff663dd12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
20 Thread 0x7fffcfafc700 (LWP 18942) "scan-10" 0x00007ffff663dd12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
19 Thread 0x7fffcfcfd700 (LWP 18941) "scanOnce" 0x00007ffff663d965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
* 18 Thread 0x7fffcfdfe700 (LWP 18940) "timerQueue" epicsMutexLock (pmutexNode=0x0) at ../../../src/libCom/osi/epicsMutex.cpp:143
Quit gdb.
(gdb) quit
A debugging session is active.
Inferior 1 [process 18904] will be killed.
Quit anyway? (y or n) y
- 1
To add the
linux-x86_64-debug
architecture to a local build of e3, add the following line toe3-base/configure/CONFIG_BASE
:E3_CROSS_COMPILER_TARGET_ARCHS+=linux-x86_64-debug