请对exit 说 No !!!

1. 背景

最近在《xxx OTA master 项目》中遇到了一个奇怪的问题:otamaster 捕获到 SIGSEGV 错误后,打印出线程堆栈信息后,执行exit(1),进程没有退出。

经过现象的分析,查阅资料,最终了解了其原因。让我对当前公版exit(1)做法产生强烈的不认同。因为exit相当于让系统对进程的资源进行回收。这种做法是无法预期的。

曾经在《程序员的自我修养——链接,装载与库》中看到一位牛人说过一句让我印象深刻的话。

成熟且优秀的程序员应该对程序运行过程中每一个字节都应该清晰

而exit的做法,明显违背了这一点。

2. 问题描述及分析

2.1 信号异常处理逻辑分析

otamaster在初始化时,会通过 adm_apc_backtrace 接口,捕获异常信号。代码如下:

C
static void backtrace_signal_handler(int32_t signal)
{
    log_w("signal:%d", signal);

    char buf[10 * 1024] = {0};
    uint64_t buflen = sizeof(buf) / sizeof(buf[0]);
    int32_t len = 0;

    void* array[128] = {0};
    int32_t size = backtrace(array, 128);
    char** strings = (char**)backtrace_symbols(array, size);
    if (NULL != strings) {
        for (int32_t i = 0; i < size; i++) {
            if (NULL != strings[i]) {
                len += sprintf(buf + len, "%sn", strings[i]);
            }
        }
        log_w("backtrace:%s", buf);
        free(strings);
    }

    char file[64] = {0};
    (void)snprintf(file, sizeof(file) / sizeof(file[0]), "/proc/%d/maps", getpid());
    FILE* fp = fopen(file, "r");
    if (fp != NULL) {
        uint64_t rlen = fread(buf, 1, buflen - 1u, fp);
        if (rlen > 0u) {
            buf[rlen] = '';
            log_w("%sn", buf);
        }
        (void)fclose(fp);
    }

    exit(1);
}

void adm_apc_backtrace(void)
{
    struct sigaction sa;
    (void)memset(&sa, 0, sizeof(sa));
    sa.sa_handler = backtrace_signal_handler;
    (void)sigemptyset(&sa.sa_mask);
    sa.sa_flags = SA_RESTART;
    do {
        /* Trap SIGSEGV */
        if (sigaction(SIGSEGV, &sa, NULL) == -1) {
            log_w("sigaction(): %sn", strerror(errno));
            break;
        }
        /* Trap SIGABRT */
        if (sigaction(SIGABRT, &sa, NULL) == -1) {
            log_w("sigaction(): %sn", strerror(errno));
            break;
        }
        /* Trap SIGPIPE */
        if (sigaction(SIGPIPE, &sa, NULL) == -1) {
            log_w("sigaction(): %s", strerror(errno));
            break;
        }
    } while (false);
}

流程分析:

  1. 首先通过信号注册,捕获 SIGSEGV (段错误), SIGABRT(abort函数),SIGPIPE(管道破裂)三个异常信号。
  2. 当某个线程产生这三个信号错误时,会触发处理函数backtrace_signal_handler。
  3. 打印异常线程的调用栈信息。【可以通过该堆栈信息和addr2line工具定位出错点】
  4. 打印otamaster进程的内存信息。【目前好像没有提炼出什么有用信息,有了解的朋友可以补充?】
  5. exit 退出进程

思考

目前这种异常处理方式是非常不妥的。因为otamaster程序存在多个线程。任何一个线程产生异常都会导致触发进程exit。这样的处理方式,可能会导致正在执行核心业务产生异常。

建议:

  1. 通知其它线程安全暂停业务
  2. 资源回收
  3. 进程退出

2.2 问题复现

在X86环境中的复现步骤如下:

  1. 运行程序。otamaster -c ../etc/otamasterConfig.json
  2. 待程序运行稳定后,查看进程中所有线程信息。ps -ef |  grep otamster                        
  3.  模拟段错误。kill -11 18493  【11 代表 SIGSEGV 信号】
  4.  观察进程状态如下:

C
I/COM      [04-10 23:54:45.382321 tid:14058] (adm_com_kvstorage.c adm_com_kvstorage_set_string:273)adm_kvStorage_set_string key:[COM]_[doggedOffset], val:276 successed
V/COM      [04-10 23:54:45.392144 tid:14058] (adm_com_doggedTick.cpp run:93)dogged offset 0x(276)
I/APPCORE  [04-10 23:54:51.474197 tid:14058] (adm_apc_coreAchieve.c adm_apc_waitTask:475)Waitting task...

W/APPCORE  [04-10 23:54:54.430045 tid:14058] (adm_apc_backtrace.c backtrace_signal_handler:33)signal:11
W/APPCORE  [04-10 23:54:54.431976 tid:14058] (adm_apc_backtrace.c backtrace_signal_handler:48)backtrace:./otamaster(+0x132d8f) [0x5623eeba1d8f]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12980) [0x7fdb6c6b1980]
/lib/x86_64-linux-gnu/libc.so.6(nanosleep+0x40) [0x7fdb6bb65680]
/lib/x86_64-linux-gnu/libc.so.6(sleep+0x3a) [0x7fdb6bb6555a]
./otamaster(adm_apc_waitTask+0xa9) [0x5623eeba534b]
./otamaster(main+0xca7) [0x5623eeba6b74]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7fdb6baa2c87]
./otamaster(_start+0x2a) [0x5623eeba034a]

W/APPCORE  [04-10 23:54:54.440987 tid:14058] (adm_apc_backtrace.c backtrace_signal_handler:59)5623eea6f000-5623eed71000 r-xp 00000000 08:01 1835790                    /home/yihua/mi-ota/build/output/mi_ota/otamaster_exe_deployment/bin/otamaster
5623eef70000-5623eef75000 r--p 00301000 08:01 1835790                    /home/yihua/mi-ota/build/output/mi_ota/otamaster_exe_deployment/bin/otamaster
5623eef75000-5623eef77000 rw-p 00306000 08:01 1835790                    /home/yihua/mi-ota/build/output/mi_ota/otamaster_exe_deployment/bin/otamaster
5623eef77000-5623eef78000 rw-p 00000000 00:00 0
5623ef6c6000-5623f04ad000 rw-p 00000000 00:00 0                          [heap]
7fdb1c000000-7fdb1c021000 rw-p 00000000 00:00 0
7fdb1c021000-7fdb20000000 ---p 00000000 00:00 0
7fdb20000000-7fdb20021000 rw-p 00000000 00:00 0
7fdb20021000-7fdb24000000 ---p 00000000 00:00 0
7fdb24000000-7fdb24021000 rw-p 00000000 00:00 0
7fdb24021000-7fdb28000000 ---p 00000000 00:00 0
7fdb28000000-7fdb28021000 rw-p 00000000 00:00 0
7fdb28021000-7fdb2c000000 ---p 00000000 00:00 0
7fdb2c000000-7fdb2c021000 rw-p 00000000 00:00 0
7fdb2c021000-7fdb30000000 ---p 00000000 00:00 0
7fdb30000000-7fdb30184000 rw-p 00000000 00:00 0
7fdb30184000-7fdb34000000 ---p 00000000 00:00 0
7fdb35ffc000-7fdb35ffd000 ---p 00000000 00:00 0
7fdb35ffd000-7fdb367fd000 rw-p 00000000 00:00 0
7fdb367fd000-7fdb367fe000 ---p 00000000 00:00 0
7fdb367fe000-7fdb36ffe000 rw-p 00000000 00:00 0
7fdb36ffe000-7fdb36fff000 ---p 00000000 00:00 0
7fdb36fff000-7fdb377ff000 rw-p 00000000 00:00 0
7fdb377ff000-7fdb37800000 ---p 00000000 00:00 0
7fdb37800000-7fdb38000000 rw-p 00000000 00:00 0
7fdb38000000-7fdb38021000 rw-p 00000000 00:00 0
7fdb38021000-7fdb3c000000 ---p 00000000 00:00 0
7fdb3c000000-7fdb3c021000 rw-p 00000000 00:00 0
7fdb3c021000-7fdb40000000 ---p 00000000 00:00 0
7fdb40000000-7fdb40021000 rw-p 00000000 00:00 0
7fdb40021000-7fdb44000000 ---p 00000000 00:00 0
7fdb44000000-7fdb44021000 rw-p 00000000 00:00 0
7fdb44021000-7fdb48000000 ---p 00000000 00:00 0
7fdb48000000-7fdb48021000 rw-p 00000000 00:00 0
7fdb48021000-7fdb4c000000 ---p 00000000 00:00 0
7fdb4c57c000-7fdb4dc68000 rw-p 00000000 00:00 0
7fdb4dc68000-7fdb4dc69000 ---p 00000000 00:00 0
7fdb4dc69000-7fdb4e469000 rw-p 00000000 00:00 0
7fdb4e469000-7fdb4e46a000 ---p 00000000 00:00 0
7fdb4e46a000-7fdb4f623000 rw-p 00000000 00:00 0
7fdb4f623000-7fdb4f624000 ---p 00000000 00:00 0
7fdb4f624000-7fdb50000000 rw-p 00000000 00:00 0
7fdb50000000-7fdb50021000 rw-p 00000000 00:00 0
7fdb50021000-7fdb54000000 ---p 00000000 00:00 0
7fdb54000000-7fdb54021000 rw-p 00000000 00:00 0
7fdb54021000-7fdb58000000 ---p 00000000 00:00 0
7fdb58000000-7fdb58021000 rw-p 00000000 00:00 0
7fdb58021000-7fdb5c000000 ---p 00000000 00:00 0
7fdb5c000000-7fdb5c021000 rw-p 00000000 00:00 0
7fdb5c021000-7fdb60000000 ---p 00000000 00:00 0
7fdb60000000-7fdb60021000 rw-p 00000000 00:00 0
7fdb60021000-7fdb64000000 ---p 00000000 00:00 0
7fdb64018000-7fdb6403a000 rw-p 00000000 00:00 0
7fdb6403a000-7fdb6414b000 rw-s 00000000 00:05 262144                     /SYSV00401dec (deleted)
7fdb6414b000-7fdb6414c000 ---p 00000000 00:00 0
7fdb6414c000-7fdb6494c000 rw-p 00000000 00:00 0
7fdb6494c000-7fdb64a5d000 rw-s 00000000 00:05 294913                     /SYSV00401ded (deleted)
7fdb64a5d000-7fdb64a5e000 ---p 00000000 00:00 0
7fdb64a5e000-7fdb6525e000 rw-p 00000000 00:00 0
7fdb6525e000-7fdb6525f000 ---p 00000000 00:00 0
7fdb6525f000-7fdb65a5f000 rw-p 00000000 00:00 0
7fdb65a5f000-7fdb65a60000 ---p 00000000 00:00 0
7fdb65a60000-7fdb66260000 rw-p 00000000 00:00 0
7fdb66260000-7fdb66261000 ---p 00000000 00:00 0
7fdb66261000-7fdb66a61000 rw-p 00000000 00:00 0
7fdb66a61000-7fdb66a62000 ---p 00000000 00:00 0
7fdb66a62000-7fdb6754c000 rw-p 00000000 00:00 0
7fdb6754c000-7fdb6754d000 ---p 00000000 00:00 0
7fdb6754d000-7fdb67d4d000 rw-p 00000000 00:00 0
7fdb67d4d000-7fdb67d4e000 ---p 00000000 00:00 0
7fdb67d4e000-7fdb6854e000 rw-p 00000000 00:00 0
7fdb6854e000-7fdb6854f000 ---p 00000000 00:00 0
7fdb6854f000-7fdb68d4f000 rw-p 00000000 00:00 0
7fdb68d4f000-7fdb68d50000 ---p 00000000 00:00 0
7fdb68d50000-7fdb69550000 rw-p 00000000 00:00 0
7fdb69550000-7fdb696ed000 r-xp 00000000 08:01 3014692                    /lib/x86_64-linux-gnu/libm-2.27.so
7fdb696ed000-7fdb698ec000 ---p 0019d000 08:01 3014692                    /lib/x86_64-linux-gnu/libm-2.27.so
7fdb698ec000-7fdb698ed000 r--p 0019c000 08:01 3014692                    /lib/x86_64-linux-gnu/libm-2.27.so
7fdb698ed000-7fdb698ee000 rw-p 0019d000 08:01 3014692                    /lib/x86_64-linux-gnu/libm-2.27.so
7fdb698ee000-7fdb69997000 r-xp 00000000 08:01 2635866                    /home/yihua/mi-ota/foundation/packages/libs/libssl.so.1.1
7fdb69997000-7fdb69b97000 ---p 000a9000 08:01 2635866                    /home/yihua/mi-ota/foundation/packages/libs/libssl.so.1.1
7fdb69b97000-7fdb69b9f000 r--p 000a9000 08:01 2635866                    /home/yihua/mi-ota/foundation/packages/libs/libssl.so.1.1
7fdb69b9f000-7fdb69ba4000 rw-p 000b1000 08:01 2635866                    /home/yihua/mi-ota/foundation/packages/libs/libssl.so.1.1
7fdb69ba4000-7fdb69ba9000 r-xp 00000000 08:01 1837725                    /home/yihua/mi-ota/application/dep/lib/x86/libvdi_tp.so
7fdb69ba9000-7fdb69da8000 ---p 00005000 08:01 1837725                    /home/yihua/mi-ota/application/dep/lib/x86/libvdi_tp.so
7fdb69da8000-7fdb69da9000 r--p 00004000 08:01 1837725                    /home/yihua/mi-ota/application/dep/lib/x86/libvdi_tp.so
7fdb69da9000-7fdb69dab000 rw-p 00005000 08:01 1837725                    /home/yihua/mi-ota/application/dep/lib/x86/libvdi_tp.so
7fdb69dab000-7fdb69daf000 r-xp 00000000 08:01 1854551                    /home/yihua/mi-ota/application/dep/lib/x86/libvdi_sa.so
7fdb69daf000-7fdb69fae000 ---p 00004000 08:01 1854551                    /home/yihua/mi-ota/application/dep/lib/x86/libvdi_sa.so
7fdb69fae000-7fdb69faf000 r--p 00003000 08:01 1854551                    /home/yihua/mi-ota/application/dep/lib/x86/libvdi_sa.so
7fdb69faf000-7fdb69fb0000 rw-p 00004000 08:01 1854551                    /home/yihua/mi-ota/application/dep/lib/x86/libvdi_sa.so
7fdb69fb0000-7fdb69ff1000 r-xp 00000000 08:01 1835434                    /home/yihua/mi-ota/application/dep/lib/x86/libvdi_convert.so
7fdb69ff1000-7fdb6a1f0000 ---p 00041000 08:01 1835434                    /home/yihua/mi-ota/application/dep/lib/x86/libvdi_convert.so
7fdb6a1f0000-7fdb6a1f1000 r--p 00040000 08:01 1835434                    /home/yihua/mi-ota/application/dep/lib/x86/libvdi_convert.so
7fdb6a1f1000-7fdb6a1f2000 rw-p 00041000 08:01 1835434                    /home/yihua/mi-ota/application/dep/lib/x86/libvdi_convert.so
7fdb6a1f2000-7fdb6a1f5000 r-xp 00000000 08:01 3014691                    /lib/x86_64-linux-gnu/libdl-2.27.so
7fdb6a1f5000-7fdb6a3f4000 ---p 00003000 08:01 3014691                    /lib/x86_64-linux-gnu/libdl-2.27.so
7fdb6a3f4000-7fdb6a3f5000 r--p 00002000 08:01 3014691                    /lib/x86_64-linux-gnu/libdl-2.27.so
7fdb6a3f5000-7fdb6a3f6000 rw-p 00003000 08:01 3014691                    /lib/x86_64-linux-gnu/libdl-2.27.so
7fdb6a3f6000-7fdb6a425000 r-xp 00000000 08:01 1835785                    /home/yihua/mi-ota/build/output/mi_ota/otamaster_exe_deployment/lib/librticonnextmsgcpp2.so
7fdb6a425000-7fdb6a624000 ---p 0002f000 08:01 1835785                    /home/yihua/mi-ota/build/output/mi_ota/otamaster_exe_deployment/lib/librticonnextmsgcpp2.so
7fdb6a624000-7fdb6a626000 r--p 0002e000 08:01 1835785                    /home/yihua/mi-ota/build/output/mi_ota/otamaster_exe_deployment/lib/librticonnextmsgcpp2.so
7fdb6a626000-7fdb6a627000 rw-p 00030000 08:01 1835785                    /home/yihua/mi-ota/build/output/mi_ota/otamaster_exe_deployment/lib/librticonnextmsgcpp2.so
7fdb6a627000-7fdb6ace3000 r-xp 00000000 08:01 1835767                    /home/yihua/mi-ota/build/output/mi_ota/otamaster_exe_deployment/lib/libnddscore.so
7fdb6ace3000-7fdb6aee2000 ---p 006bc000 08:01 1835767                    /home/yihua/mi-ota/build/output/mi_ota/otamaster_exe_deployment/lib/libnddscore.so
7fdb6aee2000-7fdb6aeec000 r--p 006bb000 08:01
I/DPM      [04-10 23:54:54.449717 tid:14058] (adm_dpm_manager.cpp ~DownloadManager:119)Exit ...
I/COM      [04-10 23:54:55.419487 tid:14058] (adm_com_kvstorage.c adm_com_kvstorage_set_string:273)adm_kvStorage_set_string key:[COM]_[doggedOffset], val:280 successed
V/COM      [04-10 23:54:55.429416 tid:14058] (adm_com_doggedTick.cpp run:93)dogged offset 0x(280)
I/COM      [04-10 23:55:05.453521 tid:14058] (adm_com_kvstorage.c adm_com_kvstorage_set_string:273)adm_kvStorage_set_string key:[COM]_[doggedOffset], val:28A successed
V/COM      [04-10 23:55:05.462580 tid:14058] (adm_com_doggedTick.cpp run:93)dogged offset 0x(28A)
I/COM      [04-10 23:55:15.486914 tid:14058] (adm_com_kvstorage.c adm_com_kvstorage_set_string:273)adm_kvStorage_set_string key:[COM]_[doggedOffset], val:294 successed
V/COM      [04-10 23:55:15.496562 tid:14058] (adm_com_doggedTick.cpp run:93)dogged offset 0x(294)
I/COM      [04-10 23:55:25.520730 tid:14058] (adm_com_kvstorage.c adm_com_kvstorage_set_string:273)adm_kvStorage_set_string key:[COM]_[doggedOffset], val:29E successed
V/COM      [04-10 23:55:25.530475 tid:14058] (adm_com_doggedTick.cpp run:93)dogged offset 0x(29E)
I/COM      [04-10 23:55:35.554644 tid:14058] (adm_com_kvstorage.c adm_com_kvstorage_set_string:273)adm_kvStorage_set_string key:[COM]_[doggedOffset], val:2A8 successed
V/COM      [04-10 23:55:35.563613 tid:14058] (adm_com_doggedTick.cpp run:93)dogged offset 0x(2A8)
I/COM      [04-10 23:55:45.587679 tid:14058] (adm_com_kvstorage.c adm_com_kvstorage_set_string:273)adm_kvStorage_set_string key:[COM]_[doggedOffset], val:2B2 successed
V/COM      [04-10 23:55:45.596792 tid:14058] (adm_com_doggedTick.cpp run:93)dogged offset 0x(2B2)
I/COM      [04-10 23:55:55.620800 tid:14058] (adm_com_kvstorage.c adm_com_kvstorage_set_string:273)adm_kvStorage_set_string key:[COM]_[doggedOffset], val:2BC successed
V/COM      [04-10 23:55:55.629744 tid:14058] (adm_com_doggedTick.cpp run:93)dogged offset 0x(2BC)

由日志可知,触发了信号异常处理函数后,进程并没有退出,还一直在进行log 打印

2.3 问题分析

根据上述的问题复现,我们认识到,线程执行exit 后,进程并没有退出,并且部分业务线程还在保持运行。但是此时进程状态已经异常。即使能够下发任务,OTA 执行过程中也会出现异常。

分析手段:

  1. 若你是使用 VS-Code工具 进行问题复现,你可以通过工具查看进程的堆栈信息。快速定位到进程阻塞的点。
  2. 若你是在虚拟机或客户环境中出现该类问题。你可以使用gdb 查看进程的堆栈信息。
  3. sudo gdb -p 18493
  4. bt

C
yihua@ubuntu:~$ sudo gdb -p 18493
[sudo] password for yihua:
GNU gdb (Ubuntu 8.1.1-0ubuntu1) 8.1.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 18493
[New LWP 18494]
[New LWP 18495]
[New LWP 18496]
[New LWP 18497]
[New LWP 18498]
[New LWP 18499]
[New LWP 18500]
[New LWP 18501]
[New LWP 18502]
[New LWP 18503]
[New LWP 18504]
[New LWP 18505]
[New LWP 18506]
[New LWP 18507]
[New LWP 18508]
[New LWP 18509]
[New LWP 18510]
[New LWP 18557]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
__pthread_rwlock_wrlock_full (abstime=0x0, rwlock=0x5623ef97cb20) at pthread_rwlock_common.c:915
915     pthread_rwlock_common.c: No such file or directory.
(gdb) bt

#0  __pthread_rwlock_wrlock_full (abstime=0x0, rwlock=0x5623ef97cb20) at pthread_rwlock_common.c:915
#1  __GI___pthread_rwlock_wrlock (rwlock=0x5623ef97cb20) at pthread_rwlock_wrlock.c:27
#2  0x00007fdb6d31d1da in Poco::RWLockImpl::writeLockImpl (this=0x5623ef97cb20) at /home/yihua/mi-ota/foundation/packages/Poco/RWLock_POSIX.h:71
#3  0x00007fdb6d31d3e4 in Poco::RWLock::writeLock (this=0x5623ef97cb20) at /home/yihua/mi-ota/foundation/packages/Poco/RWLock.h:142
#4  0x00007fdb6d31d439 in Poco::ScopedRWLock::ScopedRWLock (this=0x7ffd3b63a5c0, rwl=..., write=true) at /home/yihua/mi-ota/foundation/packages/Poco/RWLock.h:161
#5  0x00007fdb6d31d4fc in Poco::ScopedWriteRWLock::ScopedWriteRWLock (this=0x7ffd3b63a5c0, rwl=...) at /home/yihua/mi-ota/foundation/packages/Poco/RWLock.h:190
#6  0x00007fdb6d3297bd in abup::ota::com::MessageDispatcher::Impl::unregisterMessageHandler (this=0x5623ef97cb20, id=1)
    at /home/yihua/mi-ota/common/lib/src/messageDispatcher/adm_com_messageDispatcherImpl.hpp:106
#7  0x00007fdb6d329b7d in abup::ota::com::MessageDispatcher::unregisterMessageHandler (this=0x7fdb6d9ee268 <abup::ota::com::MessageDispatcher::instance()::msgDispatcher>,
    id=1) at /home/yihua/mi-ota/common/lib/src/messageDispatcher/adm_com_messageDispatcher.cpp:33
#8  0x00005623eecde7f1 in abup::ota::uim::Monitor::~Monitor (this=0x5623f0456300, __in_chrg=<optimized out>) at /home/yihua/mi-ota/application/uim/src/adm_uim_monitor.cpp:50
#9  0x00005623eecdeb91 in Poco::SingletonHolder<abup::ota::uim::Monitor>::~SingletonHolder (this=0x5623eef77b00 <abup::ota::uim::Monitor::instance()::monitorHolder>,
    __in_chrg=<optimized out>) at /home/yihua/mi-ota/foundation/packages/Poco/SingletonHolder.h:47
#10 0x00007fdb6bac4031 in __run_exit_handlers (status=1, listp=0x7fdb6be6c718 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true)
    at exit.c:108
#11 0x00007fdb6bac412a in __GI_exit (status=<optimized out>) at exit.c:139
#12 0x00005623eeba20d2 in backtrace_signal_handler (signal=11) at /home/yihua/mi-ota/application/appcore/src/adm_apc_backtrace.c:64
#13 <signal handler called>
#14 0x00007fdb6bb65680 in __GI___nanosleep (requested_time=requested_time@entry=0x7ffd3b63da50, remaining=remaining@entry=0x7ffd3b63da50)
    at ../sysdeps/unix/sysv/linux/nanosleep.c:28
#15 0x00007fdb6bb6555a in __sleep (seconds=0) at ../sysdeps/posix/sleep.c:55
#16 0x00005623eeba534b in adm_apc_waitTask () at /home/yihua/mi-ota/application/appcore/src/adm_apc_coreAchieve.c:474
#17 0x00005623eeba6b74 in main (argc=3, argv=0x7ffd3b63dbd8) at /home/yihua/mi-ota/application/appcore/src/adm_apc_core.c:257
 

从堆栈的信息中,我们可以定位到,进程阻塞在Poco 的读写锁中。但是程序都已经exit 了,为什么会调用Poco的读写锁呢?

2.4 拓展

为了进一步理解上面问题的现象,我们可以先了解以下问题。

2.4.1 程序从main 开始的吗?

我们程序员一直认为程序从main 函数开始的。但是实际上真的是如此吗?我的回答是【NO】。证据如下:

【铁证1】下面是一段C语言代码:

C
#include<stdlib.h>
#include<stdio.h>

int a = 3;
int main(int argc, char* argv[])
{
    printf("hello worldn");
    return 0;
}

在进入到main函数前,全局变量a 已经被初始化, argc 和 argv 两个参数也被正确传入。此时一些系统的I/O,堆和栈空间也已经悄悄完成。我们才可以放心使用printf。

【铁证2】在C++的代码里面,在main 函数前能够执行的代码还会更多,例如:

C++
#include <string>
using namespace std;
string v;
double foo()
{
    return 1.0;
}

double g = foo();
int main()
{
    return 0;
}

在进入main 函数前,对象v的构造函数,以及用于初始化全局变量g 的函数foo 都会在main 函数 之前调用。

这些证据是不是推翻了根深蒂固的【程序从main 开始】思想?实际上的程序运行过程大致如下:

  1. 操作系统在创建进程之后,把控制权交给了程序的入口。【这个入口并不是指main,而是运行库glibc中的一个入口函数】
  2. 入口函数主要执行以下操作。
  3. 对运行库和程序运行环境进行初始化,包括堆栈,I/O,线程,全局变量的构造等等
  4. 调用main 函数,正式执行程序主题部分
  5. main函数执行完成后,返回到入口函数。入口函数会进行清理工作。包括全局变量析构,堆销毁,关闭I/O等,然后进行系统调用结束进程。

思考:

通过程序运行过程的了解,我们对于一些异常情况也是能解释通了。

  1. main 函数第一行的代码未执行。【入口函数进行环境初始化阻塞】
  2. main 函数执行return 后,进程未退出。【入口函数清理工作阻塞】

2.5 问题解决

通过对程序运行流程的了解。我们知道本次问题出现在资源回收的流程,结合代码分析:

在执行 类Monitor 析构函数时,会获取Poco的读写锁,导致阻塞。初步怀疑是资源竞争问题。后续请教同事,发现是因为析构的顺序导致异常。因为C++的构造和析构的顺序类似栈,若对象之间存在依赖关系,就容易造成这个问题。

最终的解决方式是更改了实例的创建顺序

3. 总结

虽然问题的根因不是因为exit 无法生效,而是由于代码中的析构函数导致。但是为什么我的标题是《请对exit 说No !!!》呢?

首先程序出现异常,直接exit 退出进行。我认为这是一种投机取巧的方式,让系统去帮我们做资源的回收,这是一种无法预期的行为。为什么我们要将程序置于这种无法预期的处境呢

其次,针对其它线程不做保护措施,这是业务实现过程中的不负责。存在业务异常风险

我希望大家能够对这种投机取巧,对程序没有敬畏之心的态度说 NO !!!

本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
THE END
分享
二维码

)">
< <上一篇
下一篇>>