在 Windows 上不会发生的 Linux 上不可见的 SIGSEGV? [英] Invisiable SIGSEGV on linux that does not happen on windows?

查看:28
本文介绍了在 Windows 上不会发生的 Linux 上不可见的 SIGSEGV?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个支持插件的TCP/HTTP 服务器 以共享库的形式(DLL.so).它通过 premakemake 和 .sln 文件构建系统一个>.当我启动我的应用程序时,我向它提供了一个这样的配置文件,其中描述了哪些库服务器将用作插件以及它将传递给 tham 的参数.有一段时间我有 2 个插件,一切都很好.如果我向我的服务器配置文件提供类似 这个.但是现在我有我正在开发的新插件,所以 新配置文件.

I have a TCP/HTTP server that supports plugins in form of Shared Libraries (DLL and .so). It has make and .sln files build system via premake. When I start my application I feed to it a configuration file like this with description of what libraries server shall use as plugins and what arguments it shall pass to tham. For some time I had 2 plugins and all worked just fine. and even now works just fine if I feed to my server config fdiles alike this. But Now I have new plugin I am developing and so new config file.

在 linux 上设置我的服务器所需的步骤非常简单

Steps required to setup my server on linux are fiew and simple

  • 下载构建脚本(从这里描述这里)
  • ./cloud_server_net_setup.sh ,不需要超级用户,需要 curl、make 和 g++在常规情况下(不是开发就足够了 - 它会得到提升,它需要的其他库进入本地文件夹,它将构建所有内容,以发布形式构建服务器)
  • 现在您可以 cd 进入 cloud_server/install-dir/
  • 调用export LD_LIBRARY_PATH=./:./lib_boost
  • 并运行我们的服务器./CloudServer
  • download build script (from here as described here)
  • ./cloud_server_net_setup.sh , no superuser needed, requires curl, make and g++ In regular case (not development this is enought - it will get boost, and other libraries it needs into local folder, it will build all of tham, build server in release form )
  • now you can cd into cloud_server/install-dir/
  • call export LD_LIBRARY_PATH=./:./lib_boost
  • and run our server ./CloudServer

但是我们需要调试版本所以在我们调用脚本之后我们

But we need debug wersion so after we call script we

  • cd cloud_server/CloudServer/projects/linux-gmake/
  • 制作
  • cd bin/debug
  • export LD_LIBRARY_PATH=./:(我们调用脚本的地方)/cloud_server/install-dir/lib_boost
  • 现在,我们终于可以调用 gdb了.

所以我们称之为.这就是我们所看到的:

So we call it. and this is what we see:

 gdb ./CloudServer

GNU gdb (GDB) 7.0.1-debian
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/ole_jak/cloud_server/CloudServer/projects/linux-gmake/bin/debug/CloudServer...done.
(gdb) r
Starting program: /home/ole_jak/cloud_server/CloudServer/projects/linux-gmake/bin/debug/CloudServer
[Thread debugging using libthread_db enabled]
Cloud Server v0.5
Copyright (c) 2011 Cloud Forever. All rights reserved.

Type 'help' to see help messages.
Config file path: config.xml
[New Thread 0x7ffff5967700 (LWP 11516)]
[New Thread 0x7ffff5166700 (LWP 11517)]
[New Thread 0x7ffff4965700 (LWP 11518)]
[New Thread 0x7ffff4164700 (LWP 11519)]
[New Thread 0x7ffff3963700 (LWP 11520)]
[New Thread 0x7ffff3162700 (LWP 11521)]
[New Thread 0x7ffff2961700 (LWP 11522)]
[New Thread 0x7ffff2160700 (LWP 11523)]
[New Thread 0x7ffff195f700 (LWP 11524)]
[New Thread 0x7ffff115e700 (LWP 11525)]
[New Thread 0x7ffff095d700 (LWP 11526)]
[New Thread 0x7fffebfff700 (LWP 11527)]
[New Thread 0x7fffeb7fe700 (LWP 11528)]
[New Thread 0x7fffeaffd700 (LWP 11529)]
[New Thread 0x7fffea7fc700 (LWP 11530)]
[New Thread 0x7fffe9ffb700 (LWP 11531)]
Library libFileService.so opened.
[New Thread 0x7fffe953c700 (LWP 11532)]
Library libUsersFilesService.so opened.

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) x/i $pc
0x0:    Cannot access memory at address 0x0

我是 Linux nube,我从 wikipedia 知道关于分段错误的所有信息,但我对我的服务器和我正在创建的这项新服务了解一件事 - 它可以在 Windows 上编译和运行,完全没有错误(VS2008、2010 解决方案可以从相同的 premake 脚本创建).

I am Linux nube and all I know about Segmentation fault I know from wikipedia, but I know one more thing about my server and this new service I am creating - it compiles and runs on Windows with no errors at all (VS2008, 2010 solutions can be created from same premake script).

所以我想知道这 2 个文件的方式和位置 .cpp.h 我创建了一个错误,它不会在 Windows 上显示,而且在Linux?它是可修复的,还是对新鲜的眼睛可见?

So I wonder how and where in this 2 files .cpp and .h I have created an error that does not show on windows at alss an shows so dramaticvally on Linux? And is it fixable, or visiable to fresh eye?

更新:Valgrind 输出

UPDATE: Valgrind output

ole_jak@dspproc:~/cloud_server/CloudServer/projects/linux-gmake/bin/debug$ valgrind ./CloudServer
==11682== Memcheck, a memory error detector
==11682== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==11682== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
==11682== Command: ./CloudServer
==11682==
Cloud Server v0.5
Copyright (c) 2011 Cloud Forever. All rights reserved.

Type 'help' to see help messages.
Config file path: config.xml
Library libFileService.so opened.
Library libUsersFilesService.so opened.
==11682== Jump to the invalid address stated on the next line
==11682==    at 0x0: ???
==11682==    by 0x4D49BE: sqlite3_free (sqlite3.c:18155)
==11682==    by 0x102242D5: sqlite3OsInit (sqlite3.c:14162)
==11682==    by 0x1029EB28: sqlite3_initialize (sqlite3.c:107299)
==11682==    by 0x102A159F: openDatabase (sqlite3.c:108909)
==11682==    by 0x102A1B29: sqlite3_open (sqlite3.c:109156)
==11682==    by 0x1021CAB0: sqlite3pp::database::connect(char const*) (sqlite3pp.cpp:89)
==11682==    by 0x1021C6E3: sqlite3pp::database::database(char const*) (sqlite3pp.cpp:74)
==11682==    by 0x1020DDDF: users_files_service::create_files_table(std::string) (users_files_service.cpp:171)
==11682==    by 0x1020BAFC: users_files_service::apply_config(boost::shared_ptr<boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> > >) (users_files_service.cpp:38)
==11682==    by 0x4B5432: server_utils::parse_config_services(boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >) (server_utils.cpp:156)
==11682==    by 0x4B6436: server_utils::parse_config(boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >) (server_utils.cpp:208)
==11682==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==11682==
==11682==
==11682== Process terminating with default action of signal 11 (SIGSEGV)
==11682==  Bad permissions for mapped region at address 0x0
==11682==    at 0x0: ???
==11682==    by 0x4D49BE: sqlite3_free (sqlite3.c:18155)
==11682==    by 0x102242D5: sqlite3OsInit (sqlite3.c:14162)
==11682==    by 0x1029EB28: sqlite3_initialize (sqlite3.c:107299)
==11682==    by 0x102A159F: openDatabase (sqlite3.c:108909)
==11682==    by 0x102A1B29: sqlite3_open (sqlite3.c:109156)
==11682==    by 0x1021CAB0: sqlite3pp::database::connect(char const*) (sqlite3pp.cpp:89)
==11682==    by 0x1021C6E3: sqlite3pp::database::database(char const*) (sqlite3pp.cpp:74)
==11682==    by 0x1020DDDF: users_files_service::create_files_table(std::string) (users_files_service.cpp:171)
==11682==    by 0x1020BAFC: users_files_service::apply_config(boost::shared_ptr<boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> > >) (users_files_service.cpp:38)
==11682==    by 0x4B5432: server_utils::parse_config_services(boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >) (server_utils.cpp:156)
==11682==    by 0x4B6436: server_utils::parse_config(boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >) (server_utils.cpp:208)
==11682==
==11682== HEAP SUMMARY:
==11682==     in use at exit: 124,050 bytes in 1,083 blocks
==11682==   total heap usage: 1,814 allocs, 731 frees, 183,517 bytes allocated
==11682==
==11682== LEAK SUMMARY:
==11682==    definitely lost: 0 bytes in 0 blocks
==11682==    indirectly lost: 0 bytes in 0 blocks
==11682==      possibly lost: 46,248 bytes in 799 blocks
==11682==    still reachable: 77,802 bytes in 284 blocks
==11682==         suppressed: 0 bytes in 0 blocks
==11682== Rerun with --leak-check=full to see details of leaked memory
==11682==
==11682== For counts of detected and suppressed errors, rerun with: -v
==11682== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
Убито
ole_jak@dspproc:~/cloud_server/CloudServer/projects/linux-gmake/bin/debug$

推荐答案

这是一个讨厌的.我不确定确切的根本原因,但这似乎是与多线程相关的问题.问题的直接原因是 sqlite3Config.m.xSize 函数指针在错误发生的地点和时间为 NULL.

This is a nasty one. I am unsure about the exact root cause, but this seems to be a multi-threading related issue. The immediate cause of the problem is that the sqlite3Config.m.xSize function pointer is NULL at the place and time the error happens.

这个指针应该被初始化为第一次指向一个正确的函数 sqlite3_initialize() 被调用,这通常发生在你第一次打开 SQLite 数据库文件时.通过在 GDB 中设置断点和观察点,我能够验证指针是否已成功设置,但在分段错误时,其值为 NULL.

This pointer is supposed to be initialized to point to a proper function the first time that sqlite3_initialize() is called, which normally happens the first time you open an SQLite database file. By setting breakpoints and watchpoints in GDB I was able to verify that the pointer is successfully set, yet at the time of the segmentation fault its value is NULL.

这可能意味着以下两种情况之一:

That could mean one of two things:

  • 新的指针值没有正确传播到所有线程.SQLite3 应该是线程安全的,但是,线程可能是讨厌的小虫子......

  • The new pointer value is not properly propagated to all threads. SQLite3 is supposed to be thread-safe, but well, threads can be nasty little buggers...

指针在初始化后会重置.我认为这不太可能,因为 sqlite3Config 结构通常不会在初始化后修改.

Something resets the pointer after it has been initialized. I considered this highly unlikely since the sqlite3Config structure is not usually modified after initialization.

我执行了一个简单的测试,顺便说一下,它可以用作临时解决方法:我添加了对 sqite3_initialize() 的显式调用作为 main() 中的第一条语句,允许在启动任何线程之前执行它.结果,分段错误消失了,我得到了你的服务器的 shell 提示,它指向两个备选方案中的第一个.请注意,这充其量只是一种解决方法,因为不应显式调用 sqite3_initialize().问题的根本原因可能仍然存在并以其他方式为人所知 - 或者更糟糕的是,它可能会以微妙但难以检测的方式破坏事物.

I performed a simple test, which incidentally can be used as a temporary workaround: I added an explicit call to sqite3_initialize() as the first statement in main(), allowing it to be executed before any threads are launched. As a result, the segmentation fault went away and I got a shell prompt for your server, which points to the first of the two alternatives. Note that this is a workaround at best, since sqite3_initialize() is not supposed to be explicitly called. The root cause of the issue may still be present and make itself known otherwise - or, worse, it could break things in subtle, yet hard to detect, ways.

由于 SQLite3 应该是线程安全的(以及sqlite3_initialize() 函数 似乎 在这方面是正确的),我不确定发生了什么.这可能是 sqlite3pp 包装器或线程启动方式的问题.

Since SQLite3 is supposed to be thread-safe (and the source code of the sqlite3_initialize() function seems correct in that regard), I am unsure what is happening. It could be a problem with the sqlite3pp wrapper or with the way the threads are launched.

这篇关于在 Windows 上不会发生的 Linux 上不可见的 SIGSEGV?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆