Opened 7 years ago

Closed 7 years ago

#3025 closed defect (fixed)

Segmentation fault in SegmentObjectHolderTest.grow test

Reported by: stephen Owned by: vorner
Priority: medium Milestone: Sprint-20130820
Component: Unclassified Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 4 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

Configuration:

  • latest version of master at time of reporting (ed2fb666188ea2c005b36d767bb7bb2861178dcc)
  • g++ 4.6.3 on Ubuntu 12.04
  • rebuilding from scratch (i.e. deleting all object files and compiling everything anew; this included deletion of all ccache cached files before the build).

gdb trace:

Core was generated by `.libs/lt-run_unittests'.
Program terminated with signal 11, Segmentation fault.
#0  0x006bbf38 in get_pointer (this=0x0) at /usr/include/boost/interprocess/offset_ptr.hpp:81
81	   {  return (internal.m_offset == 1) ? 0 : (const_cast<char*>(reinterpret_cast<const char*>(this)) + internal.m_offset); }
(gdb) bt
#0  0x006bbf38 in get_pointer (this=0x0) at /usr/include/boost/interprocess/offset_ptr.hpp:81
#1  get (this=0x0) at /usr/include/boost/interprocess/offset_ptr.hpp:153
#2  get_bits (n=...) at /usr/include/boost/interprocess/offset_ptr.hpp:443
#3  get_color (n=...) at /usr/include/boost/intrusive/detail/rbtree_node.hpp:136
#4  boost::intrusive::rbtree_algorithms<boost::intrusive::rbtree_node_traits<boost::interprocess::offset_ptr<void>, true> >::rebalance_after_erasure (
    header=..., x=..., x_parent=...) at /usr/include/boost/intrusive/rbtree_algorithms.hpp:789
#5  0x006bc924 in boost::intrusive::rbtree_algorithms<boost::intrusive::rbtree_node_traits<boost::interprocess::offset_ptr<void>, true> >::erase (header=..., 
    z=...) at /usr/include/boost/intrusive/rbtree_algorithms.hpp:410
#6  0x006bcbd6 in boost::intrusive::rbtree_impl<boost::intrusive::setopt<boost::intrusive::detail::base_hook_traits<boost::interprocess::detail::intrusive_value_type_impl<boost::intrusive::detail::generic_hook<boost::intrusive::get_set_node_algo<boost::interprocess::offset_ptr<void>, true>, boost::intrusive::default_tag, (boost::intrusive::link_mode_type)1, 3>, char>, boost::intrusive::rbtree_node_traits<boost::interprocess::offset_ptr<void>, true>, (boost::intrusive::link_mode_type)1, boost::intrusive::default_tag, 3>, std::less<boost::interprocess::detail::intrusive_value_type_impl<boost::intrusive::detail::generic_hook<boost::intrusive::get_set_node_algo<boost::interprocess::offset_ptr<void>, true>, boost::intrusive::default_tag, (boost::intrusive::link_mode_type)1, 3>, char> >, unsigned int, true> >::erase (this=0xb7762020, i=...) at /usr/include/boost/intrusive/rbtree.hpp:807
#7  0x006c0218 in erase (i=..., this=<optimized out>) at /usr/include/boost/intrusive/set.hpp:568
#8  boost::interprocess::segment_manager<char, boost::interprocess::rbtree_best_fit<boost::interprocess::null_mutex_family, boost::interprocess::offset_ptr<void>, 0u>, boost::interprocess::iset_index>::priv_generic_named_destroy<char> (this=0xb7762004, name=0x81a143d "mark", index=..., table=..., 
    is_intrusive_index=...) at /usr/include/boost/interprocess/segment_manager.hpp:977
#9  0x006b3e4a in destroy<boost::interprocess::offset_ptr<void> > (this=<optimized out>, name=<optimized out>)
    at /usr/include/boost/interprocess/segment_manager.hpp:538
#10 destroy<boost::interprocess::offset_ptr<void> > (name=<optimized out>, this=<optimized out>)
    at /usr/include/boost/interprocess/detail/managed_memory_impl.hpp:565
#11 isc::util::MemorySegmentMapped::clearNamedAddressImpl (this=0x1, name=0xbfb148ed "") at memory_segment_mapped.cc:402
#12 0x08107027 in clearNamedAddress (name=<optimized out>, this=<optimized out>) at ../../../../../src/lib/util/memory_segment.h:299
#13 (anonymous namespace)::SegmentObjectHolderTest_grow_Test::TestBody (this=0x87ab5a0) at segment_object_holder_unittest.cc:135
#14 0x081849e0 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)
    ()
#15 0x081802c1 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ()
#16 0x0816f728 in testing::Test::Run() ()
#17 0x0816fddd in testing::TestInfo::Run() ()
#18 0x081702db in testing::TestCase::Run() ()
#19 0x0817478a in testing::internal::UnitTestImpl::RunAllTests() ()
#20 0x08185b46 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ()
#21 0x08181057 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ()
#22 0x0817376d in testing::UnitTest::Run() ()
#23 0x08160741 in isc::util::unittests::run_all () at run_all.cc:87
#24 0x0805bb50 in main (argc=1, argv=0xbfb151f4) at run_unittests.cc:25

Subtickets

Attachments (1)

gdb.log (39.6 KB) - added by stephen 7 years ago.
Output from run_gdb script for build with CXXFLAGS='-ggdb3 -O0'

Download all attachments as: .zip

Change History (19)

comment:1 Changed 7 years ago by muks

Is this reproducible?

comment:2 Changed 7 years ago by stephen

Yes.

As indicating in the traceback (and confirmed by gdb), the line causing the error is in src/lib/util/memory_segment_mapped.cc in the method clearNamedAddressImpl().

    return (impl_->base_sgmt_->destroy<offset_ptr<void> >(name));

(I'm not sure that you can believe the line in the traceback at #11 (this = 0x1); this appears to be an artifact of the printing of addresses - I've noticed that outputting a pointer to an object comes up with "1" a number of times.)

My tests show that in the test, the MemorySegmentGrown exception in MemorySegmentMapped::allocate is thrown when the "size" argument is 32768. impl_->base_sgmt_-_get_size() returns 65536 and impl_->base_sgmt_-_get_free_memory() returns 65272.

comment:3 Changed 7 years ago by muks

  • Estimated Difficulty changed from 0 to 4

comment:4 Changed 7 years ago by muks

  • Milestone changed from New Tasks to Sprint-20130723

comment:5 Changed 7 years ago by stephen

On thing to add which may or may not be relevant: this is occurring on a 32-bit system.

comment:6 Changed 7 years ago by vorner

  • Owner set to vorner
  • Status changed from new to accepted

comment:7 Changed 7 years ago by vorner

  • Owner changed from vorner to stephen
  • Status changed from accepted to assigned

Hello

I still didn't manage to reproduce and I don't know what is happening. But I wrote a script that collects as much information about what is happening as I could think of, maybe it'll help.

So, could you check out the branch vorner-segfault-debug, compile it with CXXFLAGS='-O0 -ggdb3' to provide as much debug info as possible, go to src/lib/datasrc/tests/memory and run the run_gdb script there? It should leave a gdb.log file that I'd like to see.

Thank you

Changed 7 years ago by stephen

Output from run_gdb script for build with CXXFLAGS='-ggdb3 -O0'

comment:8 follow-up: Changed 7 years ago by stephen

  • Owner changed from stephen to vorner

Output from script added. Note that the test does not crash with -O0, but does when optimization is switched on.

comment:9 in reply to: ↑ 8 Changed 7 years ago by vorner

  • Owner changed from vorner to stephen

Replying to stephen:

Output from script added. Note that the test does not crash with -O0, but does when optimization is switched on.

Hmm. This little detail makes the output mostly useless :-(. If you mentioned this, I must have missed it.

Can you pull (I fixed some small issue with the script) and try again with CXXFLAGS='-ggdb3 -O2? That's worse than with -O0 for debugging, but I guess it can't be helped :-|.

Thank you

comment:10 Changed 7 years ago by vorner

  • Owner changed from stephen to vorner

Taking back, reproduced with boost 1.46.

comment:11 Changed 7 years ago by vorner

  • Owner changed from vorner to UnAssigned
  • Status changed from assigned to reviewing

Hello

I did not find the find the exact reason for what is happening. But it seems to be bug either in gcc or boost, which is probably fixed in newer versions. It happens only when:

  • Boost is version 1.46 (or older, it does /not/ happen with 1.48).
  • GCC 4.6 (it didn't happen on 4.5, I don't know about newer).
  • Optimisations are turned on.

My guess is boost contained something which has undefined behaviour according to the standard and GCC introduced some new optimisation which relied on assumption broken by boost.

First, I tried disabling optimisation selectively. But there are two problems with that:

  • Such thing is not really supported with autotools. I can't append -O0 after user-set flags. There are workarounds, but they seem ugly, complex and fragile (https://lists.gnu.org/archive/html/automake/2006-09/msg00038.html).
  • When I managed to make the only file that includes the shared-memory related things compile with -O0, I got a different segfault in that area. If the whole thing is compiled without optimisations, it works. I wouldn't think this to be possible, but it seems it is.

So I propose different solution. The problem happens only in shared-memory related functionality. I added a check and ask the user to disable shared memory if the boost version is older than 1.48. That is quite old already and the few people who really do need shared memory support (because they have large zones and need to run auth multi-core) are big admins and can upgrade.

Do you think that is OK?

If so, I propose this changelog:

Due to various problems with older versions of boost and shared memory,
the server rejects to compile with combination of boost < 1.48 and
shared memory enabled. Most users don't need shared memory, admins of
large servers are asked to upgrade boost.

comment:12 Changed 7 years ago by muks

It looks like this was a bug that was fixed in 1.47. See: https://github.com/ryppl/boost-svn/commit/43c6986eefff26528f6b7b998b5b80ab2c4ccd15 or boost svn commit 70066, skip to offset_ptr.hpp changes and look at the noinline updates for GCC.

comment:14 Changed 7 years ago by vorner

Applying the whole patch proved problematic and I did not want to hand-apply each of the changes. But compiling with -fno-inline, which turns inlining off completely, makes it not crash. So it definitely is related to this.

comment:15 Changed 7 years ago by muks

  • Owner changed from UnAssigned to vorner

Hi Michal

This basically looks good, but please rename the RBTREE variable names to OFFSET_PTR as the bug is in the offset_ptr implementation.

I'd also change the message to say what the exact problem is. Something like:

-This is known to cause problems under certain situations.
+Older versions of boost have a bug which causes segfaults in the offset_ptr implementation when compiled using GCC with optimization enabled. See ticket #3025.

comment:16 Changed 7 years ago by vorner

  • Owner changed from vorner to muks

How about now?

comment:17 Changed 7 years ago by muks

  • Owner changed from muks to vorner

I've pushed a couple of minor commits to the branch. If these are ok, please go ahead and merge.

comment:18 Changed 7 years ago by vorner

  • Resolution set to fixed
  • Status changed from reviewing to closed

Thank you

Note: See TracTickets for help on using tickets.