Opened 8 years ago

Closed 8 years ago

#1727 closed defect (fixed)

IntervalTimerTest.invalidArgumentToIntervalTimer failure

Reported by: jreed Owned by: jinmei
Priority: medium Milestone: Sprint-20120403
Component: Unclassified Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: Core Feature Depending on Ticket:
Estimated Difficulty: 7 Add Hours to Ticket: 0
Total Hours: 10.17 Internal?: no

Description

asiolink test failed:

[----------] 5 tests from IntervalTimerTest
[ RUN      ] IntervalTimerTest.invalidArgumentToIntervalTimer
Segmentation fault (core dumped) 
FAIL: run_unittests
Loaded symbols for /usr/libexec/ld.so
#0  asio::detail::service_registry::create<asio::deadline_timer_service<boost::posix_time::ptime, asio::time_traits<boost::posix_time::ptime> > > (owner=Variable "owner" is not available.
)
    at task_io_service.ipp:325
325     task_io_service.ipp: No such file or directory.
        in task_io_service.ipp
(gdb) bt
#0  asio::detail::service_registry::create<asio::deadline_timer_service<boost::posix_time::ptime, asio::time_traits<boost::posix_time::ptime> > > (owner=Variable "owner" is not available.
)
    at task_io_service.ipp:325
#1  0x000000020aa62e73 in asio::detail::service_registry::use_service<asio::deadline_timer_service<boost::posix_time::ptime, asio::time_traits<boost::posix_time::ptime> > > (this=0x20ce0f320) at service_registry.ipp:98
#2  0x000000020aa5a75a in IntervalTimerImpl (this=0x20afe3d00, io_service=Variable "io_service" is not available.
)
    at io_service.hpp:31
#3  0x000000020aa5a8e6 in IntervalTimer (this=0x7f7ffffc5570, 
    io_service=@0x20ce0f2f0) at interval_timer.cc:129
#4  0x0000000000421073 in IntervalTimerTest_invalidArgumentToIntervalTimer_Test::TestBody (this=0x20ce0f2e0) at interval_timer_unittest.cc:148
#5  0x000000020db74df2 in testing::Test::Run ()
   from /usr/local/lib/libgtest.so.0.0
#6  0x000000020db74eff in testing::internal::TestInfoImpl::Run ()
   from /usr/local/lib/libgtest.so.0.0
#7  0x000000020db74fc5 in testing::TestCase::Run ()
   from /usr/local/lib/libgtest.so.0.0
#8  0x000000020db76f3c in testing::internal::UnitTestImpl::RunAllTests ()
   from /usr/local/lib/libgtest.so.0.0
#9  0x0000000000413254 in main (argc=1, argv=Variable "argv" is not available.
) at run_unittests.cc:24

This is on OpenBSD/amd64 5.0.

I didn't research further.

Subtickets

Change History (12)

comment:1 Changed 8 years ago by jreed

Also fallts in IntervalTimerTest?.startIntervalTimer, IntervalTimerTest?.destruct, IntervalTimerIntervalTimerTest?.cancel

But IntervalTimerTest?.overwriteIntervalTimer just hangs.

TCPSocket.processReceivedData has seg fault. TCPSocket.SequenceTest? hangs too. And UDPSocket.processReceivedData has segmentation fault. I stop there for now.

comment:2 Changed 8 years ago by shane

  • Milestone changed from New Tasks to Next-Sprint-Proposed

comment:3 Changed 8 years ago by jelte

  • Estimated Difficulty changed from 0 to 7

comment:4 Changed 8 years ago by jelte

  • Milestone changed from Next-Sprint-Proposed to Sprint-20120320

comment:5 Changed 8 years ago by jinmei

  • Owner set to jinmei
  • Status changed from new to accepted

comment:6 Changed 8 years ago by jinmei

trac1727 is ready for review.

The change itself is simple: if Boost used for building BIND 10 tries
to use threads by default, i.e., as a result of including
boost/config.hpp, make that sure by explicitly defining
BOOST_DISABLE_THREADS for the entire build environment.

Why we need this for this problem is very complicated (described
below). But first, this is the proposed changelog:

405.?	[bug]		jinmei
	Make sure disabling Boost threads if the default configuration is
	to disable it for the system.  This fixes a crash and hang up
	problem on OpenBSD, where the use of Boost thread could be
	different in different program files depending on the order of
	including various header files, and could introduce inconsistent
	states between a library and a program.  Explicitly forcing the
	original default throughout the BIND 10 build environment will
	prevent this from happening.
	(Trac #1727, git TBD)

Then here is the reason:

For the combination of BSD and g++, Boost apparently uses threads by
default due to this setting in boost/config/gcc.hpp:

#if !defined(__MINGW32__) && !defined(linux) && !defined(__linux) && !defined(__linux__)
# define BOOST_HAS_THREADS
#endif 

But the OpenBSD package version of Boost skips this configuration by
skipping importing system configurations:

#ifndef BOOST_NO_CONFIG
#  define BOOST_NO_CONFIG
#endif

(see boost/config/user.hpp)

So, normally BOOST_HAS_THREADS shouldn't be defined for OpenBSD. But
this deviant OS introduces another hack to boost/config/suffix.hpp:

 #if (defined(__MT__) || defined(_MT) || defined(_REENTRANT) \
-    || defined(_PTHREADS) || defined(__APPLE__) || defined(__DragonFly__)) \
+    || defined(_PTHREADS) || defined(_POSIX_THREADS) \
+    || defined(__APPLE__) || defined(__DragonFly__)) \
     && !defined(BOOST_HAS_THREADS)
 #  define BOOST_HAS_THREADS
 #endif

And _POSIX_THREADS is defined in /usr/includes/pthread.h. So, if
some preceding header before boost/config.hpp (which would be included
as a result of any boost header files or asio.hpp) includes pthread.h
(either directly or indirectly from other header files), that
particular translation unit (.cc) has BOOST_HAS_THREADS defined.

The latter case happens for, e.g.,
asiolink/tests/udp_socket_unittest.cc (it seems to be a result of
including <string>). With or without the existence of
BOOST_HAS_THREADS, one key class called task_io_service of ASIO will
differ:

class task_io_service
  : public asio::detail::service_base<task_io_service>
{
public:
...
  // The count of unfinished work.
  boost::detail::atomic_count outstanding_work_;
...
};

This atomic_count is long (8 bytes in many 64-bit systems) when
BOOST_HAS_THREADS isn't defined, while it's a structure holding an int
(normally 4 bytes) if BOOST_HAS_THREADS is defined.

Within libasiolink, BOOST_HAS_THREADS doesn't seem to be defined, and
task_io_service is instantiated in a function
service_registry::create():

template <typename Service>
asio::io_service::service* service_registry::create(
    asio::io_service& owner)
{
  return new Service(owner);
}

On the other hand, the test code itself includes asio.hpp and uses
some of the definitions, and some functions use task_io_service class
objects directly. But the test code does not use all of the ASIO
definitions (service_registry is one of them), so some of the
functions still only exist within libasiolink and they are used.
As a result, in a single program two different interpretations of
the size of task_io_service (or of its internal definition in general)
coexist, and caused various type of strange failures - sometimes
crash, sometimes hang up.

So, to solve this problem we need to make sure at least the status of
BOOST_HAS_THREADS is consistent throughout BIND 10. I considered
several possibilities, and concluded this one was probably least ugly
(in any case, the fix would be a hack). With this approach we could
at least be independent from specific OS, and we can also enable this
when we really need to use boost threads.

comment:7 Changed 8 years ago by jinmei

  • Owner changed from jinmei to UnAssigned
  • Status changed from accepted to reviewing

comment:8 Changed 8 years ago by vorner

  • Owner changed from UnAssigned to vorner

comment:9 follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to jinmei

Hello

Both the explanation and the code makes sense. I can't test it directly, but I believe you did, so I think this can be merged. But I could have a nasty comment on how complicated the stuff with preprocessor and several headers might get :-P.

comment:10 Changed 8 years ago by vorner

  • Total Hours changed from 0 to 0.42

comment:11 in reply to: ↑ 9 Changed 8 years ago by jinmei

Replying to vorner:

Both the explanation and the code makes sense. I can't test it directly, but I believe you did, so I think this can be merged.

Thanks for the review, merged, and closing.

But I could have a nasty comment on how complicated the stuff with preprocessor and several headers might get :-P.

(If I understand this correctly) yeah I know, but basically it's not
the problem inside our code, so as long as we use these external
libraries we need some kind of hack. A bit more seriously, one
possible lesson we can learn from this is that it's generally better
to limit the direct reference to external definitions to a narrower
places. Especially for things like Boost or ASIO, where many things
are defined in header files can be easily accidentally instantiated,
it's safer to hide them within the places exactly where they are needed.
(This is one background reason for having the asiolink wrapper layer).

comment:12 Changed 8 years ago by jinmei

  • Resolution set to fixed
  • Status changed from reviewing to closed
  • Total Hours changed from 0.42 to 10.17
Note: See TracTickets for help on using tickets.