Opened 7 years ago

Closed 7 years ago

#2024 closed defect (fixed)

Segfault in RecursiveQueryTest.v6UDPSendSpecific on Solaris (SPARC)

Reported by: muks Owned by: muks
Priority: medium Milestone: Sprint-20120612
Component: resolver Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 0
Total Hours: 0.5 Internal?: no

Description

RecursiveQueryTest?.v6UDPSendSpecific segfaults consistently on Solaris on SPARC. No other platforms are affected.

See: http://git.bind10.isc.org/~tester/builder//BIND10/20120606052001-Solaris10-sparc-GCC/logs/unittests.out

The issue seems to have started after the #1704 merge, but the issue is most likely unrelated (#1704 was probably a catalyst in it showing up).

jinmei made a backtrace:

(gdb) bt
#0  boost::detail::sp_counted_impl_p<asio::basic_datagram_socket<asio::ip::udp, asio::datagram_socket_service<asio::ip::udp> > >::dispose (this=0xd)
    at /usr/sfw/lib/gcc/sparc-sun-solaris2.10/3.4.3/../../../../include/c++/3.4.3/bits/stl_list.h:134
#1  0x00074504 in ~shared_count (this=0x105728)
    at /udir/jinmei/src/boost_1_46_1/boost/smart_ptr/detail/sp_counted_base_nt.hpp:79
#2  0x7f9d71a0 in boost::detail::sp_counted_impl_p<isc::asiodns::UDPServer::Data>::dispose (this=0x154888) at udp_server.cc:67
#3  0x00074504 in ~shared_count (this=0x154888)
    at /udir/jinmei/src/boost_1_46_1/boost/smart_ptr/detail/sp_counted_base_nt.hpp:79
#4  0x7f9d2164 in ~UDPServer (this=0x113708) at udp_server.h:77
#5  0x7f9b3230 in boost::detail::sp_counted_impl_p<isc::asiodns::UDPServer>::dispose (this=0xd54c8)
    at /udir/jinmei/src/boost_1_46_1/boost/checked_delete.hpp:34
#6  0x7f9a2754 in ~DNSService (this=0x1057b8)
    at /udir/jinmei/src/boost_1_46_1/boost/smart_ptr/detail/sp_counted_base_nt.hpp:79
#7  0x0006c868 in (anonymous namespace)::RecursiveQueryTest::setDNSService (
    this=0x113c30)
    at /udir/jinmei/src/boost_1_46_1/boost/smart_ptr/scoped_ptr.hpp:113

Valgrind reports the same stack with an invalid read error (among other issues):

==11170== Invalid read of size 8
==11170==    at 0x5545A97: boost::detail::sp_counted_impl_p<asio::basic_socket_acceptor<asio::ip::tcp, asio::socket_acceptor_service<asio::ip::tcp> > >::dispose() (reactive_socket_service_base.ipp:52)
==11170==    by 0x414BE8: boost::detail::shared_count::~shared_count() (sp_counted_base_gcc_x86.hpp:145)
==11170==    by 0x5544A27: isc::asiodns::TCPServer::~TCPServer() (shared_ptr.hpp:164)
==11170==    by 0x5544B78: isc::asiodns::TCPServer::~TCPServer() (tcp_server.h:38)
==11170==    by 0x553F8AE: isc::asiodns::DNSService::~DNSService() (sp_counted_base_gcc_x86.hpp:145)
==11170==    by 0x553F948: isc::asiodns::DNSService::~DNSService() (dns_service.cc:74)
==11170==    by 0x42AA21: (anonymous namespace)::RecursiveQueryTest_v6UDPSendSpecific_Test::TestBody() (recursive_query_unittest.cc:307)
==11170==    by 0x4C78139: testing::Test::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78247: testing::internal::TestInfoImpl::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78304: testing::TestCase::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C7863D: testing::internal::UnitTestImpl::RunAllTests() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4113E5: main (run_unittests.cc:26)
==11170==  Address 0x6a4dea8 is 40 bytes inside a block of size 48 free'd
==11170==    at 0x4A072BC: operator delete(void*) (vg_replace_malloc.c:387)
==11170==    by 0x530E88C: isc::asiolink::IOService::~IOService() (service_registry.ipp:76)
==11170==    by 0x427822: (anonymous namespace)::RecursiveQueryTest::setDNSService() (checked_delete.hpp:34)
==11170==    by 0x42AA21: (anonymous namespace)::RecursiveQueryTest_v6UDPSendSpecific_Test::TestBody() (recursive_query_unittest.cc:307)
==11170==    by 0x4C78139: testing::Test::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78247: testing::internal::TestInfoImpl::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78304: testing::TestCase::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C7863D: testing::internal::UnitTestImpl::RunAllTests() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4113E5: main (run_unittests.cc:26)
==11170== 
==11170== Invalid read of size 1
==11170==    at 0x5545AA4: boost::detail::sp_counted_impl_p<asio::basic_socket_acceptor<asio::ip::tcp, asio::socket_acceptor_service<asio::ip::tcp> > >::dispose() (epoll_reactor.ipp:196)
==11170==    by 0x414BE8: boost::detail::shared_count::~shared_count() (sp_counted_base_gcc_x86.hpp:145)
==11170==    by 0x5544A27: isc::asiodns::TCPServer::~TCPServer() (shared_ptr.hpp:164)
==11170==    by 0x5544B78: isc::asiodns::TCPServer::~TCPServer() (tcp_server.h:38)
==11170==    by 0x553F8AE: isc::asiodns::DNSService::~DNSService() (sp_counted_base_gcc_x86.hpp:145)
==11170==    by 0x553F948: isc::asiodns::DNSService::~DNSService() (dns_service.cc:74)
==11170==    by 0x42AA21: (anonymous namespace)::RecursiveQueryTest_v6UDPSendSpecific_Test::TestBody() (recursive_query_unittest.cc:307)
==11170==    by 0x4C78139: testing::Test::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78247: testing::internal::TestInfoImpl::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78304: testing::TestCase::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C7863D: testing::internal::UnitTestImpl::RunAllTests() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4113E5: main (run_unittests.cc:26)
==11170==  Address 0x6a4e038 is 56 bytes inside a block of size 80 free'd
==11170==    at 0x4A072BC: operator delete(void*) (vg_replace_malloc.c:387)
==11170==    by 0x43CA3C: asio::detail::object_pool<asio::detail::epoll_reactor::descriptor_state>::~object_pool() (object_pool.hpp:40)
==11170==    by 0x43CDDC: asio::detail::epoll_reactor::~epoll_reactor() (epoll_reactor.ipp:66)
==11170==    by 0x530E88C: isc::asiolink::IOService::~IOService() (service_registry.ipp:76)
==11170==    by 0x427822: (anonymous namespace)::RecursiveQueryTest::setDNSService() (checked_delete.hpp:34)
==11170==    by 0x42AA21: (anonymous namespace)::RecursiveQueryTest_v6UDPSendSpecific_Test::TestBody() (recursive_query_unittest.cc:307)
==11170==    by 0x4C78139: testing::Test::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78247: testing::internal::TestInfoImpl::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78304: testing::TestCase::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C7863D: testing::internal::UnitTestImpl::RunAllTests() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4113E5: main (run_unittests.cc:26)
==11170== 
==11170== Invalid read of size 8
==11170==    at 0x554D8C7: boost::detail::sp_counted_impl_p<asio::basic_datagram_socket<asio::ip::udp, asio::datagram_socket_service<asio::ip::udp> > >::dispose() (reactive_socket_service_base.ipp:52)
==11170==    by 0x414BE8: boost::detail::shared_count::~shared_count() (sp_counted_base_gcc_x86.hpp:145)
==11170==    by 0x554DEB0: boost::detail::sp_counted_impl_p<isc::asiodns::UDPServer::Data>::dispose() (shared_ptr.hpp:164)
==11170==    by 0x414BE8: boost::detail::shared_count::~shared_count() (sp_counted_base_gcc_x86.hpp:145)
==11170==    by 0x554DDEF: isc::asiodns::UDPServer::~UDPServer() (shared_ptr.hpp:164)
==11170==    by 0x553F8AE: isc::asiodns::DNSService::~DNSService() (sp_counted_base_gcc_x86.hpp:145)
==11170==    by 0x553F948: isc::asiodns::DNSService::~DNSService() (dns_service.cc:74)
==11170==    by 0x42AA21: (anonymous namespace)::RecursiveQueryTest_v6UDPSendSpecific_Test::TestBody() (recursive_query_unittest.cc:307)
==11170==    by 0x4C78139: testing::Test::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78247: testing::internal::TestInfoImpl::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78304: testing::TestCase::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C7863D: testing::internal::UnitTestImpl::RunAllTests() (in /usr/lib64/libgtest.so.0.0.0)
==11170==  Address 0x6a4f3d8 is 40 bytes inside a block of size 48 free'd
==11170==    at 0x4A072BC: operator delete(void*) (vg_replace_malloc.c:387)
==11170==    by 0x530E88C: isc::asiolink::IOService::~IOService() (service_registry.ipp:76)
==11170==    by 0x427822: (anonymous namespace)::RecursiveQueryTest::setDNSService() (checked_delete.hpp:34)
==11170==    by 0x42AA21: (anonymous namespace)::RecursiveQueryTest_v6UDPSendSpecific_Test::TestBody() (recursive_query_unittest.cc:307)
==11170==    by 0x4C78139: testing::Test::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78247: testing::internal::TestInfoImpl::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78304: testing::TestCase::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C7863D: testing::internal::UnitTestImpl::RunAllTests() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4113E5: main (run_unittests.cc:26)
==11170== 
==11170== Invalid read of size 1
==11170==    at 0x554D8D4: boost::detail::sp_counted_impl_p<asio::basic_datagram_socket<asio::ip::udp, asio::datagram_socket_service<asio::ip::udp> > >::dispose() (epoll_reactor.ipp:196)
==11170==    by 0x414BE8: boost::detail::shared_count::~shared_count() (sp_counted_base_gcc_x86.hpp:145)
==11170==    by 0x554DEB0: boost::detail::sp_counted_impl_p<isc::asiodns::UDPServer::Data>::dispose() (shared_ptr.hpp:164)
==11170==    by 0x414BE8: boost::detail::shared_count::~shared_count() (sp_counted_base_gcc_x86.hpp:145)
==11170==    by 0x554DDEF: isc::asiodns::UDPServer::~UDPServer() (shared_ptr.hpp:164)
==11170==    by 0x553F8AE: isc::asiodns::DNSService::~DNSService() (sp_counted_base_gcc_x86.hpp:145)
==11170==    by 0x553F948: isc::asiodns::DNSService::~DNSService() (dns_service.cc:74)
==11170==    by 0x42AA21: (anonymous namespace)::RecursiveQueryTest_v6UDPSendSpecific_Test::TestBody() (recursive_query_unittest.cc:307)
==11170==    by 0x4C78139: testing::Test::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78247: testing::internal::TestInfoImpl::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78304: testing::TestCase::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C7863D: testing::internal::UnitTestImpl::RunAllTests() (in /usr/lib64/libgtest.so.0.0.0)
==11170==  Address 0x6a4f4b8 is 56 bytes inside a block of size 80 free'd
==11170==    at 0x4A072BC: operator delete(void*) (vg_replace_malloc.c:387)
==11170==    by 0x43CA3C: asio::detail::object_pool<asio::detail::epoll_reactor::descriptor_state>::~object_pool() (object_pool.hpp:40)
==11170==    by 0x43CDDC: asio::detail::epoll_reactor::~epoll_reactor() (epoll_reactor.ipp:66)
==11170==    by 0x530E88C: isc::asiolink::IOService::~IOService() (service_registry.ipp:76)
==11170==    by 0x427822: (anonymous namespace)::RecursiveQueryTest::setDNSService() (checked_delete.hpp:34)
==11170==    by 0x42AA21: (anonymous namespace)::RecursiveQueryTest_v6UDPSendSpecific_Test::TestBody() (recursive_query_unittest.cc:307)
==11170==    by 0x4C78139: testing::Test::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78247: testing::internal::TestInfoImpl::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C78304: testing::TestCase::Run() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4C7863D: testing::internal::UnitTestImpl::RunAllTests() (in /usr/lib64/libgtest.so.0.0.0)
==11170==    by 0x4113E5: main (run_unittests.cc:26)
==11170== 

I'm assigning it to the current sprint.

Subtickets

Change History (12)

comment:1 Changed 7 years ago by muks

  • Owner set to muks
  • Status changed from new to assigned

I've got a fix for this. Running a full check.

comment:2 Changed 7 years ago by muks

  • Owner changed from muks to UnAssigned
  • Status changed from assigned to reviewing

Up for review.

comment:3 follow-up: Changed 7 years ago by jinmei

The patch itself at least doesn't seem to be harmful, but it's not
obvious to me what exactly happened in the original version and how
this change solves it. Could you elaborate please?

comment:4 Changed 7 years ago by jinmei

  • Owner changed from UnAssigned to muks

comment:5 in reply to: ↑ 3 ; follow-up: Changed 7 years ago by muks

  • Owner changed from muks to jinmei

Replying to jinmei:

The patch itself at least doesn't seem to be harmful, but it's not
obvious to me what exactly happened in the original version and how
this change solves it. Could you elaborate please?

In the following code:

        io_service_.reset(new IOService());
        callback_.reset(new ASIOCallBack(this));
        dns_service_.reset(new DNSService(*io_service_, callback_.get(), NULL,
                                          NULL));

after the io_service_.reset(), the old IOService object is destroyed, but it is still referenced and used inside dns_service_ including during its destruction (two lines below).

Destroying dns_service_ first also causes this issue to go away.

+       dns_service_.reset(NULL);
        io_service_.reset(new IOService());
        callback_.reset(new ASIOCallBack(this));
        dns_service_.reset(new DNSService(*io_service_, callback_.get(), NULL,
                                          NULL));

comment:6 in reply to: ↑ 5 ; follow-up: Changed 7 years ago by jinmei

Replying to muks:

after the io_service_.reset(), the old IOService object is destroyed, but it is still referenced and used inside dns_service_ including during its destruction (two lines below).

Ah, okay, that makes sense. But in that case I guess we can make it
even safer and cleaner:

  • I guess we can now construct io_service_ in the constructor, and it probably doesn't even have to be a pointer.
  • I also guess we don't need to be able to callback_. Can't it be done in setUp()? (which should be created)

comment:7 Changed 7 years ago by jinmei

  • Owner changed from jinmei to muks

comment:8 in reply to: ↑ 6 Changed 7 years ago by muks

  • Owner changed from muks to jinmei

Replying to jinmei:

Ah, okay, that makes sense. But in that case I guess we can make it
even safer and cleaner:

  • I guess we can now construct io_service_ in the constructor, and it probably doesn't even have to be a pointer.

Done.

  • I also guess we don't need to be able to callback_. Can't it be done in setUp()? (which should be created)

Created SetUp() and moved it there. I've kept it as a scoped_ptr as it has to get deleted.

comment:9 follow-up: Changed 7 years ago by jinmei

The latest version looks almost okay, except one minor point. This
comment is not true any more.

    // We use a pointer for io_service_, because for some tests we
    // need to recreate a new one within one onstance of this class
    IOService io_service_;

Please fix it and merge. I don't think I need to review that.

comment:10 Changed 7 years ago by jinmei

  • Owner changed from jinmei to muks
  • Total Hours changed from 0 to 0.5

comment:11 in reply to: ↑ 9 Changed 7 years ago by muks

Replying to jinmei:

The latest version looks almost okay, except one minor point. This
comment is not true any more.

    // We use a pointer for io_service_, because for some tests we
    // need to recreate a new one within one onstance of this class
    IOService io_service_;

Please fix it and merge. I don't think I need to review that.

Eek! Missed that comment.

comment:12 Changed 7 years ago by muks

  • Resolution set to fixed
  • Status changed from reviewing to closed

Merged to master:

* 6f0bcc6 [2024] Remove obsolete comment about io_service_
* 3eaafc4 [2024] Reset callback_ during SetUp() instead of in setDNSService()
* 4503850 [2024] Change io_service_ to not be a pointer anymore
* 4efe9ac [2024] Don't call setDNSService() multiple times

Resolving as fixed. Thank you for the reviews.

Note: See TracTickets for help on using tickets.