Opened 8 years ago

Closed 7 years ago

#2083 closed defect (duplicate)

Cmdctl shutdown hang, b10-resolver crash and cmdctl crash

Reported by: jreed Owned by:
Priority: medium Milestone:
Component: Unclassified Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 10 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

I tried to restart Cmdctl to test a patch. Running:

$ echo Cmdctl shutdown | /home/reed/opt/bind10/bin/bindctl

Just hung. Also it hangs running it from the bindctl prompt.

Here is the logging (I removed the unrelated auth <-> stats logging):

2012-06-27 12:29:26.087 DEBUG [b10-cmdctl.cmdctl] CMDCTL_SEND_COMMAND sending command shutdown to module Cmdctl
...
2012-06-27 12:29:31.520 DEBUG [b10-boss.boss] BIND10_RECEIVED_COMMAND received command: show_processes
2012-06-27 12:29:31.528 DEBUG [b10-boss.boss] BIND10_RECEIVED_COMMAND received command: show_processes
...
2012-06-27 12:30:33.562 DEBUG [b10-boss.boss] BIND10_RECEIVED_COMMAND received command: show_processes
2012-06-27 12:30:33.569 DEBUG [b10-boss.boss] BIND10_RECEIVED_COMMAND received command: show_processes

I aborted bindctl and tested again; same hang. Then I tried again to see if shutdown worked for anything, by doing a "Resolver shutdown". That worked. But I got some weird logs:

2012-06-27 12:32:56.356 INFO  [b10-resolver.resolver] RESOLVER_SHUTDOWN resolver  shutdown complete
pure virtual method called
terminate called without an active exception
2012-06-27 12:32:56.818 INFO  [b10-boss.boss] BIND10_LOST_SOCKET_CONSUMER consumer 21 of sockets disconnected, considering all its sockets closed
2012-06-27 12:32:56.820 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 24058
 of b10-resolver ended with status 134
2012-06-27 12:32:56.821 ERROR [b10-boss.boss] BIND10_COMPONENT_FAILED component 
b10-resolver (pid 24058) failed: process dumped core with exit status 134 (killed by signal 6: SIGABRT)

The "pure virtual method called" and "terminate called without an active exception" were on lines by themselves without any logging identification.

Here is the gdb backtrace for that core:

#0  0x00007f7ff1ae54aa in _lwp_kill () from /usr/lib/libc.so.12
#1  0x00007f7ff1ae4e1e in abort () from /usr/lib/libc.so.12
#2  0x00007f7ff2abc4bc in __gnu_cxx::__verbose_terminate_handler ()
   from /usr/lib/libstdc++.so.7
#3  0x00007f7ff2ac05e7 in __cxxabiv1::__terminate ()
   from /usr/lib/libstdc++.so.7
#4  0x00007f7ff2ac061d in std::terminate () from /usr/lib/libstdc++.so.7
#5  0x00007f7ff2ab12ff in __cxa_pure_virtual () from /usr/lib/libstdc++.so.7
#6  0x00007f7ff540e974 in isc::log::LoggerImpl::outputRaw (
    this=0x7f7ff7b16100, severity=@0x6, message=@0x0)
    at ../../../src/lib/util/interprocess_sync.h:116
#7  0x000000000041daf1 in ~ResolverImpl (this=0x7f7ff7b09200)
    at ../../../src/lib/log/log_formatter.h:167
#8  0x000000000040fb5c in ~Resolver (this=0x7f7ff7b17680) at resolver.cc:373
#9  0x0000000000421d8f in __tcf_4 ()
    at /usr/pkg/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:145
#10 0x00007f7ff1ac645d in __cxa_finalize () from /usr/lib/libc.so.12
#11 0x00007f7ff1ac5e2a in exit () from /usr/lib/libc.so.12
#12 0x000000000040f361 in ___start ()
#13 0x00007f7ff7ffa000 in ?? ()
#14 0x0000000000000002 in ?? ()
#15 0x00007f7ffffffe2d in ?? ()
#16 0x0000000000000000 in ?? ()

bind10 restarted the resolver automatically. The next logging then was:

2012-06-27 12:32:56.833 DEBUG [b10-boss.boss] BIND10_STARTED_PROCESS_PID started
 b10-resolver (PID 6011)
----------------------------------------
Exception happened during processing of request from ('127.0.0.1', 55577)
----------------------------------------
Exception happened during processing of request from ('127.0.0.1', 55586)
----------------------------------------
----------------------------------------
Traceback (most recent call last):
  File "/usr/pkg/lib/python3.1/socketserver.py", line 559, in process_request_th
read
    self.finish_request(request, client_address)
  File "/usr/pkg/lib/python3.1/socketserver.py", line 322, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/pkg/lib/python3.1/socketserver.py", line 615, in __init__
    self.finish_request(request, client_address)
  File "/usr/pkg/lib/python3.1/socketserver.py", line 322, in finish_request
    self.handle()
  File "/usr/pkg/lib/python3.1/http/server.py", line 367, in handle
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/pkg/lib/python3.1/socketserver.py", line 615, in __init__
    self.handle()
  File "/usr/pkg/lib/python3.1/http/server.py", line 367, in handle
    self.handle_one_request()
  File "/usr/pkg/lib/python3.1/http/server.py", line 361, in handle_one_request
    method()
  File "/home/reed/opt/bind10/libexec/bind10-devel/b10-cmdctl", line 152, in do_
POST
    self.handle_one_request()
  File "/usr/pkg/lib/python3.1/http/server.py", line 361, in handle_one_request
    self.send_response(rcode)
  File "/usr/pkg/lib/python3.1/http/server.py", line 419, in send_response
    self.send_header('Server', self.version_string())
  File "/usr/pkg/lib/python3.1/http/server.py", line 425, in send_header
    self.wfile.write(("%s: %s\r\n" % (keyword, value)).encode('ASCII', 'strict')
)
  File "/usr/pkg/lib/python3.1/socket.py", line 238, in write
    method()
  File "/home/reed/opt/bind10/libexec/bind10-devel/b10-cmdctl", line 152, in do_
POST
    self.send_response(rcode)
  File "/usr/pkg/lib/python3.1/http/server.py", line 419, in send_response
    return self._sock.send(b)
  File "/usr/pkg/lib/python3.1/ssl.py", line 221, in send
    self.send_header('Server', self.version_string())
  File "/usr/pkg/lib/python3.1/http/server.py", line 425, in send_header
    self.wfile.write(("%s: %s\r\n" % (keyword, value)).encode('ASCII', 'strict')
)
  File "/usr/pkg/lib/python3.1/socket.py", line 238, in write
    return self._sock.send(b)
  File "/usr/pkg/lib/python3.1/ssl.py", line 221, in send
    v = self._sslobj.write(data)
socket.error: [Errno 32] Broken pipe
    v = self._sslobj.write(data)
socket.error: [Errno 32] Broken pipe
2012-06-27 12:32:56.892 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 19275
 of b10-cmdctl ended with status 0
2012-06-27 12:32:56.892 ERROR [b10-boss.boss] BIND10_COMPONENT_FAILED component 
b10-cmdctl (pid 19275) failed: process exited normally with exit status 0
2012-06-27 12:32:56.893 INFO  [b10-boss.boss] BIND10_COMPONENT_START component b
10-cmdctl is starting

So finally cmdctl did stop (and restart).

If needed we can split this into multiple tickets.

Subtickets

Change History (5)

comment:1 Changed 8 years ago by jinmei

"pure virtual method called" sounds very weird. In theory that kind
of bug should have been caught at compile time. Maybe some tricky
shared-library versioning issue?

I also suspect it's quite likely to be system dependent. Please
specify the system details.

comment:2 Changed 8 years ago by shane

  • Estimated Difficulty changed from 0 to 10

comment:3 Changed 8 years ago by shane

  • Milestone New Tasks deleted

comment:4 Changed 7 years ago by jinmei

The cmdctl problem should have been solved by #2712.

The issue of b10-resolver should be completely different from cmdctl.

I'm going to close this ticket. If the resolver thing is still a
problem, please file a new report focusing on it.

comment:5 Changed 7 years ago by jinmei

  • Resolution set to duplicate
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.