Opened 8 years ago

Closed 6 years ago

#1672 closed defect (wontfix)

b10-resolver crashing with status 11 when compiled with --enable-boost-threads

Reported by: Jeffry A. Spain Owned by:
Priority: low Milestone: DNS Outstanding Tasks
Component: resolver Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: Low
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 7 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

In my most recent build of bind10-devel-20110119, built according to http://bind10.isc.org/wiki/SystemNotesUbuntuOneiric, b10-resolver is failing after being given the query 'dig @localhost jaspain.net a', or any query for that matter. Most of the installed prerequisites are newer versions than those in the standard Ubuntu packages that I have used previously with success, so I will try to figure out which one may be causing the problem. Based on the following log excerpts, if you have a sense of where the problem lies, would you please let me know. Thanks.

The following is log from bind10-users mailing list. https://lists.isc.org/pipermail/bind10-users/2012-February/000216.html
Apparently any query caused resolver to crash.

2012-02-12 23:00:20.613 DEBUG [b10-resolver.resolver] RESOLVER_QUERY_ACCEPTED query accepted: 'jaspain.net./A/IN' from 127.0.0.1#37965
2012-02-12 23:00:20.613 DEBUG [b10-resolver.resolver] RESOLVER_NORMAL_QUERY processing normal query
2012-02-12 23:00:20.613 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <jaspain.net. IN A> (resolve() instance 2)
2012-02-12 23:00:20.613 DEBUG [b10-resolver.reslib] RESLIB_RECQ_CACHE_NO_FIND did not find <jaspain.net. IN A> in the cache, starting RunningQuery (resolve() instance 2)
2012-02-12 23:00:20.613 DEBUG [b10-resolver.reslib] RESLIB_NSAS_LOOKUP looking up nameserver for zone . in the NSAS
2012-02-12 23:00:20.613 DEBUG [b10-resolver.nsas] NSAS_SEARCH_ZONE_NS searching NSAS for nameservers for zone .
2012-02-12 23:00:20.613 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <. IN NS> (resolve() instance 1)
2012-02-12 23:00:20.613 DEBUG [b10-resolver.nsas] NSAS_FIND_NS_ADDRESS asking resolver to obtain A and AAAA records for l.root-servers.net.
2012-02-12 23:00:20.613 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <l.root-servers.net. IN A> (resolve() instance 1)
2012-02-12 23:00:20.613 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <l.root-servers.net. IN AAAA> (resolve() instance 1)
2012-02-12 23:00:20.676 DEBUG [b10-resolver.nsas] NSAS_SEARCH_ZONE_NS searching NSAS for nameservers for zone net.
2012-02-12 23:00:20.676 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <net. IN NS> (resolve() instance 1)
2012-02-12 23:00:20.678 INFO  [b10-boss.boss] BIND10_LOST_SOCKET_CONSUMER consumer 20 of sockets disconnected, considering all its sockets closed
2012-02-12 23:00:20.678 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 1855 of b10-resolver ended with status 11
2012-02-12 23:00:20.678 ERROR [b10-boss.boss] BIND10_COMPONENT_FAILED component b10-resolver (pid 1855) failed with 11 exit status
2012-02-12 23:00:20.678 INFO  [b10-boss.boss] BIND10_COMPONENT_START component b10-resolver is starting
2012-02-12 23:00:20.678 WARN  [b10-boss.boss] BIND10_START_AS_NON_ROOT_RESOLVER starting b10-resolver as a user, not root. This might fail.
2012-02-12 23:00:20.678 INFO  [b10-boss.boss] BIND10_STARTING_PROCESS starting process b10-resolver
2012-02-12 23:00:20.685 DEBUG [b10-boss.boss] BIND10_STARTED_PROCESS_PID started b10-resolver (PID 1867)

Subtickets

Attachments (2)

core.tar.xz (321.7 KB) - added by jaspain 8 years ago.
Core dumps, compressed with "tar cJf"
b10-resolver (2.1 MB) - added by jaspain 8 years ago.
Executable binary associated with core dumps.

Change History (14)

comment:1 follow-up: Changed 8 years ago by jreed

I created a ticket for this report. http://bind10.isc.org/ticket/1672

I assume exit status 11 is EAGAIN or EWOULDBLOCK on this Ubuntu Linux system.

Note it may be related to that last log message which is documented:

http://bind10.isc.org/docs/bind10-messages.html#BIND10_START_AS_NON_ROOT_RESOLVER
The resolver is being started or restarted without root privileges.
If the module needs these privileges, it may have problems starting.
Note that this issue should be resolved by the pending 'socket-creator'
process; once that has been implemented, modules should not need root
privileges anymore. See tickets #800 and #801 for more information.

But #801 and #802 are closed.

comment:2 Changed 8 years ago by jaspain

I previously thought that building the bind10 prerequisites from source rather than installing Ubuntu packages might be part of the problem. Now I am convinced it is not. I rebuilt bind10-devel-20120119 installing prerequisites using

apt-get install libboost-all-dev libbotan1.8-dev libsqlite3-dev python3-dev

Since there is no Ubuntu package for log4cplus-1.0.4, I built that from source.
With this build I reproduced the problem, running bind10 both as root and as user bind.
I used the following configuration:

config remove Boss/components b10-xfrout
config remove Boss/components b10-xfrin
config remove Boss/components b10-auth
config add Boss/components b10-resolver
config set Boss/components/b10-resolver/special resolver
config set Boss/components/b10-resolver/kind needed
config set Boss/components/b10-resolver/priority 10
config commit

config add Logging/loggers
config set Logging/loggers[0]/name *
config set Logging/loggers[0]/severity DEBUG
config set Logging/loggers[0]/debuglevel 40
config add Logging/loggers[0]/output_options
config set Logging/loggers[0]/output_options[0]/destination file
config set Logging/loggers[0]/output_options[0]/output /var/log/bind10.log
config set Logging/loggers[0]/output_options[0]/maxsize 1048576
config set Logging/loggers[0]/output_options[0]/maxver 16
config set Logging/loggers[0]/output_options[0]/flush true
config show all Logging/loggers
config commit

The following log entries were generated running bind10 as user root. Note in particular the warning RESOLVER_NO_ROOT_ADDRESS no root addresses available and the error BIND10_COMPONENT_FAILED component b10-resolver (pid 12939) failed with 11 exit status, which is the same exit code 11 as when running b10-resolver as user bind.

2012-02-13 21:51:45.517 DEBUG [b10-boss.boss] BIND10_STARTED_CC started configuration/command session
2012-02-13 21:51:45.517 INFO  [b10-boss.boss] BIND10_READING_BOSS_CONFIGURATION reading boss configuration
2012-02-13 21:51:45.517 INFO  [b10-boss.boss] BIND10_CONFIGURATOR_RECONFIGURE reconfiguring running components
2012-02-13 21:51:45.517 DEBUG [b10-boss.boss] BIND10_CONFIGURATOR_BUILD building plan '{'sockcreator': ({'priority': 200, 'kind': 'core', 'special': 'sockcreator'}, <isc.bind10.special_component.SockCreator object at 0x1800e50>), 'msgq': ({'priority': 199, 'kind': 'core', 'special': 'msgq'}, <isc.bind10.special_component.Msgq object at 0x1800f10>), 'cfgmgr': ({'priority': 198, 'kind': 'core', 'special': 'cfgmgr'}, <isc.bind10.special_component.CfgMgr object at 0x1800ed0>)}' -> '{'sockcreator': {'priority': 200, 'kind': 'core', 'special': 'sockcreator'}, 'msgq': {'priority': 199, 'kind': 'core', 'special': 'msgq'}, 'b10-cmdctl': {'kind': 'needed', 'special': 'cmdctl'}, 'cfgmgr': {'priority': 198, 'kind': 'core', 'special': 'cfgmgr'}, 'b10-zonemgr': {'kind': 'dispensable', 'address': 'Zonemgr'}, 'b10-stats': {'kind': 'dispensable', 'address': 'Stats'}, 'b10-resolver': {'priority': 10, 'kind': 'needed', 'special': 'resolver'}, 'b10-stats-httpd': {'kind': 'dispensable', 'address': 'StatsHttpd'}}'
2012-02-13 21:51:45.517 DEBUG [b10-boss.boss] BIND10_CONFIGURATOR_RUN running plan of 5 tasks
2012-02-13 21:51:45.517 INFO  [b10-boss.boss] BIND10_COMPONENT_START component b10-resolver is starting
2012-02-13 21:51:45.517 INFO  [b10-boss.boss] BIND10_STARTING_PROCESS starting process b10-resolver
2012-02-13 21:51:45.530 DEBUG [b10-boss.boss] BIND10_STARTED_PROCESS_PID started b10-resolver (PID 12939)
2012-02-13 21:51:45.531 INFO  [b10-boss.boss] BIND10_COMPONENT_START component b10-cmdctl is starting
2012-02-13 21:51:45.531 INFO  [b10-boss.boss] BIND10_STARTING_PROCESS starting process b10-cmdctl
2012-02-13 21:51:45.536 DEBUG [b10-boss.boss] BIND10_STARTED_PROCESS_PID started b10-cmdctl (PID 12940)
2012-02-13 21:51:45.536 INFO  [b10-boss.boss] BIND10_COMPONENT_START component b10-zonemgr is starting
2012-02-13 21:51:45.536 INFO  [b10-boss.boss] BIND10_STARTING_PROCESS starting process b10-zonemgr
2012-02-13 21:51:45.549 DEBUG [b10-boss.boss] BIND10_STARTED_PROCESS_PID started b10-zonemgr (PID 12941)
2012-02-13 21:51:45.549 INFO  [b10-boss.boss] BIND10_COMPONENT_START component b10-stats is starting
2012-02-13 21:51:45.549 INFO  [b10-boss.boss] BIND10_STARTING_PROCESS starting process b10-stats
2012-02-13 21:51:45.552 DEBUG [b10-resolver.resolver] RESOLVER_CONFIG_CHANNEL configuration channel created
2012-02-13 21:51:45.552 DEBUG [b10-resolver.resolver] RESOLVER_CONFIG_UPDATED configuration updated: { "forward_addresses": [  ], "listen_on": [ { "address": "::1", "port": 53 }, { "address": "127.0.0.1", "port": 53 } ], "query_acl": [ { "action": "ACCEPT", "from": "127.0.0.1" }, { "action": "ACCEPT", "from": "::1" } ], "retries": 3, "root_addresses": [  ], "timeout_client": 4000, "timeout_lookup": 30000, "timeout_query": 2000 }
2012-02-13 21:51:45.552 INFO  [b10-resolver.resolver] RESOLVER_RECURSIVE running in recursive mode
2012-02-13 21:51:45.552 WARN  [b10-resolver.resolver] RESOLVER_NO_ROOT_ADDRESS no root addresses available
2012-02-13 21:51:45.552 DEBUG [b10-resolver.resolver] RESOLVER_SET_PARAMS query timeout: 2000, client timeout: 4000, lookup timeout: 30000, retry count: 3
2012-02-13 21:51:45.552 INFO  [b10-resolver.resolver] RESOLVER_SET_QUERY_ACL query ACL is configured
2012-02-13 21:51:45.552 DEBUG [b10-resolver.server_common] SRVCOMM_SET_LISTEN setting addresses to listen to
2012-02-13 21:51:45.555 DEBUG [b10-boss.boss] BIND10_STARTED_PROCESS_PID started b10-stats (PID 12942)
2012-02-13 21:51:45.555 INFO  [b10-boss.boss] BIND10_COMPONENT_START component b10-stats-httpd is starting
2012-02-13 21:51:45.555 INFO  [b10-boss.boss] BIND10_STARTING_PROCESS starting process b10-stats-httpd
2012-02-13 21:51:45.569 DEBUG [b10-boss.boss] BIND10_STARTED_PROCESS_PID started b10-stats-httpd (PID 12943)
2012-02-13 21:51:45.570 INFO  [b10-boss.boss] BIND10_STARTUP_COMPLETE BIND 10 started
2012-02-13 21:51:45.581 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [::1]:53 of type TCP from the creator
2012-02-13 21:51:45.602 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 18
2012-02-13 21:51:45.603 DEBUG [b10-resolver.asiodns] ASIODNS_FD_ADD_TCP adding a new TCP server by opened fd 9
2012-02-13 21:51:45.604 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [::1]:53 of type UDP from the creator
2012-02-13 21:51:45.604 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 20
2012-02-13 21:51:45.605 DEBUG [b10-resolver.asiodns] ASIODNS_FD_ADD_UDP adding a new UDP server by opened fd 10
2012-02-13 21:51:45.605 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [127.0.0.1]:53 of type TCP from the creator
2012-02-13 21:51:45.606 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 21
2012-02-13 21:51:45.606 DEBUG [b10-resolver.asiodns] ASIODNS_FD_ADD_TCP adding a new TCP server by opened fd 11
2012-02-13 21:51:45.607 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [127.0.0.1]:53 of type UDP from the creator
2012-02-13 21:51:45.607 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 22
2012-02-13 21:51:45.608 DEBUG [b10-resolver.asiodns] ASIODNS_FD_ADD_UDP adding a new UDP server by opened fd 12
2012-02-13 21:51:45.608 DEBUG [b10-resolver.resolver] RESOLVER_QUERY_SETUP query setup
2012-02-13 21:51:45.608 DEBUG [b10-resolver.resolver] RESOLVER_CONFIG_LOADED configuration loaded
2012-02-13 21:51:45.608 INFO  [b10-resolver.resolver] RESOLVER_STARTED resolver started
2012-02-13 21:51:45.807 DEBUG [b10-stats.stats] STATS_RECEIVED_NEW_CONFIG received new configuration: {'version': 2}
2012-02-13 21:51:45.807 INFO  [b10-stats.stats] STATS_STARTING starting
2012-02-13 21:51:45.807 DEBUG [b10-stats.stats] STATS_SEND_REQUEST_BOSS requesting boss to send statistics
2012-02-13 21:51:45.840 DEBUG [b10-cmdctl.cmdctl] CMDCTL_SEND_COMMAND sending command get_module_spec to module ConfigManager
2012-02-13 21:51:45.842 DEBUG [b10-cmdctl.cmdctl] CMDCTL_COMMAND_SENT command 'get_module_spec' to module 'ConfigManager' was sent
2012-02-13 21:51:45.842 DEBUG [b10-cmdctl.cmdctl] CMDCTL_SEND_COMMAND sending command get_config to module ConfigManager
2012-02-13 21:51:45.843 DEBUG [b10-cmdctl.cmdctl] CMDCTL_COMMAND_SENT command 'get_config' to module 'ConfigManager' was sent
2012-02-13 21:51:45.852 DEBUG [b10-stats-httpd.stats-httpd] STATHTTPD_HANDLE_CONFIG reading configuration: {'version': 2}
2012-02-13 21:51:45.859 INFO  [b10-stats-httpd.stats-httpd] STATHTTPD_STARTED listening on 127.0.0.1#8000
2012-02-13 21:52:08.715 DEBUG [b10-resolver.resolver] RESOLVER_QUERY_ACCEPTED query accepted: 'jaspain.net./A/IN' from 127.0.0.1#55753
2012-02-13 21:52:08.715 DEBUG [b10-resolver.resolver] RESOLVER_NORMAL_QUERY processing normal query
2012-02-13 21:52:08.715 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <jaspain.net. IN A> (resolve() instance 2)
2012-02-13 21:52:08.716 DEBUG [b10-resolver.reslib] RESLIB_RECQ_CACHE_NO_FIND did not find <jaspain.net. IN A> in the cache, starting RunningQuery (resolve() instance 2)
2012-02-13 21:52:08.716 DEBUG [b10-resolver.reslib] RESLIB_NSAS_LOOKUP looking up nameserver for zone . in the NSAS
2012-02-13 21:52:08.716 DEBUG [b10-resolver.nsas] NSAS_SEARCH_ZONE_NS searching NSAS for nameservers for zone .
2012-02-13 21:52:08.716 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <. IN NS> (resolve() instance 1)
2012-02-13 21:52:08.716 DEBUG [b10-resolver.nsas] NSAS_FIND_NS_ADDRESS asking resolver to obtain A and AAAA records for l.root-servers.net.
2012-02-13 21:52:08.716 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <l.root-servers.net. IN A> (resolve() instance 1)
2012-02-13 21:52:08.716 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <l.root-servers.net. IN AAAA> (resolve() instance 1)
2012-02-13 21:52:08.779 DEBUG [b10-resolver.nsas] NSAS_SEARCH_ZONE_NS searching NSAS for nameservers for zone net.
2012-02-13 21:52:08.779 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <net. IN NS> (resolve() instance 1)
2012-02-13 21:52:08.782 INFO  [b10-boss.boss] BIND10_LOST_SOCKET_CONSUMER consumer 19 of sockets disconnected, considering all its sockets closed
2012-02-13 21:52:08.782 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 12939 of b10-resolver ended with status 11
2012-02-13 21:52:08.782 ERROR [b10-boss.boss] BIND10_COMPONENT_FAILED component b10-resolver (pid 12939) failed with 11 exit status
2012-02-13 21:52:08.782 INFO  [b10-boss.boss] BIND10_COMPONENT_START component b10-resolver is starting
2012-02-13 21:52:08.783 INFO  [b10-boss.boss] BIND10_STARTING_PROCESS starting process b10-resolver
2012-02-13 21:52:08.788 DEBUG [b10-boss.boss] BIND10_STARTED_PROCESS_PID started b10-resolver (PID 12950)
2012-02-13 21:52:08.817 DEBUG [b10-resolver.resolver] RESOLVER_CONFIG_CHANNEL configuration channel created
2012-02-13 21:52:08.817 DEBUG [b10-resolver.resolver] RESOLVER_CONFIG_UPDATED configuration updated: { "forward_addresses": [  ], "listen_on": [ { "address": "::1", "port": 53 }, { "address": "127.0.0.1", "port": 53 } ], "query_acl": [ { "action": "ACCEPT", "from": "127.0.0.1" }, { "action": "ACCEPT", "from": "::1" } ], "retries": 3, "root_addresses": [  ], "timeout_client": 4000, "timeout_lookup": 30000, "timeout_query": 2000 }
2012-02-13 21:52:08.817 INFO  [b10-resolver.resolver] RESOLVER_RECURSIVE running in recursive mode
2012-02-13 21:52:08.817 WARN  [b10-resolver.resolver] RESOLVER_NO_ROOT_ADDRESS no root addresses available
2012-02-13 21:52:08.817 DEBUG [b10-resolver.resolver] RESOLVER_SET_PARAMS query timeout: 2000, client timeout: 4000, lookup timeout: 30000, retry count: 3
2012-02-13 21:52:08.817 INFO  [b10-resolver.resolver] RESOLVER_SET_QUERY_ACL query ACL is configured
2012-02-13 21:52:08.817 DEBUG [b10-resolver.server_common] SRVCOMM_SET_LISTEN setting addresses to listen to
2012-02-13 21:52:08.819 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [::1]:53 of type TCP from the creator
2012-02-13 21:52:08.819 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 13
2012-02-13 21:52:08.821 DEBUG [b10-resolver.asiodns] ASIODNS_FD_ADD_TCP adding a new TCP server by opened fd 9
2012-02-13 21:52:08.821 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [::1]:53 of type UDP from the creator
2012-02-13 21:52:08.821 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 20
2012-02-13 21:52:08.822 DEBUG [b10-resolver.asiodns] ASIODNS_FD_ADD_UDP adding a new UDP server by opened fd 10
2012-02-13 21:52:08.823 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [127.0.0.1]:53 of type TCP from the creator
2012-02-13 21:52:08.823 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 21
2012-02-13 21:52:08.823 DEBUG [b10-resolver.asiodns] ASIODNS_FD_ADD_TCP adding a new TCP server by opened fd 11
2012-02-13 21:52:08.824 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [127.0.0.1]:53 of type UDP from the creator
2012-02-13 21:52:08.824 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 22
2012-02-13 21:52:08.825 DEBUG [b10-resolver.asiodns] ASIODNS_FD_ADD_UDP adding a new UDP server by opened fd 12
2012-02-13 21:52:08.825 DEBUG [b10-resolver.resolver] RESOLVER_QUERY_SETUP query setup
2012-02-13 21:52:08.825 DEBUG [b10-resolver.resolver] RESOLVER_CONFIG_LOADED configuration loaded
2012-02-13 21:52:08.825 INFO  [b10-resolver.resolver] RESOLVER_STARTED resolver started
2012-02-13 21:52:13.715 DEBUG [b10-resolver.resolver] RESOLVER_QUERY_ACCEPTED query accepted: 'jaspain.net./A/IN' from 127.0.0.1#55753
2012-02-13 21:52:13.716 DEBUG [b10-resolver.resolver] RESOLVER_NORMAL_QUERY processing normal query
2012-02-13 21:52:13.717 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <jaspain.net. IN A> (resolve() instance 2)
2012-02-13 21:52:13.717 DEBUG [b10-resolver.reslib] RESLIB_RECQ_CACHE_NO_FIND did not find <jaspain.net. IN A> in the cache, starting RunningQuery (resolve() instance 2)
2012-02-13 21:52:13.717 DEBUG [b10-resolver.reslib] RESLIB_NSAS_LOOKUP looking up nameserver for zone . in the NSAS
2012-02-13 21:52:13.717 DEBUG [b10-resolver.nsas] NSAS_SEARCH_ZONE_NS searching NSAS for nameservers for zone .
2012-02-13 21:52:13.718 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <. IN NS> (resolve() instance 1)
2012-02-13 21:52:13.718 DEBUG [b10-resolver.nsas] NSAS_FIND_NS_ADDRESS asking resolver to obtain A and AAAA records for l.root-servers.net.
2012-02-13 21:52:13.718 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <l.root-servers.net. IN A> (resolve() instance 1)
2012-02-13 21:52:13.718 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <l.root-servers.net. IN AAAA> (resolve() instance 1)
2012-02-13 21:52:13.780 DEBUG [b10-resolver.nsas] NSAS_SEARCH_ZONE_NS searching NSAS for nameservers for zone net.
2012-02-13 21:52:13.780 DEBUG [b10-resolver.reslib] RESLIB_RESOLVE asked to resolve <net. IN NS> (resolve() instance 1)
2012-02-13 21:52:13.782 INFO  [b10-boss.boss] BIND10_LOST_SOCKET_CONSUMER consumer 18 of sockets disconnected, considering all its sockets closed
2012-02-13 21:52:13.783 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 12950 of b10-resolver ended with status 11
2012-02-13 21:52:13.783 ERROR [b10-boss.boss] BIND10_COMPONENT_FAILED component b10-resolver (pid 12950) failed with 11 exit status
2012-02-13 21:52:18.788 INFO  [b10-boss.boss] BIND10_COMPONENT_START component b10-resolver is starting
2012-02-13 21:52:18.788 INFO  [b10-boss.boss] BIND10_STARTING_PROCESS starting process b10-resolver
2012-02-13 21:52:18.795 DEBUG [b10-boss.boss] BIND10_STARTED_PROCESS_PID started b10-resolver (PID 12951)
2012-02-13 21:52:18.825 DEBUG [b10-resolver.resolver] RESOLVER_CONFIG_CHANNEL configuration channel created
2012-02-13 21:52:18.825 DEBUG [b10-resolver.resolver] RESOLVER_CONFIG_UPDATED configuration updated: { "forward_addresses": [  ], "listen_on": [ { "address": "::1", "port": 53 }, { "address": "127.0.0.1", "port": 53 } ], "query_acl": [ { "action": "ACCEPT", "from": "127.0.0.1" }, { "action": "ACCEPT", "from": "::1" } ], "retries": 3, "root_addresses": [  ], "timeout_client": 4000, "timeout_lookup": 30000, "timeout_query": 2000 }
2012-02-13 21:52:18.825 INFO  [b10-resolver.resolver] RESOLVER_RECURSIVE running in recursive mode
2012-02-13 21:52:18.825 WARN  [b10-resolver.resolver] RESOLVER_NO_ROOT_ADDRESS no root addresses available
2012-02-13 21:52:18.825 DEBUG [b10-resolver.resolver] RESOLVER_SET_PARAMS query timeout: 2000, client timeout: 4000, lookup timeout: 30000, retry count: 3
2012-02-13 21:52:18.825 INFO  [b10-resolver.resolver] RESOLVER_SET_QUERY_ACL query ACL is configured
2012-02-13 21:52:18.825 DEBUG [b10-resolver.server_common] SRVCOMM_SET_LISTEN setting addresses to listen to
2012-02-13 21:52:18.827 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [::1]:53 of type TCP from the creator
2012-02-13 21:52:18.827 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 13
2012-02-13 21:52:18.829 DEBUG [b10-resolver.asiodns] ASIODNS_FD_ADD_TCP adding a new TCP server by opened fd 9
2012-02-13 21:52:18.829 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [::1]:53 of type UDP from the creator
2012-02-13 21:52:18.829 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 20
2012-02-13 21:52:18.830 DEBUG [b10-resolver.asiodns] ASIODNS_FD_ADD_UDP adding a new UDP server by opened fd 10
2012-02-13 21:52:18.831 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [127.0.0.1]:53 of type TCP from the creator
2012-02-13 21:52:18.831 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 21
2012-02-13 21:52:18.831 DEBUG [b10-resolver.asiodns] ASIODNS_FD_ADD_TCP adding a new TCP server by opened fd 11
2012-02-13 21:52:18.832 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [127.0.0.1]:53 of type UDP from the creator
2012-02-13 21:52:18.832 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 22
2012-02-13 21:52:18.833 DEBUG [b10-resolver.asiodns] ASIODNS_FD_ADD_UDP adding a new UDP server by opened fd 12
2012-02-13 21:52:18.833 DEBUG [b10-resolver.resolver] RESOLVER_QUERY_SETUP query setup
2012-02-13 21:52:18.833 DEBUG [b10-resolver.resolver] RESOLVER_CONFIG_LOADED configuration loaded
2012-02-13 21:52:18.833 INFO  [b10-resolver.resolver] RESOLVER_STARTED resolver started
2012-02-13 21:52:33.313 INFO  [b10-boss.boss] BIND10_RECEIVED_SIGNAL received signal SIGINT
2012-02-13 21:52:33.313 INFO  [b10-boss.boss] BIND10_SHUTDOWN stopping the server
2012-02-13 21:52:33.313 INFO  [b10-boss.boss] BIND10_CONFIGURATOR_STOP bind10 component configurator is shutting down
2012-02-13 21:52:33.314 DEBUG [b10-boss.boss] BIND10_CONFIGURATOR_BUILD building plan '{'sockcreator': ({'priority': 200, 'kind': 'core', 'special': 'sockcreator'}, <isc.bind10.special_component.SockCreator object at 0x1800e50>), 'msgq': ({'priority': 199, 'kind': 'core', 'special': 'msgq'}, <isc.bind10.special_component.Msgq object at 0x1800f10>), 'b10-cmdctl': ({'kind': 'needed', 'special': 'cmdctl'}, <isc.bind10.special_component.CmdCtl object at 0x1895890>), 'cfgmgr': ({'priority': 198, 'kind': 'core', 'special': 'cfgmgr'}, <isc.bind10.special_component.CfgMgr object at 0x1800ed0>), 'b10-zonemgr': ({'kind': 'dispensable', 'address': 'Zonemgr'}, <isc.bind10.component.Component object at 0x17783d0>), 'b10-stats': ({'kind': 'dispensable', 'address': 'Stats'}, <isc.bind10.component.Component object at 0x18a1d10>), 'b10-resolver': ({'priority': 10, 'kind': 'needed', 'special': 'resolver'}, <isc.bind10.special_component.Resolver object at 0x18a1dd0>), 'b10-stats-httpd': ({'kind': 'dispensable', 'address': 'StatsHttpd'}, <isc.bind10.component.Component object at 0x18a1f90>)}' -> '{}'
2012-02-13 21:52:33.314 DEBUG [b10-boss.boss] BIND10_CONFIGURATOR_RUN running plan of 8 tasks
2012-02-13 21:52:33.314 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP component Socket creator is being stopped
2012-02-13 21:52:33.314 INFO  [b10-boss.boss] BIND10_SOCKCREATOR_TERMINATE terminating socket creator
2012-02-13 21:52:33.314 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP component msgq is being stopped
2012-02-13 21:52:33.314 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP component b10-cmdctl is being stopped
2012-02-13 21:52:33.314 INFO  [b10-boss.boss] BIND10_STOP_PROCESS asking b10-cmdctl to shut down
2012-02-13 21:52:33.315 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP component cfgmgr is being stopped
2012-02-13 21:52:33.315 INFO  [b10-boss.boss] BIND10_STOP_PROCESS asking cfgmgr to shut down
2012-02-13 21:52:33.322 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP component b10-zonemgr is being stopped
2012-02-13 21:52:33.323 INFO  [b10-boss.boss] BIND10_STOP_PROCESS asking b10-zonemgr to shut down
2012-02-13 21:52:33.323 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP component b10-stats is being stopped
2012-02-13 21:52:33.323 INFO  [b10-boss.boss] BIND10_STOP_PROCESS asking b10-stats to shut down
2012-02-13 21:52:33.323 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP component b10-resolver is being stopped
2012-02-13 21:52:33.323 INFO  [b10-boss.boss] BIND10_STOP_PROCESS asking b10-resolver to shut down
2012-02-13 21:52:33.323 DEBUG [b10-zonemgr.zonemgr] ZONEMGR_RECEIVE_SHUTDOWN received SHUTDOWN command
2012-02-13 21:52:33.324 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP component b10-stats-httpd is being stopped
2012-02-13 21:52:33.324 INFO  [b10-boss.boss] BIND10_STOP_PROCESS asking b10-stats-httpd to shut down
2012-02-13 21:52:33.324 INFO  [b10-stats.stats] STATS_RECEIVED_SHUTDOWN_COMMAND shutdown command received
2012-02-13 21:52:33.324 DEBUG [b10-stats-httpd.stats-httpd] STATHTTPD_RECEIVED_SHUTDOWN_COMMAND shutdown command received
2012-02-13 21:52:33.325 INFO  [b10-stats-httpd.stats-httpd] STATHTTPD_SHUTDOWN shutting down
2012-02-13 21:52:33.325 INFO  [b10-stats-httpd.stats-httpd] STATHTTPD_CLOSING closing 127.0.0.1#8000
2012-02-13 21:52:33.325 DEBUG [b10-stats-httpd.stats-httpd] STATHTTPD_CLOSING_CC_SESSION stopping cc session
2012-02-13 21:52:33.328 DEBUG [b10-zonemgr.zonemgr] ZONEMGR_SHUTDOWN zone manager has shut down
2012-02-13 21:52:33.331 DEBUG [b10-resolver.cache] CACHE_MESSAGES_DEINIT deinitialized message cache
2012-02-13 21:52:33.357 INFO  [b10-resolver.resolver] RESOLVER_SHUTDOWN resolver shutdown complete
2012-02-13 21:52:33.357 DEBUG [b10-resolver.resolver] RESOLVER_QUERY_SHUTDOWN query shutdown
2012-02-13 21:52:34.332 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 12936 of Socket creator ended with status 0
2012-02-13 21:52:34.332 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 12938 of cfgmgr ended with status 0
2012-02-13 21:52:34.332 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 12940 of b10-cmdctl ended with status 0
2012-02-13 21:52:34.332 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 12941 of b10-zonemgr ended with status 0
2012-02-13 21:52:34.332 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 12942 of b10-stats ended with status 0
2012-02-13 21:52:34.332 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 12943 of b10-stats-httpd ended with status 0
2012-02-13 21:52:34.332 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 12951 of b10-resolver ended with status 0
2012-02-13 21:52:34.333 INFO  [b10-boss.boss] BIND10_SEND_SIGTERM sending SIGTERM to msgq (PID 12937)
2012-02-13 21:52:34.433 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 12937 of msgq ended with status 0
2012-02-13 21:52:34.433 INFO  [b10-boss.boss] BIND10_SHUTDOWN_COMPLETE all processes ended, shutdown complete

comment:3 in reply to: ↑ 1 Changed 8 years ago by vorner

Hello

Replying to jreed:

I assume exit status 11 is EAGAIN or EWOULDBLOCK on this Ubuntu Linux system.

No. This is the exit code/return status/$? of the process. That has nothing to do with errno codes. I wrote a small program to decode the value, assuming that ubuntu doesn't have different status encoding from gentoo:

#include <sys/types.h>
#include <sys/wait.h>
#include <stdio.h>

int main(int argc, const char ** argv) {
        printf("Exited: %d, Exit status: %d, Signaled: %d, Signal: %d\n",
                        WIFEXITED(11), WEXITSTATUS(11),
                        WIFSIGNALED(11), WTERMSIG(11));
        return 0;
}

The output is:
Exited: 0, Exit status: 0, Signaled: 1, Signal: 11

So, the problem is SEGFAULT somewhere in the code. We might want to add the decoding of the exit code into the boss sometime, the number is not really helpful.

Jaspain, would it be too much to ask if you happen to have core dumps enabled? A backtrace from gdb could be very helpful (KDE has a really useful thingie, if a program crashes, it tries to get the backtrace automatically and detects if it could be useful and asks the user to send it, we probably want to think about some simple way for this as well).

Thank you

comment:4 Changed 8 years ago by jaspain

After some trial and error, I have demonstrated that the cause of this error is due to configuring the bind10 installation with the option --enable-boost-threads. Without this option, b10-resolver runs fine, whether running as root or as user bind, and whether the prerequisites are installed from Ubuntu packages or from downloaded source. I enabled core dumps and am attaching core.tar.xz to this ticket. It contains two files, core-b10-resolver-11-0-0-1774-1329267846 and core-b10-resolver-11-0-0-1787-1329267851. The naming convention is core-<executable name>-<signal>-<UID>-<GID>-<PID>-<Unix time>. There are two core dumps because b10-boss restarted b10-resolver once after the first failure, and it failed again. I have the two configurations, with and without --enable-boost-threads, saved as virtual machine snapshots should you want additional testing. Just let me know. Thanks. Jeff.

Last edited 8 years ago by jaspain (previous) (diff)

Changed 8 years ago by jaspain

Core dumps, compressed with "tar cJf"

comment:5 Changed 8 years ago by vorner

  • Defect Severity changed from High to Low
  • Priority changed from major to minor
  • Summary changed from b10-resolver crashing with status 11 to b10-resolver crashing with status 11 when compiled with --enable-boost-threads

Thank you for your info, with the flag I can reproduce the problem (even the tests crash now). Core dumps are of little use without the corresponding binaries, but we can now get our own, so this isn't a problem.

Anyway, to the flag itself, it is kind of historical relict now, I'd say (I wasn't aware of the flag until you poited it out). We wrote some code and put locks there to be ready when we start using threads and it turned out the boost library didn't provide the locks on all the systems. So we put an ifdef there and provided some dummy locks which are used instead if the flag isn't used. We don't use any threads yet (we just have the locks in place), so using the flag would only slow things down for now. It seems as nobody was using it, the code there got rot somehow.

I'd suggest you use the code without the flag, then. I'll propose to remove the flag from configure (it brings no advantage and only breaks things). We'll need to solve the problem before we actually introduce the threads, but that probably won't happen sometime really soon.

comment:6 Changed 8 years ago by jaspain

Sorry, I'm not very familiar with how core dumps are used. /usr/libexec/bind10-devel/b10-resolver now attached. Hopefully that will save you some time. Jeff.

Changed 8 years ago by jaspain

Executable binary associated with core dumps.

comment:7 Changed 8 years ago by shane

  • Milestone changed from New Tasks to Next-Sprint-Proposed

comment:8 Changed 8 years ago by jelte

  • Estimated Difficulty changed from 0 to 7

comment:9 Changed 8 years ago by jreed

Does ticket #1680 fix this (by removing the option)?

(As for the exit status confusion, that was handled by ticket #1673.)

comment:10 Changed 8 years ago by vorner

Yes, it is (the #1680 is merged, it's opened because of changelog entry only now). We probably want to close this ticket, at least for now ‒ once we introduce the threads into the resolver, we'll probably need to solve it, but I'm not sure we want threads there.

comment:11 Changed 6 years ago by stephen

  • Milestone set to DNS Outstanding Tasks

comment:12 Changed 6 years ago by tomek

  • Resolution set to wontfix
  • Status changed from new to closed

DNS and BIND10 framework is outside of scope for Kea project.
The corresponding code has been removed from Kea git repository.
If you want to follow up on DNS or former BIND10 issues, see
http://bundy-dns.de project.

Closing ticket.

Note: See TracTickets for help on using tickets.