Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#1546 closed defect (fixed)

resolver not listening

Reported by: jreed Owned by: jinmei
Priority: very high Milestone: Sprint-20120124
Component: resolver Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: High
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 0
Total Hours: 1.43 Internal?: no

Description

Following the guide, I did:

> config remove Boss/components b10-xfrout
> config remove Boss/components b10-xfrin
> config remove Boss/components b10-auth
> config add Boss/components b10-resolver
> config set Boss/components/b10-resolver/special resolver
> config set Boss/components/b10-resolver/kind needed
> config set Boss/components/b10-resolver/priority 10
> config commit
>

(no feedback there)

The resolver is running:

> config show Resolver/listen_on
Resolver/listen_on[0]/address   "::1"   string  (default)
Resolver/listen_on[0]/port      53      integer (default)
Resolver/listen_on[1]/address   "127.0.0.1"     string  (default)
Resolver/listen_on[1]/port      53      integer (default)

But not listening.

The default logging:

2012-01-10 12:08:07.339 INFO  [b10-boss.boss] BIND10_CONFIGURATOR_RECONFIGURE reconfiguring running components
2012-01-10 12:08:07.340 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP component b10-xfrin is being stopped
2012-01-10 12:08:07.340 INFO  [b10-boss.boss] BIND10_STOP_PROCESS asking b10-xfrin to shut down
2012-01-10 12:08:07.358 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP component b10-xfrout is being stopped
2012-01-10 12:08:07.358 INFO  [b10-boss.boss] BIND10_STOP_PROCESS asking b10-xfrout to shut down
2012-01-10 12:08:07.359 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP component b10-auth is being stopped
2012-01-10 12:08:07.359 INFO  [b10-boss.boss] BIND10_STOP_PROCESS asking b10-autt configuration
2012-01-10 12:05:05.508 INFO  [b10-xfrout.xfrout] XFROUT_NEW_CONFIG_DONE Update xfrout configuration done
2012-01-10 12:05:05.636 INFO  [b10-stats.stats] STATS_STARTING starting
2012-01-10 12:05:05.918 INFO  [b10-stats-httpd.stats-httpd] STATHTTPD_STARTED listening on 127.0.0.1#8000
2012-01-10 12:05:08.854 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [::]:53 of type TCP from the creator
2012-01-10 12:05:08.874 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 19
2012-01-10 12:05:08.894 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [::]:53 of type UDP from the creator
2012-01-10 12:05:08.896 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 21
2012-01-10 12:05:08.903 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [0.0.0.0]:53 of type TCP from the creator
2012-01-10 12:05:08.903 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 22
2012-01-10 12:05:08.907 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [0.0.0.0]:53 of type UDP from the creator
2012-01-10 12:05:08.907 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 23
2012-01-10 12:05:08.917 INFO  [b10-auth.auth] AUTH_SERVER_STARTED server started
2012-01-10 12:08:07.339 INFO  [b10-boss.boss] BIND10_CONFIGURATOR_RECONFIGURE reconfiguring running components
2012-01-10 12:08:07.340 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP component b10-xfrin is being stopped
2012-01-10 12:08:07.340 INFO  [b10-boss.boss] BIND10_STOP_PROCESS asking b10-xfrin to shut down
2012-01-10 12:08:07.358 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP component b10-xfrout is being stopped
2012-01-10 12:08:07.358 INFO  [b10-boss.boss] BIND10_STOP_PROCESS asking b10-xfrout to shut down
2012-01-10 12:08:07.359 INFO  [b10-boss.boss] BIND10_COMPONENT_STOP component b10-auth is being stopped
2012-01-10 12:08:07.359 INFO  [b10-boss.boss] BIND10_STOP_PROCESS asking b10-auth to shut down
2012-01-10 12:08:07.360 INFO  [b10-boss.boss] BIND10_COMPONENT_START component b10-resolver is starting
2012-01-10 12:08:07.360 INFO  [b10-boss.boss] BIND10_STARTING_PROCESS starting process b10-resolver
2012-01-10 12:08:07.363 INFO  [b10-xfrout.xfrout] XFROUT_RECEIVED_SHUTDOWN_COMMAND shutdown command received
2012-01-10 12:08:07.377 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 28511 of b10-xfrin ended with status 0
2012-01-10 12:08:07.384 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 18900 of b10-auth ended with status 0
2012-01-10 12:08:07.384 INFO  [b10-boss.boss] BIND10_LOST_SOCKET_CONSUMER consumer 20 of sockets disconnected, considering all its sockets closed
2012-01-10 12:08:07.443 INFO  [b10-boss.boss] BIND10_PROCESS_ENDED process 23139 of b10-xfrout ended with status 0
2012-01-10 12:08:07.585 INFO  [b10-resolver.resolver] RESOLVER_STARTING starting resolver with command line 'b10-resolver'
2012-01-10 12:08:07.613 INFO  [b10-resolver.resolver] RESOLVER_RECURSIVE running in recursive mode
2012-01-10 12:08:07.613 WARN  [b10-resolver.resolver] RESOLVER_NO_ROOT_ADDRESS no root addresses available
2012-01-10 12:08:07.614 INFO  [b10-resolver.resolver] RESOLVER_SET_QUERY_ACL query ACL is configured
2012-01-10 12:08:07.616 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [::1]:53 of type TCP from the creator
2012-01-10 12:08:07.616 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 12
2012-01-10 12:08:11.650 ERROR [b10-resolver.cc] CC_TIMEOUT timeout reading data from command channel
2012-01-10 12:08:11.664 ERROR [b10-resolver.server_common] SRVCOMM_ADDRESS_FAIL failed to listen on addresses (Timeout while reading data from cc session)
2012-01-10 12:08:11.665 ERROR [b10-resolver.resolver] RESOLVER_CONFIG_ERROR error in configuration: Timeout while reading data from cc session
2012-01-10 12:08:11.665 INFO  [b10-resolver.resolver] RESOLVER_STARTED resolver started

I tried to restart:

> Resolver shutdown
{
    "error": "Module 'Resolver' not responding"
}

And logged:

2012-01-10 12:25:43.794 ERROR [b10-cmdctl.cmdctl] CMDCTL_COMMAND_ERROR error in command shutdown to module Resolver: Module 'Resolver' not responding

I stopped with Boss shutdown and restarted but resolver still failing:

2012-01-10 12:27:36.286 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [::1]:53 of type TCP from the creator
2012-01-10 12:27:36.287 INFO  [b10-boss.boss] BIND10_SOCKET_CREATED successfully created socket 17
...
2012-01-10 12:27:40.373 ERROR [b10-resolver.cc] CC_TIMEOUT timeout reading data from command channel
2012-01-10 12:27:40.387 ERROR [b10-resolver.server_common] SRVCOMM_ADDRESS_FAIL failed to listen on addresses (Timeout while reading data from cc session)
2012-01-10 12:27:40.388 ERROR [b10-resolver.resolver] RESOLVER_CONFIG_ERROR error in configuration: Timeout while reading data from cc session

Subtickets

Change History (11)

comment:1 Changed 8 years ago by jinmei

  • Owner set to jinmei
  • Status changed from new to accepted

comment:2 Changed 8 years ago by jinmei

I believe I found the reason. It's a regression due to #805. I'll create a patch
so that we can include the fix in the release.

comment:3 Changed 8 years ago by jinmei

The fix is available in trac1546. Please review.

What happened is that when the requestor sent an initial request
ModuleCCSession had already started listening to commands/config
updates, so the response from the boss is stolen by ModuleCCSession
and resulted in "timeout".

A similar problem happened for auth and we developed a workaround to
avoid this problem (that's why auth didn't have the problem for
listen_on). I applied the same fix to the resolver.

Unfortunately it's quite difficult to provide a unit test to reproduce
this problem, so at the moment I only added the fix (I confirmed it
by running it).

I don't plan to create a changelog entry for this fix.

comment:4 Changed 8 years ago by jinmei

  • Owner changed from jinmei to UnAssigned
  • Status changed from accepted to reviewing

comment:5 Changed 8 years ago by jinmei

  • Milestone changed from New Tasks to Sprint-20120110
  • Priority changed from major to blocker

comment:6 Changed 8 years ago by vorner

  • Owner changed from UnAssigned to vorner

comment:7 follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to jinmei

Thank you, it works for me and the code looks OK, for one tiny detail:

// go into the main loop.   See auth/main.cc for the rationale.

This comment is located in auth/main.cc. It should either say see resolver/main.cc or be present there and be in the full here.

I don't think I need to review the comment change again, so please merge after the fix.

comment:8 in reply to: ↑ 7 Changed 8 years ago by jinmei

Replying to vorner:

Thank you, it works for me and the code looks OK, for one tiny detail:

// go into the main loop.   See auth/main.cc for the rationale.

This comment is located in auth/main.cc. It should either say see resolver/main.cc or be present there and be in the full here.

Ah, good catch. It should really have been (as it originally was) in
auth. I swapped these sets of comments, and recovered the mention to
tsig key configuration in the auth part.

I don't think I need to review the comment change again, so please merge after the fix.

Okay, thanks, merge done. Closing ticket.

comment:9 Changed 8 years ago by jinmei

  • Total Hours changed from 0 to 1.18

comment:10 Changed 8 years ago by jinmei

  • Resolution set to fixed
  • Status changed from reviewing to closed

comment:11 Changed 8 years ago by vorner

  • Total Hours changed from 1.18 to 1.43
Note: See TracTickets for help on using tickets.