Opened 8 years ago

Closed 8 years ago

#1387 closed defect (fixed)

timeout on xfrout when sock path too long

Reported by: jelte Owned by: jinmei
Priority: medium Milestone: Sprint-20111220
Component: Unclassified Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket: none
Estimated Difficulty: 5 Add Hours to Ticket: 0
Total Hours: 3.15 Internal?: no

Description

in this case, it got triggered by the lettuce xfrin test; if the auth->xfrout domain socket path is too long, you get a timeout when you try to transfer from bind 10. Looks like auth fires off the request but doesn't do anything if it fails.

We should probably log a potential problem with the sockpath length, and return SERVFAIL instead of 'nothing'. Perhaps we need to check if there are other failure scenario's where one could end up with a timeout instead of a error response as well.

Subtickets

Change History (14)

comment:1 Changed 8 years ago by jelte

  • Milestone changed from New Tasks to Next-Sprint-Proposed

comment:2 Changed 8 years ago by jelte

  • Estimated Difficulty changed from 0 to 5

comment:3 Changed 8 years ago by jelte

  • Milestone changed from Next-Sprint-Proposed to Sprint-20111206

comment:4 Changed 8 years ago by shane

  • Feature Depending on Ticket set to none

comment:5 Changed 8 years ago by jinmei

  • Owner set to jinmei
  • Status changed from new to accepted

comment:6 Changed 8 years ago by jinmei

trac1387 is ready for review.

I believe the diff (quite short) tells everything. One note: I
suspect what actually happened was that b10-auth was (relatively
gracefully) shut down due to the propagated exception, not just the
AXFR request was timed out. This would also mean it could be a
potential security problem (with a bad local configuration, a remote
attacker could force auth to shut down), but considering the overall
current maturity of BIND 10 and the fact that it happens only with a
bad local config, I think it's okay to treat it as a normal bug.

I added specific test cases for this problem. Although this stuff
should have had more tests, adding some specific tests would be better
than (still) nothing.

Proposed changelog entry:

338.?	[bug]		jinmei
	libxfr, used by b10-auth to share TCP sockets with b10-xfrout,
	incorrectly propagated ASIO specific exceptions to the application
	if the given file name was too long.  This could lead to
	unexpected shut down of b10-auth.
	(Trac #1387, git TBD)

comment:7 Changed 8 years ago by jinmei

  • Owner changed from jinmei to UnAssigned
  • Status changed from accepted to reviewing

comment:8 Changed 8 years ago by jelte

  • Owner changed from UnAssigned to jelte

comment:9 follow-up: Changed 8 years ago by jelte

  • Owner changed from jelte to jinmei
  • Total Hours changed from 0 to 1

Just to be clear, the asio connect() call, when given the optional error variable, can still raise exceptions? If so, I wonder if this is also the case for some of the other calls then.

And yes, we do need more tests :) (but not in this ticket)

Anyway, on to the code;

xfrout_client.cc:

in XfroutClient::connect(), asio::error_code err is now unused and can be removed (wonder why neither my compiler nor cppcheck fails on that)

client_test.cc:

My compiler does not allow the uninitialized sockaddr_un structure as a const...

I noticed my system headers have a macro for this specific thing:

# define SUN_LEN(ptr) ((size_t) (((struct sockaddr_un *) 0)->sun_path)	      \
		      + strlen ((ptr)->sun_path))

Though I don't know how portable the macro is.

BTW, the variable name should also not be sun, as that is a predefined macro in sunstudio system headers

comment:10 in reply to: ↑ 9 Changed 8 years ago by jinmei

Replying to jelte:

Thanks for the review.

Just to be clear, the asio connect() call, when given the optional error variable, can still raise exceptions? If so, I wonder if this is also the case for some of the other calls then.

No, if we pass the 'error' parameter, connect() should be exception
free at the asio level. Basically, the 'with error' version is a
wrapper for the throwing version that would look like

nothrow_func(error& err) {
  try {
    throw_func();
  } catch() {
    convert exception to error code and set it in err;
  }
}

xfrout_client.cc:

in XfroutClient::connect(), asio::error_code err is now unused and can be removed (wonder why neither my compiler nor cppcheck fails on that)

Ah, good catch, removed. The reason why the compiler didn't complain
is this:

AM_CXXFLAGS += -Wno-unused-parameter # see src/lib/cc/Makefile.am

(not sure about cppcheck. it sometimes overlooks things that it's
supposed to find)

In this module, since we only use the quite low level interface (it
uses send(2) in the end) in the blocking mode, we actually don't need
the ASIO level abstraction. So we should probably simply stop using
asio here, or rather extend asiolink to support unix domain sockets.
(But that would be a subject of a separate ticket).

client_test.cc:

My compiler does not allow the uninitialized sockaddr_un structure as a const...

Okay, I think in this case we should simply give up constifying it;
the brevity would be more important in this context. I also don't
think we need to rely on SUN_LEN, which would also have a portability
issue.

BTW, the variable name should also not be sun, as that is a predefined macro in sunstudio system headers

Ah, okay, changed it.

comment:11 Changed 8 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:12 follow-up: Changed 8 years ago by jelte

  • Owner changed from jelte to jinmei

Ok, all seems good :) please go ahead and merge.

And yeah, if we don't actually use asio here, we can probably remove it.

comment:13 in reply to: ↑ 12 Changed 8 years ago by jinmei

Replying to jelte:

Ok, all seems good :) please go ahead and merge.

Okay, thanks, merged, closing ticket.

And yeah, if we don't actually use asio here, we can probably remove it.

Maybe we could do this as part of #1452. I'll keep that in my mind
as I work on it.

comment:14 Changed 8 years ago by jinmei

  • Resolution set to fixed
  • Status changed from reviewing to closed
  • Total Hours changed from 1 to 3.15
Note: See TracTickets for help on using tickets.