Opened 7 years ago

Closed 7 years ago

#2681 closed defect (fixed)

DHCPv6 server crashes on Solicit and Request when pool has been exhausted.

Reported by: marcin Owned by: stephen
Priority: high Milestone: Sprint-DHCP-20130214
Component: dhcp6 Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: High
Sub-Project: DHCP Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

I perform the manual test v6.general.log.exhausted-pool described in the dhcp-val repo: tests/common/.

This test limits the address pool to 5 addresses while the number of simulated clients is 10. The expected output from this test is that 5 leases are assigned and that the server logs debug messages for 5 remaining requests that the pool is exhausted. The debug message is not logged by the server.

In the course of debugging it turned out that the server crashes and gets restarted by the BIND10 Boss when new Solicit or Request comes in. This happens after making attempt to allocate a new lease 100 times. When it fails the following call in src/bin/dhcp6/dhcp6_srv.cc (line 630) throws the AllocFailed? exception:

Lease6Ptr lease = alloc_engine_->allocateAddress6(subnet, duid, ia->getIAID(), hint, fake_allocation);

This exception propagates to the main function causing process to abort.

This is undesired behaviour because it (to my understanding) violates the RFC 3315 (par 17.2.2.) which says that ''If the server will not assign any addresses to any IAs in a subsequent Request from the client, the server MUST send an Advertise message to the client that includes only a Status Code option with code NoAddrsAvail?...''. Obviously the client does not get anything.

Another obvious implication is that the server is vulnerable to DoS attacks and its performance is affected by the frequent need to restart the process.

Thus, setting the Priority and Severity to High.

The log file has been attached.

Subtickets

Attachments (2)

bind10.log (108.4 KB) - added by marcin 7 years ago.
Full log produced by the v6.general.log.exhausted-pool
single-request.log (28.2 KB) - added by marcin 7 years ago.
A log limited to only single SOLICIT followed by the server crash.

Download all attachments as: .zip

Change History (9)

Changed 7 years ago by marcin

Full log produced by the v6.general.log.exhausted-pool

Changed 7 years ago by marcin

A log limited to only single SOLICIT followed by the server crash.

comment:1 Changed 7 years ago by stephen

  • Owner set to stephen
  • Status changed from new to accepted

comment:2 Changed 7 years ago by stephen

  • Owner changed from stephen to UnAssigned
  • Status changed from accepted to reviewing

Address allocation failures are now caught and lead to a rejection packet being sent to the client. This has been tested for IPv6 traffic but not yet for V4 via a relay (awaiting resolution of #2697). It has been put for review now so as not to hold up the next release.

comment:3 Changed 7 years ago by marcin

  • Owner changed from UnAssigned to marcin

comment:4 Changed 7 years ago by marcin

  • Owner changed from marcin to stephen

src/bin/dhcp4/dhcp4_messages.mes
The formats of the DHCP4_PACKET_PROCESS_FAIL and DHCP6_PACKET_PROCESS_FAIL are inconsistent. The former takes just one argument that is used to log an exception string. The latter takes 2 more arguments that are used to log the name of the packet and remote address. The purpose of the debug message is to help root cause issues so it should provide as much information as possible. Therefore, I suggest that the debug message for DHCPv4 server provides the same level of information as the message logged by DHCPv6 server.
I realize that the error message carried by the exception being caught may already include the detailed information but we shouldn't rely on this because someone may extend the processX() function and the new code may emit an exception without the detailed data.

src/lib/dhcpsrv/alloc_engine.cc
Please update the date in the copyright header to 2013.

allocateAddress6: Line 205: The reuseExpiredLease() function may throw at least two types of exceptions derived from isc::Exception. These are: NoSuchLease and DbOperationError. If these exceptions are thrown from the line 205 it is not caught. Note that if the same function throws an exception a few lines further (line 251) the DHCPSRV_ADDRESS6_ALLOC_ERROR is logged.

This has another implication that the client will not receive the NoAddrsAvail status code because the code section that starts in dhcp6_srv.cc, line 654 will not be executed.

allocateAddress4: This function suffers from the same issue as the allocateAddress6 function. Thankfully, we don't need to reply to the client that we could not allocate the lease and the exceptions will be eventually caught by the try-catch surrounding processX() functions. Nevertheless, for this function we should choose whether we catch exceptions, log an error within the function and return NULL pointer OR that we catch exceptions externally. The current code returns a combination of exceptions and NULL pointers for similar issues.

We should add a ChangeLog entry for this bug fix.

comment:5 Changed 7 years ago by stephen

  • Owner changed from stephen to marcin

src/bin/dhcp4/dhcp4_messages.mes
The formats of the DHCP4_PACKET_PROCESS_FAIL and DHCP6_PACKET_PROCESS_FAIL are inconsistent. The former takes just one argument that is used to log an exception string. The latter takes 2 more arguments that are used to log the name of the packet and remote address. The purpose of the debug message is to help root cause issues so it should provide as much information as possible. Therefore, I suggest that the debug message for DHCPv4 server provides the same level of information as the message logged by DHCPv6 server.

Modified to include the hardware address (if present) of the source packet. (Information in the DHCP56 packet is not available in the V4 packet.)

src/lib/dhcpsrv/alloc_engine.cc
Please update the date in the copyright header to 2013.

Updated.

allocateAddress6: Line 205: The reuseExpiredLease() function may throw at least two types of exceptions derived from isc::Exception. These are: NoSuchLease and DbOperationError. If these exceptions are thrown from the line 205 it is not caught. Note that if the same function throws an exception a few lines further (line 251) the DHCPSRV_ADDRESS6_ALLOC_ERROR is logged.

This has another implication that the client will not receive the NoAddrsAvail status code because the code section that starts in dhcp6_srv.cc, line 654 will not be executed.

Good catch. I've extended the range of the "try" block to cover the entire method. This does leave the possibility of a database problem generating multiple log messages, one per failed address allocation attempt, but that this something we will need to address in the future.

allocateAddress4: This function suffers from the same issue as the allocateAddress6 function. Thankfully, we don't need to reply to the client that we could not allocate the lease and the exceptions will be eventually caught by the try-catch surrounding processX() functions. Nevertheless, for this function we should choose whether we catch exceptions, log an error within the function and return NULL pointer OR that we catch exceptions externally. The current code returns a combination of exceptions and NULL pointers for similar issues.

Same solution applied - all isc::Exceptions are caught and a null lease returned (which matches the description of the method in the .h file).

We should add a ChangeLog entry for this bug fix.

How about:

XXX.	[bug]		stephen
	Fixed problem where the DHCP server crashed if it
        ran out of addresses.  Such a condition now causes
        a packet to be returned to the client refusing the
        allocation of an address.
	(Trac #2681, git XXX)

comment:6 Changed 7 years ago by marcin

  • Owner changed from marcin to stephen

I verified the fix by executing the test which discovered the problem. Test passed.

All ok. Please merge.

comment:7 Changed 7 years ago by stephen

  • Resolution set to fixed
  • Status changed from reviewing to closed

Merged in commit 87ce14cdb121b37afb5b1931af51bed7f6323dd6

Note: See TracTickets for help on using tickets.