Opened 3 years ago

Closed 3 years ago

#5070 closed defect (fixed)

Setting next-action to skip in lease4_select() causes allocation engine to retry til MAX ATTEMPTS

Reported by: tmark Owned by: fdupont
Priority: medium Milestone: Kea1.2
Component: hooks Version: git
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DHCP Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 1
Total Hours: 3 Internal?: no

Description (last modified by tmark)

On 11/15/16 6:49 AM, Thomas Markwalder wrote:

On 11/14/16 5:05 PM, MRob wrote:

  1. Is using "lease.decline(0)" the best (only) way, at least in this

hook point, to turn away unknown clients? That's what I've done and
it works, though the lease is still processed and sent to the
client, but with what I think is a lease for zero seconds.

Declining a lease is intended to be a client initiated action, so I

don't really think this is direction to go. If you set the
lease4_select next step action to SKIP before returning from your
lease4_select hook:

"NEXT STEP STATUS: If any callout installed on the "lease4_select"
hook sets the next step action to SKIP, the server will not assign any
lease and the callouts become responsible for the lease assignment. If
the callouts fail to provide a lease, the packet processing will
continue, but client will not get an address."

If your callout does not then assign a lease using its own decision
making, the server will generate a NAK to your client.

This works much better (though again, would have not needed to bother
you if I had been able to find the docs you quote here). Thanks again
for all your help, it's working quite nicely now.

Caveat: when a lease is skipped, the server immediately tries again to
allocate another lease for the client, and does this many more times
in the same second, so much so that there are nearly 2000 lines of
logging at maximum verbosity before a NAK is sent. Then it keeps
trying many times per second for another few seconds.

This seems quite excessive. Shouldn't the server send the NAK and stop
retrying after a lease is denied? At a minimum, retries should be rate
limited or limited to X number of retries.

The lease4_select hook is executed with AllocEngine::createLease4(). This method sees the skip and returns an empty lease pointer to AllocEngine::allocateOrReuseLease4() which returns it to AllocEngine::allocateUnreservedLease4(). This method sees the empty lease pointer and simply tries again until max attempts is reached. The skip needs to be checked again farther up:

Something like this would do it:

diff --git a/src/lib/dhcpsrv/alloc_engine.cc b/src/lib/dhcpsrv/alloc_engine.cc
index bc37eeb..c87af16 100644
--- a/src/lib/dhcpsrv/alloc_engine.cc
+++ b/src/lib/dhcpsrv/alloc_engine.cc
@@ -2841,6 +2841,10 @@ AllocEngine::allocateUnreservedLease4(ClientContext4& ctx) {

new_lease = allocateOrReuseLease4(candidate, ctx);
if (new_lease) {

return (new_lease);

+ } else {
+ if (ctx.callout_handle_->getStatus() == CalloutHandle::NEXT_STEP_SKIP) {
+ break;
+ }

}

}

}

Subtickets

Change History (7)

comment:1 Changed 3 years ago by tmark

  • Description modified (diff)

comment:2 Changed 3 years ago by hschempf

  • Milestone changed from Kea-proposed to Kea1.2

Per Dec 1 Kea meeting, accept 1.2

comment:3 Changed 3 years ago by fdupont

  • Owner set to fdupont
  • Status changed from new to accepted

comment:4 Changed 3 years ago by fdupont

  • Add Hours to Ticket changed from 0 to 2
  • Owner changed from fdupont to UnAssigned
  • Status changed from accepted to reviewing

The same issue exists for DHCPv6. Fixed in both versions with corresponding new unit tests (master callouts are called 100 times vs once).
Ready for review.

comment:5 Changed 3 years ago by tomek

  • Owner changed from UnAssigned to tomek

comment:6 Changed 3 years ago by tomek

  • Add Hours to Ticket changed from 2 to 1
  • Owner changed from tomek to fdupont
  • Total Hours changed from 0 to 3

I have reviewed your changes and they look good, but there's one minor improvement I'd like to suggest. Currently next step can have one of three values: continue (which is the default), skip and drop. The code checked for skip equality, but it would be better to check for continue inequality.

I have pushed those changes. Please pull and review. If you agree with them, the code is ready for merge. If not, let's continue discussion.

The code builds and unit-tests pass on Ubuntu 16.04.1 x64.

comment:7 Changed 3 years ago by fdupont

  • Resolution set to fixed
  • Status changed from reviewing to closed

Merged. Closing.

Note: See TracTickets for help on using tickets.