Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#3737 closed defect (fixed)

Got DHCP4_LEASE_ALLOC_FAIL with reason "client sent yiaddr" (he did not)

Reported by: nicolas.chaigneau Owned by: marcin
Priority: medium Milestone: Kea0.9.1
Component: Unclassified Version: 0.9 beta1
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DHCP Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

The following use case has been detected from a replayed network capture.

What happens:

1) A client sends a Request (without option 50 "Requested IP Address").
Kea answers with an ACK with a suitable leased IP address.

2) 30 seconds later, the client reemits the Request (same xid, still no "Requested IP").
Kea answers with an ACK with the same leased IP address.

3) 15 seconds later, the clients reemits.
This time Kea answers with a NAK, and we have the following message in the logs:

2015-03-03 09:42:18.748 DEBUG [kea-dhcp4.dhcp4/2720] DHCP4_LEASE_ALLOC_FAIL failed to grant a lease for client-id 01:c4:73:1e:1c:8e:0e, hwaddr hwtype=1 c4:73:1e:1c:8e:0e, client sent yiaddr 10.176.22.90

The message is wrong: client did not send a yiaddr. (the address logged is the lease that was assigned to this client).

4) 8 seconds later, then 4, then 2, the client reemits again.
And again, Kea answers with a NAK with the same log message as in 3).

2015-03-03 09:42:25.802 DEBUG [kea-dhcp4.dhcp4/2720] DHCP4_LEASE_ALLOC_FAIL failed to grant a lease for client-id 01:c4:73:1e:1c:8e:0e, hwaddr hwtype=1 c4:73:1e:1c:8e:0e, client sent yiaddr 10.176.22.90
2015-03-03 09:42:28.855 DEBUG [kea-dhcp4.dhcp4/2720] DHCP4_LEASE_ALLOC_FAIL failed to grant a lease for client-id 01:c4:73:1e:1c:8e:0e, hwaddr hwtype=1 c4:73:1e:1c:8e:0e, client sent yiaddr 10.176.22.90
2015-03-03 09:42:29.908 DEBUG [kea-dhcp4.dhcp4/2720] DHCP4_LEASE_ALLOC_FAIL failed to grant a lease for client-id 01:c4:73:1e:1c:8e:0e, hwaddr hwtype=1 c4:73:1e:1c:8e:0e, client sent yiaddr 10.176.22.90

Same observation, the client does not send a yiaddr.

I understand that the client is broken.
But why does Kea behave differently when handling the 4 last reemissions ?
And the message logged is definitively misleading.

Here is the DEBUG log of Kea handling a request and complaining about the yiaddr:

2015-03-03 09:42:28.855 DEBUG [kea-dhcp4.dhcp4/2720] DHCP4_CLASS_ASSIGNED client packet has been assigned to the following class(es): VENDOR_CLASS_udhcp 1.18.1-VD Linux VDLinux.1.2.1.x
2015-03-03 09:42:28.855 DEBUG [kea-dhcp4.dhcp4/2720] DHCP4_PACKET_RECEIVED REQUEST (type 3) packet received on interface bond0.102
2015-03-03 09:42:28.855 DEBUG [kea-dhcp4.dhcp4/2720] DHCP4_QUERY_DATA received packet type 3, data is <localAddr=10.163.216.168:67 remoteAddr=10.156.0.1:67, msgtype=3, transid=0xe9eb9a1d

type=53, len=1:
type=55, len=7:
type=57, len=2:
type=60, len=37: 75:64:68:63:70:20:31:2e:31:38:2e:31:2d:56:44:20:4c:69:6e:75:78:20:56:44:4c:69:6e:75:78:2e:31:2e:32:2e:31:2e:78
type=61, len=7: 01:c4:73:1e:1c:8e:0e
type=82, len=49, data fields:
type=1, len=30: 57:34:57:67:6b:72:68:68:78:70:3a:38:36:2e:32:31:37:2e:32:33:2e:33:34:3a:6f:72:61:6e:67:65 type=2, len=15: 75:6e:61:75:74:68:65:6e:74:69:63:61:74:65:64

2015-03-03 09:42:28.855 DEBUG [kea-dhcp4.dhcp4/2720] DHCP4_SUBNET_SELECTED the 10.156.0.0/15 subnet was selected for client assignment
2015-03-03 09:42:28.855 DEBUG [kea-dhcp4.dhcpsrv/2720] DHCPSRV_MEMFILE_GET_SUBID_HWADDR obtaining IPv4 lease for subnet ID 1 and hardware address hwtype=1 c4:73:1e:1c:8e:0e
2015-03-03 09:42:28.855 DEBUG [kea-dhcp4.dhcp4/2720] DHCP4_LEASE_ALLOC_FAIL failed to grant a lease for client-id 01:c4:73:1e:1c:8e:0e, hwaddr hwtype=1 c4:73:1e:1c:8e:0e, client sent yiaddr 10.176.22.90
2015-03-03 09:42:28.855 DEBUG [kea-dhcp4.dhcp4/2720] DHCP4_RESPONSE_DATA responding with packet type 6, data is <localAddr=10.163.216.168:67 remoteAddr=10.156.0.1:67, msgtype=6, transid=0xe9eb9a1d

type=53, len=1: 06
type=54, len=4: 10.163.216.168
type=61, len=7: 01:c4:73:1e:1c:8e:0e
type=82, len=49, data fields:
type=1, len=30: 57:34:57:67:6b:72:68:68:78:70:3a:38:36:2e:32:31:37:2e:32:33:2e:33:34:3a:6f:72:61:6e:67:65 type=2, len=15: 75:6e:61:75:74:68:65:6e:74:69:63:61:74:65:64

I'll attach the extract of the network capture related to this xid.

Subtickets

Attachments (2)

kea_3737.pcap (4.7 KB) - added by nicolas.chaigneau 4 years ago.
kea_3737_b.pcap (6.2 KB) - added by nicolas.chaigneau 4 years ago.
Full packet capture for this client.

Download all attachments as: .zip

Change History (12)

Changed 4 years ago by nicolas.chaigneau

comment:1 Changed 4 years ago by marcin

Thanks for reporting. I am investigating this now.

comment:2 Changed 4 years ago by marcin

The logging message indicating that the client has sent the yiaddr is clearly wrong. The client has obviously sent the ciaddr and the ciaddr value seems to be ok, given that this is the address which the server granted to the client when it sent the DHCPACK.

There is something interesting in the client behavior. The client seems to be in the permanent rebinding state because it doesn't unicast its DHCPREQUEST messages to the server. Instead the messages are forwarded by the relay agent which may indicate that the client broadcasts messages (rebind state) which are picked and forwarded by the relay agent. Is this, what is happening? It is also interesting that the client doesn't follow the rebinding time value sent by the server and sends the requests every so often.

Nevertheless, it doesn't seem to be a source of the issue. I expect that Kea should consistently send DHCPACK and refresh client's lease. I didn't spot any difference between the two DHCPREQUESTs sent by the client, where for one of them the server has sent the DHCPACK, and for the other the server sent the DHCPNAK.

The only logical explanation is that something has changed on the server side between the two DHCPREQUESTs. Some of the possibilities are:

  • There has been some new host reservations added for the address being in use by the client. Unlikely, as I know you don't use host reservations. Right?
  • The subnet configuration has changed and the address being requested (ciaddr) has been removed from the pool.
  • The lease has been hijacked by another client who requested this specific address being owned by this client. I found this bug recently and it has been fixed with one of the ticket being now under review. It is unlikely scenario in your case though, because you'd need to have a renewing client which is sending a different address than being renewed.
  • Some other reason.

To further investigate this, it would be useful to have the following information:

  • the exact server configuration
  • the lease file holding all entries for the culprit address (no lfc performed on the lease file)
  • information whether there is also some other traffic which may cause requests for this specific address, perhaps whole the traffic capture between the server sending DHCPACK and DHCPNAK to the client
  • information if this was a one time event or it is fully reproducable issue

comment:3 Changed 4 years ago by nicolas.chaigneau

I'm not using host reservation.
The server configuration does not change during the network capture replay.
I did not find any other client requesting this specific address.

I noticed something more that I missed previously: immediately after the last NAK, the client starts a new transaction (with a new xid):

  • Discover, with Requested IP Address
  • Kea replies with an Offer (but offers another address).
  • then Request, Ack.

I'll attach another capture for the full exchange with this client.

This is reproducible: I'm using a network capture that I replay (using tcpreplay), and it happens each time. But, this only happens once for the whole capture, so it's safe to say this is very rare.

Changed 4 years ago by nicolas.chaigneau

Full packet capture for this client.

comment:4 follow-up: Changed 4 years ago by marcin

I analyzed the traffic capture and this is what I think happens here:

  • The server doesn't have the record of the client
  • The server sends a DHCPREQUEST with setting the ciaddr to out-of-pool value of 10.176.22.90
  • The server is trying to find the lease for this client but since there is no lease for this client and also the ciaddr doesn't fit to any of the pools, the server will allocate the address from the dynamic pool of 10.156.0.0.
  • The client is unhappy that it has got a different address than it is using, so it will send the DHCPREQUEST again.
  • The situation is now different on the server side because the client has a lease but is requesting a different address. The server will update the existing lease with the ciaddr as requested by the client and send the DHCPACK.
  • The client now renews the granted address 10.176.22.90 and the server discovers that this address doesn't belong to the pool and sends the DHCPNAK.

The server's behavior appears to be wrong because it should send the DHCPNAK from the very first message received from the client.

There is also a thing that the client's behavior is invalid because it is requesting different out-of-pool addresses. Also, even in the case when the client receives a DHCPOFFER with the particular address, i.e. 10.156.27.139 it sends the DHCPREQUEST asking for a different address, i.e. 10.156.167.26. So, the test is not really a real life scenario. But, it well demonstrates issues with the allocation engine for DHCPv4.

We have the ticket in the review queue: #3694, which rewrites the code in the Allocation Engine in this area. Many (if not all) of the issues should be addressed there. I am proposing to wait for the merge of the #3694 and rerun the test. Would this work?

comment:5 in reply to: ↑ 4 Changed 4 years ago by nicolas.chaigneau

Replying to marcin:

We have the ticket in the review queue: #3694, which rewrites the code in the Allocation Engine in this area. Many (if not all) of the issues should be addressed there. I am proposing to wait for the merge of the #3694 and rerun the test. Would this work?

That's ok for me.
Thanks for your investigation!

comment:6 Changed 4 years ago by marcin

  • Milestone changed from Kea-proposed to Kea0.9.1

comment:7 Changed 4 years ago by marcin

  • Owner set to UnAssigned
  • Status changed from new to reviewing

While I was waiting for the confirmation that the #3694 has resolved the issues with the server's responses, as described in http://kea.isc.org/ticket/3737#comment:4, I forgot that the logging message DHCP4_LEASE_ALLOC_FAIL was also broken and should be fixed within this ticket.

I have fixed the logging message on the trac3737 branch. I think it is worth to review this little change at this point and perhaps merge it to master. If Nicolas or ourselves find that the message processing is still broken, we can open another ticket.

Proposed ChangeLog entry:

XXX.	[bug]		marcin
	DHCPv4 server: corrected the logging message issued when
	the server could not allocate or offer the lease for the
	client. The corrected message contains the client-id,
	hardware address, ciaddr and requested-ip-address.
	(Trac 3737, git abcd)

comment:8 Changed 4 years ago by sar

  • Owner changed from UnAssigned to marcin

It all looks fine to me.

comment:9 Changed 4 years ago by marcin

  • Resolution set to fixed
  • Status changed from reviewing to closed

Merged with commit 4c56e1348c5d50eaa5b3083a0a8c346966e1b603.

comment:10 Changed 4 years ago by nicolas.chaigneau

Tested on the GitHub version.

This is completely fixed.
The message is now correct.

The failed allocation is explainable:

I have this 17 times, of which:

  • 10 are for a client requesting an IP address which belongs to another client (using the same Client Id).
  • 7 are for a Renewing client which is not known from the server.

All of this is the expected behavior.

Thanks for the fixes :)

Note: See TracTickets for help on using tickets.