Opened 3 years ago

Closed 3 years ago

#5016 closed enhancement (complete)

Review and merge lw4over6 options support (PR#24)

Reported by: tomek Owned by: marcin
Priority: high Milestone: Kea1.2
Component: dhcp6 Version: git
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DHCP Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 2
Total Hours: 2 Internal?: no

Description

This ticket coverts reviewing, cleaning up and merging support for lightweight 4over6 options defined in RFC7598. We have received a pull request 24 on github (https://github.com/isc-projects/kea/pull/24).

See all the details on github.

Subtickets

Change History (12)

comment:1 Changed 3 years ago by marcin

  • Owner set to marcin
  • Status changed from new to accepted

comment:2 Changed 3 years ago by marcin

I have spent quite a while on adopting the code from the pull request #24. There is a new branch 'github24' where this code is available with all my modifications. I also asked the contributors to provide more up-to-date documentation and I am now waiting for their response. In case there is no response for a couple of days, I'll create/fix documentation on my own.

comment:3 Changed 3 years ago by marcin

  • Owner changed from marcin to UnAssigned
  • Status changed from accepted to reviewing

I haven't had any followup with the contributors so I updated the docs on my own. I also made some further changes to the code as a result of testing.

All changes are on the 'github24' branch!!!

Proposed ChangeLog entry:

11XX.	[func]		marcin
	Support for DHCPv6 options defined in RFC6603 and RFC7598. Thanks
	to Andrei Pavel and Cristian Secareanu of Qualitance for submitting
	initial implementation.
	(Trac #github24, git cafe)

comment:4 Changed 3 years ago by tomek

  • Owner changed from UnAssigned to tomek

comment:5 follow-up: Changed 3 years ago by tomek

The code did not compile for me. The issues encountered on Ubuntu
16.04 x64 were:

CXXLD    libdhcptest.la
ar: `u' modifier ignored since `D' is the default (see `U')
  CXX      libdhcp___unittests-option_custom_unittest.o
option_custom_unittest.cc: In member function ‘virtual void {anonymous}::OptionCustomTest_prefixDataArray_Test::TestBody()’:
option_custom_unittest.cc:863:5: error: narrowing conversion of ‘184’ from ‘int’ to ‘char’ inside { } is ill-formed in C++11 [-Werror=narrowing]
     };
     ^
option_custom_unittest.cc: In member function ‘virtual void {anonymous}::OptionCustomTest_psidDataArray_Test::TestBody()’:
option_custom_unittest.cc:907:5: error: narrowing conversion of ‘128’ from ‘int’ to ‘char’ inside { } is ill-formed in C++11 [-Werror=narrowing]
     };
     ^
option_custom_unittest.cc:907:5: error: narrowing conversion of ‘212’ from ‘int’ to ‘char’ inside { } is ill-formed in C++11 [-Werror=narrowing]
option_custom_unittest.cc:907:5: error: narrowing conversion of ‘128’ from ‘int’ to ‘char’ inside { } is ill-formed in C++11 [-Werror=narrowing]
cc1plus: all warnings being treated as errors
Makefile:1122: polecenia dla obiektu 'libdhcp___unittests-option_custom_unittest.o' nie powiodły się
make[2]: *** [libdhcp___unittests-option_custom_unittest.o] Błąd 1

and

  CXX      libkea_dhcpsrv_la-srv_config.lo
pool.cc:191:40: error: unused parameter ‘excluded_prefix’ [-Werror=unused-parameter]
             const asiolink::IOAddress& excluded_prefix,
                                        ^
pool.cc:192:27: error: unused parameter ‘excluded_prefix_len’ [-Werror=unused-parameter]
             const uint8_t excluded_prefix_len) {
                           ^
  CXX      libkea_dhcpsrv_la-subnet.lo

I fixed both and pushed my changes. Please pull.

The first one was trivial to fix, while the second one was slightly
more complex. I added a check for excluded_prefix_len and commented
out excluded_prefix and added a @todo about it. The problem with PD
exclude is that it's really an infix, so it would be difficult to
implement and specific checks for its content. But I have a sketch of
an idea in mind. excluded_prefix is an infix that is supposed to start
at delegated_prefix_len and end at excluded_prefix_len, that is left
shifted, so there are no leading zeros. So the check could verify that
only the bits between those two are set. I did not implement this yet,
because I still have doubts. For example, it may be more convenient
for sysadmins to specify the full prefix, rather than in infix. So,
what are your thoughts on this? This is actually explained in detail
in section 4.2 of RFC6603. There's even C pseudocode that illustrates
that.

doc/examles/kea6/softwire46.json
Thanks for providing and example config. Growing number of examples
is something that won't happen overnight, so it's great that we're
taking small steps towards the right direction.

I would explicitly add lightweight 4over6 somewhere in the first
comment paragraph.

You can trim down the comment in around line 14 about May 2014
and having 3 backends supported. Maybe saying "#Let's use the
simplest backend - memfile" would do the trick here?

It would be very useful to have some comment regarding values provided
for s46-rule and s46-portparams. Especially the portparams. Not sure
if you recall, but that's something Normen complained about. If we can
make it easy to use and well documented, he will surely appreciate
that.

I think your example of excluded prefix length is incorrect. This is a
bit complex, I admit. But please take a look at the Section 4.2 of
RFC6603. This is briefly summarized in the last paragraph on page 4.

If the delegated prefix is /64 and the excluded one is /72, then the
actual option would contain only 8 bits, that's 0xca in your
example. This option was defined that way, so the same value can be
sent to all clients and it would be interpreted by each client
differently, depending on the value of IAPREFIX in IA_PD.

doc/guide/dhcp6-srv.xml
The description of S46 Rule Option should explain what those
parameters mean. How about the following text:

The S46 Rule option conveys a number of parameters:

- flags, an unsigned 8 bits integer, with currently only the most
  significant bit specified. It denotes whether the rule can be used
  for forwarding (128) or not (0). (Note your example is incorrect.
  This was caught when the draft was in RFC-Ed queue and AD made the
  decision to carry on as is. The bits are described in unfortunate
  way, so the only bit defined is 7th, so either 128 or 0 can be used,
  not 1 as you have in your example.)

- ea-len, an 8 bits long Embedded Address length. Allowed values
  range from 0 to 48.

- IPv4 prefix length, 8 bits long; expresses the prefix length of the
  Rule IPv4 prefix specified in the ipv4-prefix field.  Allowed
  values range from 0 to 32.

- IPv4 prefix, a fixed-length 32-bit field that specifies the IPv4
  prefix for the S46 rule.  The bits in the prefix after prefix4-len
  number of bits are reserved and MUST be initialized to zero by the
  sender and ignored by the receiver.

- IPv6 prefix in prefix/length notation that specifies the IPv6 domain
  prefix for the S46 rule.  The field is padded on the right with zero
  bits up to the nearest octet boundary when prefix6-len is not evenly
  divisible by 8.

Line 2508: you added a comma that is not needed.

Section "Supported DHCPv6 Standards" should mention RFC6334.

src/bin/dhcp6/dhcp6.spec
Please provide item_description for the new entries.

sarr_unittest.cc

The configuration 2 uses incorrect excluded prefix value. This is similar to my
comments for JSON example config above. The major reason why this option is
specified as infix is because it should have the same value for all delegated
prefixes. Note that the exclude is a "hole" in the specific prefix delegated to
the client, it's not a way to tell the server there's a certain prefix it should
omit. So the value you specified in the config example would in principle be
meaningful for one of the delegated prefixes (2001:db8:3::/64), but would
be meaningless for all other delegated prefixes.

This issue seems to persist in several places in the code and the
documentation. There are several ways we can solve this:

a) modify the code to allow specification of infix and do the actual
checks, i.e. verify that only the allowed bits are specified. This is
technically correct, but some people may find in difficult to
calculate the infix themselves.

b) modify the code to allow specifying the excluded prefix for the
first delegated prefix, and calculate the infix values based on it.
That would make your config examples correct, but the code would have
to be updated.

c) keep the code as is, add @todo and address it some time in the
future, when we'll be actually start caring about PD exclude. This is
easiest to do, but it will give people more opportunity to
misconfigure their DHCP server.

I don't have a strong preference here, but due to other tasks we have
planned for 1.2, I think going with c) seems reasonable.

The same applies to Dhcp6ParserTest.pdPoolPrefixExclude in
config_parser_unittest.cc and RenewTest?.renewWithExcludedPrefix in
renew_unitest.cc.

duid_factory.cc, docsis3_option_defs.h
Copyright year has been updated. Please pull.

option_int.h
The code change in this option class (and several others) set the
encapsulated space to dhcp4 or dhcp6. Why? Why does an array of
integers encapsulate anything? If you have a compelling reason
for doing this, maybe it would be better to set it in the Option
constructor once for all the objects and the only override it in
specific options (vendor and map-t/e containers)?

option6_pdexclude.cc|h
Why does the buffer parsing constructor take delegated_prefix and
delegated_prefix_length parameters? They're not documented BTW, but
rather than documenting them, I think we should remove them
completely. When we're parsing the buffer we don't know anything a
priori about the delegated prefixes at all. This must be fixed.
In particular, this option may be misplaced in the incoming message
and there could even be no delegated prefix at all. Nevertheless, the
constructor should parse the buffer and create option object for it.

I see there's delegated prefix and delegated prefix length stored in
this object. Those fields are not in the option, so they shouldn't
be there. That's actually a big flaw. When you have a PD pool with
1000s of prefixes in it and pd exclude defined, you just want to copy
a pointer to the pd-exclude option and not need to keep updating its
content for every PD you send to clients.

This is the route I went in Dibbler and it was a big mistake. I put
too much logic in the options, so they knew what parent options
they're in and tried to be smart about it. They weren't and the logic
became spread out in many places, with some options checking certain
things while other didn't. It ended up in a big mess. Options are
supposed to be simple containers what whatever comes over the wire.

In fact, I think the whole logic defined in Option6PDExclude::pack()
should be done once in constructor, not every time the option is being
sent. Remember the usage pattern. The object is created once and then
is being sent out 1000s of times.

option_data_types.h
I see that you moved the definition of OptionDefParams? to this
file. It used to be in std_option_defs.h. I think we should update
this structure to use std::string rather than const char* and use
vector<OptionDataType?> (or other suitable containers) rather than
const OptionDataType?*.

option_data_types.h

loggingan => logging an

Description of psid parameter for writePsid around line 564 says:

/// @param psid PSID value, where the lowest value is 0, and the
/// highest value is 2^(PSID length).

Shouldn't that be 2(PSID length) - 1? The 2(psid length) is too
large to fit.

std_option_defs.h
Why is pd-exclude defined as OPT_BINARY_TYPE and not
OPT_IPV6_PREFIX_TYPE? That's a honest question. I thought that type is
exactly for 8 bits length + variable size prefix.

option6_ia_unittest.cc
Copyright year has been updated. Please pull.

src/lib/dhcp/tests/option6_pdexclude_unittest.cc
The comment in line 62 does not seem correct. 0x3F is not 59, it's 63.

Several tests defined an array and then use it to initialize a vector.
Since we're moving to C++11, this:

    const uint8_t expected_data[] = {
        0x00, 0x43, // option code 67
        0x00, 0x02, // option length 2
        0x3F, 0x70  // excluded prefix length 63 + subnet id
    };

    std::vector<uint8_t> expected_vec(expected_data,
                                      expected_data + sizeof(expected_data));

can be simplified to:

    std::vector<uint8_t> expected_vec = {
        0x00, 0x43, // option code 67
        0x00, 0x02, // option length 2
        0x3F, 0x70  // excluded prefix length 63 + subnet id
    };

We most certainly don't want to spend time updating thousands of
existing tests, but it may be ok to add it for the new ones.

This concludes the review of changes in src/bin and src/lib/dhcp. I
still need to review changes in src/lib/dhcpsrv, which will be posted
as separate comment.

comment:6 follow-up: Changed 3 years ago by tomek

  • Owner changed from tomek to marcin

This is the second and final part of my review:

AUTHORS
I think the date should be October, not September. Some features take
many months to develop and we put the dates of the merge, not when
they were initially developed. This is now fixed on github24 branch.
Please pull.

src/lib/dhcpsrv/alloc_engine.cc
Around line 694 you removed the Pool6Ptr, so the code now effectively
uses method-wide pool object. That's fine, but that pool is used 3
lines later to get length of it. If the getPool does not succeed, we
will dereference a null pointer. There's even a todo about it. I think
this should be addressed, so I went ahead and tweaked the code a bit.

cfg_option_unittest.cc
const unsigned type uses implied integer type. We use explicit typing
everywhere, so it should be const unsigned int.

CfgOptionTest?.encapsulate is too long and should be split into several
smaller methods. This will come in handy, when we'll need more
unit-tests.

I fixed both issues.

src/lib/dhcpsrv/tests/pool_unittest.cc
handlded => handled. I fixed that. Please pull.


It seems that my compilation fix introduced in commit
79ad6e97c1be37704b4fae58d57d936497a27ada causes some regressions in
the tests. (It is ok if the excluded prefix is shorter than delegated
prefix if the former is 0, which indicates there's no exclusion at
all). I have fixed that now.


I think there should be at least one unit-test that specifies lw4over6
options and then checks that client was indeed assigned those options,
that they're nested properly and have expected values. This test
should in general be similar to RenewTest?.renewWithExcludedPrefix in
src/bin/dhcp6/tests/renew_unittest.cc. If you're ok with that, I can
develop such a test.


I tried to run all unit-tests, but experienced the following in
src/bin/d2:

2016-10-24 19:01:20.578 ERROR [kea-dhcp-ddns.libdhcp-ddns/14892] DHCP_DDNS_NCR_UDP_RECV_ERROR UDP socket receive error while listening for DNS Update requests: Resource temporarily unavailable
2016-10-24 19:01:20.578 ERROR [kea-dhcp-ddns.dhcp-to-d2/14892] DHCP_DDNS_QUEUE_MGR_RECV_ERROR application's queue manager was notified of a request receive error by its listener.
2016-10-24 19:01:20.578 DEBUG [kea-dhcp-ddns.dhcpddns/14892] DHCP_DDNS_QUEUE_MGR_STOPPED application's queue manager has stopped listening for requests.
2016-10-24 19:01:20.578 INFO  [kea-dhcp-ddns.dhcpddns/14892] DHCP_DDNS_QUEUE_MGR_RECOVERING application is attempting to recover from a queue manager IO error
2016-10-24 19:01:20.578 DEBUG [kea-dhcp-ddns.dhcpddns/14892] DHCP_DDNS_QUEUE_MGR_STARTED application's queue manager has begun listening for requests.
2016-10-24 19:01:20.578 ERROR [kea-dhcp-ddns.libdhcp-ddns/14892] DHCP_DDNS_NCR_UDP_RECV_ERROR UDP socket receive error while listening for DNS Update requests: Resource temporarily unavailable

repeating many times. And also the following in src/lib/dhcp_ddns:

[ RUN      ] NameChangeUDPTest.roundTripTest
ncr_udp_unittests.cc:705: Failure
Value of: checkSendVsReceived(sent_ncrs_[i], received_ncrs_[i])
  Actual: false
Expected: true
ncr_udp_unittests.cc:705: Failure
Value of: checkSendVsReceived(sent_ncrs_[i], received_ncrs_[i])
  Actual: false
Expected: true
ncr_udp_unittests.cc:705: Failure
Value of: checkSendVsReceived(sent_ncrs_[i], received_ncrs_[i])
  Actual: false
Expected: true
ncr_udp_unittests.cc:714: Failure
Value of: listener_->isIoPending()
  Actual: true
Expected: false
[  FAILED  ] NameChangeUDPTest.roundTripTest (1 ms)

The latter is addressed on master. The former I have not seen yet, but
since the code has not touched anything in DDNS area, I'll simply
assume that's an issue of the code being based on an old master.

I have pushed several smaller changes. Please pull and review.


In overall, there's a huge chunk of good code here. This ticket is way bigger
than average, so it would be very much ok to push any more significant changes
to separate ticket. In particular, the way how Option6PDExclude stores unnecessary
data (delegated prefix and prefix length) are great candidates for a new ticket.

Thank you for your thorough work on this ticket. It's much appreciated.

comment:7 Changed 3 years ago by marcin

Tomek,
Thanks for your thorough review. I realize it was not easy to go through so many changes. This is the artifact of the pull request which itself came so huge.

This work was previously conducted on the github24 branch. However, since then, there were a couple of relevant changes o the master branch, of which the most important one was about pool specific options in v6. Also, this code touches so many files that since it has been implemented a few conflicts with master branch arose. Therefore, I thought it will be good to merge/rebase the github24 branch. I did that and put it on the trac5016 branch. Some of the previous commits were squashed to easily separate previous work and my recent changes to address your review comments.

So, the most important message for you is to continue your review on the trac5016 branch starting from the following commit:

commit 0c0ca2fde709f015a891c30620ce5f8828e998b0
Author: Marcin Siodelski <marcin@isc.org>
Date:   Wed Oct 26 08:18:05 2016 +0200

    [5016] Merged pull request #24 from github24 and squashed it.
    
    This change includes the code implemented by the pull request
    submitter as well as Marcin's fixes/changes on top of it, but
    without a review.

In the following comments in this ticket I am going to respond to your individual review comments.

comment:8 in reply to: ↑ 5 ; follow-up: Changed 3 years ago by marcin

Replying to tomek:

The code did not compile for me. The issues encountered on Ubuntu
16.04 x64 were:

CXXLD    libdhcptest.la
ar: `u' modifier ignored since `D' is the default (see `U')
  CXX      libdhcp___unittests-option_custom_unittest.o
option_custom_unittest.cc: In member function ‘virtual void {anonymous}::OptionCustomTest_prefixDataArray_Test::TestBody()’:
option_custom_unittest.cc:863:5: error: narrowing conversion of ‘184’ from ‘int’ to ‘char’ inside { } is ill-formed in C++11 [-Werror=narrowing]
     };
     ^
option_custom_unittest.cc: In member function ‘virtual void {anonymous}::OptionCustomTest_psidDataArray_Test::TestBody()’:
option_custom_unittest.cc:907:5: error: narrowing conversion of ‘128’ from ‘int’ to ‘char’ inside { } is ill-formed in C++11 [-Werror=narrowing]
     };
     ^
option_custom_unittest.cc:907:5: error: narrowing conversion of ‘212’ from ‘int’ to ‘char’ inside { } is ill-formed in C++11 [-Werror=narrowing]
option_custom_unittest.cc:907:5: error: narrowing conversion of ‘128’ from ‘int’ to ‘char’ inside { } is ill-formed in C++11 [-Werror=narrowing]
cc1plus: all warnings being treated as errors
Makefile:1122: polecenia dla obiektu 'libdhcp___unittests-option_custom_unittest.o' nie powiodły się
make[2]: *** [libdhcp___unittests-option_custom_unittest.o] Błąd 1

and

  CXX      libkea_dhcpsrv_la-srv_config.lo
pool.cc:191:40: error: unused parameter ‘excluded_prefix’ [-Werror=unused-parameter]
             const asiolink::IOAddress& excluded_prefix,
                                        ^
pool.cc:192:27: error: unused parameter ‘excluded_prefix_len’ [-Werror=unused-parameter]
             const uint8_t excluded_prefix_len) {
                           ^
  CXX      libkea_dhcpsrv_la-subnet.lo

I fixed both and pushed my changes. Please pull.

Thanks.

The first one was trivial to fix, while the second one was slightly
more complex. I added a check for excluded_prefix_len and commented
out excluded_prefix and added a @todo about it. The problem with PD
exclude is that it's really an infix, so it would be difficult to
implement and specific checks for its content. But I have a sketch of
an idea in mind. excluded_prefix is an infix that is supposed to start
at delegated_prefix_len and end at excluded_prefix_len, that is left
shifted, so there are no leading zeros. So the check could verify that
only the bits between those two are set. I did not implement this yet,
because I still have doubts. For example, it may be more convenient
for sysadmins to specify the full prefix, rather than in infix. So,
what are your thoughts on this? This is actually explained in detail
in section 4.2 of RFC6603. There's even C pseudocode that illustrates
that.

Ok. I think I have fixed the PD exclude stuff. See below.

doc/examles/kea6/softwire46.json
Thanks for providing and example config. Growing number of examples
is something that won't happen overnight, so it's great that we're
taking small steps towards the right direction.

I would explicitly add lightweight 4over6 somewhere in the first
comment paragraph.

Added.

You can trim down the comment in around line 14 about May 2014
and having 3 backends supported. Maybe saying "#Let's use the
simplest backend - memfile" would do the trick here?

Removed the comment.

It would be very useful to have some comment regarding values provided
for s46-rule and s46-portparams. Especially the portparams. Not sure
if you recall, but that's something Normen complained about. If we can
make it easy to use and well documented, he will surely appreciate
that.

Added some comments for those options.

I think your example of excluded prefix length is incorrect. This is a
bit complex, I admit. But please take a look at the Section 4.2 of
RFC6603. This is briefly summarized in the last paragraph on page 4.

Where exactly? I couldn't find a relevant place.

If the delegated prefix is /64 and the excluded one is /72, then the
actual option would contain only 8 bits, that's 0xca in your
example. This option was defined that way, so the same value can be
sent to all clients and it would be interpreted by each client
differently, depending on the value of IAPREFIX in IA_PD.

doc/guide/dhcp6-srv.xml
The description of S46 Rule Option should explain what those
parameters mean. How about the following text:

The S46 Rule option conveys a number of parameters:

- flags, an unsigned 8 bits integer, with currently only the most
  significant bit specified. It denotes whether the rule can be used
  for forwarding (128) or not (0). (Note your example is incorrect.
  This was caught when the draft was in RFC-Ed queue and AD made the
  decision to carry on as is. The bits are described in unfortunate
  way, so the only bit defined is 7th, so either 128 or 0 can be used,
  not 1 as you have in your example.)

- ea-len, an 8 bits long Embedded Address length. Allowed values
  range from 0 to 48.

- IPv4 prefix length, 8 bits long; expresses the prefix length of the
  Rule IPv4 prefix specified in the ipv4-prefix field.  Allowed
  values range from 0 to 32.

- IPv4 prefix, a fixed-length 32-bit field that specifies the IPv4
  prefix for the S46 rule.  The bits in the prefix after prefix4-len
  number of bits are reserved and MUST be initialized to zero by the
  sender and ignored by the receiver.

- IPv6 prefix in prefix/length notation that specifies the IPv6 domain
  prefix for the S46 rule.  The field is padded on the right with zero
  bits up to the nearest octet boundary when prefix6-len is not evenly
  divisible by 8.

Ok. I have added this text. Thanks for proposing.

Line 2508: you added a comma that is not needed.

Removed.

Section "Supported DHCPv6 Standards" should mention RFC6334.

Does mention now.

src/bin/dhcp6/dhcp6.spec
Please provide item_description for the new entries.

Provided.

sarr_unittest.cc

The configuration 2 uses incorrect excluded prefix value. This is similar to my
comments for JSON example config above. The major reason why this option is
specified as infix is because it should have the same value for all delegated
prefixes. Note that the exclude is a "hole" in the specific prefix delegated to
the client, it's not a way to tell the server there's a certain prefix it should
omit. So the value you specified in the config example would in principle be
meaningful for one of the delegated prefixes (2001:db8:3::/64), but would
be meaningless for all other delegated prefixes.

This issue seems to persist in several places in the code and the
documentation. There are several ways we can solve this:

a) modify the code to allow specification of infix and do the actual
checks, i.e. verify that only the allowed bits are specified. This is
technically correct, but some people may find in difficult to
calculate the infix themselves.

b) modify the code to allow specifying the excluded prefix for the
first delegated prefix, and calculate the infix values based on it.
That would make your config examples correct, but the code would have
to be updated.

c) keep the code as is, add @todo and address it some time in the
future, when we'll be actually start caring about PD exclude. This is
easiest to do, but it will give people more opportunity to
misconfigure their DHCP server.

I don't have a strong preference here, but due to other tasks we have
planned for 1.2, I think going with c) seems reasonable.

The same applies to Dhcp6ParserTest.pdPoolPrefixExclude in
config_parser_unittest.cc and RenewTest?.renewWithExcludedPrefix in
renew_unitest.cc.

Yeah... I elected to go for option b). I think it is more common and natural to specify the entire prefix rather than infix. The code has been updated accordingly.

duid_factory.cc, docsis3_option_defs.h
Copyright year has been updated. Please pull.

Thanks.

option_int.h
The code change in this option class (and several others) set the
encapsulated space to dhcp4 or dhcp6. Why? Why does an array of
integers encapsulate anything? If you have a compelling reason
for doing this, maybe it would be better to set it in the Option
constructor once for all the objects and the only override it in
specific options (vendor and map-t/e containers)?

We have had that in the existing code. I guess, it is something to be considered in a separate ticket?

option6_pdexclude.cc|h
Why does the buffer parsing constructor take delegated_prefix and
delegated_prefix_length parameters? They're not documented BTW, but
rather than documenting them, I think we should remove them
completely. When we're parsing the buffer we don't know anything a
priori about the delegated prefixes at all. This must be fixed.
In particular, this option may be misplaced in the incoming message
and there could even be no delegated prefix at all. Nevertheless, the
constructor should parse the buffer and create option object for it.

I see there's delegated prefix and delegated prefix length stored in
this object. Those fields are not in the option, so they shouldn't
be there. That's actually a big flaw. When you have a PD pool with
1000s of prefixes in it and pd exclude defined, you just want to copy
a pointer to the pd-exclude option and not need to keep updating its
content for every PD you send to clients.

This is the route I went in Dibbler and it was a big mistake. I put
too much logic in the options, so they knew what parent options
they're in and tried to be smart about it. They weren't and the logic
became spread out in many places, with some options checking certain
things while other didn't. It ended up in a big mess. Options are
supposed to be simple containers what whatever comes over the wire.

In fact, I think the whole logic defined in Option6PDExclude::pack()
should be done once in constructor, not every time the option is being
sent. Remember the usage pattern. The object is created once and then
is being sent out 1000s of times.

Ok. I concur. I have made a revolution. The Option6PDExclude will still take the prefix but will only hold an infix. That was you can reuse the option for all pools. In addition, I removed the delegated_prefix and delegated_prefix_len from the constructor which takes wire data.

option_data_types.h
I see that you moved the definition of OptionDefParams? to this
file. It used to be in std_option_defs.h. I think we should update
this structure to use std::string rather than const char* and use
vector<OptionDataType?> (or other suitable containers) rather than
const OptionDataType?*.

Is this really so critical? Is this a step towards C++11?

option_data_types.h

loggingan => logging an

Fixed.

Description of psid parameter for writePsid around line 564 says:

/// @param psid PSID value, where the lowest value is 0, and the
/// highest value is 2^(PSID length).

Shouldn't that be 2(PSID length) - 1? The 2(psid length) is too
large to fit.

Fixed.

std_option_defs.h
Why is pd-exclude defined as OPT_BINARY_TYPE and not
OPT_IPV6_PREFIX_TYPE? That's a honest question. I thought that type is
exactly for 8 bits length + variable size prefix.

Fixed.

option6_ia_unittest.cc
Copyright year has been updated. Please pull.

Thanks.

src/lib/dhcp/tests/option6_pdexclude_unittest.cc
The comment in line 62 does not seem correct. 0x3F is not 59, it's 63.

Good catch. Fixed.

Several tests defined an array and then use it to initialize a vector.
Since we're moving to C++11, this:

    const uint8_t expected_data[] = {
        0x00, 0x43, // option code 67
        0x00, 0x02, // option length 2
        0x3F, 0x70  // excluded prefix length 63 + subnet id
    };

    std::vector<uint8_t> expected_vec(expected_data,
                                      expected_data + sizeof(expected_data));

can be simplified to:

    std::vector<uint8_t> expected_vec = {
        0x00, 0x43, // option code 67
        0x00, 0x02, // option length 2
        0x3F, 0x70  // excluded prefix length 63 + subnet id
    };

We most certainly don't want to spend time updating thousands of
existing tests, but it may be ok to add it for the new ones.

I didn't do this change. Though, I tried to compile Kea using std=c++11. It seems that we're not quite ready to migrate to c++11 and I don't want this ticket to be stuck until we are. What I found as a show stopper was that GTEST < 1.8 doesn't work with C++11. We don't document that Kea will now requires GTEST 1.8+. Also, is Jenkins ready to compile Kea as C++11 code? I think we should first make the switch of master and then merge the C++11 polluted code into it. For the time being I don't want it to be show stopper for merging this work which becomes conflicted with master very quickly (because it touches so many files).

This concludes the review of changes in src/bin and src/lib/dhcp. I
still need to review changes in src/lib/dhcpsrv, which will be posted
as separate comment.

Thanks. I will respond to the rest in a separate comment too.

comment:9 in reply to: ↑ 6 ; follow-up: Changed 3 years ago by marcin

  • Owner changed from marcin to tomek

Replying to tomek:

This is the second and final part of my review:

AUTHORS
I think the date should be October, not September. Some features take
many months to develop and we put the dates of the merge, not when
they were initially developed. This is now fixed on github24 branch.
Please pull.

As we speak... we almost reach November. :-) So, perhaps wait with further updating it till final merge?

src/lib/dhcpsrv/alloc_engine.cc
Around line 694 you removed the Pool6Ptr, so the code now effectively
uses method-wide pool object. That's fine, but that pool is used 3
lines later to get length of it. If the getPool does not succeed, we
will dereference a null pointer. There's even a todo about it. I think
this should be addressed, so I went ahead and tweaked the code a bit.

Ok.

cfg_option_unittest.cc
const unsigned type uses implied integer type. We use explicit typing
everywhere, so it should be const unsigned int.

"unsigned" is explicit typing because it means the same as "unsigned int", but it is shorter.

CfgOptionTest?.encapsulate is too long and should be split into several
smaller methods. This will come in handy, when we'll need more
unit-tests.

I fixed both issues.

src/lib/dhcpsrv/tests/pool_unittest.cc
handlded => handled. I fixed that. Please pull.

Thanks.


It seems that my compilation fix introduced in commit
79ad6e97c1be37704b4fae58d57d936497a27ada causes some regressions in
the tests. (It is ok if the excluded prefix is shorter than delegated
prefix if the former is 0, which indicates there's no exclusion at
all). I have fixed that now.


I think there should be at least one unit-test that specifies lw4over6
options and then checks that client was indeed assigned those options,
that they're nested properly and have expected values. This test
should in general be similar to RenewTest?.renewWithExcludedPrefix in
src/bin/dhcp6/tests/renew_unittest.cc. If you're ok with that, I can
develop such a test.

I don't have any issue with you developing this test.


I tried to run all unit-tests, but experienced the following in
src/bin/d2:

2016-10-24 19:01:20.578 ERROR [kea-dhcp-ddns.libdhcp-ddns/14892] DHCP_DDNS_NCR_UDP_RECV_ERROR UDP socket receive error while listening for DNS Update requests: Resource temporarily unavailable
2016-10-24 19:01:20.578 ERROR [kea-dhcp-ddns.dhcp-to-d2/14892] DHCP_DDNS_QUEUE_MGR_RECV_ERROR application's queue manager was notified of a request receive error by its listener.
2016-10-24 19:01:20.578 DEBUG [kea-dhcp-ddns.dhcpddns/14892] DHCP_DDNS_QUEUE_MGR_STOPPED application's queue manager has stopped listening for requests.
2016-10-24 19:01:20.578 INFO  [kea-dhcp-ddns.dhcpddns/14892] DHCP_DDNS_QUEUE_MGR_RECOVERING application is attempting to recover from a queue manager IO error
2016-10-24 19:01:20.578 DEBUG [kea-dhcp-ddns.dhcpddns/14892] DHCP_DDNS_QUEUE_MGR_STARTED application's queue manager has begun listening for requests.
2016-10-24 19:01:20.578 ERROR [kea-dhcp-ddns.libdhcp-ddns/14892] DHCP_DDNS_NCR_UDP_RECV_ERROR UDP socket receive error while listening for DNS Update requests: Resource temporarily unavailable

repeating many times. And also the following in src/lib/dhcp_ddns:

[ RUN      ] NameChangeUDPTest.roundTripTest
ncr_udp_unittests.cc:705: Failure
Value of: checkSendVsReceived(sent_ncrs_[i], received_ncrs_[i])
  Actual: false
Expected: true
ncr_udp_unittests.cc:705: Failure
Value of: checkSendVsReceived(sent_ncrs_[i], received_ncrs_[i])
  Actual: false
Expected: true
ncr_udp_unittests.cc:705: Failure
Value of: checkSendVsReceived(sent_ncrs_[i], received_ncrs_[i])
  Actual: false
Expected: true
ncr_udp_unittests.cc:714: Failure
Value of: listener_->isIoPending()
  Actual: true
Expected: false
[  FAILED  ] NameChangeUDPTest.roundTripTest (1 ms)

The latter is addressed on master. The former I have not seen yet, but
since the code has not touched anything in DDNS area, I'll simply
assume that's an issue of the code being based on an old master.

Maybe you should try now, as I rebased the code against master?

I have pushed several smaller changes. Please pull and review.


In overall, there's a huge chunk of good code here. This ticket is way bigger
than average, so it would be very much ok to push any more significant changes
to separate ticket. In particular, the way how Option6PDExclude stores unnecessary
data (delegated prefix and prefix length) are great candidates for a new ticket.

Thank you for your thorough work on this ticket. It's much appreciated.

Thanks. Hopefully I didn't omit anything. There are many changes and many review comments so whatever is left unfixed was not intentional.

comment:10 in reply to: ↑ 8 Changed 3 years ago by tomek

  • Add Hours to Ticket changed from 0 to 2
  • Total Hours changed from 0 to 2

Trimming down all the comments that are addressed already.

Replying to marcin:

Replying to tomek:
Ok. I think I have fixed the PD exclude stuff. See below.

Yes. Thanks for doing that. The code looks good now.

I think your example of excluded prefix length is incorrect. This is a
bit complex, I admit. But please take a look at the Section 4.2 of
RFC6603. This is briefly summarized in the last paragraph on page 4.

Where exactly? I couldn't find a relevant place.

It does not matter anymore. Since you chose to update the code - choice b) of my proposal - this example is now correct.

sarr_unittest.cc

The configuration 2 uses incorrect excluded prefix value. This is similar to my
comments for JSON example config above. The major reason why this option is
specified as infix is because it should have the same value for all delegated
prefixes. Note that the exclude is a "hole" in the specific prefix delegated to
the client, it's not a way to tell the server there's a certain prefix it should
omit. So the value you specified in the config example would in principle be
meaningful for one of the delegated prefixes (2001:db8:3::/64), but would
be meaningless for all other delegated prefixes.

This issue seems to persist in several places in the code and the
documentation. There are several ways we can solve this:

a) modify the code to allow specification of infix and do the actual
checks, i.e. verify that only the allowed bits are specified. This is
technically correct, but some people may find in difficult to
calculate the infix themselves.

b) modify the code to allow specifying the excluded prefix for the
first delegated prefix, and calculate the infix values based on it.
That would make your config examples correct, but the code would have
to be updated.

c) keep the code as is, add @todo and address it some time in the
future, when we'll be actually start caring about PD exclude. This is
easiest to do, but it will give people more opportunity to
misconfigure their DHCP server.

I don't have a strong preference here, but due to other tasks we have
planned for 1.2, I think going with c) seems reasonable.

The same applies to Dhcp6ParserTest.pdPoolPrefixExclude in
config_parser_unittest.cc and RenewTest?.renewWithExcludedPrefix in
renew_unitest.cc.

Yeah... I elected to go for option b). I think it is more common and natural to specify the entire prefix rather than infix. The code has been updated accordingly.

Thanks. The code is now much better.

option_int.h
The code change in this option class (and several others) set the
encapsulated space to dhcp4 or dhcp6. Why? Why does an array of
integers encapsulate anything? If you have a compelling reason
for doing this, maybe it would be better to set it in the Option
constructor once for all the objects and the only override it in
specific options (vendor and map-t/e containers)?

We have had that in the existing code. I guess, it is something to be considered in a separate ticket?

It was an honest question. If you can't remember any reasons why we did it
that way, maybe add comment that its purpose is unknown. I don't think
a ticket is needed, but if you think it would help, feel free to create
one for this.

option_data_types.h
I see that you moved the definition of OptionDefParams? to this
file. It used to be in std_option_defs.h. I think we should update
this structure to use std::string rather than const char* and use
vector<OptionDataType?> (or other suitable containers) rather than
const OptionDataType?*.

Is this really so critical? Is this a step towards C++11?

Nope, it was just a suggestion. Maybe I got over-enthusiastic with this
whole C++11 deal :) It's perfectly fine to keep the code as it is now,
especially given the issues you mentioned below.

Several tests defined an array and then use it to initialize a vector.
Since we're moving to C++11, this:

    const uint8_t expected_data[] = {
        0x00, 0x43, // option code 67
        0x00, 0x02, // option length 2
        0x3F, 0x70  // excluded prefix length 63 + subnet id
    };

    std::vector<uint8_t> expected_vec(expected_data,
                                      expected_data + sizeof(expected_data));

can be simplified to:

    std::vector<uint8_t> expected_vec = {
        0x00, 0x43, // option code 67
        0x00, 0x02, // option length 2
        0x3F, 0x70  // excluded prefix length 63 + subnet id
    };

We most certainly don't want to spend time updating thousands of
existing tests, but it may be ok to add it for the new ones.

I didn't do this change. Though, I tried to compile Kea using std=c++11. It seems that we're not quite ready to migrate to c++11 and I don't want this ticket to be stuck until we are. What I found as a show stopper was that GTEST < 1.8 doesn't work with C++11. We don't document that Kea will now requires GTEST 1.8+. Also, is Jenkins ready to compile Kea as C++11 code? I think we should first make the switch of master and then merge the C++11 polluted code into it. For the time being I don't want it to be show stopper for merging this work which becomes conflicted with master very quickly (because it touches so many files).

Sure. It's fine to keep the code as it. We have 1000s of other code similar
to this.

comment:11 in reply to: ↑ 9 Changed 3 years ago by tomek

  • Owner changed from tomek to marcin

Replying to marcin:

Replying to tomek:

This is the second and final part of my review:

AUTHORS
I think the date should be October, not September. Some features take
many months to develop and we put the dates of the merge, not when
they were initially developed. This is now fixed on github24 branch.
Please pull.

As we speak... we almost reach November. :-) So, perhaps wait with further updating it till final merge?

We actually did. But the code is finally ready for merge. Al least from the engineering perspective. We'll discuss the "business" readiness for its merge on jabber.

I think there should be at least one unit-test that specifies lw4over6
options and then checks that client was indeed assigned those options,
that they're nested properly and have expected values. This test
should in general be similar to RenewTest?.renewWithExcludedPrefix in
src/bin/dhcp6/tests/renew_unittest.cc. If you're ok with that, I can
develop such a test.

I don't have any issue with you developing this test.

On a reflection, such unit-test will better fit into the lw4over6 hook code, where we will be actually generating those on the fly.

Maybe you should try now, as I rebased the code against master?

Yes, it works.

Thanks. Hopefully I didn't omit anything. There are many changes and many review comments so whatever is left unfixed was not intentional.

You have covered everything AFAICT.

Major thanks to you for driving this patch forward. Except the Cassandra thing, I believe this was the most complex and most extensive patch we have received.

There are 3 non-code related topics we should discuss:

  1. I think the ChangeLog? should mention you, Cristian and Andrei.
  1. Couple weeks ago Stephen asked me to write down the benefits of using github. I did it on KeaGithub? page (Benefits of github repo section) on our internal wiki. Can you read it, especially items 3 and 7? What's your opinion on squashing vs keeping the original commits?
  1. There's the business readiness argument that we should discuss on jabber.

Ok, that's it. I have no further comments.

p.s.
This ticket a total of 2 hours. I think 2 weeks are more appropriate. Can you update it something more realistic? I feel that Heidi will start looking at those fields soon and she may underappreciate the amount of work conducted here.

comment:12 Changed 3 years ago by marcin

  • Resolution set to complete
  • Status changed from reviewing to closed

Merged with commit 59b62eb3ddd9db62c04bd47cd8fbdc1af62fbc1b

Note: See TracTickets for help on using tickets.