Opened 8 years ago

Closed 8 years ago

#1224 closed task (fixed)

V4 packet library - basic framework

Reported by: stephen Owned by: tomek
Priority: medium Milestone: Sprint-DHCP-20111026
Component: dhcp Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DHCP Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

Production of packet/option classes identified in the design in #1223

Subtickets

Change History (12)

comment:1 Changed 8 years ago by stephen

  • Milestone changed from DHCP 2011 to Sprint-DHCP-20111026

comment:2 Changed 8 years ago by tomek

  • Owner set to tomek
  • Status changed from new to accepted

comment:3 Changed 8 years ago by tomek

Pkt4 class implemented.
Imported include/dhcp.h from ISC DHCP as src/lib/dhcp/dhcp4.h (minor modifications only).
Implemented tests for Pkt4.

Note: It would be difficult to split fixed fields and pkt implementation itself. Therefore this ticket also includes work planned for #1225. Options work will be done as part of 1228.

comment:4 Changed 8 years ago by tomek

  • Owner changed from tomek to UnAssigned
  • Status changed from accepted to reviewing

Ready for review.

comment:5 Changed 8 years ago by stephen

  • Owner changed from UnAssigned to stephen

comment:6 follow-up: Changed 8 years ago by stephen

  • Owner changed from stephen to tomek

src/lib/dhcp/Makefile.am
Should add dhcp4.h to libdhcp_la_SOURCES

src/lib/dhcp/dhcp4.h
Rather than "format kept for easier merge" I think that the real reason is that DHCPv4 is still being developed and that this file may change if the code is extended.

Having said that, the DHCPv4 protocol is fairly fixed, so I think there is a very strong case for moving to a C++ paradigm. In particular, defining the different constants as enums would ensure that C++ type checking checked that constants were appropriate for the fields. At this stage of the work, it should not be too much of an overhead.

src/lib/dhcp/pkt4.h
Public constants in Pkt4 want a Doxygen-style comment before them.

Typographical error in the DHCPV4_PKT_HDR_LEN comment - "specifes".

In the Pkt4 constructor used for message transmission, does the first line require an "@brief" tag?

Same constructor, the "data" element is really a pointer to the data to be written to the message, not a pointer to received data.

Header comments for pack()/unpack(): there are no members named data_ or data_len_.

Header for unpack(): doesn't this parse a DHCPv4 packet, not a v6 one? Also, it should return true if the unpacking was successful, not if the "build" was successful.

len(): a better description would be that len() returns the size of the required buffer to build the packet with the current set of packet options.

getSname()/getFile(): I think it would be safer to return a "const vector<uint8_t>" unless there is a very strong guarantee that the Pkt4 structure will remain in existence while the the data is accessed.

setHWAddr(): there is the assumption that the macAddr argument points to an array that is long enough. Passing in a "const vector<uint8_t>&" would be better in that the length can be checked.

Description of ifindex: "Windows" is spelt with a capital "W". (You might not like it, but let's give it its proper name :-) ).

op_ description (first line): This is the only line in the file that is more than 80 characters long (and 80 characters is the default width at which my editor opens). Can we move the last word to the next line?

chaddr_/sname_/file_ fields. The declarations of the size of these arrays should use the constants defined earlier in the class.

bufferIn_: We need to look again at the hooks interface. I'm thinking that perhaps we don't need the buffer_rcvd() or buffer_end() callbacks - only let the users modify data in the form of a PktN& obejcts. (besides, the DhcpHooks document advises against modifying the input or output wire data.) In this case, we are not passing a raw data buffer to the hooks and so bufferIn_ can be an InputBuffer structure. (Not: if adopted, this requires a modification to DhcpHooks).

src/lib/dhcp/pkt6.cc
Constructors: for members that are of type IOAddress, there is no need to create an intermediate IOAddress object (i.e. ciaddr_("0.0.0.0") will work as well as ciaddr_(IOAddress("0.0.0.0")). The former is initializing via a direct constructor, the latter is creating an intermediate object and initializing ciaddr_ via a copy constructor.

Having said that, and at the risk of premature optimisation, there is a lot of parsing of "0.0.0.0" every time a Pkt4 object is created. Would it be better to declare an external variable like:

namespace {
const IOAddress DEFAULT_ADDRESS("0.0.0.0");
}

... and to initialize the address objects with a copy of DEFAULT_ADDRESS? That way the parsing is only done once.

setHWAddr: should be spaces around binary operators (i.e. "hlen > MAC_CHADDR_LEN").

setHWAddr: can the macAddr be encapsulated in a vector (in which case there is no need to pass the hlen argument)?

setFile(): comment at the end of the method shows a cut and paste :-)

src/lib/dhcp/tests/pkt4_unittest.cc
constructor: use the form "type* variable" instead of "type * variable".

constructor: surround binary operators with spaces ("i = 0" instead of "i=0").

constructor: even single-line "for" loops should have the statement enclosed in "{" and "}".

Suggest starting comments with capital letters.

generateTestPacket1(): Space between parentheses in setFlags() call.

generateTestPacket2(): I must admit I would have used a vector<> and a std::copy using a back_inserter object to copy data into it. That way there is no chance of overflowing the buffer. As it is:

  • In the memcpy calls, use the constants as the length argument instead of the numbers. (If numbers are used, then it would be worth checking their value with an EXPECT_EQ check).
  • On exit from the function, we should expect that (offset - &buf[0]) is equal to DHCPV4_PKT_HDR_LEN.

fixedFields test - would prefer "EXPECT_EQ(0, memcmp(...)) to EXPECT_FALSE().

comment:7 in reply to: ↑ 6 Changed 8 years ago by tomek

  • Owner changed from tomek to stephen

Replying to stephen:

Thanks for the review.

src/lib/dhcp/Makefile.am
Should add dhcp4.h to libdhcp_la_SOURCES

Done.

src/lib/dhcp/dhcp4.h
Rather than "format kept for easier merge" I think that the real reason is that DHCPv4 is still being developed and that this file may change if the code is extended.

Having said that, the DHCPv4 protocol is fairly fixed, so I think there is a very strong case for moving to a C++ paradigm. In particular, defining the different constants as enums would ensure that C++ type checking checked that constants were appropriate for the fields. At this stage of the work, it should not be too much of an overhead.

That is probably true that we won't be doing any merges. Reworked most of the header text. I left defines that I don't know about. I don't want to guess names for enums just yet. I'll convert them to C++ enums as soon as I start using them.

src/lib/dhcp/pkt4.h
Public constants in Pkt4 want a Doxygen-style comment before them.

Added.

Typographical error in the DHCPV4_PKT_HDR_LEN comment - "specifes".

Fixed.

In the Pkt4 constructor used for message transmission, does the first line require an "@brief" tag?

Added.

Same constructor, the "data" element is really a pointer to the data to be written to the message, not a pointer to received data.

Comment was confusing. It is hopefully more readable now. That constructor is used when receiving data. Clarified that input data will be copied to bufferIn_.

Header comments for pack()/unpack(): there are no members named data_ or data_len_.

Copy-and-paste error. Clarified that bufferIn_ and bufferOut_ will be used.

Header for unpack(): doesn't this parse a DHCPv4 packet, not a v6 one? Also, it should return true if the unpacking was successful, not if the "build" was successful.

Fixed.

len(): a better description would be that len() returns the size of the required buffer to build the packet with the current set of packet options.

Done.

getSname()/getFile(): I think it would be safer to return a "const vector<uint8_t>" unless there is a very strong guarantee that the Pkt4 structure will remain in existence while the the data is accessed.

Done are requested. Note that is made use of this method both slower and more complicated. See changes in pkt4_unittest.cc.

setHWAddr(): there is the assumption that the macAddr argument points to an array that is long enough. Passing in a "const vector<uint8_t>&" would be better in that the length can be checked.

This function was implemented under the assumption that its user is a sane developer that understands that passing a pointer to a buffer and lying about its length will cause problems. As we do not know where MAC address date will come from (note that there could possibly be different sources of that information) at this stage, using const uint8_t* seems like a reasonable choice. Expanded comment to explain that and warn about passing hlen that is larger than buffer pointed by macAddr. Also added check if macAddr is NULL.

Description of ifindex: "Windows" is spelt with a capital "W". (You might not like it, but let's give it its proper name :-) ).

How about wINDOWS? :-). Ok, fixed. It is now "MS Windows".

op_ description (first line): This is the only line in the file that is more than 80 characters long (and 80 characters is the default width at which my editor opens). Can we move the last word to the next line?

It is now less than 80 characters long. Also expanded this description a bit.
BTW What happened to the discussion about extending allowed line length to 100?

chaddr_/sname_/file_ fields. The declarations of the size of these arrays should use the constants defined earlier in the class.

Done.

bufferIn_: We need to look again at the hooks interface. I'm thinking that perhaps we don't need the buffer_rcvd() or buffer_end() callbacks - only let the users modify data in the form of a PktN& obejcts. (besides, the DhcpHooks document advises against modifying the input or output wire data.) In this case, we are not passing a raw data buffer to the hooks and so bufferIn_ can be an InputBuffer structure. (Not: if adopted, this requires a modification to DhcpHooks).

I thought that we decided that on a high-level, hook user will have the ability to inject his changes on 2 levels: binary format and later at parsed options. There are valid cases when user may want to modify wire format - e.g. to fix fields for buggy clients, to recalculate checksums etc.

It is true that user will have the ability to modify PktN& object, but we need to pass it twice. First, after we receive raw packet, we will pass it to buffer_rcvd(). Essential part is that hook can modify incoming packet. Therefore we must do parsing AFTER this hook returns. Then we parse this buffer, create options, set fields and so on. After parsing is complete, we call packet_rcvd() and user can tweak this packet further on a fields and options level.

We could simplify this under the condition that there will be NO checks or any sort of processing between receiving buffer and parsing it. I can't specify any concrete examples at this moment, but I believe that one of useful functions of hooks at this stage would be conformance fix. Let me explain. Imagine a network with old devices that cannot be upgraded or migrated. Some of those older clients may send slightly malformed packets. We have first hand experience with cases when clients abused single domain option to send list of domains. One way of solving this problem is to implement hook that will fix such packets on the fly.

src/lib/dhcp/pkt6.cc
Constructors: for members that are of type IOAddress, there is no need to create an intermediate IOAddress object (i.e. ciaddr_("0.0.0.0") will work as well as ciaddr_(IOAddress("0.0.0.0")). The former is initializing via a direct constructor, the latter is creating an intermediate object and initializing ciaddr_ via a copy constructor.

Having said that, and at the risk of premature optimisation, there is a lot of parsing of "0.0.0.0" every time a Pkt4 object is created. Would it be better to declare an external variable like:

namespace {
const IOAddress DEFAULT_ADDRESS("0.0.0.0");
}

... and to initialize the address objects with a copy of DEFAULT_ADDRESS? That way the parsing is only done once.

Done.

setHWAddr: should be spaces around binary operators (i.e. "hlen > MAC_CHADDR_LEN").

setHWAddr: can the macAddr be encapsulated in a vector (in which case there is no need to pass the hlen argument)?

It could be. But hlen is a field in DHCPv4 packet. I prefer clean method that sets those fields as they are. Adding some logic that hlen will be set based on length of a vector is not the way to go here, especially since there are corner cases ahead. For example for Infiniband hlen is 20, but hlen field must be set to 0 and actual hardware address is stored in client-identifier option. htype, hlen and hardware address really should be set together in the simplest possible way.

setFile(): comment at the end of the method shows a cut and paste :-)

Fixed.


src/lib/dhcp/tests/pkt4_unittest.cc
constructor: use the form "type* variable" instead of "type * variable".

constructor: surround binary operators with spaces ("i = 0" instead of "i=0").

Sorry for this recurring mistake. That's a notation I've been using for years and it's tricky to discard old habits.

constructor: even single-line "for" loops should have the statement enclosed in "{" and "}".

Done.

Suggest starting comments with capital letters.

Should comments form a complete sentence? I was trying to follow a logic that if a comment contains sentence, it should be formed as such (starting with capital letter and ending with a dot.). If it is just a short explanation, it is written in lower-case, without a dot.

generateTestPacket1(): Space between parentheses in setFlags() call.

generateTestPacket2(): I must admit I would have used a vector<> and a std::copy using a back_inserter object to copy data into it. That way there is no chance of overflowing the buffer.

I've never used back_inserter object. Could you commit code that does that?

fixedFields test - would prefer "EXPECT_EQ(0, memcmp(...)) to EXPECT_FALSE().

Done.

Fixes pushed to trac1224. Please re-review.

comment:8 follow-up: Changed 8 years ago by stephen

  • Owner changed from stephen to tomek

src/lib/dhcp/dhcp4.h

I left defines that I don't know about.

OK, can you add a comment to that effect in the file? Also, commenting them out (#if 0 ... #endif) would be a check that they aren't used unexpectedly.

Can't the DHCP_OPTIONS_COOKIE be defined as a "static const char*"?

Not BIND 10 convention to indent fields within a namespace. (Also, to be consistent, the trailing brace of the BOOTPTypes should be on a new line.)

I don't want to guess names for enums just yet

If I understand you, I think you mean that you don't want to change the names of members of the enums (e.g. change DHO_PAD to "Pad" etc.). If this is the case, I don't think there is any reason to; if nothing else, leaving the names the same as DHCPv4 will make it easier to read the code.

The reason for wanting enums was type safety. If you have method that takes a DHCPOptionType (say), but when calling that method incorrectly call it with the argument of DHCPDISCOVER (for example), you will get a compile-time error. If the methods were to just accept an int, this would go unnoticed unless it caused some form of run-time failure.

src/lib/dhcp/pkt4.cc
I've just noticed this and I'm a bit surprised that it has compiled. The Pkt4 class declaration in pkt4.h is in the namespace isc::dhcp, but the definitions in this file are in the namespace isc.

Also, the trailing brace of a namespace doesn't need a semi-colon after it.

load_addr_ and remote_addr_ in the Pkt4 constructors can also be initialised using DEFAULT_ADDRESS.

setHWAddr(): there is the assumption that the macAddr argument points to an array that is long enough. Passing in a "const vector<uint8_t>&" would be better in that the length can be checked.

This function was implemented under the assumption that its user is a sane developer that understands that passing a pointer to a buffer and lying about its length will cause problems.

I think that's assumed. But as Jinmei has pointed out on a related ticket, errors can happen and we all make them.

As we do not know where MAC address date will come from (note that there could possibly be different sources of that information) at this stage, using const uint8_t* seems like a reasonable choice. Expanded comment to explain that and warn about passing hlen that is larger than buffer pointed by macAddr. Also added check if macAddr is NULL.

Leave it for now, but I do think we will need to revisit this later.

bufferIn_: We need to look again at the hooks interface...

Your comments about modifying a packet for conformance are well-made. OK, we need to have some way to vary the raw packet data. As the packet is raw, why are we passing a Pkt6 object to buffer_rcvd()/buffer_sent()? Wouldn't it be better to pass a "vector<uint8_t>&"? That way the called function can quite happily modify the data and the length of the data? It also has the advantage in that the correct allocator for the vector should be used, which avoids the problems that arise on some operating systems where memory is allocated in one DLL and freed in another.

Should comments form a complete sentence?

I admit I sometimes omit trailing periods on one-line comments which I really shouldn't do. The reason I like capital letters at the start of comments (and spaces around binary operators etc.) is that I believe that the code and comments are easier to read if they conform as closely as possible to the rules of written English.

generateTestPacket2(): I must admit I would have used a vector<> and a std::copy using a back_inserter object to copy data into it. That way there is no chance of overflowing the buffer.

I've never used back_inserter object. Could you commit code that does that?

Here's a modified version of generateTestPacket2():

// Returns a vector containing a DHCPv4 packet header.
vector<uint8_t>
generateTestPacket2() {

    // That is only part of the header. It contains all "short" fields, 
    // larger fields are constructed separately.
    uint8_t hdr[] = {
        1, 6, 6, 13,            // op, htype, hlen, hops,
        0x12, 0x34, 0x56, 0x78, // transaction-id
        0, 42, 0xff, 0xff,      // 42 secs, 0xffff flags
        192, 0, 2, 1,           // ciaddr
        1, 2, 3, 4,             // yiaddr
        192, 0, 2, 255,         // siaddr
        255, 255, 255, 255,     // giaddr
    };

    // Initialize the vector with the header fields defined above.
    vector<uint8_t> buf(hdr, hdr + sizeof(hdr));

    // Append the large header fields.
    copy(dummyMacAddr, dummyMacAddr + Pkt4::MAX_CHADDR_LEN, back_inserter(buf));
    copy(dummySname, dummySname + Pkt4::MAX_SNAME_LEN, back_inserter(buf));
    copy(dummyFile, dummyFile + Pkt4::MAX_FILE_LEN, back_inserter(buf));

    // Should now have all the header, so check.  The "static_cast" is used
    // to get round an odd bug whereby the linker appears not to find the
    // definition of DHCPV4_PKT_HDR_LEN if it appears within an EXPECT_EQ().
    EXPECT_EQ(static_cast<size_t>(Pkt4::DHCPV4_PKT_HDR_LEN), buf.size());

    return (buf);
}

comment:9 in reply to: ↑ 8 Changed 8 years ago by tomek

Replying to stephen:

src/lib/dhcp/dhcp4.h

I left defines that I don't know about.

OK, can you add a comment to that effect in the file? Also, commenting them out (#if 0 ... #endif) would be a check that they aren't used unexpectedly.

Done.

Can't the DHCP_OPTIONS_COOKIE be defined as a "static const char*"?

No, it cannot. That causes "defined but not used" warning that is treated as error. Changed to static const char* as suggested, but also commented it out.

Not BIND 10 convention to indent fields within a namespace. (Also, to be consistent, the trailing brace of the BOOTPTypes should be on a new line.)

Done.

I don't want to guess names for enums just yet

If I understand you, I think you mean that you don't want to change the names of members of the enums (e.g. change DHO_PAD to "Pad" etc.). If this is the case, I don't think there is any reason to; if nothing else, leaving the names the same as DHCPv4 will make it easier to read the code.

The reason for wanting enums was type safety. If you have method that takes a DHCPOptionType (say), but when calling that method incorrectly call it with the argument of DHCPDISCOVER (for example), you will get a compile-time error. If the methods were to just accept an int, this would go unnoticed unless it caused some form of run-time failure.

Yes, I understand. The problem is with naming the enums. I'm not sure what RAI or defines starting with RAI are. I could try to invent generic name like RaiTypes?, but I don't want to do that. Message and option types are easy. I understand what they are meaning, so naming those enums DHCPMessageType or DHCPOptionType is adequate.

Anyway, both remaining define groups (RAI_* and FQDN_*) are ifdefed 0 and commented appropriately.

src/lib/dhcp/pkt4.cc
I've just noticed this and I'm a bit surprised that it has compiled. The Pkt4 class declaration in pkt4.h is in the namespace isc::dhcp, but the definitions in this file are in the namespace isc.

Also, the trailing brace of a namespace doesn't need a semi-colon after it.

Done.

load_addr_ and remote_addr_ in the Pkt4 constructors can also be initialised using DEFAULT_ADDRESS.

Done.

setHWAddr(): there is the assumption that the macAddr argument points to an array that is long enough. Passing in a "const vector<uint8_t>&" would be better in that the length can be checked.

This function was implemented under the assumption that its user is a sane developer that understands that passing a pointer to a buffer and lying about its length will cause problems.

I think that's assumed. But as Jinmei has pointed out on a related ticket, errors can happen and we all make them.

I give up. const vector<uint8_t>& it is.

As we do not know where MAC address date will come from (note that there could possibly be different sources of that information) at this stage, using const uint8_t* seems like a reasonable choice. Expanded comment to explain that and warn about passing hlen that is larger than buffer pointed by macAddr. Also added check if macAddr is NULL.

Leave it for now, but I do think we will need to revisit this later.

Changed the code as requested.

bufferIn_: We need to look again at the hooks interface...

Your comments about modifying a packet for conformance are well-made. OK, we need to have some way to vary the raw packet data. As the packet is raw, why are we passing a Pkt6 object to buffer_rcvd()/buffer_sent()? Wouldn't it be better to pass a "vector<uint8_t>&"?

Note that hook document predates the actual implementation. Just recently nobody told me "Tomek, please drop the coding work for now and update the document". :-)

That way the called function can quite happily modify the data and the length of the data? It also has the advantage in that the correct allocator for the vector should be used, which avoids the problems that arise on some operating systems where memory is allocated in one DLL and freed in another.

Valid points. Hooks document should be updated then. As a convenience to Shawn, whoever updates the hooks document, should explain him, why we his recommendation to keep parameter formats unified between calls is no longer honored.

Should comments form a complete sentence?

I admit I sometimes omit trailing periods on one-line comments which I really shouldn't do. The reason I like capital letters at the start of comments (and spaces around binary operators etc.) is that I believe that the code and comments are easier to read if they conform as closely as possible to the rules of written English.

Agree. Noting that obviously I'm not a native speaker, I try to follow the rules of written English. Please let me know if there are any comments that you want me fixed. Having said that, I can continue working on comments text to better adhere to proper English and improve my English skills in the process. Alternatively, we could consider possibility of accepting comments as they are now, and move on to working on a new code. :-)

generateTestPacket2(): I must admit I would have used a vector<> and a std::copy using a back_inserter object to copy data into it. That way there is no chance of overflowing the buffer.

I've never used back_inserter object. Could you commit code that does that?

Here's a modified version of generateTestPacket2():

// Returns a vector containing a DHCPv4 packet header.
vector<uint8_t>
generateTestPacket2() {

    // That is only part of the header. It contains all "short" fields, 
    // larger fields are constructed separately.
    uint8_t hdr[] = {
        1, 6, 6, 13,            // op, htype, hlen, hops,
        0x12, 0x34, 0x56, 0x78, // transaction-id
        0, 42, 0xff, 0xff,      // 42 secs, 0xffff flags
        192, 0, 2, 1,           // ciaddr
        1, 2, 3, 4,             // yiaddr
        192, 0, 2, 255,         // siaddr
        255, 255, 255, 255,     // giaddr
    };

    // Initialize the vector with the header fields defined above.
    vector<uint8_t> buf(hdr, hdr + sizeof(hdr));

    // Append the large header fields.
    copy(dummyMacAddr, dummyMacAddr + Pkt4::MAX_CHADDR_LEN, back_inserter(buf));
    copy(dummySname, dummySname + Pkt4::MAX_SNAME_LEN, back_inserter(buf));
    copy(dummyFile, dummyFile + Pkt4::MAX_FILE_LEN, back_inserter(buf));

    // Should now have all the header, so check.  The "static_cast" is used
    // to get round an odd bug whereby the linker appears not to find the
    // definition of DHCPV4_PKT_HDR_LEN if it appears within an EXPECT_EQ().
    EXPECT_EQ(static_cast<size_t>(Pkt4::DHCPV4_PKT_HDR_LEN), buf.size());

    return (buf);
}

Thank you. Replaced old implementation with your code.

Code changes pushed to trac1224. Please review.

comment:10 Changed 8 years ago by tomek

  • Owner changed from tomek to stephen

comment:11 Changed 8 years ago by stephen

  • Owner changed from stephen to tomek

Seems OK - please merge.

comment:12 Changed 8 years ago by tomek

  • Resolution set to fixed
  • Status changed from reviewing to closed

Code merged to master. Passes make distcheck and make check. Pushed to master. Closing ticket.

Note: See TracTickets for help on using tickets.