Opened 9 years ago

Closed 9 years ago

#998 closed task (complete)

IP based ACL check

Reported by: vorner Owned by: stephen
Priority: medium Milestone: Sprint-20110628
Component: Unclassified Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 5.0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

This will be based on the the #977 base class and it will check if a given IP address (provided as JSON parameter in constructor) matches the one from parameter.

It needs to define the struct that will describe the packet. It might be just a struct with the IP address and the message for now.

Subtickets

Attachments (2)

editorial.diff (4.8 KB) - added by jinmei 9 years ago.
pton.diff (1.4 KB) - added by jinmei 9 years ago.

Download all attachments as: .zip

Change History (29)

comment:1 Changed 9 years ago by stephen

  • Owner set to stephen
  • Status changed from new to assigned

comment:2 Changed 9 years ago by stephen

  • Owner changed from stephen to UnAssigned
  • Status changed from assigned to reviewing

It needs to define the struct that will describe the packet.

This has been left as a template parameter. When used, the implementation will need to provide template specialisations to extract the address being tested from the structure and pass to the internal comparison methods.

comment:3 Changed 9 years ago by stephen

  • Estimated Difficulty changed from 0.0 to 5

comment:4 Changed 9 years ago by jinmei

Not decided to pick up the review task (I'd happily leave it to someone
else:-), but noticed a couple of things:

  • cppcheck fails:
    src/lib/acl/tests/ip_check_unittest.cc:31: check_fail: Member variable 'GeneralAddress::isv4' is not initialised in the constructor. (warning,uninitVar)
    
  • it seems getInverse() can (should) be a const member function.

comment:5 Changed 9 years ago by stephen

  • Owner changed from UnAssigned to stephen

comment:6 Changed 9 years ago by stephen

  • Owner changed from stephen to UnAssigned

Corrected points raised by Jinmei (commit 67a88d3fd748cc42730e142cbfa79d0b7fb7a813).

comment:7 Changed 9 years ago by jinmei

I've taken a bit more closer look at the branch, mainly for #999.

I have some design level question/comment/suggestion.

Observations

First, I don't see much benefit in having both Ipv4Check and Ipv6Check
as derived classes of IpCheck?. IpCheck? is a concrete class that
internally creates and holds IpvXCheck. This way, applications can
(and I guess will) only depend on IpCheck?, not only when using the
class (as in the usual usage of polymorphism) but also when creating
it. So IpvXCheck doesn't seem to have to be a derived class of
IpCheck? (or even a derived class of Check).

Second, regardless of whether they are derived from IpCheck?, I'm not
so sure if it makes sense to separate classes for IPv4 and IPv6.
Essentially they do the same thing: dealing with some binary data with
a fixed maximum size and a variable-length effective size (in bits),
and matching given data of the maximum size against its internal data
(up to the effective size). So, most of the code logic should be
sharable, and it seems we can implement everything in a single class
with an identifier of the address family.

Of course, separating classes (with polymorphism) has its own
advantages: We'll be able to benefit from stronger type checks; the
size of the objects may be smaller (because the address family is
embedded in the type), which can be a big plus if/when we want to a
large number of ACL rules. In that case, however, I'd suggest
combining the main code logic as much as possible either via C++
template or some design pattern techniques like template methods.

Third: maybe this is premature at this point, but I guess Ip(vX)Check
will need to contain the information of "which address it is", such as
the source address or destination address. For example, consider the
following ACL entry:

  allow from 2001:db8:1::/64 to 2001:db8:2::/64

In this case, the "Context" would be something like a "Packet" class,
that stores the source and destination address information. Both the
"from 2001:db8:1::/64" and "to 2001:db8:2::/64" would be represented
as separate Ip(v6)Check objects, and the former/latter would have to
know it's a source/destination address; otherwise the Context cannot
determine which address it should extract to perform the match.

Finally: I suspect methods like "compare()" would need to take
something more generic, at least some opaque data and the address
family identifier. For example, consider these rules:

  allow from 2001:db8:1::/64 to 2001:db8:2::/64
  allow from 192.0.2.0/24 to 192.0.2.53

And assume that the "Packet" class is IP-version independent (just
like our IOMessage class). Since we'd like to keep things version
independently as much as possible, IpCheck?<Packet>::matches() would
like to do something like this:

    if (getAddressType() == IpCheck::SOURCE) { // see the third point
       return (compare(packet.getSourceAddr()));
   } else {
       return (compare(packet.getDestinationAddr()));
   }

Now, the address family of getSourceAddr() may be different from the
IP version for this IpCheck? instance, and in that case we'd be
expected to return "false". So, we'd like compare() to take various
types of addresses, and return false if the family doesn't match.

In my experimental attempt, I'm thinking about something like this:

struct IPAddress {
    IPAddress(const struct sockaddr& sa);
    const int family;
    const uint8_t* const data;
    const size_t length;
}

with this the compare() method would take "const IPAddress&". But
this is just an example structure. I'm not pushing it if there's a
better way to implement the concept.

Suggestion

Considering all these, I'd suggest

  • Either make IpCheck? a single "for all IP versions" check class, or to make IpvXCheck direct subclass of Check. In the latter case IpCheck? will be removed, some factory function will be introduced, and employ some code sharing technique (such as templates).
  • extend the compare() method (or something equivalent to that if addressing the first point affects this part of interface) so that it can take different types of addresses
  • (maybe optionally) add an interface to Ip(vX)Check to maintain and return the type of addresses. The "address type" may better be templated (like the "Action" type of the ACL class).

(If these suggestions are vague, I can show it via sample code)

Other comments

Here are some code level comments I happened to notice.

  • I'd say "IP" instead of "Ip" because "IP" is an acronym
  • I'd use unnamed namespace wherever possible in the test, and avoid using file-scope "static'
  • can't compare() be private?
  • "getNetmask(): const uint32_t" should be uint32_t. The const is redundant, and at least clang++ actually treats it a fatal warning.

comment:8 Changed 9 years ago by jinmei

  • Owner changed from UnAssigned to stephen

comment:9 follow-ups: Changed 9 years ago by stephen

  • Owner changed from stephen to jinmei

First, I don't see much benefit in having both Ipv4Check and Ipv6Check as derived classes of IpCheck?...

Which version of the code did you review? IpCheck, Ipv4Check and Ipv6Check were all derived from Check.

Second, regardless of whether they are derived from IpCheck?, I'm not so sure if it makes sense to separate classes for IPv4 and IPv6. Essentially they do the same thing: dealing with some binary data with a fixed maximum size and a variable-length effective size (in bits), and matching given data of the maximum size against its internal data (up to the effective size). So, most of the code logic should be sharable, and it seems we can implement everything in a single class with an identifier of the address family ...

That's a very valid point and, on reflection, I agree with you. I have rewritten the code to put everything in one class.

Third: maybe this is premature at this point, but I guess Ip(vX)Check will need to contain the information of "which address it is", such as the source address or destination address. For example, consider the following ACL entry:

allow from 2001:db8:1::/64 to 2001:db8:2::/64

In this case, the "Context" would be something like a "Packet" class, that stores the source and destination address information. Both the "from 2001:db8:1::/64" and "to 2001:db8:2::/64" would be represented as separate Ip(v6)Check objects, and the former/latter would have to know it's a source/destination address; otherwise the Context cannot determine which address it should extract to perform the match.

The class as implemented here is only a single ACL match (or non-match - the capability to check for a non-match was trivial, which is why I added it to this class). The ACL you are suggesting in the example is really a combined ACL:

allow (source matches 2001:db8:1::/64) AND (dest matches 2001:db8:2::/64)

I would see then implementation as being a class containing two IPCheck classes, one for source and one for destination. It would be the container class that knows the structure of the packet and distinguishes between source and destination address.

However, consider the possible ACL:

allow (source matches 2001:db8:1::/64) AND (protocol is TCP)

(I don't know whether we would want to allow access based on protocol, but as it is a possibility so I think we should consider it.) In this case a more general class allowing the AND between two ACLs is called for. Applying this to the IP check problem, an alternative approach to a single class checking both source and destination addresses could be to derive SourceIPCheck and DestinationIPCheck classes from IPCheck and combine them using the AND operator. (All the derived classes would do is provide a matches() method to pass the source/destination IP address from the packet to the parent IPCheck::matches())

Finally: I suspect methods like "compare()" would need to take something more generic, at least some opaque data and the address family identifier. For example...

The entry into compare() is through matches(), which is currently unspecified until it is known what structure is going to be used as "Context". (The idea is that once we know, a template specialisation can be written.)

Now, the address family of getSourceAddr() may be different from the IP version for this IpCheck instance, and in that case we'd be expected to return "false". So, we'd like compare() to take various types of addresses, and return false if the family doesn't match.

Whoops! I missed that. Added.

Either make IpCheck a single "for all IP versions" check class, or to make IpvXCheck direct subclass of Check. In the latter case IpCheck will be removed, some factory function will be introduced, and employ some code sharing technique (such as templates).

The new IPCheck is now a single class.

extend the compare() method (or something equivalent to that if addressing the first point affects this part of interface) so that it can take different types of addresses

Let's discuss this further. As noted above, I think that this can be implemented by the planned logical operator ACLs.

(maybe optionally) add an interface to Ip(vX)Check to maintain and return the type of addresses.

Done.

I'd say "IP" instead of "Ip" because "IP" is an acronym

I used Ip to avoid a run of capitals, but I'm not dogmatic about it and have renamed the class IPCheck.

I'd use unnamed namespace wherever possible in the test, and avoid using file-scope "static'

Done.

can't compare() be private?

An oversight - it is now.

"getNetmask(): const uint32_t" should be uint32_t. The const is redundant, and at least clang++ actually treats it a fatal warning.

Treating all addresses as arrays, this now returns a vector.

Thanks for the comprehensive review.

comment:10 in reply to: ↑ 9 ; follow-ups: Changed 9 years ago by vorner

Hello

The notification caught my eye, so I had a short look.

Replying to stephen:

The class as implemented here is only a single ACL match (or non-match - the capability to check for a non-match was trivial, which is why I added it to this class). The ACL you are suggesting in the example is really a combined ACL:

The non-match, I guess, will not be used, for consistency. We probably will have a general not, so having a different syntax for not on IP seems strange.

allow (source matches 2001:db8:1::/64) AND (dest matches 2001:db8:2::/64)

I would see then implementation as being a class containing two IPCheck classes, one for source and one for destination. It would be the container class that knows the structure of the packet and distinguishes between source and destination address.

But I don't see a way to say „use this context and extract the source address“ for one version and „use the same context, but extract the destination address“. Maybe it's there, but I didn't notice it.

And, looking into the file, I didn't find the body of the match function. Do I look bad?

On the other side, is there any reason to have all the assignment and copy constructors and so on?

With regards

comment:11 in reply to: ↑ 10 Changed 9 years ago by jinmei

Replying to vorner:

The notification caught my eye, so I had a short look.

Replying to stephen:

The class as implemented here is only a single ACL match (or non-match - the capability to check for a non-match was trivial, which is why I added it to this class). The ACL you are suggesting in the example is really a combined ACL:

The non-match, I guess, will not be used, for consistency. We probably will have a general not, so having a different syntax for not on IP seems strange.

Yeah, that's another point I wondered about, too (but I didn't go into
that in my previous comment). Basically, I agree that it makes more
sense to have generic "NOT" than implementing the negation concept in
each derived class.

allow (source matches 2001:db8:1::/64) AND (dest matches 2001:db8:2::/64)

I would see then implementation as being a class containing two IPCheck classes, one for source and one for destination. It would be the container class that knows the structure of the packet and distinguishes between source and destination address.

But I don't see a way to say „use this context and extract the source address“ for one version and „use the same context, but extract the destination address“. Maybe it's there, but I didn't notice it.

I'm not sure if I catch the context correctly, but as I noted in my
previous comment message, I would envision the above to be implemented
as follows:

  • (source matches 2001:db8:1::/64) is represented as a single IPCheck object that internally stores the information that it's about source address
  • (dest matches 2001:db8:2::/64) is represented as a single IPCheck object that internally stores the information that it's about destination address
  • the entire "(source ...) AND (dest ...)" is represented as a kind of compound Check object, which consists of the two IPCheck objects, and its match() method returns true iff both of the two IPCheck::matches() true.
  • The "Context" is a some kind of "Packet" class object, which has the ability to extract the source and destination addresses of the packet.

We'd then implement IPCheck<Packet> class, whose matches() method
would look like this:

bool
IPCheck<Packet>::matches(const Packet& packet) {
    if (getAddressType() == IPCheck::SOURCE) {
       return (compare(packet.getSourceAddr()));
   } else {
       return (compare(packet.getDestinationAddr()));
   }
}

comment:12 in reply to: ↑ 10 ; follow-up: Changed 9 years ago by jinmei

Replying to vorner:

And, looking into the file, I didn't find the body of the match function. Do I look bad?

My understanding is that it's expected to be very Context dependent,
and no generic template is provided (it's the Context class author's
responsibility to give the implementation).

comment:13 in reply to: ↑ 9 Changed 9 years ago by jinmei

Replying to stephen:

Third: maybe this is premature at this point, but I guess Ip(vX)Check will need to contain the information of "which address it is", such as the source address or destination address.

[...]

The class as implemented here is only a single ACL match (or non-match - the capability to check for a non-match was trivial, which is why I added it to this class). The ACL you are suggesting in the example is really a combined ACL:

allow (source matches 2001:db8:1::/64) AND (dest matches 2001:db8:2::/64)

I would see then implementation as being a class containing two IPCheck classes, one for source and one for destination. It would be the container class that knows the structure of the packet and distinguishes between source and destination address.

However, consider the possible ACL:

allow (source matches 2001:db8:1::/64) AND (protocol is TCP)

(I don't know whether we would want to allow access based on protocol, but as it is a possibility so I think we should consider it.) In this case a more general class allowing the AND between two ACLs is called for.

Do you mean something like the one I mentioned in my response to Michal?
http://bind10.isc.org/ticket/998?replyto=9#comment:11

If so, I actually meant that type of concept in my original comment
(the above response of yours seems to indicate I was suggesting a
single IPCheck class to support the compound "(source..) AND
(dest...)" rule. If so, no that's not what I meant).

Applying this to the IP check problem, an alternative approach to a single class checking both source and destination addresses could be to derive SourceIPCheck and DestinationIPCheck classes from IPCheck and combine them using the AND operator. (All the derived classes would do is provide a matches() method to pass the source/destination IP address from the packet to the parent IPCheck::matches())

Functionality wise, these two options wouldn't be so different

  • store the source/dest information in a single IPCheck class and introduce IPCheck::getAddressType() method to return that information
  • define SourceIPCheck and DestinationIPCheck classes as you mentioned above

Design wise, I'd generally be careful about adding more
characteristics by creating more and more specialized subclasses
because that approach would easily cause an explosion of specialized
classes. This is what I was afraid about in my comment for #977:
http://bind10.isc.org/ticket/977#comment:5

For example, hypothetically assume we still separated classes for IPv4
and IPv6. Then this approach would easily result in having 4 classes:
SourceIPv4Check, SourceIPv6Check, DestinationIPv4Check,
DestinationIPv6Check. It can quite easily lead to an unmanageable
level.

Of course, the other approach could result in a monster, monolithic
class, which should also be avoided. So it's a matter of balance,
and, in this specific case, I'd keep a single class approach.

extend the compare() method (or something equivalent to that if addressing the first point affects this part of interface) so that it can take different types of addresses

Let's discuss this further. As noted above, I think that this can be implemented by the planned logical operator ACLs.

Hmm, maybe I confused you here...in this context by "different types
of address" I meant IPv4 or IPv6, not source or destination. You seem
to interpret it as meaning source vs destination?

Thanks for the comprehensive review.

You're welcome, but actually it was not "comprehensive" yet:-)

comment:14 in reply to: ↑ 12 ; follow-up: Changed 9 years ago by vorner

Replying to jinmei:

Replying to vorner:

And, looking into the file, I didn't find the body of the match function. Do I look bad?

My understanding is that it's expected to be very Context dependent,
and no generic template is provided (it's the Context class author's
responsibility to give the implementation).

Won't that be confusing, to have methods of single class scattered over multiple files? Anyway, the thing as a whole looks complicated, considering it's only simple IP address match.

comment:15 in reply to: ↑ 14 Changed 9 years ago by jinmei

Replying to vorner:

And, looking into the file, I didn't find the body of the match function. Do I look bad?

My understanding is that it's expected to be very Context dependent,
and no generic template is provided (it's the Context class author's
responsibility to give the implementation).

Won't that be confusing, to have methods of single class scattered over multiple files? Anyway, the thing as a whole looks complicated, considering it's only simple IP address match.

I can't speak for Stephen (this is not my design), but, again, my
understanding of his design is that it's a tradeoff between
flexibility and implementation overhead of specialization.

The problem is that it wouldn't be so obvious how to extract the IP
address to match from the Context. The "Context" may be of some type
representing an IP packet, or it may represent an A/AAAA RR (for some
filtering on the A/AAAA RDATA), or it may mean something else. Also,
when the "Context" means an IP packet, what we may want to test is
either the source or destination address.

One possibility is to introduce some requirement to the Context type,
e.g., that it must have a method named "getAddress()" that returns a
struct sockaddr or something, and to provide templated implementation
of match() based on the requirement. This will save the
specialization cost, but it also reduces the flexibility of the
"Context" type. Personally, I can see both points, and don't have a
strong opinion either way; I'd happily leave the dispute on this point
to you and Stephen:-)

comment:16 Changed 9 years ago by stephen

Michal:
The non-match, I guess, will not be used, for consistency. We probably will have a general not, so having a different syntax for not on IP seems strange.

Fair enough. I added the negation because I wasn't certain on this, and the code for it was trivial. It's easy enough to remove it (or leave it in - if the "inverse" argument is not specified, it does a positive match).

Michal:
But I don't see a way to say „use this context and extract the source address“ for one version and „use the same context, but extract the destination address“.

See discussion below.

Jinmei:
My understanding is that it's expected to be very Context dependent, and no generic template is provided (it's the Context class author's responsibility to give the implementation).

The matches() method has not been written because as yet, the "Context" object holding the address is not specified. I didn't want to guess at some structure then require the code do some conversion from its own representation to that structure when the data could have been directly extracted from the representation.

Michal:
Won't that be confusing, to have methods of single class scattered over multiple files?

When the Context type has been decided, then we write the matches() code (and probably place it in the same file as the rest of the class).

Jinmei
Functionality wise, these two options wouldn't be so different

  • store the source/dest information in a single IPCheck class and introduce IPCheck::getAddressType() method to return that information
  • define SourceIPCheck and DestinationIPCheck classes as you mentioned above

Design wise, I'd generally be careful about adding more characteristics by creating more and more specialized subclasses because that approach would easily cause an explosion of specialized classes. This is what I was afraid about in my comment for #977: http://bind10.isc.org/ticket/977#comment:5

I'm not sure that it's adding more characteristics. My argument for the dual-class approach is that at present the class has a single purpose - match an IP address. It will do that in the absence of any context or knowledge of how the address will be used. (So, as you suggest, it could also be used in filtering A/AAAA RDATA where the use of that data is not defined.)

To extract source/destination IP information really requires some form of adapter to process the information before it gets to the matches() method rather than to have knowledge in the class itself. This transformation is provided by the subclass.

Jinmei:
The problem is that it wouldn't be so obvious how to extract the IP address to match from the Context.

That really summarises where we are at the moment. Once we have "Context", the final elements of the code can be completed.

comment:17 Changed 9 years ago by jinmei

A comprehensive review is ongoing, but here are some intermediate results
(dumping it to possibly maximize work concurrency):

general

  • Maybe a matter of preference, but I'd use the term "prefix" instead of (address+)netmask unless we want to support non contiguous network masks. IMO it's more intuitive. I'd also note that there's even no term of "netmask" in IPv6 terminologies (there are only "prefixes"). (note: of course, we'll have to internally convert a prefix length to something like net masks for match operations. this comment is about public interface and public documentation wording)
  • maybe we want to have a keyword "any" (or perhaps "any4"/"any6")
  • on a related note, I guess we'd probably want to use 0-length prefix to indicate an "any" match.

createNetmask

  • should this be public? the intended use of it seems to be very limited (and we are not supposed to provide a generic bitmask manipulation library, are we?). hmmm, is this perhaps for testing? If so, I see the point, but in that case I'd clarify the intent and that it's not expected to be used outside of this file (e.g., compatibility won't be ensured). I'd also introduce a specific namespace like "detail" to further clarify the intent.
  • I suspect this should be "w-m < w" (at least the current expression doesn't make sense to me):
            // Final note: at this point in the logic, m is non-zero, so w-m < m.
    
  • For exception, I'd use \exception markup (that would also be helpful for my work-in-progress automatic C++ to pydoc converter script). (although this point may become moot if it's changed to non public). Same general comment applies to anywhere else in this branch, so I won't repeat.

comment:18 Changed 9 years ago by jinmei

(continued)

I've completed my review on ip_check.h. I've not fully reviewed the
tests yet, but I guess it's better to give the ticket back to Stephen
at this point.

splitIPAddress

  • This implementation accepts (e.g.) "/1" or "1/", which should be properly rejected with an exception.
  • like createNetmask, should this be public? Moreover, since this isn't templated it will result in duplicate symbols.
  • I'd revise the documentation using the term "prefix" (see the general comment), e.g.:
    /// Splits an IP prefix (given in the form of "xxxxxx/n" or "xxxxx") into a
    /// string representing its address part and the prefix length in bits.
    /// An exception...
    
    "(In the latter case...)" was removed because it's obvious from the definition of the prefix length.
  • Likewise, I'd rename the parameter "addrmask" to "ipprefix"
  • Isn't it better to use the max prefix length instead of the magic number of -1? Then we can eliminate the special case consideration for negative 'requested' in setNetmask (which I'd call setPrefixLength), and we could even make the parameter unsigned int.

IPCheck class

  • The union usage for AddressData? seems dangerous in that it breaks the basic rule of union: don't read a member that's different from a member that was last written. Especially in a bit more advanced languages like C++ we should generally be able to find a better way than doing this type of game using union. Also, the use of uint32_t version basically seems to be for performance optimization. If that's the main purpose, it seems to me premature. We should consider optimization after we are sure this is a heavy bottleneck, and after we know a specific optimization really improves the performance significantly, especially because it uses a dangerous hack like tweaking union. Actually, I'd expect this will eventually be a major optimization point considering some operators use very large ACLs, but at that point we'll probably need more aggressive reconstruction such as introducing a radix/trie, which would make this type of micro optimization moot.
  • at the risk of sounding self contradicting, I'd be a bit concerned about the redundant member variables like masksize_ (which I'd call prefixlen_) and straddr_. These can be generated on demand from other members, and, as commented, these wouldn't be needed in performance sensitive path. On the other hand, we may have a very large number of ACL entries, so the contribution of the additional redundant member might be significant in terms of memory usage.
  • Likewise, I wonder whether we might want to allocate spaces for address and mask dynamically, and only for the necessary amount depending on the address family.
  • as we already discussed, I'd suggest removing "inverse".
  • Do we really need the default constructor? It allows an IPCheck oject to have a mostly invalid state, opening up a possibility of bugs.
  • constructor from string: inet_pton() could fail with -1, too.
  • constructor from IPvX numeric addresses: I'd pass in_addr&/in6_addr& instead of the lower level primitives.
  • copy constructor: couldn't we simply substitute address_ and netmask_? i.e., address_ = other.address_, etc. Same for the assignment operator.
  • getNetmask(): I'd instead define a method returning a prefix length (int) because conceptually this class works with prefix lengths; netmasks are just internal representation (probably for faster matching)
  • getFamily: if we don't need to have the default constructor, maybe we don't have to worry about the asserts (IMO this is another reason why it would be better to not have it if we don't have a meaningful way to use it).
  • match(const uint8_t*): I'd like this method to support multiple families of addresses (and simply return false if address families don't match) either by having an additional "family" parameter or by taking an IPAddress structure as I defined in trac999preview branch (which is now in the central repo for reference). Otherwise, each specialization of matches() needs to deal with the address family mismatches (as I did in trac999review).
  • match(uint32_t): I'd remove this at least for now. If it's for optimization, it seems to me to be premature. Also, it's not really well defined when constructed for an IPv6 prefix.
  • setNetmask: to be very sure, I'd check it doesn't overrun with netmask (or we might want to use something like boost::array and use at() instead []).

comment:19 Changed 9 years ago by jinmei

  • Owner changed from jinmei to stephen

comment:20 Changed 9 years ago by jinmei

And I have some proposed editorial corrections and fixes in comments.
See the attached patch. Most of them are the proposed change of
"IPV[46]" to "IPv[46]". Others should be trivial and non controversial fixes.

Changed 9 years ago by jinmei

comment:21 follow-up: Changed 9 years ago by stephen

  • Owner changed from stephen to jinmei

Maybe a matter of preference, but I'd use the term "prefix" instead of (address+)netmask unless we want to support non contiguous network masks.

Done.

maybe we want to have a keyword "any" (or perhaps "any4"/"any6")

Good point. I missed the "match any address from this family". The string constructor now accepts "any4" and "any6".

on a related note, I guess we'd probably want to use 0-length prefix to indicate an "any" match.

Done.

createNetmask: should this be public? ...

Renamed to "createMask" (to get rid of the "netmask" connotation) and placed in the "internal" namespace (which seemed more to convey the idea that this is an internal function than "detail" would).

I suspect this should be "w-m < w" (at least the current expression doesn't make sense to me):

It was a typo - fixed.

For exception, I'd use \exception markup (that would also be helpful for my work-in-progress...

Done.

splitIPAddress: This implementation accepts (e.g.) "/1" or "1/", which should be properly rejected with an exception.

Sigh. That's what you get when you try to use something that does not *exactly* do the thing you're after. The method now does its own parsing and this should be fixed.

like createNetmask, should this be public? Moreover, since this isn't templated it will result in duplicate symbols.

Well-noticed. Moved to ip_check.cc

I'd revise the documentation using the term "prefix" (see the general comment), e.g.:

Done.

Likewise, I'd rename the parameter "addrmask" to "ipprefix"

Done.

Isn't it better to use the max prefix length instead of the magic number of -1?

No. At the point where that is returned, all that is known is that the string passed (presumably) comprises an address and no prefix length. At that point, the maximum valid prefix length is not yet known. As a value of 0 is now being interpreted as a request to match any address of this family, -1 is the easiest way of passing back information about the absence of a prefix length. (Note that explicitly specifying a negative number as the prefix length will cause an exception to be thrown, so the only way -1 can be returned is if no prefix length is given.)

setNetmask (which I'd call setPrefixLength...

I've renamed it to setMask - as it's creating a mask used in a logical operation, this seems appropriate.

The union usage for AddressData seems dangerous in that it breaks the basic rule of union: don't read a member that's different from a member that was last written.

I've not come across that before - I guess that this allows compiler some optimisation and we may fall foul of that and you
would need to use a "volatile" keyword to avoid it. OK, rewritten using a vector<uint8_t>.

at the risk of sounding self contradicting, I'd be a bit concerned about the redundant member variables like masksize_ (which I'd call prefixlen_) and straddr_.

Removed. A toText() method can be added if we need it, and getPrefixlen() now calculates the prefix length on the fly.

Likewise, I wonder whether we might want to allocate spaces for address and mask dynamically, and only for the necessary amount depending on the address family.

vector<uint8_t> is now used.

as we already discussed, I'd suggest removing "inverse".

Done.

Do we really need the default constructor? It allows an IPCheck object to have a mostly invalid state, opening up a possibility of bugs.

Depends if it is needed. I added the default constructor to facilitate its use in vectors and the like (e.g. initializing a vector of IPCheck objects to a known size), but it's trivial to remove it. (Note that compare() for an acl constructed with the default constructor will always return false for both IPV4 and IPV6 addresses).

constructor from string: inet_pton() could fail with -1, too.

This is only with an invalid address family, but the code now checks for the explicit success status.

constructor from IPvX numeric addresses: I'd pass in_addr&/in6_addr& instead of the lower level primitives.

Unless we really need them, let's leave it for now. Its just too much effort to modify the tests.

copy constructor: couldn't we simply substitute address_ and netmask_? i.e., address_ = other.address_, etc. Same for the assignment operator.

Yes you can. I'm confusing it with assignment, where array copies are not allowed. However, I've removed the copy constructor and assignment operator as they're now not needed.

getNetmask(): I'd instead define a method returning a prefix length (int) because conceptually this class works with prefix lengths; netmasks are just internal representation (probably for faster matching)

getPrefixlen() has been provided. getNetmask() (now renamed getMask()) is retained as it allows inspection of the internal mask, which is helpful in testing.

getFamily: if we don't need to have the default constructor, maybe we don't have to worry about the asserts

If we remove the default constructor, we'll remove the asserts. They're static asserts, so have no run-time impact.

match(const uint8_t*): I'd like this method to support multiple families of addresses (and simply return false if address families don't match) either by having an additional "family" parameter or by taking an IPAddress structure as I defined in trac999preview branch (which is now in the central repo for reference). Otherwise, each specialization of matches() needs to deal with the address family mismatches (as I did in trac999review).

Added a "family" argument.

match(uint32_t): I'd remove this at least for now.

Removed.

setNetmask: to be very sure, I'd check it doesn't overrun with netmask

Modified to ensure this.

One final thing. At present the address and comparison bitmask are both equal to the size of addresses of the relevant family. A possibly premature optimization would be to truncate them to the size dictated by the prefix length (e.g. is the prefix length were specified as /8, just store one byte). However, this will take a little time to do and you did suggest that you are waiting on these changes.

Changed 9 years ago by jinmei

comment:22 in reply to: ↑ 21 Changed 9 years ago by jinmei

Replying to stephen:

First, responding to your response. (I have no other comments on the
points I'm not explicitly referring to below, although there can be
further comments on the resulting code).

Do we really need the default constructor? It allows an IPCheck object to have a mostly invalid state, opening up a possibility of bugs.

Depends if it is needed. I added the default constructor to facilitate its use in vectors and the like (e.g. initializing a vector of IPCheck objects to a known size), but it's trivial to remove it. (Note that compare() for an acl constructed with the default constructor will always return false for both IPV4 and IPV6 addresses).

Apparently the intended usage is to use shared pointers of Check objects
(see the ACL class merged in the master), so it's less likely to store
bare Check objects in STL containers in practice. In general, I'd
prefer minimalist approach: don't introduce a feature due to a guessed
(but yet non existent) usage. But if you still prefer to keeping it,
I won't oppose to it any more in this review cycle.

constructor from string: inet_pton() could fail with -1, too.

This is only with an invalid address family

I don't think it's guaranteed so (even though it may be the case for
many actual implementations), but...

but the code now checks for the explicit success status.

...then that's okay (but see comments about the actual revised
implementation below).

One final thing. At present the address and comparison bitmask are both equal to the size of addresses of the relevant family. A possibly premature optimization would be to truncate them to the size dictated by the prefix length (e.g. is the prefix length were specified as /8, just store one byte). However, this will take a little time to do and you did suggest that you are waiting on these changes.

Yeah I've noticed this optimization, too. But whether or not other
ticket is waiting on the completion of this one, I'd rather defer this
type of optimization to a separate ticket.

Next, comments on the revised code. This time I've also reviewed the
unit tests.

general

You seem to have intended to apply my editorial suggestion on IPv[46]
vs IPV[46], but the actual result seems to be different from what I
actually suggested: my suggestion is to use "IPv4/6" (lower v) in
comments. You applied it in the reverse direction. Is that
intentional?

I've also made a few minor changes (mostly editorial, plus constify
something where it's trivial) and pushed them directly.

acl/Makefile.am

Why did you disable -Werror for libacl?

splitIPAddress (revised)

  • prefix_size: I'd personally defer defining a mutable variable until we really start using it (in order to effectively limit the lifetime of the variable), and, in this sense, I'd move the definition of it immediately before this part:
        // If there is a prefixlength, attempt to convert it.
        if (!prefixlen.empty()) {
        ...
    
  • so, are we (now?) allowing beginning and ending spaces (including '\n')?
  • mod_prefix could (should) be const.
  • prefix could simply be initialized using the default constructor (which would be a bit more efficient than constructing string("") and coping it)
  • slashpos could (should) be const

ip_check.h

general

  • lexical_cast.hpp and scoped_ptr.hpp should be removed

IPCheck

  • I know I may sound inconsistent regarding when to worry about performance and when to not, but I'm afraid using vectors are suboptimal in terms of both comparison performance (because its operator[] is slower than direct memory access) and memory footprint (due to the overhead of the class structure). But this only affects the internal data structure and we can change this later, so I'd propose keeping the current implementation for now, leaving this discussion open and creating a separate ticket for it.
  • "any4" and "any6": I'd rather use all 0 address than keeping them empty because having variable length data is an easy source of bugs. (even though it would be less efficient in terms of memory footprint)
  • constructors: address_() and mask_() could be omitted, but this is probably a matter of personal taste.
  • From IPv4 constructor: I don't think it a good idea to use the host byte order. And, on further thought, I'd even wonder whether we need these "from address" constructors in the first place for now. In the sense of minimalist approach, all we need for the immediate purpose is the constructor from the string. While I'm hesitating to suggest this, maybe it'd be smoother if we backed out these constructors from this branch for now and defer them to a separate ticket.
  • From IPv4 constructor: as far as I can see we don't need static_cast here.
  • From string constructor: this pattern doesn't seem to be the most common idiom in this case:
                    std::copy(address_bytes, address_bytes + IPV6_SIZE,
                              std::back_inserter(address_));
    
    I'd rewrite this to:
                    address_.assign(address_bytes, address_bytes + IPV6_SIZE);
    
    Same comment applies to the IPv4 case.
  • From string constructor: technically this is not 100% correct:
                        isc_throw(isc::InvalidParameter, "address prefix of " <<
                                  ipprefix << " is a not valid");
    
    (oh btw, the what message looks like a typo in "is a not valid") because if inet_pton fails with -1 it means a system error, not necessarily due to the input was invalid. Likewise, this is also not entire correct:
                } else {
                    // Not IPV6, try IPV4
    
    when status == -1. To be 100% pedantic we should do something like the attached diff.
  • getPrefixlen(): even though it's legal, I'd avoid shadowing the same name of variable 'i'.
  • getPrefixlen(): this for loop can make meaningless iterations once the mask byte is 0.
  • getPrefixlen(): the condition seems to be wrong. It should be "i < 8":
                    for (int i = 0; i < 8 * sizeof(uint8_t); ++i) {
    
  • compare(): (should have noticed this before) shouldn't this be non virtual?
  • compare(): personally I'd list 'family' first as it seems to be more significant, but that's purely a matter of taste, so this is just a comment.

ip_check_unittest

  • As far as I understand it, this is not correct:
        // Check on all possible 8-bit values.  Use a signed type to generate a
        // variable with the most significant bits set (right-shifting will
        // introduce additional bits).
    
    The behavior of >> for signed integer is implementation dependent (at least it is so for plain C). To be very safe, I'd use a uint16_t initialized with 0xff00 with >>. This way we could also integrate the case of i = 0 in the for loop.
  • Same for the 32-bit version, but I now guess we need this in the first place, both for the main code and test (and whether createMask needs to be templated) because we don't use the 32-bit version.
  • for comparison tests, I'd also use prefix lengths that are not for a 4 or 8-bit boundary, e.g. /19, /45, /127, etc.
  • MixedMode? test: I don't see a point in doing EXPECT_NO_THROW before EXPECT_FALSE for the same expression. I'd either remove the former or change it to ASSERT_NO_THROW.
  • MixedMode? test: for a more meaningful test, I'd use data that would confuse the compare method without the explicit family check, e.g. an IPv4 address to be tested and IPv6 prefix in the check whose first 32 bits are identical to the IPv4 address.

Others
These are basically random thoughts on future possible extensions. I
wouldn't request them be incorporated to the current branch (actually
I'd rather defer it).

  • We might want to "normalize" the address, that is, apply the mask to the address. For example, if we're given "192.0.2.1/24", internally convert it to "192.0.2.0/24".
  • maybe we want to implement specialized toText() for IPCheck..
  • for IPv6, we might want to support link-local addresses with scope consideration, i.e, differentiate fe80::1%ether0 and fe80::1%ether1. although not very advisable, there are actually people trying to use DNS over IPv6 link-local addresses.

comment:23 Changed 9 years ago by jinmei

  • Owner changed from jinmei to stephen

comment:24 follow-up: Changed 9 years ago by stephen

  • Owner changed from stephen to jinmei

ip_check

(about default constructor)

Apparently the intended usage is to use shared pointers of Check objects (see the ACL class merged in the master), so it's less likely to store bare Check objects in STL containers in practice. In general, I'd prefer minimalist approach: don't introduce a feature due to a guessed (but yet non existent) usage.

Unfortunately, at the time of writing, very little detail was specified about how the ACL objects would be used, hence the guesses. I've removed the default constructor.

You seem to have intended to apply my editorial suggestion on IPv[46] vs IPV[46], but the actual result seems to be different from what I actually suggested: my suggestion is to use "IPv4/6" (lower v) in comments. You applied it in the reverse direction. Is that intentional?

Misread it - changed.

acl/Makefile.am
Why did you disable -Werror for libacl?

Because I missed it when I copied and edited another Makefile.am after I introduced the .cc file. Corrected.

prefix_size: I'd personally defer defining a mutable variable until we really start using it..

Done.

so, are we (now?) allowing beginning and ending spaces (including '\n')?

Why not? Semantically, " 192.0.2.0 " means the same as "192.0.2.0". (Also, it follows Postel's dictum of being liberal in what you accept.)

mod_prefix could (should) be const.

Done.

prefix could simply be initialized using the default constructor (which would be a bit more efficient than constructing string("") and coping it)

True, but this is a minuscule overhead that I think is outweighed by the documentation effect provided by having two statements together where the initial values are clearly shown.

slashpos could (should) be const

Done.

lexical_cast.hpp and scoped_ptr.hpp should be removed

Done, with lexical_cast.hpp being moved to ip_check.cc

From IPv4 constructor: I don't think it a good idea to use the host byte order. And, on further thought, I'd even wonder whether we need these "from address" constructors in the first place for now...

As to the constructors, I don't mind. As I said, there was a certain amount of second-guessing when trying to create this class and my guess that non-string constructors were needed was obviously wrong. I've got rid of these so a number of your other comments now no longer apply.

I know I may sound inconsistent regarding when to worry about performance and when to not, but I'm afraid using vectors are suboptimal ...

Inconsistent? Never! :-) However, now that we only have a string constructor, we can fix array sizes. Both address and mask have been converted to a fixed-size uint8_t array.

constructors: address_() and mask_() could be omitted, but this is probably a matter of personal taste.

Removed.

The following no longer apply:

From IPv4 constructor: as far as I can see we don't need static_cast here.
From string constructor: this pattern doesn't seem to be the most common idiom in this case:

Continuing...

From string constructor: technically this is not 100% correct:
:

when status == -1. To be 100% pedantic we should do something like the attached diff.

Message has been changed and a different exception thrown for a negative status return.

getPrefixlen(): even though it's legal, I'd avoid shadowing the same name of variable 'i'.

Mistake - changed to j.

getPrefixlen(): this for loop can make meaningless iterations once the mask byte is 0.

I assume you mean needless iterations. Fixed.

getPrefixlen(): the condition seems to be wrong. It should be "i < 8":

Only because we know that uint8_t is one byte wide. However, the use of the multiplier was inconsistent so I have removed it.

compare(): (should have noticed this before) shouldn't this be non virtual?

It should. However, if we derive classes from IPPcheck, then we may want to access it and possibly override it, so I have made it public.

ip_check_unittest

The behavior of >> for signed integer is implementation dependent (at least it is so for plain C). To be very safe, I'd use a uint16_t initialized with 0xff00 with >>. This way we could also integrate the case of i = 0 in the for loop.
Same for the 32-bit version, but I now guess we need this in the first place, both for the main code and test (and whether createMask needs to be templated) because we don't use the 32-bit version.

I've removed the templated version - it's no longer needed.

for comparison tests, I'd also use prefix lengths that are not for a 4 or 8-bit boundary, e.g. /19, /45, /127, etc.

Added some that are not on nibble boundaries.

MixedMode test: I don't see a point in doing EXPECT_NO_THROW before EXPECT_FALSE for the same expression.

If there was an exception generated I wanted to clearly see where the problem was. However, removed the EXPECT_NO_THROW.

MixedMode test: for a more meaningful test, I'd use data that would confuse the compare method without the explicit family check, e.g. an IPv4 address to be tested and IPv6 prefix in the check whose first 32 bits are identical to the IPv4 address.

Done.

comment:25 in reply to: ↑ 24 Changed 9 years ago by jinmei

Replying to stephen:

It now mostly looks okay (and I'm happy that it's now much simplified
and more understandable).

I made some fixes and directly pushed them. Please check. I bevelive
they are trivial and should be non controversial.

From string constructor: technically this is not 100% correct:
:

when status == -1. To be 100% pedantic we should do something like the attached diff.

Message has been changed and a different exception thrown for a negative status return.

It's mostly okay, but there's still some subtle case. Consider the
case where first inet_pton() fails with -1 but the input was actually
valid. Then the second inet_pton() will also fail, this time with 0,
so we'll see a "not valid" exception what() message, which is
misleading. That's why I checked status == 0 in my suggested patch.
On the other hand, your version can rescue the case where the first
inet_pton() fails with -1, the input is a valid IPv4 address, and the
second inet_pton() succeeds.

We could make this perfect addressing both two cases, but all of these
are quite unlikely scenario anyway, so maybe it's not worth the
additional code complexity. I'd leave it to you.

I assume you mean needless iterations. Fixed.

There's still a possible needless iteration. I'd suggest revise it
to:

        for (size_t i = 0; i < IPV6_SIZE; ++i) {
            if (mask_[i] == 0xff) {
                // All bits set in this byte
                count += 8;
                continue;

            } else if (mask_[i] != 0) {
                // Only some bits set in this byte.  Count them.
                uint8_t byte = mask_[i];
                for (int j = 0; j < 8; ++j) {
                    count += byte & 0x01;   // Add one if the bit is set
                    byte >>= 1;             // Go for next bit
                }
            }
            // Unless the all bits set, there are no more remaining bits
	    // that is set, so exit.
            break;
        }

compare(): (should have noticed this before) shouldn't this be non virtual?

It should. However, if we derive classes from IPPcheck, then we may want to access it and possibly override it, so I have made it public.

Hmm, this seems to me to try catching an unseen need, but I don't
oppose it. But if you want to leave that possibility,

  • please document the intent
  • I'd suggest making it protected, not public.

When these minor things are addressed, I think it can be merged (I
don't think more explicit review is needed - unless you want to
discuss something about them).

Finally, please create tickets about open points.

comment:26 Changed 9 years ago by jinmei

  • Owner changed from jinmei to stephen

comment:27 Changed 9 years ago by stephen

  • Resolution set to complete
  • Status changed from reviewing to closed

Minor changes made and branch merged in commit 877e89713ad2398b6637b843a22c3b12607fe5bb

Regarding creating tickets, I'm not clear what open points you are talking about.

Note: See TracTickets for help on using tickets.