Opened 8 years ago

Closed 8 years ago

#1883 closed task (complete)

define tp_hash for some basic isc.dns classes

Reported by: jinmei Owned by: jinmei
Priority: medium Milestone: Sprint-20120717
Component: libdns++ Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 5 Add Hours to Ticket: 0
Total Hours: 2.25 Internal?: no

Description

In our Python wrappers tp_hash are generally (if not always) NULL,
so we cannot use our C++ based Python classes with python hash
naturally. It's inconvenient, and could also cause performance
issues.

I propose defining tp_hash for some basic classes like isc.dns.Name,
isc.dns.RRType, and isc.dns.RRClass. For the latter two the hash
should be trivial. For name, we can use a hash function defined in
LabelSequence.

Subtickets

Change History (14)

comment:1 Changed 8 years ago by jinmei

We've already supported this for the Name and RRClass types.
I just did it for RRType, too, which was needed to write an
experimental script for the resolver research work. I've pushed that
change in the trac1883 branch.

The change is very small and should be easy to review. So I suggest
getting it reviewed and merging quickly. I'm pushing it to the
proposed next sprint queue.

Note: while this ticket was originally given an estimation of "5",
it should now be much more lightweight.

comment:2 Changed 8 years ago by jinmei

  • Milestone set to Next-Sprint-Proposed
  • Priority changed from medium to high
  • Status changed from new to reviewing

comment:3 Changed 8 years ago by shane

  • Milestone changed from Next-Sprint-Proposed to Sprint-20120717

comment:4 Changed 8 years ago by jelte

  • Owner changed from UnAssigned to jelte

comment:5 follow-up: Changed 8 years ago by jelte

  • Owner changed from jelte to jinmei

I had a comment about Py_hash_t but I see we already have #2026 for that (do we care about the implicit cast there, btw?) :)

I have one tiny suggestion for the unit tests; add

        self.assertEqual(hash(RRType.AAAA()), hash(RRType("Type28")))

as well (or a similar one).

Other than that, this looks OK and can be merged.

comment:6 in reply to: ↑ 5 ; follow-up: Changed 8 years ago by jinmei

Replying to jelte:

I had a comment about Py_hash_t but I see we already have #2026 for that (do we care about the implicit cast there, btw?) :)

Hmm, not sure what exactly you meant by the implicit cast, but on
looking into it more closely, I noticed we actually need to do care
about it. Py_hash_t is a signed integer, and tp_hash() is not
expected to return -1. So we need to ensure two things:

  • the conversion to Py_hash_t is safe (not result in an undefined behavior)
  • the result is never -1

In the case of the extension to RRType, it's unlikely to be a real
issue in practice because the original hash value is of uint16_t, and
it size is quite likely to be smaller than sizeof(Py_hash_t). But it
could be more realistic concern for Name because the original hash
is of size_t, and (in my environment) it has the same size as that of
Py_hash_t.

So I introduced a wrapper converter and used it for all existing
cases.

I have one tiny suggestion for the unit tests; add

        self.assertEqual(hash(RRType.AAAA()), hash(RRType("Type28")))

as well (or a similar one).

Okay, added.

comment:7 Changed 8 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:8 in reply to: ↑ 6 ; follow-up: Changed 8 years ago by jelte

  • Owner changed from jelte to jinmei

Replying to jinmei:

Replying to jelte:

So I introduced a wrapper converter and used it for all existing
cases.

My compiler (gcc 4.6) is being a PITA and fails on that code;

(note, Shane reports success with gcc 4.7, and muks failure, also with 4.6, so there's a good chance this is a version-specific compiler bug; however, it is a version that is present on a number of systems)

pydnspp_common.h: In function 'Py_hash_t isc::dns::python::convertToPyHash(HashvalType) [with HashvalType = long unsigned int, Py_hash_t = long int]':
name_python.cc:527:60:   instantiated from here
pydnspp_common.h:90:59: error: left shift count >= width of type [-Werror]
cc1plus: all warnings being treated as errors

So it is able to figure out at compile time that a 64-bit left-shift (size_t) is no good, but it is not able to figure out that said statement wouldn't be reached for that type...

So far the only workaround I've come up with is to replace it by a variable;

        size_t mask_shift = (CHAR_BIT * sizeof(HashvalType));
        const Py_hash_t mask = ~(static_cast<Py_hash_t>(-1) << mask_shift);

(note that it's not const; otherwise it would once again calculate it and trip over the value)

The inverse case (else branch with uint16_t type) is also a problem;

In file included from rrclass_python.cc:23:0:
pydnspp_common.h: In function 'Py_hash_t isc::dns::python::convertToPyHash(HashvalType) [with HashvalType = short unsigned int, Py_hash_t = long int]':
rrclass_python.cc:271:62:   instantiated from here
pydnspp_common.h:96:65: error: large integer implicitly truncated to unsigned type [-Werror=overflow]

another cast of the ~ output seems to shut it up here.

But both these 'solutions' are quite ugly...

comment:9 in reply to: ↑ 8 Changed 8 years ago by jinmei

Replying to jelte:

My compiler (gcc 4.6) is being a PITA and fails on that code;

Ah, I was slightly afraid about that kind of failure. I'm not sure if
it's a compiler "bug", but in any case we need some workaround.

So far the only workaround I've come up with is to replace it by a variable;

[...]

But both these 'solutions' are quite ugly...

I've been trying to find a cleaner solution, but couldn't come up with
anything obviously better either.

One possible compromise is to use the wrapper only for the case of
Name (and simplifies the converter function focusing on that
specific case), as the implicit conversion is actually safe for the
other cases. But I guess the not-so-clear constraint of excluding -1
can really bite us if and when we support tp_hash for some other
classes, so I relatively prefer providing a generic wrapper from the
beginning even if it's less clean.

I've committed a slightly different version of your suggested change.
I don't seem to have easy access to the version of g++ that triggers
this warning, so I'm not sure if it really compiles thought. I'm also
not sure if hash_val_bits cannot be const. If you also agree on
providing a generic wrapper, could you confirm these? If you'd rather
prefer a minimalist approach and only convert the Name hash for now,
that's okay for me too.

comment:10 Changed 8 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:11 Changed 8 years ago by jinmei

  • Priority changed from high to medium

comment:12 follow-up: Changed 8 years ago by jelte

  • Owner changed from jelte to jinmei

Yes, this compiles for me, and seems as good as we can get it. Perhaps in some future we can remove the workaround once (if ever) this version of g++ is moved away. But for now I think we should keep it like this.

Please go ahead and merge.

comment:13 in reply to: ↑ 12 Changed 8 years ago by jinmei

Replying to jelte:

Yes, this compiles for me, and seems as good as we can get it. Perhaps in some future we can remove the workaround once (if ever) this version of g++ is moved away. But for now I think we should keep it like this.

Please go ahead and merge.

Okay, thanks. Merge done, closing.

comment:14 Changed 8 years ago by jinmei

  • Resolution set to complete
  • Status changed from reviewing to closed
  • Total Hours changed from 0 to 2.25
Note: See TracTickets for help on using tickets.