Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#2275 closed defect (fixed)

work on valgrind backlog

Reported by: jreed Owned by: jelte
Priority: medium Milestone: Sprint-20121009
Component: Unclassified Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: Core Feature Depending on Ticket:
Estimated Difficulty: 5 Add Hours to Ticket: 0
Total Hours: 2.5 Internal?: no

Description

As reminder I am creating this ticket proposed for the next sprint and setting to five points for now as discussed.
See http://bind10.isc.org/wiki/WeeklyMinutes20120911#GettingridofValgrindbacklogexistingreports

While here also consider evaluating old valgrind tickets: #644 and #645.

Subtickets

Change History (14)

comment:1 Changed 7 years ago by muks

  • Milestone changed from Next-Sprint-Proposed to Sprint-20121009

comment:2 Changed 7 years ago by jelte

  • Owner set to jelte
  • Status changed from new to assigned

comment:3 Changed 7 years ago by jelte

  • Owner changed from jelte to UnAssigned
  • Status changed from assigned to reviewing

Ready for review.

My system's valgrind does not detect all cases newer versions do, but still, these changes reduced the suppressions file I get from 8867 lines (608 suppressions) to 224 lines (8 suppressions)

Most of the changes are pretty straightforward I think.

A couple of the memsets could technically be a little more specific (for instance take into account which data we know we will write, and only blank the rest), but it seemed more prudent to initialize the entire thing first.

For most directories in src/lib the unittests *should* now pass with valgrind without any suppressions. A few notable exceptions are

  • any sets that contain death tests (valgrind goes bonkers on those, and probably rightfully so)
  • lib/resolver tests (need some serious rework of either the resolver or the test setup due to classes that are supposed to delete themselves)

comment:4 Changed 7 years ago by jinmei

  • Owner changed from UnAssigned to jinmei

comment:5 follow-up: Changed 7 years ago by jinmei

acl/dns.cc

Is the change to ensure that the loader will be destroyed
at the end of program's lifetime? If so, is this safe? We cannot
control the order of destruction of these objects, so I'm afraid this
can cause unexpected disruption due to possible reference between
these objects. For example, what if there's another static object,
which internally refers to this static loader, and its destructor
assumes the loader is valid?

...hmm, but on further thought it's not specific to this case...this
can be a potential issue wherever we use this type of static proxy
objects. I'd be still concerned about it with such a complicated
class than a mostly straightforward value-class like RRType, but
that'd be a separate issue anyway.

So I'm okay with the change with a couple of suggestions:

  • I'm afraid the intent of the use of smart pointer is not so obvious, so I'd comment why we do it.
  • I'd use boost::scope_ptr instead of auto_ptr. In this case the former should be sufficient.

sqlite3_accessor_unittest.cc

Not really related to the branch, but this comment doesn't seem to be
correct:

    // should work now that we closed it
    SQLite3Accessor accessor3(SQLITE_NEW_DBFILE, "IN");

Looks like a copy-paste error.

labelsequence_unittest.cc

My valgrind (3.8.1) doesn't complain about this. What's wrong with
the original code?

recursive_query_xxx

Not directly related, I'd make the 'const char*' variables 'const
char* const'. I'd also define things in an unnamed namescope.

socket_requestor_test.cc

My valgrind (3.8.1) doesn't complain about this. What's wrong with
the original code?

Last edited 7 years ago by jinmei (previous) (diff)

comment:6 Changed 7 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:7 in reply to: ↑ 5 ; follow-up: Changed 7 years ago by jelte

  • Owner changed from jelte to jinmei

Replying to jinmei:

acl/dns.cc

...hmm, but on further thought it's not specific to this case...this
can be a potential issue wherever we use this type of static proxy
objects. I'd be still concerned about it with such a complicated
class than a mostly straightforward value-class like RRType, but
that'd be a separate issue anyway.

Yeah it is solely for end-of-program destruction. Originally I tried another approach; make a (locally namespaced) cleanup method, and use atexit(), but that seemed overkill for this purpose, and less clear than a RAII-style smart pointer encapsulation.

With careful use of atexit(), I do believe we can solve the issue, but it would be fragile; If we really do care about destruction order, we may need to create some CleanupRegister? class, that knows internal class dependencies, and register singletons there, so that it can clean up those objects in the right order (or possible even a number of them, depending on library layout).

So I'm okay with the change with a couple of suggestions:

  • I'm afraid the intent of the use of smart pointer is not so obvious, so I'd comment why we do it.

ok

  • I'd use boost::scope_ptr instead of auto_ptr. In this case the former should be sufficient.

TBH, I think either are fine, but no strong opinion so I've changed it

(btw, i just noticed the includes were not in the right order, so changed that too)

sqlite3_accessor_unittest.cc

Not really related to the branch, but this comment doesn't seem to be
correct:

    // should work now that we closed it
    SQLite3Accessor accessor3(SQLITE_NEW_DBFILE, "IN");

Looks like a copy-paste error.

and the original was not exactly right either :p Updated.

labelsequence_unittest.cc

My valgrind (3.8.1) doesn't complain about this. What's wrong with
the original code?

uninitialized memory; the constructor checks both offsets and label length in one loop; so even though it failed on the offset being wrong, the label data memory was uninitialized.

I've reverted it to just add the one byte to the fixed array.

Come to think of it, this test only checks one of the error conditions (wrong offset, not max_labellen). Added a second case there to hit the other one.

recursive_query_xxx

Not directly related, I'd make the 'const char*' variables 'const
char* const'. I'd also define things in an unnamed namescope.

ok done

socket_requestor_test.cc

My valgrind (3.8.1) doesn't complain about this. What's wrong with
the original code?

another uninitialized memory error; I think it is not a difference in valgrind, but that my system accesses more (possibly all) of the path data bytes when binding a domain socket

comment:8 in reply to: ↑ 7 ; follow-up: Changed 7 years ago by jinmei

Replying to jelte:

  • I'd use boost::scope_ptr instead of auto_ptr. In this case the former should be sufficient.

TBH, I think either are fine, but no strong opinion so I've changed it

In this particular case (where the scope is very small and it's
obvious that auth_ptr is used like scoped_ptr), it may not matter much
in practice. But, in general, I prefer using the least powerful tool
for a specific goal because it would be generally less error-prone and
often more efficient. Since in this case we don't have to transfer the
ownership of the pointer, we don't need that part of the auto_ptr
features, and in that sense auto_ptr has unnecessary feature for us.

Anyway, I saw it changed, so of course I'm fine with that.

labelsequence_unittest.cc

My valgrind (3.8.1) doesn't complain about this. What's wrong with
the original code?

uninitialized memory; the constructor checks both offsets and label length in one loop; so even though it failed on the offset being wrong, the label data memory was uninitialized.

Ah, right. The original code was really buggy and valgrind correctly
identified it.

Come to think of it, this test only checks one of the error conditions (wrong offset, not max_labellen). Added a second case there to hit the other one.

This is fine.

socket_requestor_test.cc

My valgrind (3.8.1) doesn't complain about this. What's wrong with
the original code?

another uninitialized memory error; I think it is not a difference in valgrind, but that my system accesses more (possibly all) of the path data bytes when binding a domain socket

Hmm, on a closer look, I suspect the original code was really buggy,
and the *right* fix is this:

        strncpy(socket_address.sun_path, path_,
                sizeof(socket_address.sun_path));

(without the explicit memset)

The previous version used the length (strlen) of path_, so sun_path
could actually be non-nul terminated even if path_ is short enough.
(another reason why we should generally avoid such low-level APIs).

I believe with this change valgrind stops complaining about this.

comment:9 Changed 7 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:10 in reply to: ↑ 8 ; follow-up: Changed 7 years ago by jelte

  • Owner changed from jelte to jinmei

Replying to jinmei:

labelsequence_unittest.cc

My valgrind (3.8.1) doesn't complain about this. What's wrong with
the original code?

uninitialized memory; the constructor checks both offsets and label length in one loop; so even though it failed on the offset being wrong, the label data memory was uninitialized.

Ah, right. The original code was really buggy and valgrind correctly
identified it.

well, it would always hit the check and exception, but due to the initialized memory, it was essentially random which of the two parts of the if-check it would hit (so yeah, it was buggy, and certainly not doing what it intended, but it would never visibly fail)

Hmm, on a closer look, I suspect the original code was really buggy,
and the *right* fix is this:

        strncpy(socket_address.sun_path, path_,
                sizeof(socket_address.sun_path));

(without the explicit memset)

The previous version used the length (strlen) of path_, so sun_path
could actually be non-nul terminated even if path_ is short enough.
(another reason why we should generally avoid such low-level APIs).

I believe with this change valgrind stops complaining about this.

ah, doh, of course.

Changed.

comment:11 in reply to: ↑ 10 Changed 7 years ago by jinmei

Replying to jelte:

well, it would always hit the check and exception, but due to the initialized memory, it was essentially random which of the two parts of the if-check it would hit (so yeah, it was buggy, and certainly not doing what it intended, but it would never visibly fail)

Right.

The previous version used the length (strlen) of path_, so sun_path
could actually be non-nul terminated even if path_ is short enough.
(another reason why we should generally avoid such low-level APIs).

I believe with this change valgrind stops complaining about this.

ah, doh, of course.

Looks good, please merge.

comment:12 Changed 7 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:13 Changed 7 years ago by jelte

  • Resolution set to fixed
  • Status changed from reviewing to closed

thanks, merged, closing ticket

comment:14 Changed 7 years ago by jinmei

  • Total Hours changed from 0 to 2.5
Note: See TracTickets for help on using tickets.