Opened 8 years ago

Closed 7 years ago

#1668 closed task (duplicate)

redo threaded stats tests with mocksessions

Reported by: jelte Owned by:
Priority: medium Milestone:
Component: statistics Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: discuss Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

The current tests in stats/ contain a number of strangely interacting threads, and it can be very tricky to reliably add or modify tests. (for instance, for #640, we just spent an hour hunting down a race which was caused by a zero-timeout join(), but making that join() blocking causes almost all of the other tests to deadlock).

It is highly probable that most, if not all, of these tests don't even need threads, and that mock sessions and modules would do just fine.

So we need to refactor these tests and, if possible, remove the threads. If not possible, we should at least make the threading code such that it does not rely on timeouts to finish, but cleanly wind down each started thread.

Subtickets

Change History (10)

comment:1 Changed 8 years ago by shane

  • Milestone changed from New Tasks to Year 3 Task Backlog

comment:2 Changed 8 years ago by jelte

Another related issue has come up:

http://git.bind10.isc.org/~tester/builder//BIND10/20120213154001-Debian5Linux-i686/logs/unittests.out

(if you run these tests often enough on a system with some load you're bound to see this error at some point too)

comment:3 follow-up: Changed 8 years ago by naokikambe

Hello,
I'm just curious about the race condition issue found on 640#comment:21. It was seemed to be discussed only on jabber.
Regards,

comment:4 Changed 8 years ago by jelte

FYI, I temporarily disabled another part of the tests I added in #640 in b10-stats_test.py; see commit 047380ffbbf6e876282cc269ef16d8d6b168d6d2

comment:5 in reply to: ↑ 3 Changed 8 years ago by jelte

Replying to naokikambe:

Hello,
I'm just curious about the race condition issue found on 640#comment:21. It was seemed to be discussed only on jabber.

The problem is partly in ThreadingServerManager? from testutils.py; when shutdown() is called on it, it forwards that call to whatever server-type it is managing. It the join()s the thread for that server, but does so with a zero timeout (i.e. it does not wait for it to finish). Since the test I needed for 640 does want those threads to do some additional processing (it tests cleanup), this often fails.

However, a lot of the other tests would deadlock here, should one change that zero timeout to a blocking one. So the workaround there, for now, is to specifically do a blocking join() for that specific test. See commit 2da5a3de561484f902aa7fb8475551217b9535b9

You can reproduce the behaviour by changing the call self.stats_server.shutdown(True) to self.stats_server.shutdown() in b10-stats_test.py. But since it is a race condition, the problem does not show on every run (about one in 5 on my system).

comment:6 Changed 8 years ago by naokikambe

Replying to jelte:
Thank you! When I was working for newly adding test_utils.py, I sometime experienced the case that the unittest didn't work when I omitted the argument of the join() or when I set the non-zero value in the argument. e.g. The threads never stopped, or it continued waiting until it took the specified seconds. It couldn't wait for so long time if each module in each thread was instantly exited. I didn't think that the issue was caused because the threads were used in the stats unittest. Anyway I don't know exactly whether this issue is same as that issue or not.
Regards,

comment:7 Changed 8 years ago by naokikambe

As long as I just saw a bit of b10-stats_test.py, will it work if we explicitly close the already opened ModuleCCSession object before we newly set the MockModuleCCSession object? For example,

--- a/src/bin/stats/tests/b10-stats_test.py
+++ b/src/bin/stats/tests/b10-stats_test.py
@@ -207,7 +207,8 @@ class TestStats(unittest.TestCase):
         # below), are temporarily disabled
         # See ticket #1668
         # Override moduleCCSession so we can check if send_stopping is called
-        #self.stats.mccs = MockModuleCCSession()
+        self.stats.mccs.close()
+        self.stats.mccs = MockModuleCCSession()
         self.assertEqual(send_shutdown("Stats"), (0, None))
         self.assertFalse(self.stats.running)
         # Call server.shutdown with argument True so the thread.join() call

But I don't at all test the above.
Regards,

comment:8 Changed 7 years ago by jinmei

  • Milestone set to Next-Sprint-Proposed

comment:9 Changed 7 years ago by jinmei

  • Milestone changed from Previous-Sprint-Proposed to Next-Sprint-Proposed

comment:10 Changed 7 years ago by jinmei

  • Resolution set to duplicate
  • Status changed from new to closed

This one seems to be taken over by #2823.
Closing this as a duplicate.

Note: See TracTickets for help on using tickets.