Opened 9 years ago

Closed 8 years ago

#930 closed enhancement (fixed)

update stats modules to get statistics list from cfgmgr, and define the list in spec file of each module

Reported by: naokikambe Owned by: naokikambe
Priority: medium Milestone: Sprint-20111011
Component: statistics Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 20.0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

After #928 and #929, stats modules can get the statistics list from cfgmgr instead of local stats-spec file. So update stats modules for this, and define the statistics items in spec file of each module.

Subtickets

Change History (20)

comment:1 Changed 9 years ago by naokikambe

  • Type changed from defect to enhancement

comment:2 Changed 9 years ago by shane

  • Milestone changed from New Tasks to Year 3 Task Backlog

comment:3 Changed 9 years ago by naokikambe

  • Estimated Difficulty changed from 0.0 to 20.0
  • Owner set to UnAssigned
  • Status changed from new to reviewing

It's ready for reviewing.

comment:4 Changed 9 years ago by stephen

  • Milestone changed from Year 3 Task Backlog to Sprint-20110802

comment:5 Changed 9 years ago by naokikambe

  • Owner changed from UnAssigned to naokikambe
  • Status changed from reviewing to accepted

Conflicts with latest master code needs to be fixed.

comment:6 Changed 9 years ago by naokikambe

  • Owner changed from naokikambe to UnAssigned
  • Status changed from accepted to reviewing

Conflicts are resolved. Ready for reviewing again.

comment:7 Changed 9 years ago by vorner

  • Owner changed from UnAssigned to vorner

comment:8 Changed 9 years ago by vorner

  • Owner changed from vorner to naokikambe

Hello

I hope I didn't overlook much, as the diff is quite large. Anyway, the general impression from it looks good, but there are of course some comments:

  • The changelog could be more user oriented. At talks a lot about how data are sent trough the system, which users aren't interested much. I think the main point is that the items are specified by each module's spec file instead of one global.
  • The name of parse_spec doesn't look like the best choice, because the spec file is parsed already, this function extracts the default values of items. So, something like get_spec_defaults would look better.
  • If there are errors in command_show, an exception is raised, but the errors themselfs are not preserved anywhere. It would be better to give as much information as possible. Would it be possible to either log the errors or put them into the exception?
  • A lot of tests use time.sleep. I think this might be a problem, as our buildbots showed, the timers there are highly inaccurate and they are sometimes loaded. It might happen the timeout will not be enough. Or, if it would be large enough to run just about everywhere, it would be incredibly slow to run the tests. As this looks like it is some kind of synchronization between threads. I think it would be better to use some tool that is designed for this (some condition or event or whatever it is called in python).
  • This is not a problem, but I wonder, where these come from? I know the test does invoke some 500s, but who prints the messages?
    localhost - - [14/Jul/2011 12:17:58] code 500, message Internal Server Error
    localhost - - [14/Jul/2011 12:17:58] "GET /bind10/statistics/xml HTTP/1.1" 500 -
    localhost - - [14/Jul/2011 12:17:58] code 500, message Internal Server Error
    localhost - - [14/Jul/2011 12:17:58] "GET /bind10/statistics/xsd HTTP/1.1" 500 -
    localhost - - [14/Jul/2011 12:17:58] code 500, message Internal Server Error
    localhost - - [14/Jul/2011 12:17:58] "GET /bind10/statistics/xsl HTTP/1.1" 500 -
    
  • There are many places where there's an empty line containing only spaces. This is not a huge problem, but the code looks better if there are no spaces at the end of line or on empty lines. The file having most of them is src/bin/stats/tests/b10-stats-httpd_test.py, but some are in others as well. Git can show diff with them marked, if you set color.diff = auto (I believe it is this one, maybe it's a different color config).
  • This test looks really strange:
        def test_get_timestamp(self):
            self.assertEqual(stats.get_timestamp(), 1308730448.965706)
    
    I think it does depend on being run after another test that sets the function to get the current time to constant one. But depending on order of tests is fragile, the setup should be put into setUp() or somewhere like that and this test would deserve a comment about it. There are more tests like this.
  • This is unneeded, isn't it?
    tearDown():
            pass
    
  • We run the tests from source? Well, we do, but I mean with the environment variable? Where does this come from?
    self.assertTrue("B10_FROM_SOURCE" in os.environ)
    
  • Generally, the tests where the whole system is simulated is hard to call a „unit test“. I'm not opposed to having them, having more and more thorough tests is good for sure, but some kind of comment on what is going on there would be nice.
  • A lot of functions (eg. update_modules) would deserve a little bit more documentation. For example, some of them return some kind of None in case of bad name and are handled by a condition ‒ this special return value should be noted (or, better, this should be thrown as some kind of exception and caught at the level where the command is received from CC, automatically converting to the error answer ‒ but it even the exception should be noted in the docstring).

Thank you

comment:9 follow-ups: Changed 9 years ago by naokikambe

  • Owner changed from naokikambe to vorner

Replying to vorner:

Thank you for reviewing.

I added more changes into trac930 branch after you reviewed codes first. Because we must add more changes depending on #928 and #929 as these reviewing was going on. Maybe you have already read until git 5909fb3d1dd3cd806e0df965d707b45803d4b9c3. Could you see after git a0a022bf4c5150f62b8a4118e5d6a4dcaf69f669? Thank you very much for patience.

  • The changelog could be more user oriented. At talks a lot about how data are sent trough the system, which users aren't interested much. I think the main point is that the items are specified by each module's spec file instead of one global.

I made it easier. How about this?

xxx.    [func]          naokikambe
	Statistics items are specified by each module's spec file.
	Stats module can read these through the config manager. Stats
	module and stats httpd report statistics data and statistics
	schema by each module via both bindctl and HTTP/XML.
	(Trac #928,#929,#930, git nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn)
  • The name of parse_spec doesn't look like the best choice, because the spec file is parsed already, this function extracts the default values of items. So, something like get_spec_defaults would look better.

I renamed the function name at git 24fa3987d6e1f42d645f2c0a9c26cd8bdbc9c16b.

  • If there are errors in command_show, an exception is raised, but the errors themselfs are not preserved anywhere. It would be better to give as much information as possible. Would it be possible to either log the errors or put them into the exception?

I fixed that at git 065c6004799a76d45859289d2ebae0379e9dceba.

  • A lot of tests use time.sleep. I think this might be a problem, as our buildbots showed, the timers there are highly inaccurate and they are sometimes loaded. It might happen the timeout will not be enough. Or, if it would be large enough to run just about everywhere, it would be incredibly slow to run the tests. As this looks like it is some kind of synchronization between threads. I think it would be better to use some tool that is designed for this (some condition or event or whatever it is called in python).

I could not find nicer solution:(
Could you give some example or can it be out of scope on this ticket?

  • This is not a problem, but I wonder, where these come from? I know the test does invoke some 500s, but who prints the messages?
    localhost - - [14/Jul/2011 12:17:58] code 500, message Internal Server Error
    localhost - - [14/Jul/2011 12:17:58] "GET /bind10/statistics/xml HTTP/1.1" 500 -
    localhost - - [14/Jul/2011 12:17:58] code 500, message Internal Server Error
    localhost - - [14/Jul/2011 12:17:58] "GET /bind10/statistics/xsd HTTP/1.1" 500 -
    localhost - - [14/Jul/2011 12:17:58] code 500, message Internal Server Error
    localhost - - [14/Jul/2011 12:17:58] "GET /bind10/statistics/xsl HTTP/1.1" 500 -
    

Actually I didn't implement it. I think http.server.log_message might print theses messages.
We need to add more messages helpful for users?

  • There are many places where there's an empty line containing only spaces. This is not a huge problem, but the code looks better if there are no spaces at the end of line or on empty lines. The file having most of them is src/bin/stats/tests/b10-stats-httpd_test.py, but some are in others as well. Git can show diff with them marked, if you set color.diff = auto (I believe it is this one, maybe it's a different color config).

I might mistake somewhere when I copied some codes. I removed the unnecessary tailing whitespaces.
Please see git f02cc6985eff417f34e4edd232b6836852de6c98.

  • This test looks really strange:
        def test_get_timestamp(self):
            self.assertEqual(stats.get_timestamp(), 1308730448.965706)
    
    I think it does depend on being run after another test that sets the function to get the current time to constant one. But depending on order of tests is fragile, the setup should be put into setUp() or somewhere like that and this test would deserve a comment about it. There are more tests like this.
  • This is unneeded, isn't it?
     tearDown():
            pass
    

I fixed at git 5002ef4bc64e1d58293d8891adbaf652e075a6cf. Please see that.

  • We run the tests from source? Well, we do, but I mean with the environment variable? Where does this come from?
    self.assertTrue("B10_FROM_SOURCE" in os.environ)
    

We set os.environ["B10_FROM_SOURCE"] when doing "make check" as the following. We are sure that is set properly in the test. However actually this test is not so important. We need to remove this?

tests/Makefile.am:

        for pytest in $(PYTESTS) ; do \
...
        B10_FROM_SOURCE=$(abs_top_srcdir) \
...
        $(PYCOVERAGE_RUN) $(abs_srcdir)/$$pytest || exit ; \
        done
  • Generally, the tests where the whole system is simulated is hard to call a "unit test“. I'm not opposed to having them, having more and more thorough tests is good for sure, but some kind of comment on what is going on there would be nice.

I added comments into test scripts at git fee53aeaaf8c953c856f12ae3388a61abc47d2f9, but I could not remeber nicer comments. Could you give me advice?

  • A lot of functions (eg. update_modules) would deserve a little bit more documentation. For example, some of them return some kind of None in case of bad name and are handled by a condition - this special return value should be noted (or, better, this should be thrown as some kind of exception and caught at the level where the command is received from CC, automatically converting to the error answer - but it even the exception should be noted in the docstring).

I modified the behaviors of these methods. Please see git 5c9bdf1345699e47018c16254778de67cc695847.

Thank you.

comment:10 in reply to: ↑ 9 Changed 9 years ago by naokikambe

Replying to naokikambe:

  • A lot of tests use time.sleep. I think this might be a problem, as our buildbots showed, the timers there are highly inaccurate and they are sometimes loaded. It might happen the timeout will not be enough. Or, if it would be large enough to run just about everywhere, it would be incredibly slow to run the tests. As this looks like it is some kind of synchronization between threads. I think it would be better to use some tool that is designed for this (some condition or event or whatever it is called in python).

I could not find nicer solution:(
Could you give some example or can it be out of scope on this ticket?

This is not nice solution and basically nothing is improved, but I refactored unittests a little. Could you see git 7a682a158e383cd2b11f1ba49b7d80c05cb4d733?

Thanks,

comment:11 in reply to: ↑ 9 ; follow-up: Changed 9 years ago by vorner

  • Owner changed from vorner to naokikambe

Good morning

Replying to naokikambe:

I added more changes into trac930 branch after you reviewed codes first. Because we must add more changes depending on #928 and #929 as these reviewing was going on. Maybe you have already read until git 5909fb3d1dd3cd806e0df965d707b45803d4b9c3. Could you see after git a0a022bf4c5150f62b8a4118e5d6a4dcaf69f669? Thank you very much for patience.

By the added code, you mean the validation?

I'm slightly nervous about these changes. While validation is important, I think it should be done on the receiving side (in the stats module), for two reasons.

One is, if it is checked by the module which is sending the data, it does not solve much. If it is buggy, it can as well do the validation wrong and send wrong data anyway. And if the module wants to be malicious, nothing stops it from being so. Therefore I don't think validation on the sending side solves any problem (or, I at last don't see any problem being solved by this).

Another is, when it is in the stats module, it is enough to be implemented once for all modules and save a lot of almost duplicate code in many modules.

  • A lot of tests use time.sleep. I think this might be a problem, as our buildbots showed, the timers there are highly inaccurate and they are sometimes loaded. It might happen the timeout will not be enough. Or, if it would be large enough to run just about everywhere, it would be incredibly slow to run the tests. As this looks like it is some kind of synchronization between threads. I think it would be better to use some tool that is designed for this (some condition or event or whatever it is called in python).

I could not find nicer solution:(
Could you give some example or can it be out of scope on this ticket?

As I'm looking into it, I think just placing the _started.set() at the right place would be enough ‒ like when the thing is already prepared to do whatever is needed. In most cases it looked like it already is placed where it should anyway, but I didn't study it too properly.

  • This is not a problem, but I wonder, where these come from? I know the test does invoke some 500s, but who prints the messages?
    localhost - - [14/Jul/2011 12:17:58] code 500, message Internal Server Error
    localhost - - [14/Jul/2011 12:17:58] "GET /bind10/statistics/xml HTTP/1.1" 500 -
    localhost - - [14/Jul/2011 12:17:58] code 500, message Internal Server Error
    localhost - - [14/Jul/2011 12:17:58] "GET /bind10/statistics/xsd HTTP/1.1" 500 -
    localhost - - [14/Jul/2011 12:17:58] code 500, message Internal Server Error
    localhost - - [14/Jul/2011 12:17:58] "GET /bind10/statistics/xsl HTTP/1.1" 500 -
    

Actually I didn't implement it. I think http.server.log_message might print theses messages.
We need to add more messages helpful for users?

It might be a nice bonus, but nothing really important IMO. Let's leave it for now, I was just curious because I didn't find the code producing them originally.

  • This test looks really strange:
        def test_get_timestamp(self):
            self.assertEqual(stats.get_timestamp(), 1308730448.965706)
    
    I think it does depend on being run after another test that sets the function to get the current time to constant one. But depending on order of tests is fragile, the setup should be put into setUp() or somewhere like that and this test would deserve a comment about it. There are more tests like this.

The test_command_show still uses the numeric „magic“ constants. It is correct, as they are set and checked 2 lines from each other and it is a different test fixture than the rest, so it is maybe even easier, I just wanted to point it out, as it might be less clean. But if you like it this way, it is OK.

  • We run the tests from source? Well, we do, but I mean with the environment variable? Where does this come from?
    self.assertTrue("B10_FROM_SOURCE" in os.environ)
    

We set os.environ["B10_FROM_SOURCE"] when doing "make check" as the following. We are sure that is set properly in the test. However actually this test is not so important. We need to remove this?

No, no need to remove it, actually testing the assumption is better. It is just unusual for the variable being set, maybe adding a comment to the assert it comes from the Makefile to clarify?

  • Generally, the tests where the whole system is simulated is hard to call a "unit test“. I'm not opposed to having them, having more and more thorough tests is good for sure, but some kind of comment on what is going on there would be nice.

I added comments into test scripts at git fee53aeaaf8c953c856f12ae3388a61abc47d2f9, but I could not remeber nicer comments. Could you give me advice?

Well, they help a little. But I'd appreciate something like listing of what is being done and how. I'm not sure this is true for the code (so check after me it doesn't lie), but maybe something like this?:

In each of these tests we start several virtual components. They are not the
real components, no external processes are started. They are just simple mock
objects running each in its own thread and pretending to be bind10 modules. This
helps testing the stats module in a close to real environment.

With regards

comment:12 in reply to: ↑ 11 ; follow-up: Changed 9 years ago by naokikambe

  • Owner changed from naokikambe to vorner

Hello, thank you for giving comments again.

Replying to vorner:

By the added code, you mean the validation?

Yes.

I'm slightly nervous about these changes. While validation is important, I think it should be done on the receiving side (in the stats module), for two reasons.

One is, if it is checked by the module which is sending the data, it does not solve much. If it is buggy, it can as well do the validation wrong and send wrong data anyway. And if the module wants to be malicious, nothing stops it from being so. Therefore I don't think validation on the sending side solves any problem (or, I at last don't see any problem being solved by this).

I'm very sorry if I am misunderstanding. Of course the stats module validates statistics data received from other modules. Here the validation is just a spec-format validation before sending statistics data. The validation is done by a non-stats module itself (like auth, boss etc). For example the implementation of this validation is like ModuleSpec::validateConfig or like ModuleSpec::validateCommand already implemented. And Jelte-san commented in #928 that statistics validation also would be required for other modules.(6:ticket:928)

Another is, when it is in the stats module, it is enough to be implemented once for all modules and save a lot of almost duplicate code in many modules.

Yes, we need to consider more about the duplicate implementation. But I think that might be beyond the scope of #930 a little.

As I'm looking into it, I think just placing the _started.set() at the right place would be enough - like when the thing is already prepared to do whatever is needed. In most cases it looked like it already is placed where it should anyway, but I didn't study it too properly.

That's right but it seems that _started.set()/wait() actually works properly. I confirmed it by a simple script as following.

import time, threading
import test_utils

class MockThreadingServerManager(test_utils.ThreadingServerManager):
    def run(self):
        assert self.server._started.is_set() is False
        self.server._thread.start()
        print(str(time.time()))
        self.server._started.wait()
        assert self.server._started.is_set() is True
        self.server._started.clear()
        assert self.server._started.is_set() is False
        print(str(time.time()))

class Mock:
    def __init__(self):
        self._started = threading.Event()
        self.running = False
    def run(self):
        time.sleep(3)
        assert self._started.is_set() is False
        self._started.set()
        assert self._started.is_set() is True
        self.running = True
        while self.running: time.sleep(1)
    def shutdown(self):
        self.running = False

if __name__ == "__main__":
    mock = MockThreadingServerManager(Mock)
    mock.run()
    mock.shutdown()

However actually more wait is required after _started.set() is done (in other word after _started.wait() is done). As one solution, should we move _started.set() to inside of the target module, for example, just before the while-loop in the target module is started? But it might be not good to customize inside of the target module for test only.

It might be a nice bonus, but nothing really important IMO. Let's leave it for now, I was just curious because I didn't find the code producing them originally.

OK. Thank you.

The test_command_show still uses the numeric "magic“ constants. It is correct, as they are set and checked 2 lines from each other and it is a different test fixture than the rest, so it is maybe even easier, I just wanted to point it out, as it might be less clean. But if you like it this way, it is OK.

I also moved these variables into the setUP function as constants.

No, no need to remove it, actually testing the assumption is better. It is just unusual for the variable being set, maybe adding a comment to the assert it comes from the Makefile to clarify?

OK, but I moved it from the setUp, and I merged it with another OS environment test(test_osenv) into another unittest class.

Well, they help a little. But I'd appreciate something like listing of what is being done and how. I'm not sure this is true for the code (so check after me it doesn't lie), but maybe something like this?:

In each of these tests we start several virtual components. They are not the
real components, no external processes are started. They are just simple mock
objects running each in its own thread and pretending to be bind10 modules. This
helps testing the stats module in a close to real environment.

Very good comment, thank you! I almost pasted it in header of each unittest script as it is.

That changes are pushed at git 3e3f4cad1dd4068070cac2e322d070df563565eb. And I also added trivial changes at 81724ff9d7b84fffbef12d5fe29110ac648cc27f before. Please see these.

Best regards,

comment:13 in reply to: ↑ 12 ; follow-up: Changed 9 years ago by vorner

  • Owner changed from vorner to naokikambe

Hello

Replying to naokikambe:

One is, if it is checked by the module which is sending the data, it does not solve much. If it is buggy, it can as well do the validation wrong and send wrong data anyway. And if the module wants to be malicious, nothing stops it from being so. Therefore I don't think validation on the sending side solves any problem (or, I at last don't see any problem being solved by this).

I'm very sorry if I am misunderstanding. Of course the stats module validates statistics data received from other modules. Here the validation is just a spec-format validation before sending statistics data. The validation is done by a non-stats module itself (like auth, boss etc). For example the implementation of this validation is like ModuleSpec::validateConfig or like ModuleSpec::validateCommand already implemented. And Jelte-san commented in #928 that statistics validation also would be required for other modules.(6:ticket:928)

Ah, I see. OK, then it makes sense.

However actually more wait is required after _started.set() is done (in other word after _started.wait() is done). As one solution, should we move _started.set() to inside of the target module, for example, just before the while-loop in the target module is started? But it might be not good to customize inside of the target module for test only.

OK, let's leave it this way for now, because it doesn't look trivial. But could you, please, create a ticket to get rid of that sleep somehow? I still think it is not really nice to have it there, though I don't see an easy way to avoid it right now.

So, I believe the branch is ready.

comment:14 in reply to: ↑ 13 ; follow-up: Changed 9 years ago by naokikambe

  • Owner changed from naokikambe to vorner

Hello,

Replying to vorner:

OK, let's leave it this way for now, because it doesn't look trivial. But could you, please, create a ticket to get rid of that sleep somehow? I still think it is not really nice to have it there, though I don't see an easy way to avoid it right now.

OK. I haven't created a ticket for it yet, but I made a draft. How about this?:

Titile: find another solution to synchronize between threads used in
        unittests for stats

Description: As we discussed in #930, time.sleep() is used to
        synchronize between threads used in unittests for stats. It is
        not appropriate. Because in our buildbots, the timer is
        inaccurate and timeout might be not enough.

So, I believe the branch is ready.

I'm waiting for finishing #929. trac930 depends on trac929, and trac929 is already merged into trac930. If #929 is ok, I will merge trac930 into master.

Best

Last edited 9 years ago by naokikambe (previous) (diff)

comment:15 in reply to: ↑ 14 ; follow-up: Changed 9 years ago by vorner

  • Owner changed from vorner to naokikambe

Hello

Replying to naokikambe:

Replying to vorner:

OK, let's leave it this way for now, because it doesn't look trivial. But could you, please, create a ticket to get rid of that sleep somehow? I still think it is not really nice to have it there, though I don't see an easy way to avoid it right now.

OK. I haven't created a ticket for it yet, but I made a draft. How about this?:

Titile: find another solution to synchronize between threads used in
        unittests for stats

Description: As we discussed in #930, time.sleep() is used to
        synchronize between threads used in unittests for stats. It is
        not appropriate. Because in our buildbots, the timer is
        inaccurate and timeout might be not enough.

I'd also include another reason, the sleep slows down the tests, but yes, this one is OK.

So, I believe the branch is ready.

I'm waiting for finishing #929. trac930 depends on trac929, and trac929 is already merged into trac930. If #929 is ok, I will merge trac930 into master.

ACK.

With regards.

comment:16 in reply to: ↑ 15 Changed 9 years ago by naokikambe

  • Resolution set to fixed
  • Status changed from reviewing to closed

Hello,

I've just merged the branch with master and have created a new ticket(#1175), and I'm closing this ticket. Thank you for so long reviewing.

Best

comment:17 Changed 9 years ago by naokikambe

Michal-san,

Sorry, as you know the merging is not done yet because some failures was occurred in buidbot:

http://git.bind10.isc.org/~tester/builder//BIND10/20110815074001-Debian5Linux-i686/logs/unittests.out
http://git.bind10.isc.org/~tester/builder//BIND10/20110815065000-Solaris10-sparc-Sunstudio/logs/unittests.out
http://git.bind10.isc.org/~tester/builder//BIND10-systest/20110815064501-MacOS/logs/unittests.out

I'll work for this problem on #1175. If the problem is fixed, I will merge the branch together with the part of trac930. #930 is still closed.

Best

comment:18 Changed 9 years ago by naokikambe

  • Resolution fixed deleted
  • Status changed from closed to reopened

Merging has not been completed. The ticket should be reopened.

comment:19 Changed 8 years ago by jelte

  • Milestone changed from Sprint-20110927 to Sprint-20111011

comment:20 Changed 8 years ago by naokikambe

  • Resolution set to fixed
  • Status changed from reopened to closed

merging #1175 and #930 is done, closing ticket.

Note: See TracTickets for help on using tickets.