Opened 10 years ago

Closed 9 years ago

#191 closed task (fixed)

Implement a initial version of stats

Reported by: naokikambe Owned by: naokikambe
Priority: medium Milestone: y2 6 month milestone
Component: statistics Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity:
Sub-Project: Feature Depending on Ticket:
Estimated Difficulty: 20.0 Add Hours to Ticket: 0
Total Hours: 5.83 Internal?: no

Description

  • Make a initial version of stats daemon which works with other components
    • It can accept the commands related statistics and can report statistics data via bindctl.
  • Fix the specification of statistics channel which is used between stats daemon and other components

Subtickets

Attachments (1)

stats.py,cover (14.7 KB) - added by naokikambe 9 years ago.
An annotate source file of stats.py

Download all attachments as: .zip

Change History (25)

comment:1 Changed 10 years ago by naokikambe

  • Type changed from defect to task

mistook to select type

comment:2 Changed 9 years ago by naokikambe

  • Status changed from new to accepted

I committed r2195 as a initial prototype of stats module.
see also svn log.

comment:3 Changed 9 years ago by naokikambe

  • billable set to 0
  • Estimated Difficulty set to 0.0
  • Internal? unset
  • Owner changed from naokikambe to UnAssigned
  • Status changed from accepted to reviewing

I've completed implementation of ticket #191. I committed r2679. So
this ticket is ready for reviewing. Please refer to the document
StatsModule for details, which is related to ticket #170.

comment:4 Changed 9 years ago by jinmei

Very preliminary feedback:

  • maybe we should rebase it. the branch point was based on very old version of trunk, which makes review difficult
  • from a quick run of unittests, the test coverage is not good. When I run b10-stats_test.py, the coverage report says only 32% of stats.py are covered. I'd say at least 80% of it should be covered before doing formal review by a third person.

comment:5 Changed 9 years ago by jinmei

  • Owner changed from UnAssigned to naokikambe

comment:6 Changed 9 years ago by jinmei

Please also provide a proposed ChangeLog? entry for this ticket.

comment:7 follow-up: Changed 9 years ago by naokikambe

Jinmei-san,

Thank you for your comments. Fist, can I ask you a basic way to evaluate coverage?

Replying to jinmei:

  • from a quick run of unittests, the test coverage is not good. When I run b10-stats_test.py, the coverage report says only 32% of stats.py are covered.

In my environment, it reports 76% in stats.py. I actually evaluated the coverage like following. Could you tell me differences from what you did if there are? I also attached stats.py,cover.

I'll answer your other comments soon.
Thanks,

[kambe@covm tests]$ coverage help
Coverage.py, version 3.3.1
Measure, collect, and report on code coverage in Python programs.

usage: coverage <command> [options] [args]

Commands:
    annotate    Annotate source files with execution information.
    combine     Combine a number of data files.
    erase       Erase previously collected coverage data.
    help        Get help on using coverage.py.
    html        Create an HTML report.
    report      Report coverage stats on modules.
    run         Run a Python program and measure code execution.
    xml         Create an XML report of coverage results.

Use "coverage help <command>" for detailed help on any command.
Use "coverage help classic" for help on older command syntax.
For more information, see http://nedbatchelder.com/code/coverage
[kambe@covm tests]$ pwd
/home/kambe/svn/bind10.isc.org/svn/bind10/branches/trac191/src/bin/stats/tests
[kambe@covm tests]$ coverage erase
[kambe@covm tests]$ env PYTHONPATH=../../../lib/python:.. coverage run ./b10-stats_test.py
........[Stats] Error requesting configuration: just an error
....
----------------------------------------------------------------------
Ran 12 tests in 0.178s

OK
[kambe@covm tests]$ coverage report -m
Name                                                                                                           Stmts   Exec  Cover   Missing
--------------------------------------------------------------------------------------------------------------------------------------------
/home/kambe/svn/bind10.isc.org/svn/bind10/branches/trac191/src/bin/stats/stats                       230    176    76%   35, 46-47, 50-52, 68, 84-87, 120, 178-179, 209-213, 233, 242, 257, 266-272, 278, 296, 312, 331, 349, 358, 368, 370, 378, 387, 403-420, 423
<...snip...>
[kambe@covm tests]$ coverage annotate ../stats_stub.py
[kambe@covm tests]$ 

Changed 9 years ago by naokikambe

An annotate source file of stats.py

comment:8 Changed 9 years ago by naokikambe

  • Owner changed from naokikambe to jinmei

comment:9 Changed 9 years ago by naokikambe

Replying to jinmei:

  • maybe we should rebase it. the branch point was based on very old version of trunk, which makes review difficult

I created a new branch [source:/branches/trac191-rebased trac191-rebased] as r2769, which was merged from the newest trunk and added source files of stats module. Please refer to the branch.

comment:10 in reply to: ↑ 7 Changed 9 years ago by jinmei

Replying to naokikambe:

Replying to jinmei:

  • from a quick run of unittests, the test coverage is not good. When I run b10-stats_test.py, the coverage report says only 32% of stats.py are covered.

In my environment, it reports 76% in stats.py. I actually evaluated the coverage like following. Could you tell me differences from what you did if there are? I also attached stats.py,cover.

Ah, sorry, it turns out I didn't set B10_FROM_SOURCE and tests failed.

I can now see the more decent coverage results.

comment:11 Changed 9 years ago by jinmei

  • Add Hours to Ticket changed from 0.0 to 3.08
  • Total Hours changed from 0.0 to 3.08

I've not completed my review, but am providing some initial feedback:

run_b10-stats.sh.in

  • is B10_FROM_BUILD necessary? (the same comment applies to other .sh)

stats.spec.pre.in

  • I don't think this is a good "default". (it rather looks like an "example")
            "item_default": "123abc@xxxx",
    

unittest_fakesession.py

  • this seems to be largely derived from config/tests/unittest_fakesession.py. please note it if you do something like that. it will help reduce review burden very much. also, it's even better to unify the code (with appropropriate abstraction if per-module customization is necessary). having duplicate code is generally not a good practice.
  • I'm not sure if this line of close() matches the what this method does:
            # need to pass along somehow that this function has been called,
    
    moreover, I'm not even sure about the purpose of this method. Is this while-select trick necessary? Please describe the purpose of this method and how it's expected to be used (as comments).
  • this line in close() seems to be dead code:
                    raise Exception('Unexpected return values from select():', r, w, e)
    
  • test_boss_stub: I don't this is a good test because get_datetime() may return different values in the test and in the testee. (code duplication of get_datetime() is another bad thing)

b10-stats_stub_test.py

  • I'd move tearDown() right after setUp().

stats_stub.py.in

  • BossModuleStub?.send_command(): the method name is too generic for what it's actually doing.
  • I suggest all temporary resources be cleaned up in tearDown(). from a quick look, it seems self.listener has to be stopped.

b10-stats_test.py

  • the intent of this part is not clear. please add comment:
            for i in (0, 1):
                self.session.get_message("ConfigManager", None)
    

comment:12 Changed 9 years ago by jinmei

  • Add Hours to Ticket changed from 0.0 to 0.25
  • Total Hours changed from 3.08 to 3.33

I also have some high level questions about the design of stats.py:

  • I don't see the need for the abstraction using the observer pattern because there is only type of listener (=observer). what's the background rationle of this design decision? Please describe it in more detail in the source code.
  • Likewise, I was not sure about the intent of the seeming introduction of singleton. From a very superficial look at the code the intent seems to make SessionSubject? a singleton class, but at least in the usage of stats.py we don't have to bother to have such a tricky technique, because there cannot be more than one SessionSubject? object in the first place. Besides, I suspect this implementation actually doesn't realize what is seeminly intended:
    class SessionSubject(Subject):
        """
        A concrete subject class which creates CC session object
        """
        __metaclass__ = Singleton
    
    according to a python 2-to-3 guide, this syntax has been depreciated: http://diveintopython3.org/porting-code-to-python-3-with-2to3.html#metaclass Please clarify the intent of the code in more detail.

comment:13 Changed 9 years ago by jinmei

  • Owner changed from jinmei to naokikambe

comment:14 Changed 9 years ago by naokikambe

  • Estimated Difficulty changed from 0.0 to 20.0
  • Owner changed from naokikambe to jinmei

Thank you, Jinmei-san,

Replying to jinmei:

Please also provide a proposed ChangeLog? entry for this ticket.

I committed at r2845

Replying to jinmei:

I'd say at least 80% of it should be covered before doing formal review by a third person.

I committed at r2846

Replying to jinmei:

I've not completed my review, but am providing some initial feedback:

run_b10-stats.sh.in

  • is B10_FROM_BUILD necessary? (the same comment applies to other .sh)

Yes, it's not necessary. I removed them. r2847

stats.spec.pre.in

  • I don't think this is a good "default". (it rather looks like an "example")
            "item_default": "123abc@xxxx",
    

This item is not optional. I'm sure that this value is not appropriate. I changed this item to no default value. r2848

unittest_fakesession.py

  • this seems to be largely derived from config/tests/unittest_fakesession.py. please note it if you do something like that. it will help reduce review burden very much. also, it's even better to unify the code (with appropropriate abstraction if per-module customization is necessary). having duplicate code is generally not a good practice.

Yes, I copied this whole file from here. You're right. I created a whole new class which is mock-up classes against isc.cc.session and isc.config.ccsession. r2862 And I removed unittest_fakesession.py under this directory. r2865

moreover, I'm not even sure about the purpose of this method. Is this while-select trick necessary? Please describe the purpose of this method and how it's expected to be used (as comments).

  • this line in close() seems to be dead code:
                    raise Exception('Unexpected return values from select():', r, w, e)
    

In this new mock-up classes, I don't use this tricky way.

  • test_boss_stub: I don't this is a good test because get_datetime() may return different values in the test and in the testee. (code duplication of get_datetime() is another bad thing)

I have recognized this before but I couldn't find the best way to resolve it. I rewrote the codes so that the same values are returned between test code and target code. I created a new mock-up classes fake_time.py against time.py which is standard class of python. Also I aggregated the definitions of get_datetime() and get_timestamp() in stats.py.in. r2864

b10-stats_stub_test.py

  • I'd move tearDown() right after setUp().

OK. I moved tearDown() up. r2864

stats_stub.py.in

  • BossModuleStub?.send_command(): the method name is too generic for what it's actually doing.

I changed the name of the methods to the followings according to actual doing:
Before::

  BossModuleStub.send_command()
  AuthModuleStub.send_command_udp()
  AuthModuleStub.send_command_tcp()

After::

  BossModuleStub.send_boottime()
  AuthModuleStub.send_udp_query_count()
  AuthModuleStub.send_tcp_query_count()

r2864

  • I suggest all temporary resources be cleaned up in tearDown(). from a quick look, it seems self.listener has to be stopped.

OK. I did that. I added the codes which stops the listener object and the subject object into each tearDown() method.
r2864

b10-stats_test.py

  • the intent of this part is not clear. please add comment:
            for i in (0, 1):
                self.session.get_message("ConfigManager", None)
    

This codes isn't unnecessary because I created a mock-up class against ConfigManager?. I removed this codes.

Replying to jinmei:

I also have some high level questions about the design of stats.py:

I don't have strong opinions but it might be wrong ways to use these design patterns. :(

  • I don't see the need for the abstraction using the observer pattern because there is only type of listener (=observer). what's the background rationle of this design decision? Please describe it in more detail in the source code.

I added a comment to the code like this. In the initial release, I'm also sure that observer pattern isn't definitely needed because the interface between gathering and reporting statistics data is single.
However in the future release, the interfaces may be multiple, that is, multiple listeners may be needed. For example, one interface, which stats module has, is for between config manager and stats module, another interface is for between HTTP server and stats module, and one more interface is for between SNMP server and stats module. So by considering that stats module needs multiple interfaces in the future release, I adopted the observer pattern in stats module. But I don't have concrete ideas in case of multiple listener currently.
r2864

  • Likewise, I was not sure about the intent of the seeming introduction of singleton. From a very superficial look at the code the intent seems to make SessionSubject? a singleton class, but at least in the usage of stats.py we don't have to bother to have such a tricky technique, because there cannot be more than one SessionSubject? object in the first place. Besides, I suspect this implementation actually doesn't realize what is seeminly intended:
    class SessionSubject(Subject):
        """
        A concrete subject class which creates CC session object
        """
        __metaclass__ = Singleton
    

I added a comment to the code like this. At the beginning of coding, one UNIX domain socket is needed for config manager, another socket is needed for stats module, then stats module might need two sockets. So I adopted the singleton pattern because I avoid creating multiple sockets in one stats module.
But in the initial version stats module reports only via bindctl, so just one socket is needed. To use the singleton pattern is not important now. :(
r2864

according to a python 2-to-3 guide, this syntax has been depreciated: http://diveintopython3.org/porting-code-to-python-3-with-2to3.html#metaclass
Please clarify the intent of the code in more detail.

Thank you for your giving reference. I didn't know this guide. I rewrote the code according to the guide.

Are above all my answers just enough? Please be free to ask again if you can't understand that.

Thanks,

comment:15 Changed 9 years ago by naokikambe

  • Add Hours to Ticket changed from 0.0 to 0.5
  • Total Hours changed from 3.33 to 3.83

I added lines of sending "bind10.boot_time" to b10-stats to bind10.py.in.
r2888
Please check it too.

comment:16 follow-up: Changed 9 years ago by jelte

I committed two minor patches to the branch:

r2910 (to fix a crash in cmdctl if you give the command 'Stats show', i.e. without an argument)

r2911 (to be able to run this in a separate build directory)

comment:17 follow-up: Changed 9 years ago by jelte

  • Owner changed from jinmei to naokikambe

Some other comments:

stats.py.in:

line 172: I think the assignment of self.subject is already done by the call to the superclasses init

line 205: eek, eval(). Even though we only eval something out of files we provide, it is still scary, and looks unnecessary in this case;
callback = getattr(self, name)
should also work

More general comments:
I noticed in the spec file that the statistics data is considered configuration (it is under "config_data"). I personally think this is confusing and possible also error-prone. I propose we extend the .spec format with a "stats_data", on the same level as "config_data". Right now, it could have the same semantics, but for starters it keeps statistics data like 'boot time' away from configuration.

What exactly are those Stub classes supposed to do? Are they only for testing or is the intent that modules use them?

comment:18 in reply to: ↑ 16 Changed 9 years ago by naokikambe

  • Add Hours to Ticket changed from 0.0 to 0.5
  • Owner changed from naokikambe to jelte
  • Total Hours changed from 3.83 to 4.33

Replying to jelte:

r2910 (to fix a crash in cmdctl if you give the command 'Stats show', i.e. without an argument)

Thanks :)

r2911 (to be able to run this in a separate build directory)

Thanks. But do we always need to use the build directory instead of source directory in python?
I also fixed other lines related this change. see r2926.

comment:19 in reply to: ↑ 17 Changed 9 years ago by naokikambe

  • Add Hours to Ticket changed from 0.0 to 0.5
  • Total Hours changed from 4.33 to 4.83

Replying to jelte:

stats.py.in:

line 172: I think the assignment of self.subject is already done by the call to the superclasses init

I removed this at r2970.

line 205: eek, eval(). Even though we only eval something out of files we provide, it is still scary, and looks unnecessary in this case;
callback = getattr(self, name)
should also work

Yeah, I fixed this at r2973.

More general comments:
I noticed in the spec file that the statistics data is considered configuration (it is under "config_data"). I personally think this is confusing and possible also error-prone. I propose we extend the .spec format with a "stats_data", on the same level as "config_data". Right now, it could have the same semantics, but for starters it keeps statistics data like 'boot time' away from configuration.

Yes, that's right. But in an initial version, I chose to use it under "config_data". From next version, I'll try to extend spec file of stats module for it.

What exactly are those Stub classes supposed to do? Are they only for testing or is the intent that modules use them?

Stub is only for testing, not really installed. It works as real bind10 or b10-auth against b10-stats. It is never used by other modules.

comment:20 Changed 9 years ago by naokikambe

  • Add Hours to Ticket changed from 0.0 to 0.5
  • Total Hours changed from 4.83 to 5.33
  • I removed the implement of increase command and related test codes due to the decision of #170(ticket:170:26). Please see r3046.
  • I created a new ticket #347: Implement query counters in auth module. Please discuss about the implement of statistics features on auth module in #347. It is out of scope here. Please see #347.

comment:21 follow-up: Changed 9 years ago by jelte

  • Owner changed from jelte to naokikambe

One more thing, I'd prefer it if you change

742	    cmd = { "command": [
743	             'set',
744	             { "stats_data": {
745	                 'bind10.boot_time': time.strftime('%Y-%m-%dT%H:%M:%SZ', _BASETIME)
746	               }
747	             }
748	            ]
749	          }

in bind10.py.in to

cmd = isc.config.ccsession.create_command('set', 
        { "stats_data": {
            'bind10.boot_time': time.strftime('%Y-%m-%dT%H:%M:%SZ', _BASETIME)
          }
        })

The actual data passed is still 'hand-crafted' here, but the abstract away from the actual format of command messages. (Perhaps this goes for the tests as well btw).

Oh and another thing, i see that we recreate part of the lib/python/isc tree for tests here. Not that we especially need to change this now, but i am a bit afraid that this won't really scale or that this gets stale, so we should think about ways to generalize this.

comment:22 in reply to: ↑ 21 Changed 9 years ago by naokikambe

  • Add Hours to Ticket changed from 0.0 to 0.5
  • Owner changed from naokikambe to jelte
  • Total Hours changed from 5.33 to 5.83

Thank you for reviewing again.

Replying to jelte:

One more thing, I'd prefer it if you change

I'm done in r3207.

Oh and another thing, i see that we recreate part of the lib/python/isc tree for tests here. Not that we especially need to change this now, but i am a bit afraid that this won't really scale or that this gets stale, so we should think about ways to generalize this.

I agree it. If we can define the 'interface' in the library, other modules should use it through the 'interface'. We should not usually recreate the 'interface' part but we can usually recreate the concrete part of the library. Then we don't need to recreate the modules every the library is changed. I don't know we can actually define such 'interface' in python. That's a idea.

Could you let me know whether you have any other comments?

comment:23 Changed 9 years ago by jelte

  • Owner changed from jelte to naokikambe

No, the rest looks fine.

comment:24 Changed 9 years ago by naokikambe

  • Resolution set to fixed
  • Status changed from reviewing to closed

OK, I'm closing it. And I'll merge trunk with [source:/branches/trac191-rebased@3217].

Note: See TracTickets for help on using tickets.