Opened 9 years ago

Closed 9 years ago

#547 closed enhancement (fixed)

Update of statistics daemon for HTTP/XML reporting: implement

Reported by: naokikambe Owned by: naokikambe
Priority: medium Milestone: Sprint-20110503
Component: statistics Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 40.0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

Implement a HTTP/XML interface for the statistics daemon according to #512

Subtickets

Change History (19)

comment:1 follow-up: Changed 9 years ago by naokikambe

  • Owner changed from naokikambe to UnAssigned
  • Status changed from new to reviewing

It's ready for reviewing at git be3ac2735e707deb384f9f3ec9f36857026bae1d.

comment:2 in reply to: ↑ 1 Changed 9 years ago by jinmei

Replying to naokikambe:

It's ready for reviewing at git be3ac2735e707deb384f9f3ec9f36857026bae1d.

The diff seems to too big (even for the essntial part, which is the
diff between a recent master and the head of the branch). Is it
possible to break down into multiple steps/pieces to reduce the load
of review? It would actually be even better if we can make it
shorter/smaller cycle of incremental development/review/merge steps
from the beginning.

Also, it would be helpful to add some implementation notes as a hint
to the reviewer (e.g., "at a high level the whole diff consists of the
following N parts", "this part may be tricky and should be reviewed
carefully)..

comment:3 Changed 9 years ago by naokikambe

  • Estimated Difficulty changed from 0.0 to 40.0
  • Owner changed from UnAssigned to jinmei

Sorry for so many and small-sized changes in trac547. I created a new
branch "trac547-summarized" at git
7a7299fb09d2a02023041d3fe7a92e22d17ab549. This branch is based on the
newest master and merged with trac547. And the summary of the changes
is described in the commit log in the new branch. I think document
Stats module is also helpful for reviewing. Is that okay
for you? If not enough, please let me know.

comment:4 Changed 9 years ago by jinmei

  • Owner changed from jinmei to UnAssigned

comment:5 Changed 9 years ago by stephen

  • Milestone A-Team-Task-Backlog deleted

Milestone A-Team-Task-Backlog deleted

comment:6 Changed 9 years ago by shane

  • Milestone set to Sprint-20110405

comment:7 Changed 9 years ago by vorner

  • Owner changed from UnAssigned to vorner

comment:8 follow-up: Changed 9 years ago by vorner

  • Owner changed from vorner to naokikambe

Hello

The problem with size was not about the number of commits. It was about the amount of changed code, which has to be read during the review. I did go trough it, but squashing the whole branch into one commit does not help, quite the reverse. Splitting into commits might help a lot, but that expects that the commits have good description and the history is clean. (Btw. how did you manage to have some commits 3 times in the history? Do you by any chance combine rebase with merge? It does nasty things to the history). Anyway, keep it like it is for now (I tried cleaning the history up, but it seems it would require quite a lot of time). Jinmei originally asked if you could say something like „commits from this point to this implement the startup and listening and are independent of the rest“, and so on, so one could read a part, write comments for them, forget them and start reading the next part ‒ it helps both from psychic point of view (it's less tiresome going trough 300 lines of changes, even 10 times, than 3000 lines at one time) and from practical point of view (while reviewing, I try to remember all the relations between the new code, and it gets confusing when the amount gets large).

I fixed some English in the man page, pushed into the -summarised branch.

About the code:

  • Every process/module has a separate directory in the bin/. Would the xml stats server deserve it's own directory as well?
  • Do we expect users to want to know the internal how it works? Like how it is started by boss, etc. Maybe the man page could be less detailed in that sense, and just say that it presents the statistics from stats module trough XML/HTTP?
  • The man page talks about parameters when it talks on which addresses it listens. IMO that's little bit misleading, user might want to put it into the command line after reading it. Maybe call them options or configuration points so it is clear they are not command line arguments?
  • The BASE_LOCATION = BASE_LOCATION.replace("${datarootdir}", DATAROOTDIR).replace("${prefix}", PREFIX) is little bit suspicious. Can the BASE_LOCATION really contain strings like '${prefix}'? Where do they come from?
  • You seem to use dict(key=value) consistently trough the code. Is there any special reason about that? I think the use of {'key':value} is more common in the python world.
  • This might be out of the scope of this task, but is there a way to ask for only part of the data? Like there is for the stats module?
  • Why is this limitation in place?:
    # The size of template file should be under 100k. TODO: should be
    # considered later.
    MAX_SIZE_OF_TEMPLATE = 102400
    
    Allowing arbitrary size should be no problem, since the file is local one and the only one able to change it is the administrator. Why stop him? Reading the whole file by single call without the limitation should be no problem in python (the file object should have something like readAll() or similar).
  • There is number of assert calls trough the code. While this is generally good thing, asserts for given types of things (eg. assert isinstance(self.server, HttpServer)) is AFAIK explicitly not recommended by python guidelines. The strength of python is that you can put whatever object you like into it as long as it has the needed methods. So, such assert actually make the code reuse harder, because every user needs to inherit of the given checked base class. Example of where it might make trouble are mock classes in testing.
  • The server redirects all unknown addresses to the one address for the data. That's quite surprising behaviour, wouldn't it be better to redirect only things like / and index.html, but leave /something_that_does_not_exist to actually return 404? If the user took the work to type some address, he had some intention and we should tell him we don't know the answer for him instead of giving him something else.
  • Why does it subscribe to the stats' messages? It does nothing when it receives such message and it asks for the statistics on demand when they are needed. I guess nothing would happen if it didn't subscribe to it, simplifying the code.
  • You sort out duplicate addresses by some set. But if they are problem, shouldn't a configuration with duplicate addresses be rejected instead of silently modifying it?
  • If you want to raise the original exception:
    except select.error as err:
        if err.args[0] == errno.EINTR:
        (rfd, wfd, xfd) = ([], [], [])
    else:
        raise select.error(err)
    
    You might want to use just raise without any parameters. That raises the exact instance of exception as it caught in the except, preserving all other information (like errno, which your way doesn't) and the backtrace is more helpful (it doesn't show the note about exception being raised while handling another exception).
  • You use select around the httpd object and when a socket is readable, you call it. But, does it use non-blocking sockets? If there's a client that sends GET url and gets stuck before sending the rest of headers, will our program get stuck as well, trying to read until the end of the headers?
  • Aren't you sending answer to answer? I guess only commands should have answers:
    try:
          seq = self.cc_session.group_sendmsg(
              isc.config.ccsession.create_command('show'),
              self.stats_module_name)
          (answer, env) = self.cc_session.group_recvmsg(False, seq)
          if answer:
              (rcode, value) = isc.config.ccsession.parse_answer(answer)
              self.cc_session.group_reply(
                  env, isc.config.ccsession.create_answer(0))
    
  • You check that the resulting list of statistics is not empty (assert len(xml_list) > 0). However, it should be empty in case you get empty statistics list from the stats module. It is true that there should be some statistics from it, but is it really illegal? And should we crash for assertion in that case?
  • The process will return code 0 (success) even when it catches an exception in main and terminates. That should generate an error.
  • You seem to use regexps in the tests when you really think substrings:
    self.assertRegexpMatches(handler.response.body, stats_httpd.XSD_NAMESPACE)
    self.assertRegexpMatches(handler.response.body, stats_httpd.XSD_URL_PATH)
    
    The paths aren't valid regexps, they are just string. Furthermore, if they contain regexp metacharacters, the test would break.
  • Isn't there a spelling error in _clear_ques? Shouldn't it be _clear_queues (used consistently with the mock ccsession).
  • There's a class containing tests only in setUp. That's counter-intuitive and might generate wrong error messages when the test actually fail. It should be moved to some test_something method (the setUp method is optional).
  • You test only the error condition in test_open_template, but not the successful run.
  • Can this assert fail under any circumstances?
    d = dict(_UNKNOWN_KEY_=None)
    assert type(d) is dict
    
  • In test_config, you call it with various configs and check that it doesn't crash and it returns the correct answer. But shouldn't there be a check it did anything „internal“, like it sets any value inside?
  • What is tested in test_no_buildpath? How can the test fail, if there's no self.assert* call? The same with test_osenv (probably older code, but I saw it in the diff as well).
  • You modified the dequeue method in the mock ccsession. But why is there the seq parameter now? It seems to be unused.
  • The get_value seems suspicious. It returns either some spec file default value or default value of a type. But is it able to return any actual configured value? Or why isn't it needed?
  • The mock select has some hidden state which can make it quite surprising, if someone tried to reuse it in some other tests or wrote another test here that used it. Shouldn't the variable be exported and explicitly set in the test using it?

Anyway, I did read the code, but my brain is too limited to check that every function is tested, as the diff is so large. Hopefully you remember, did you write the tests for every function, or should I make the effort, find some piece of paper and writing down what is tested by what test and such?

Thanks

comment:9 in reply to: ↑ 8 ; follow-up: Changed 9 years ago by naokikambe

  • Owner changed from naokikambe to vorner

Replying to vorner:
Thank you very much for so deeply reviewing and for giving so many comments! And I'm very sorry for confusing reviewers a lot. So I would like to suggest you this. I'll close this ticket and split it into following 4 tickets.

  • TICKET1: document a manpage about XML/HTTP interface for statistics feature
  • TICKET2: implement a XML/HTTP interface for statistics feature
  • TICKET3: update the existing statistics daemon for the XML/HTTP interface
  • TICKET4: update mockup classes referred from statistics modules

And I'll delete the existing branches on the repository (trac547, trac547-summarized) and recreate new branch and complete history by each ticket. How about that? Can it make sense for you? I'll do that if it is okay for you. Please let me know.

BTW, I will answer to each your comment here.

I fixed some English in the man page, pushed into the -summarised branch.

Thank you. That's good.

  • Every process/module has a separate directory in the bin/. Would the xml stats server deserve it's own directory as well?

Because I think the stats daemon and the HTTP server are grouped in the same category according to the picture in DesignDiagrams, and they almost share the same libraries and the same stub codes, so I put them into the same directory. If it is incorrect, I will separate the directory.

  • Do we expect users to want to know the internal how it works? Like how it is started by boss, etc. Maybe the man page could be less detailed in that sense, and just say that it presents the statistics from stats module trough XML/HTTP?

Yes, the users needs to know details but I don't exactly know what we should add in the man page. What about this? Please show me more suggestion if you have.

For example, The HTTP server gets the following data from the stats daemon
in python dict format. :
{
  "auth.queries.tcp": 1,
  "auth.queries.udp": 1,
  "bind10.boot_time": "2011-04-06T10:52:10Z",
  "report_time": "2011-04-06T10:56:32Z",
  "stats.boot_time": "2011-04-06T10:52:12Z",
  "stats.last_update_time": "2011-04-06T10:56:11Z",
  "stats.lname": "4d9c45dc_c@fdvm",
  "stats.start_time": "2011-04-06T10:52:12Z",
  "stats.timestamp": 1302087392.218623
}
Then the server pushes this to the client in XML format through HTTP. :
  <stats>
  <auth.queries.tcp>0</auth.queries.tcp>
  <auth.queries.udp>0</auth.queries.udp>
  <bind10.boot_time>2011-04-06T10:52:10Z</bind10.boot_time>
  <report_time>2011-04-06T10:55:10Z</report_time>
  <stats.boot_time>2011-04-06T10:52:12Z</stats.boot_time>
  <stats.last_update_time>2011-04-06T10:54:11Z</stats.last_update_time>
  <stats.lname>4d9c45dc_c@fdvm</stats.lname>
  <stats.start_time>2011-04-06T10:52:12Z</stats.start_time>
  <stats.timestamp>1302087310.02</stats.timestamp>
  </stats>
That is, one element in python dict corresponds to one tag in XML.
  • The man page talks about parameters when it talks on which addresses it listens. IMO that's little bit misleading, user might want to put it into the command line after reading it. Maybe call them options or configuration points so it is clear they are not command line arguments?

OK. What about adding the following description into OPTIONS section in the man page?

The verbose mode is only supported in the command-line option of
b10-stats-httpd. You can set the parameters like addresses and ports in HTTP
into the specification file (stats-httpd.spec). Please see "CONFIGURATION AND
COMMANDS" section below for details.
  • The BASE_LOCATION = BASE_LOCATION.replace("${datarootdir}", DATAROOTDIR).replace("${prefix}", PREFIX) is little bit suspicious. Can the BASE_LOCATION really contain strings like '${prefix}'? Where do they come from?

Yes. In my environment, BASE_LOCATION is "${datarootdir}/bind10-devel", DATAROOTDIR is "${prefix}/share" and PREFIX is "/home/kambe/bind10", then ${datarootdir} in BASE_LOCATION is replaced with DATAROOTDIR and ${prefix} in DATAROOTDIR is replaced with "/home/kambe/bind10". That is, BASE_LOCATION contains ${datarootdir} and ${datarootdir} contains ${prefix}. Is that correct?

  • You seem to use dict(key=value) consistently trough the code. Is there any special reason about that? I think the use of {'key':value} is more common in the python world.

I don't have so special reason. The way of dict method is more intuitive for me because we don't need to bracket the key name with quotations. If it is the bad manner, I will correct through the all codes.

  • This might be out of the scope of this task, but is there a way to ask for only part of the data? Like there is for the stats module?

Yes, there is a way, but it is not prepared for the HTTP server. It is only prepared for the stats daemon. You can invoke the show command with the argument in bindctl like this. Is the same way needed for the HTTP server?

> Stats show auth.queries.ucp
{
    "auth.queries.tcp": 1
}
  • Why is this limitation in place?:
    # The size of template file should be under 100k. TODO: should be
    # considered later.
    MAX_SIZE_OF_TEMPLATE = 102400
    
    Allowing arbitrary size should be no problem, since the file is local one and the only one able to change it is the administrator. Why stop him? Reading the whole file by single call without the limitation should be no problem in python (the file object should have something like readAll() or similar).

I don't have so specific reason for it. But I cannot expect the situation which so huge file is read-in. For the unexpected situation, I set such limitation. In almost cases an administrator dosen't need to change this limitation. Do we need to remove it?

  • There is number of assert calls trough the code. While this is generally good thing, asserts for given types of things (eg. assert isinstance(self.server, HttpServer)) is AFAIK explicitly not recommended by python guidelines. The strength of python is that you can put whatever object you like into it as long as it has the needed methods. So, such assert actually make the code reuse harder, because every user needs to inherit of the given checked base class. Example of where it might make trouble are mock classes in testing.

That's right. I will remove assertions like the following in the all codes. Is that okay?

assert isinstance(xxx, yyy)
assert type(xxx) is yyy
  • The server redirects all unknown addresses to the one address for the data. That's quite surprising behaviour, wouldn't it be better to redirect only things like / and index.html, but leave /something_that_does_not_exist to actually return 404? If the user took the work to type some address, he had some intention and we should tell him we don't know the answer for him instead of giving him something else.

That's right, I will remove the codes for redirection.

  • Why does it subscribe to the stats' messages? It does nothing when it receives such message and it asks for the statistics on demand when they are needed. I guess nothing would happen if it didn't subscribe to it, simplifying the code.

OK. I will remove the codes to subscribe/unsubscribe.

  • You sort out duplicate addresses by some set. But if they are problem, shouldn't a configuration with duplicate addresses be rejected instead of silently modifying it?

OK. I will remove the code to sort addresses and ports.

  • If you want to raise the original exception:
    except select.error as err:
        if err.args[0] == errno.EINTR:
        (rfd, wfd, xfd) = ([], [], [])
    else:
        raise select.error(err)
    
    You might want to use just raise without any parameters. That raises the exact instance of exception as it caught in the except, preserving all other information (like errno, which your way doesn't) and the backtrace is more helpful (it doesn't show the note about exception being raised while handling another exception).

OK. I will replace with another exception instead of the original exception(select.error).

  • You use select around the httpd object and when a socket is readable, you call it. But, does it use non-blocking sockets? If there's a client that sends GET url and gets stuck before sending the rest of headers, will our program get stuck as well, trying to read until the end of the headers?

The HTTP server can handle only one request at a time. If the server is trying to read, it will be timed out and the client will be reset later. I think this interface is not so busy like a DNS server, because I think this feature is for administration use only. Is that correct?

  • Aren't you sending answer to answer? I guess only commands should have answers:
    try:
          seq = self.cc_session.group_sendmsg(
              isc.config.ccsession.create_command('show'),
              self.stats_module_name)
          (answer, env) = self.cc_session.group_recvmsg(False, seq)
          if answer:
              (rcode, value) = isc.config.ccsession.parse_answer(answer)
              self.cc_session.group_reply(
                  env, isc.config.ccsession.create_answer(0))
    

That's right, this code to reply is unnecessary. I'll remove it.

  • You check that the resulting list of statistics is not empty (assert len(xml_list) > 0). However, it should be empty in case you get empty statistics list from the stats module. It is true that there should be some statistics from it, but is it really illegal? And should we crash for assertion in that case?

I think xml_list cannot be empty as long as it gets along with the stats daemon, but we consider the case that xml_list become empty as you mentioned. I will remove it.

  • The process will return code 0 (success) even when it catches an exception in main and terminates. That should generate an error.

OK, it will exit with non-zero status code when it catches any exception in the main routine.

  • You seem to use regexps in the tests when you really think substrings:
    self.assertRegexpMatches(handler.response.body, stats_httpd.XSD_NAMESPACE)
    self.assertRegexpMatches(handler.response.body, stats_httpd.XSD_URL_PATH)
    
    The paths aren't valid regexps, they are just string. Furthermore, if they contain regexp metacharacters, the test would break.

That's right. I'll use find method in string object for it. What about this?

self.assertTrue(handler.response.body.find(stats_httpd.XSD_NAMESPACE)>0)
self.assertTrue(handler.response.body.find(stats_httpd.XSD_URL_PATH)>0)
  • Isn't there a spelling error in _clear_ques? Shouldn't it be _clear_queues (used consistently with the mock ccsession).

OK. I'll correct it with the proper spelling(_clear_queues).

  • There's a class containing tests only in setUp. That's counter-intuitive and might generate wrong error messages when the test actually fail. It should be moved to some test_something method (the setUp method is optional).

OK. I'll use test_xxx method instead of setUp method in the unittest class which has only setUp method.

  • You test only the error condition in test_open_template, but not the successful run.

I think the successful condition of test_open_template is tested in other test cases, but I'll add the successful one into test_open_template too.

  • Can this assert fail under any circumstances?
    d = dict(_UNKNOWN_KEY_=None)
    assert type(d) is dict
    

No, but I'll remove these assertions as I mentioned above.

  • In test_config, you call it with various configs and check that it doesn't crash and it returns the correct answer. But shouldn't there be a check it did anything „internal“, like it sets any value inside?

OK. I will add tests to check internal values. What about this?

  self.assertEqual(
      self.stats_httpd.config_handler(
                  dict(listen_on=[dict(address="::2",port=8000)])),
      isc.config.ccsession.create_answer(0))
  self.assertTrue(self.stats_httpd.config in "listen_on")
  self.assertTrue(self.stats_httpd.config["listen_on"] in "address")
  self.assertTrue(self.stats_httpd.config["listen_on"] in "port")
  self.assertTrue(self.stats_httpd.config["listen_on"]["address"] == "::2")
  self.assertTrue(self.stats_httpd.config["listen_on"]["port"] == 8000)
  • What is tested in test_no_buildpath? How can the test fail, if there's no self.assert* call? The same with test_osenv (probably older code, but I saw it in the diff as well).

There aren't so specific reason. I intended to let it go through the following code in stats_http.py. It will fail if it has errors in syntax. Do we need to remove test_no_buildpath?

else:
    PREFIX = "@prefix@"
    DATAROOTDIR = "@datarootdir@"
    BASE_LOCATION = "@datadir@" + os.sep + "@PACKAGE@"
    BASE_LOCATION = BASE_LOCATION.replace("${datarootdir}", DATAROOTDIR).replace("${prefix}", PREFIX)
SPECFILE_LOCATION = BASE_LOCATION + os.sep + "stats-httpd.spec"
STATS_SPECFILE_LOCATION = BASE_LOCATION + os.sep + "stats.spec"
XML_TEMPLATE_LOCATION = BASE_LOCATION + os.sep + "stats-httpd-xml.tpl"
XSD_TEMPLATE_LOCATION = BASE_LOCATION + os.sep + "stats-httpd-xsd.tpl"
XSL_TEMPLATE_LOCATION = BASE_LOCATION + os.sep + "stats-httpd-xsl.tpl"
  • You modified the dequeue method in the mock ccsession. But why is there the seq parameter now? It seems to be unused.

Yes, it isn't necessary. I'll remove it.

  • The get_value seems suspicious. It returns either some spec file default value or default value of a type. But is it able to return any actual configured value? Or why isn't it needed?

No, it is not able to return any actual value. The purpose of this get_value is only for getting the default value in the spec file, because this method is called only when stats_http is loading the spec file at first. But the behavior of this method is unlike to the behavior of the original method in the original class. So what about switching the name of it to get_default_value?

  • The mock select has some hidden state which can make it quite surprising, if someone tried to reuse it in some other tests or wrote another test here that used it. Shouldn't the variable be exported and explicitly set in the test using it?

I think these mock classes (select.py, socket.py, etc) are not reusable for other tests, the codes should be more coded if they need to be, and I think these tasks are out of scope of this ticket. Should they be reusable?

comment:10 in reply to: ↑ 9 ; follow-ups: Changed 9 years ago by vorner

  • Owner changed from vorner to naokikambe

Hello

Replying to naokikambe:

Thank you very much for so deeply reviewing and for giving so many comments! And I'm very sorry for confusing reviewers a lot. So I would like to suggest you this. I'll close this ticket and split it into following 4 tickets.
And I'll delete the existing branches on the repository (trac547, trac547-summarized) and recreate new branch and complete history by each ticket. How about that? Can it make sense for you? I'll do that if it is okay for you. Please let me know.

No need to. I already went trough the code, so splitting it up now is no longer needed, it would only waste your time.

  • Every process/module has a separate directory in the bin/. Would the xml stats server deserve it's own directory as well?

Because I think the stats daemon and the HTTP server are grouped in the same category according to the picture in DesignDiagrams, and they almost share the same libraries and the same stub codes, so I put them into the same directory. If it is incorrect, I will separate the directory.

Well, with these things, there's no correct or incorrect, it's just how we feel about it. It surprised me, so I asked ‒ if it is on purpose, then it's OK.

  • Do we expect users to want to know the internal how it works? Like how it is started by boss, etc. Maybe the man page could be less detailed in that sense, and just say that it presents the statistics from stats module trough XML/HTTP?

Yes, the users needs to know details but I don't exactly know what we should add in the man page. What about this? Please show me more suggestion if you have.

Why do they need to know?

I mean, if I was a user, I would care about if it runs or not and how to ask it for my web page with statistics. Where or how it gets the data, it's none of my problems. So, why would any other user want to care what is the internal format that goes from one place in the BIND10 system to another? I fear that the man page is too long and confusing users about too much detail.

This information is, of course, valuable as developer documentation. But man page is for the end user.

  • The man page talks about parameters when it talks on which addresses it listens. IMO that's little bit misleading, user might want to put it into the command line after reading it. Maybe call them options or configuration points so it is clear they are not command line arguments?

OK. What about adding the following description into OPTIONS section in the man page?

The verbose mode is only supported in the command-line option of
b10-stats-httpd. You can set the parameters like addresses and ports in HTTP
into the specification file (stats-httpd.spec). Please see "CONFIGURATION AND
COMMANDS" section below for details.

But that's not true. No user edits the .spec file, it's configured trough bind control.

  • The BASE_LOCATION = BASE_LOCATION.replace("${datarootdir}", DATAROOTDIR).replace("${prefix}", PREFIX) is little bit suspicious. Can the BASE_LOCATION really contain strings like '${prefix}'? Where do they come from?

Yes. In my environment, BASE_LOCATION is "${datarootdir}/bind10-devel", DATAROOTDIR is "${prefix}/share" and PREFIX is "/home/kambe/bind10", then ${datarootdir} in BASE_LOCATION is replaced with DATAROOTDIR and ${prefix} in DATAROOTDIR is replaced with "/home/kambe/bind10". That is, BASE_LOCATION contains ${datarootdir} and ${datarootdir} contains ${prefix}. Is that correct?

Hmm, I see, it does so even here. OK, we have a strange build system.

  • You seem to use dict(key=value) consistently trough the code. Is there any special reason about that? I think the use of {'key':value} is more common in the python world.

I don't have so special reason. The way of dict method is more intuitive for me because we don't need to bracket the key name with quotations. If it is the bad manner, I will correct through the all codes.

I don't know, it probably isn't bad. Again, I was just surprised to see it, because I wouldn't write it that way myself, but it's OK if you like it this way more.

  • Why is this limitation in place?:
    # The size of template file should be under 100k. TODO: should be
    # considered later.
    MAX_SIZE_OF_TEMPLATE = 102400
    
    Allowing arbitrary size should be no problem, since the file is local one and the only one able to change it is the administrator. Why stop him? Reading the whole file by single call without the limitation should be no problem in python (the file object should have something like readAll() or similar).

I don't have so specific reason for it. But I cannot expect the situation which so huge file is read-in. For the unexpected situation, I set such limitation. In almost cases an administrator dosen't need to change this limitation. Do we need to remove it?

Well, the limitation probably will not be hit, but there's no technical reason for this exact number. The code and hardware can surely handle much more. So it only makes the code little bit more complicated and there's no benefit for the user (it can, in theory, make him unhappy actually), so I don't see much reason to have the limitation there.

  • If you want to raise the original exception:
    except select.error as err:
        if err.args[0] == errno.EINTR:
        (rfd, wfd, xfd) = ([], [], [])
    else:
        raise select.error(err)
    
    You might want to use just raise without any parameters. That raises the exact instance of exception as it caught in the except, preserving all other information (like errno, which your way doesn't) and the backtrace is more helpful (it doesn't show the note about exception being raised while handling another exception).

OK. I will replace with another exception instead of the original exception(select.error).

I'm not sure if we understand each other here. Was the original intention to catch the sellect.error only in the case it is EINTR, but let it fall when it's different errno? If so, then you don't need to replace any exception, the correct code would be:

    except select.error as err:
        if err.args[0] == errno.EINTR:
        (rfd, wfd, xfd) = ([], [], [])
    else:
        raise

Just raise, without any parameter. It will just re-raise the original, unmodified, exception.

  • You use select around the httpd object and when a socket is readable, you call it. But, does it use non-blocking sockets? If there's a client that sends GET url and gets stuck before sending the rest of headers, will our program get stuck as well, trying to read until the end of the headers?

The HTTP server can handle only one request at a time. If the server is trying to read, it will be timed out and the client will be reset later. I think this interface is not so busy like a DNS server, because I think this feature is for administration use only. Is that correct?

If you are aware of it and think it's not so big problem to complicate the code, then OK. But it would deserve a FIXME comment in the code at last, to note that the issue is known.

  • You seem to use regexps in the tests when you really think substrings:
    self.assertRegexpMatches(handler.response.body, stats_httpd.XSD_NAMESPACE)
    self.assertRegexpMatches(handler.response.body, stats_httpd.XSD_URL_PATH)
    
    The paths aren't valid regexps, they are just string. Furthermore, if they contain regexp metacharacters, the test would break.

That's right. I'll use find method in string object for it. What about this?

self.assertTrue(handler.response.body.find(stats_httpd.XSD_NAMESPACE)>0)
self.assertTrue(handler.response.body.find(stats_httpd.XSD_URL_PATH)>0)

Yes, that seems better.

No, but I'll remove these assertions as I mentioned above.

  • In test_config, you call it with various configs and check that it doesn't crash and it returns the correct answer. But shouldn't there be a check it did anything „internal“, like it sets any value inside?

OK. I will add tests to check internal values. What about this?

  self.assertEqual(
      self.stats_httpd.config_handler(
                  dict(listen_on=[dict(address="::2",port=8000)])),
      isc.config.ccsession.create_answer(0))
  self.assertTrue(self.stats_httpd.config in "listen_on")
  self.assertTrue(self.stats_httpd.config["listen_on"] in "address")
  self.assertTrue(self.stats_httpd.config["listen_on"] in "port")
  self.assertTrue(self.stats_httpd.config["listen_on"]["address"] == "::2")
  self.assertTrue(self.stats_httpd.config["listen_on"]["port"] == 8000)

I don't really remember the exact internal layout, but yes, I mean something like this.

  • What is tested in test_no_buildpath? How can the test fail, if there's no self.assert* call? The same with test_osenv (probably older code, but I saw it in the diff as well).

There aren't so specific reason. I intended to let it go through the following code in stats_http.py. It will fail if it has errors in syntax. Do we need to remove test_no_buildpath?

No need to remove it, but a comment about it would help, it looks like part of the code might be missing or something.

  • The get_value seems suspicious. It returns either some spec file default value or default value of a type. But is it able to return any actual configured value? Or why isn't it needed?

No, it is not able to return any actual value. The purpose of this get_value is only for getting the default value in the spec file, because this method is called only when stats_http is loading the spec file at first. But the behavior of this method is unlike to the behavior of the original method in the original class. So what about switching the name of it to get_default_value?

OK, seems reasonable.

  • The mock select has some hidden state which can make it quite surprising, if someone tried to reuse it in some other tests or wrote another test here that used it. Shouldn't the variable be exported and explicitly set in the test using it?

I think these mock classes (select.py, socket.py, etc) are not reusable for other tests, the codes should be more coded if they need to be, and I think these tasks are out of scope of this ticket. Should they be reusable?

They don't really need to be reusable, but again, a comment with warning about it might prevent someone from trying to reuse it and failing.

OK to anything I left out from direct answer.

Thanks

comment:11 in reply to: ↑ 10 ; follow-up: Changed 9 years ago by naokikambe

  • Owner changed from naokikambe to vorner

Can I ask you more?
Replying to vorner:

No need to. I already went trough the code, so splitting it up now is no longer needed, it would only waste your time.

OK. Thank you!

Well, with these things, there's no correct or incorrect, it's just how we feel about it. It surprised me, so I asked ‒ if it is on purpose, then it's OK.

OK, we put it in the same directory, but we will split it if needed in future.

I mean, if I was a user, I would care about if it runs or not and how to ask it for my web page with statistics. Where or how it gets the data, it's none of my problems. So, why would any other user want to care what is the internal format that goes from one place in the BIND10 system to another? I fear that the man page is too long and confusing users about too much detail.

This information is, of course, valuable as developer documentation. But man page is for the end user.

I understood that so detail isn't needed for the most end users. I'll change nothing.

But that's not true. No user edits the .spec file, it's configured trough bind control.

Further, for avoiding such end-users' misleading about command-line options, do we need to append the following lines into the CONFIGURATION AND COMMANDS section in the manpage?

You can change the default setting through bindctl. You cannot change it with command-line
options. Please see also bindctl(1) for how to change.

Well, the limitation probably will not be hit, but there's no technical reason for this exact number. The code and hardware can surely handle much more. So it only makes the code little bit more complicated and there's no benefit for the user (it can, in theory, make him unhappy actually), so I don't see much reason to have the limitation there.

OK. I will remove that.

Just raise, without any parameter. It will just re-raise the original, unmodified, exception.

That's right. I'll change it like your suggestion.

If you are aware of it and think it's not so big problem to complicate the code, then OK. But it would deserve a FIXME comment in the code at last, to note that the issue is known.

OK. I will append a FIXME comment around the while loop in the code.

They don't really need to be reusable, but again, a comment with warning about it might prevent someone from trying to reuse it and failing.

I will write some notes on these mock-up codes.

Regards,

comment:12 in reply to: ↑ 11 ; follow-up: Changed 9 years ago by vorner

  • Owner changed from vorner to naokikambe

Hello

Replying to naokikambe:

Can I ask you more?

Of course.

But, is it correct that there's still no new code in the repository? I'm just making sure I don't look at the wrong place or something like that.

Replying to vorner:

But that's not true. No user edits the .spec file, it's configured trough bind control.

Further, for avoiding such end-users' misleading about command-line options, do we need to append the following lines into the CONFIGURATION AND COMMANDS section in the manpage?

You can change the default setting through bindctl. You cannot change it with command-line
options. Please see also bindctl(1) for how to change.

The „You can change the settings trough bindctl (8)“ could be added to „CONFIGURATION AND COMMAND“ section, yes. The other statement probably isn't needed.

Actually, I was originally worried about the sentence in „DESCRIPTION“ („The parameter of the server is a list of pairs…“). It would help to change „parameter“ to „option“ or just remove that sentence completely ‒ the configuration is described below anyway, so it's not missing and it's not really needed for explaining what the thing does.

With regards

comment:13 in reply to: ↑ 12 ; follow-up: Changed 9 years ago by naokikambe

  • Owner changed from naokikambe to vorner

Replying to vorner:

But, is it correct that there's still no new code in the repository? I'm just making sure I don't look at the wrong place or something like that.

It's correct but I have not pushed the codes yet. Please wait a bit longer. I wish to do until tomorrow.

The „You can change the settings trough bindctl (8)“ could be added to „CONFIGURATION AND COMMAND“ section, yes. The other statement probably isn't needed.

Thank you for suggestion. I simply add that sentence.

Actually, I was originally worried about the sentence in „DESCRIPTION“ („The parameter of the server is a list of pairs…“). It would help to change „parameter“ to „option“ or just remove that sentence completely ‒ the configuration is described below anyway, so it's not missing and it's not really needed for explaining what the thing does.

OK. I'll remove the sentence.

Regards,

comment:14 in reply to: ↑ 13 Changed 9 years ago by vorner

  • Owner changed from vorner to naokikambe

Replying to naokikambe:

Replying to vorner:

But, is it correct that there's still no new code in the repository? I'm just making sure I don't look at the wrong place or something like that.

It's correct but I have not pushed the codes yet. Please wait a bit longer. I wish to do until tomorrow.

OK, no hurry. Just tell me when it's there.

comment:15 in reply to: ↑ 10 ; follow-up: Changed 9 years ago by naokikambe

  • Owner changed from naokikambe to vorner

Sorry for my late response. I completed pushing the all revised codes, however I added changes which are likely different from your intention. Please check these codes anyway.

Replying to vorner:

  • Why does it subscribe to the stats' messages? It does nothing when it receives such message and it asks for the statistics on demand when they are needed. I guess nothing would happen if it didn't subscribe to it, simplifying the code.

Sorry, I couldn't find the corresponding part in the codes as you mentioned. Could you point it out a bit more exactly again?

Replying to vorner:

  • The get_value seems suspicious. It returns either some spec file default value or default value of a type. But is it able to return any actual configured value? Or why isn't it needed?

No, it is not able to return any actual value. The purpose of this get_value is only for getting the default value in the spec file, because this method is called only when stats_http is loading the spec file at first. But the behavior of this method is unlike to the behavior of the original method in the original class. So what about switching the name of it to get_default_value?

OK, seems reasonable.

I couldn't rename the the method, because there isn't the corresponding method in the original class. (I went through the original class more properly, I found it didn't correspond to 'get_default_value' in the original class.) I removed some different parts from the original method, and added some notes, in stead of renaming the method.

Are these changes okay for you? Please let me know.

Regards,

comment:16 in reply to: ↑ 15 ; follow-up: Changed 9 years ago by vorner

  • Defect Severity set to N/A
  • Owner changed from vorner to naokikambe
  • Sub-Project set to DNS

Good morning

Replying to naokikambe:

Sorry for my late response. I completed pushing the all revised codes, however I added changes which are likely different from your intention. Please check these codes anyway.

They seem mostly OK, with few minor points below.

Replying to vorner:

  • Why does it subscribe to the stats' messages? It does nothing when it receives such message and it asks for the statistics on demand when they are needed. I guess nothing would happen if it didn't subscribe to it, simplifying the code.

Sorry, I couldn't find the corresponding part in the codes as you mentioned. Could you point it out a bit more exactly again?

In open_mccs and close_mccs. It loads STATS specification and subscribes to the stats module. But maybe it should read and subscribe to the stats-http? Did you test the server in reality, looking at it from browser, configuring it, etc? Or, should I do it? This looks strange to me, looking at it again, because I don't see any place where it would subscribe to it's own module (but maybe I only look at the wrong place right now).

Replying to vorner:

OK, seems reasonable.

I couldn't rename the the method, because there isn't the corresponding method in the original class. (I went through the original class more properly, I found it didn't correspond to 'get_default_value' in the original class.) I removed some different parts from the original method, and added some notes, in stead of renaming the method.

OK, I see. That's a mock class and it should have the same method as original, right?

Are these changes okay for you? Please let me know.

Only two small things I noticed now:

With the open_template test, you run them in the „successful“ way now. But you don't check the result in any way, function that would only open the file, but didn't read anything would pass the test the same way this does.

You removed all redirects. My point was, if I type http://localhost:8000/some/address , I don't expect the main page. But if I type http://localhost:8000/ , I do. Having a 404 on / is as surprising as having the index page on some strange address. Maybe having the redirect only when the address is /, but not when it's something different? Or provide the page right away in that case?

I pushed one small comment update.

With regards.

comment:17 in reply to: ↑ 16 Changed 9 years ago by naokikambe

  • Owner changed from naokikambe to vorner

Replying to vorner:

In open_mccs and close_mccs. It loads STATS specification and subscribes to the stats module. But maybe it should read and subscribe to the stats-http? Did you test the server in reality, looking at it from browser, configuring it, etc? Or, should I do it? This looks strange to me, looking at it again, because I don't see any place where it would subscribe to it's own module (but maybe I only look at the wrong place right now).

I just missed such the part of the codes. I had removed the part as you pointed out before. Please see the new code. Of course, I did test in reality with the web browser.

That's a mock class and it should have the same method as original, right?

Yes, that's right.

With the open_template test, you run them in the „successful“ way now. But you don't check the result in any way, function that would only open the file, but didn't read anything would pass the test the same way this does.

I added tests checking internal values.

You removed all redirects. My point was, if I type http://localhost:8000/some/address , I don't expect the main page. But if I type http://localhost:8000/ , I do. Having a 404 on / is as surprising as having the index page on some strange address. Maybe having the redirect only when the address is /, but not when it's something different? Or provide the page right away in that case?

I reverted the part of the redirection in case of request with '/'.

I pushed one small comment update.

Thank you very much!

I had already pushed, please see the new codes.

Regards,

comment:18 follow-up: Changed 9 years ago by vorner

  • Owner changed from vorner to naokikambe

Hello

It looks OK now. Please merge.

With regards

comment:19 in reply to: ↑ 18 Changed 9 years ago by naokikambe

  • Resolution set to fixed
  • Status changed from reviewing to closed

Replying to vorner:

Thank you very much for long reviewing! I completed merging and I'm closing the ticket.

Regards,

Note: See TracTickets for help on using tickets.