Opened 7 years ago

Closed 7 years ago

#2298 closed defect (fixed)

Xfrout/zones and XML stats

Reported by: jreed Owned by: naokikambe
Priority: medium Milestone: Sprint-20121120
Component: statistics Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: Core Feature Depending on Ticket:
Estimated Difficulty: 5 Add Hours to Ticket: 0
Total Hours: 2 Internal?: no

Description

See http://n10.isc.org:8000/bind10/statistics/xml/Xfrout/zones

The "zones" URI doesn't match the "Zone names" item name.

The data is not in XML. If there were many zones served by Xfrout, then this list could be long and confusing. Please use XML output for all of this. Please consider allowing access to any entry, such as:

http://n10.isc.org:8000/bind10/statistics/xml/Xfrout/zones/_SERVER_/xfrreqdone to get a value.

Subtickets

Attachments (2)

bind10-statistics.xml (2.1 KB) - added by naokikambe 7 years ago.
bind10-statistics.xsl (1.3 KB) - added by naokikambe 7 years ago.

Download all attachments as: .zip

Change History (24)

comment:1 Changed 7 years ago by naokikambe

XML by the current b10-stats-httpd has to be fixed when the level of JSON dictionary which b10-stats provides via cc-session becomes deeper. This is because the current XML structure is too complicated, I think. We need to make it simpler for improving basically this issue.

Changed 7 years ago by naokikambe

Changed 7 years ago by naokikambe

comment:2 Changed 7 years ago by naokikambe

This is an example for proposal for improvement of XML display. XML and XSL can be opened by a browser after downloading to local. This XML is an example output in case opening "Xfrout/zones" item. Possible items are listed in directory format. Specific item set can be displayed by clicking a link.

Could anyone have comments? I'd like to proceed to fix this if there is no problem in this style.

comment:3 Changed 7 years ago by jinmei

  • Milestone changed from New Tasks to Next-Sprint-Proposed

comment:4 Changed 7 years ago by naokikambe

  • Owner set to UnAssigned
  • Status changed from new to reviewing

The branch is ready for reviewing.
I've slightly changed an xml format from the above proposed one. Item data of statistics are included in xml attributes, not in xml elements. This is intended for reducing the amount of xml data. I've introduced xml validation check by using the lxml module into the unittest, which is a third-party python module. However that check will be skipped if the module is unavailable on the system. I've also included a fix for #2048 (configure check for pyexpat) in the branch.
A proposed ChangeLog entry is:

nnn.    [bug]*          naokikambe
        Fixed an XML format viewed from b10-stats-httpd. Regarding per-zone
        counters as zones of Xfrout, a part of the item values wasn't an exact
        XML format. A zone name can be specified in URI as
        /bind10/statistics/xml/Xfrout/zones/example.org/xfrreqdone. XSD and XSL
        formats are also changed to constant ones due to these changes.
        (Trac #2298, git tbd)

comment:5 Changed 7 years ago by jelte

  • Milestone changed from Next-Sprint-Proposed to Sprint-20121106

comment:6 follow-up: Changed 7 years ago by jinmei

From a quick look, the branch is pretty big, changing many things.
Can't it be divided into multiple stages?

I'm not sure about others, but I'm afraid this would look too scary
that no one will be willing to take it in its current form...

comment:7 in reply to: ↑ 6 ; follow-up: Changed 7 years ago by naokikambe

Replying to jinmei:

From a quick look, the branch is pretty big, changing many things.
Can't it be divided into multiple stages?

I had to revise the core part of the code for fixing this issue. Handling URI of XML was changed. XSD and XSL wasn't dynamically produced. The revised code depended on many other parts of the codes. This issue is that JSON is appeared in XML. So validating XML data become more strict in unittest. And some dirty codes was cleaned. That's why the change became big. There are 27 commits in the branch. Probably this ticket was too big to be fixed.

Anyway I tried to group the related changes as the following. Can we review codes by group?

  • Handling the uri requested via HTTP
    0d35b9 [2298] modify handling the accepted URI
    69f3f5 [2298] append a slash at the end of the URI
    a86eeb [2298] update HTTP status codes
    
  • Handling statistics data returned from the stats daemon and converting to an xml format
    0a41ff [2298] modify the xml_handler() method
    b1282d [2298] modify test_do_GET() and check_XML_URL_PATH()
    57e21d [2298] update test_xml_handler()
    
  • Revising an xsd format
    7a2a76 [2298] modify the xsd template file
    351a2c [2298] modify the xsd_handler() method
    d663e8 [2298] modify test_do_GET() and check_XSD_URL_PATH()
    bc4bdd [2298] modify test_xsd_handler()
    
  • Revising an xsl format
    bba782 [2298] modify the xsl template file
    0e0714 [2298] modify test_xsl_handler()
    5286dd [2298] modify test_do_GET() and check_XSL_URL_PATH()
    3c04bb [2298] modify test_xsl_handler()
    
  • adding a helper method
    e44e9c [2298] add a method item_name_list()
    68df7d [2298] add constants of XML namespaces for using in tests
    
  • misc changes in b10-stats-httpd
    3d17e6 [2298] modify consts related to XML, XSD, and XSL
    65a0f4 [2298] return a 500 status code when catching any exception
    456e69 [2298] append a Last-Modified header
    7e6faa [2298] modify the open_template() method
    
  • misc changes in unittest
    d4a9ae [2298] add CONST_BASETIME and use it in MockBoss and MyStats classes
    2f26b1 [2298] modify DUMMY_DATA
    f44c42 [2298] use assertGreater() rather than assertTrue()
    d7b7c2 [2298] add a unittest using XML validation with XSD
    753b37 [2298] skip test_do_GET() if XMLParser() is unavailable
    30cf85 [2298] add constants of XML namespaces for using in tests
    
  • updating copyrights
    d23d54 [2298] add copyright and update copyrights
    

comment:8 in reply to: ↑ 7 ; follow-up: Changed 7 years ago by jinmei

Replying to naokikambe:

From a quick look, the branch is pretty big, changing many things.
Can't it be divided into multiple stages?

I had to revise the core part of the code for fixing this issue. Handling URI of XML was changed. XSD and XSL wasn't dynamically produced. The revised code depended on many other parts of the codes. This issue is that JSON is appeared in XML. So validating XML data become more strict in unittest. And some dirty codes was cleaned. That's why the change became big. There are 27 commits in the branch. Probably this ticket was too big to be fixed.

Anyway I tried to group the related changes as the following. Can we review codes by group?

Perhaps (thanks for the grouping). If I end up reviewing this ticket,
I would like to see a rearranged branch that consist of the grouped
commits that can be followed step by step, since I'm not a guy whose
brain stack is very deep.

I experimentally created a single branch from the branch point of
trac2298 and then cherry-picked the following three commits in that
order:

  • Handling the uri requested via HTTP
    0d35b9 [2298] modify handling the accepted URI
    69f3f5 [2298] append a slash at the end of the URI
    a86eeb [2298] update HTTP status codes
    

These cleanly merged but tests didn't pass. What I would like to see
is a newly created branch like this and each group is "completed" in
that it (if any .cc/.h involved) compiles and passes all tests. Or,
even preferably, we separate this task into 2-3 subtasks and we review
each subtask one by one. If you're willing to make such changes, that
would be highly appreciated whoever ends up taking on the review.

But, perhaps someone else has a deeper brain capacity than me and may
be able to review the whole branch at once. If you think it's the
case, I'd simply leave the review to that person.

comment:9 in reply to: ↑ 8 ; follow-up: Changed 7 years ago by naokikambe

  • Owner changed from UnAssigned to naokikambe

Thank you for the suggestion. OK, I'll try to split the branch into multiple small ones. Some new code might be needed for splitting and it might be discarded after merging. But I believe that's important work for reasonable reviewing.

comment:10 in reply to: ↑ 9 Changed 7 years ago by naokikambe

  • Owner changed from naokikambe to UnAssigned

Replying to naokikambe:

OK, I'll try to split the branch into multiple small ones.

I've split the branch into the following 7 small branches. They all are now for ready for reviewing.

I've added new code for making the unit test passed inside each small branch after splitting. As a result, total number of commits were increased to 39. But total differences are not changed before being split.

Split small branches are trac2298_1, trac2298_2 .. and trac2298_7. There is a dependency between branches. So we need to start reviewing from the small one trac2298_1. The last one trac2298_7 has all changes for this ticket. That is, there is no differences between the not split one trac2298 and trac2298_7.

The split branches are (the duplicate commits are omitted) :

trac2298_1: adding a helper method

56da3a8 [2298] update copyrights
62c09c1 [2298] add a method item_name_list()
17fb05b [2298] add tests for item_name_list()

trac2298_2: Handling statistics data returned from the stats daemon and converting to an xml format

a675312 [2298] modify the xml_handler() method
d9ea05e [2298] modify test_do_GET() and check_XML_URL_PATH()
6ae5e49 [2298] update test_xml_handler()
1601f31 [2298] add some consts (xml root element and attributes)
e5b2478 [2298] add constants of XML namespaces for using in tests

trac2298_3: Revising an xsd format

ac274af [2298] modify the xsd template file
0434072 [2298] modify the xsd_handler() method
65a6e71 [2298] modify test_do_GET() and check_XSD_URL_PATH()
b9fd968 [2298] modify test_xsd_handler()
a9c0e9a [2298] rename the parameter name for template
f680979 [2298] update HTTP status codes due to the change of xml_handler()

trac2298_4: Revising an xsl format

68be400 [2298] modify the xsl template file
5d7d2c8 [2298] modify test_xsl_handler()
f504c0e [2298] modify test_do_GET() and check_XSL_URL_PATH()
ff27bd0 [2298] modify test_xsl_handler()
8fd3d74 [2298] update HTTP status codes due to the change of xsl_handler()
4e1f5f6 [2298] rename the parameter name for template

trac2298_5: Handling the uri requested via HTTP

f1cef3c [2298] modify consts related to XML, XSD, and XSL
3dd20b5 [2298] modify handling the accepted URI
36ccf2e [2298] append a slash at the end of the URI
ca4dcdb [2298] add a copyright
3968d20 [2298] add CONST_BASETIME and use it in MockBoss and MyStats classes
3cf06ca [2298] modify DUMMY_DATA
0e2e8ca [2298] modify the xml_handler() method
ab359d1 [2298] modify the xsd_handler() method
746fd5c [2298] modify the xsl_handler() method
3c4b62a [2298] modify check_XML_URL_PATH() in test_do_GET()
f74756d [2298] revert checks of http status codes when requesting XSD URI with some path
7428368 [2298] revert checks of http status codes when requesting XSL URI with some path

trac2298_6: misc changes in b10-stats-httpd

0a2d254 [2298] return a 500 status code when catching any exception
05314a2 [2298] append a Last-Modified header
4c1b039 [2298] modify the open_template() method
86dbecd [2298] assert IOError raising instead of StatsHttpdDataError

trac2298_7: misc changes in unittest

1185fc6 [2298] use assertGreater() rather than assertTrue()
d3e6af5 [2298] add a unittest using XML validation with XSD
8309d89 [2298] skip test_do_GET() if XMLParser() is unavailable

comment:11 follow-up: Changed 7 years ago by jinmei

I'm not still sure if I can reasonably review this branch, but since
no one has picked up, I'll give it a try...

commits 56da3a8^..17fb05b

stats_httpd.py

  • it's difficult to understand what this function does just from the description. What's "list of items"? (and what are "items"?) What does traverse mean? By which it's "sorted"? Also some examples may help. Some in-code comments also help. I'd like to see description about possible exceptions, too.
  • why do we need to call str() if identifier is of string?
        if identifier and len(str(identifier)) > 0:
            ident = str(identifier)
            ret.append(ident)
    
  • if we know identifier is a string can't we skip the 'and len(...'?
        if identifier and len(str(identifier)) > 0:
    
  • further, since we replace ident with identifier if the latter is not empty, can't we simply do this?
        ident = identifier
        if ident:
            ret.append(ident)
    
  • why use len() if ident is a string?
            if len(ident) > 0:
    
  • this seems to be redundant
                ident = '%s/' % ident
    
    I'd do it:
                ident += '/'
    
  • is it okay not to consider the case where ident is empty here?
                idstr = '%s[%s]' % (ident, i)
    
    also, while minor, it seems to me more natural if we use '%d' for the second '%s' because i is an integer.

stats_httpd_test.py

It's not really clear specifically which point each assertXXX tries to
test. Is this a set of necessary and sufficient tests? (i.e., isn't
there an uncovered scenario? or isn't there a redundant case?). And,
in any case, I'd like to see brief comments on the purpose of the test
for each (major) case.

comment:12 in reply to: ↑ 11 ; follow-up: Changed 7 years ago by naokikambe

Replying to jinmei:
Thank you for reviewing. But I'm not sure whether you are a real reviewer for this ticket. So I left it 'UnAssigned'. If this is wrong, please assign yourself.

stats_httpd.py

  • it's difficult to understand what this function does just from the description. What's "list of items"? (and what are "items"?) What does traverse mean? By which it's "sorted"? Also some examples may help. Some in-code comments also help. I'd like to see description about possible exceptions, too.

I updated the docstring at

5fdb1fa [2298] update the docstring of item_name_list()
  • why do we need to call str() if identifier is of string?
        if identifier and len(str(identifier)) > 0:
            ident = str(identifier)
            ret.append(ident)
    
  • if we know identifier is a string can't we skip the 'and len(...'?
        if identifier and len(str(identifier)) > 0:
    
  • further, since we replace ident with identifier if the latter is not empty, can't we simply do this?
        ident = identifier
        if ident:
            ret.append(ident)
    
  • why use len() if ident is a string?
            if len(ident) > 0:
    
  • this seems to be redundant
                ident = '%s/' % ident
    
    I'd do it:
                ident += '/'
    

After I considered that well, I found it was ok to assume it to be a string. For the above comments, I revised the code at

3d0e3ec [2298] It was ok to assume the argument identifier to be a string
  • is it okay not to consider the case where ident is empty here?
                idstr = '%s[%s]' % (ident, i)
    
    also, while minor, it seems to me more natural if we use '%d' for the second '%s' because i is an integer.

According to your comment, I revised the code at

aa38883 [2298] use '%d' in formatting an integer instead of '%s'

stats_httpd_test.py

It's not really clear specifically which point each assertXXX tries to
test. Is this a set of necessary and sufficient tests? (i.e., isn't
there an uncovered scenario? or isn't there a redundant case?). And,
in any case, I'd like to see brief comments on the purpose of the test
for each (major) case.

I added comments for the tests at:

c0b0bf7 [2298] add comments in test_item_name_list()

BTW, I changed the way to sort the key names in element which is in the first argument.
Before revising, returned value was sorted at the end of the method. However, idexes were also sorted as strings. For example, regarding a 10-index list, ['a[0]', 'a[1]', .. 'a[10]'] are wrongly soterd to ['a[0]', 'a[1]', 'a[10]', .. 'a[9]']. So I fixed that at

bd42224 [2298] add a test for sorting
f3c3c9b [2298] in case of dict change the way to sort by key name

Regards,

comment:13 in reply to: ↑ 12 Changed 7 years ago by jinmei

Replying to naokikambe:

Thank you for reviewing. But I'm not sure whether you are a real reviewer for this ticket. So I left it 'UnAssigned'. If this is wrong, please assign yourself.

I noticed some minor things, but this chunk seems okay.

The minor comments are:

  • This can be simplified:
            keys = list(elem.keys())
            keys.sort()
            for key in keys:
    
    to
            for key in sorted(elem.keys()):
    
  • probably not a issue of this branch, but these two cases are bit awkward:
            # for specifying empty strings in element and identifier
            self.assertEqual([],
                             stats_httpd.item_name_list('', ''))
            # for specifying wrong element, which is an non-empty string,
            # and an non-empty string in identifier
            self.assertRaises(isc.cc.data.DataTypeError,
                             stats_httpd.item_name_list, 'a', 'a')
    
    because the first parameter it not a dict but it somehow succeeds.
  • related to the previous point:
  • is it okay not to consider the case where ident is empty here?
                idstr = '%s[%s]' % (ident, i)
    

Actually it can happen as in this test:

        self.assertRaises(isc.cc.data.DataTypeError,
                         stats_httpd.item_name_list, [1,2,3], '')

and how it fails is not so trivial. It fails in the recursive call
to item_name_list() because now the 2nd parameter to find() is not
empty. I don't know if this means we need to do something in this
branch, though.

comment:14 follow-up: Changed 7 years ago by jinmei

...and I've spent some time to try to understand the second chunk of
the diff (starting from a675312), but it seems to be beyond my
capability of comprehension. The before-after diff is so big, and
tests are not very direct (meaning, not just passing some basic type
to the method and check the result using assertEqual), so I couldn't
even understand what this diff tries to do at the higher level, much
less being confident it's safe.

At this point I think I have to give up reviewing this branch. Could
you find someone else for the rest of the review?

comment:15 Changed 7 years ago by jinmei

  • Owner changed from UnAssigned to naokikambe

comment:16 Changed 7 years ago by jinmei

  • Total Hours changed from 0 to 2

comment:17 in reply to: ↑ 14 Changed 7 years ago by naokikambe

Replying to jinmei:

Replying to naokikambe:

Thank you for reviewing. But I'm not sure whether you are a real reviewer for this ticket. So I left it 'UnAssigned'. If this is wrong, please assign yourself.

I noticed some minor things, but this chunk seems okay.

The minor comments are:

  • This can be simplified:
            keys = list(elem.keys())
            keys.sort()
            for key in keys:
    
    to
            for key in sorted(elem.keys()):
    

Thank you for suggestion. I didn't know that way. I fixed at:

27cb11a [2298_1] use sorted() instead of using sort() redundantly
  • probably not a issue of this branch, but these two cases are bit awkward:
            # for specifying empty strings in element and identifier
            self.assertEqual([],
                             stats_httpd.item_name_list('', ''))
            # for specifying wrong element, which is an non-empty string,
            # and an non-empty string in identifier
            self.assertRaises(isc.cc.data.DataTypeError,
                             stats_httpd.item_name_list, 'a', 'a')
    
    because the first parameter it not a dict but it somehow succeeds.

This behavior depends on isc.cc.data.find(). As the following, this method always returns the first arg if the second arg is an empty string. However it raises DataTypeError unless the first arg is dict and unless the second arg is an empy string.

def find(element, identifier):
...
    if identifier == "":
        return element
    if type(element) != dict:
        raise DataTypeError("element in find() is not a dict")
...
  • related to the previous point:
  • is it okay not to consider the case where ident is empty here?
                idstr = '%s[%s]' % (ident, i)
    

Actually it can happen as in this test:

        self.assertRaises(isc.cc.data.DataTypeError,
                         stats_httpd.item_name_list, [1,2,3], '')

and how it fails is not so trivial. It fails in the recursive call
to item_name_list() because now the 2nd parameter to find() is not
empty. I don't know if this means we need to do something in this
branch, though.

(Sorry, I forgot to reply.) Yes, but it's no problem in stats_httpd if this case fails.
The top level data type should be dict. For example, this test is passed:

        self.assertEqual(['a', 'a[0]', 'a[1]', 'a[2]'],
                         stats_httpd.item_name_list({'a':[1,2,3]}, ''))

Replying to jinmei:

...and I've spent some time to try to understand the second chunk of
the diff (starting from a675312), but it seems to be beyond my
capability of comprehension. The before-after diff is so big, and
tests are not very direct (meaning, not just passing some basic type
to the method and check the result using assertEqual), so I couldn't
even understand what this diff tries to do at the higher level, much
less being confident it's safe.

In the second chunk, before-diff doesn't exactly match after-diff. xml_handler() is same name but the content is wholy different. Sorry for the confusion.

At this point I think I have to give up reviewing this branch. Could
you find someone else for the rest of the review?

Thank you for reviewing! I'll look for another reviewer.

Regards,

comment:18 follow-up: Changed 7 years ago by jelte

Very general, and unrelated note: b10-statshttpd prints GET requests to stderr directly. This should be done through the logging api (perhaps by making a tiny class that acts as a stream that can be printed to, if the source of these messages is the python http module)

Probably because of the above, make check prints a lot of mostly uninteresting crap (at least, when the tests pass :p) and it may be non-obvious that it is skipping some if you don't have python3-lxml installed.

Anyway, this is most likely something for a different ticket.

on to the branch itself: I have been looking at the diffs between commits a67531278a9b431fa64816274477a1e41d64f9e9 and f5caa308b4534eb70cd5996c2e809a36ac36ee6d (in trac2298_7)

In stats_httpd.py.in:

send_head() (lines 124+):

the code could use a few comments as to what it is doing here, I see a lot of path modifications, but it is not directly clear what is being modified and why (in the first case of the if at line 125)

open_template() (lines 553+):

        lines = None
        f = None
        try:
            f = open(file_name, 'r')
            lines = "".join(f.readlines())
        except IOError as err:
            raise StatsHttpdDataError(
                "%s: %s" % (err.__class__.__name__, err))
        finally:
            if f: f.close()

this can be done slightly more straightforward with a 'with'-statement:

        try:
            with open(file_name, 'r') as f:
                lines = "".join(f.readlines())
        except IOError as err:
            raise StatsHttpdDataError(
                "%s: %s" % (err.__class__.__name__, err))

b10-stats-httpd_test.py:

in test_do_GET's check_XML_URL_PATH (lines 256+):

Not entirely sure if root.find('item') is supposed to ever return anything, but in the current set of tests it is always an empty iterator, and the checks in that final loop are never reached.
(I was looking to see why the check at line 260 is an assertFalse, not an assertTrue)

comment:19 in reply to: ↑ 18 Changed 7 years ago by naokikambe

Hello,

Thank you for reviewing.

Replying to jelte:

Very general, and unrelated note: b10-statshttpd prints GET requests to stderr directly. This should be done through the logging api (perhaps by making a tiny class that acts as a stream that can be printed to, if the source of these messages is the python http module)

Probably because of the above, make check prints a lot of mostly uninteresting crap (at least, when the tests pass :p) and it may be non-obvious that it is skipping some if you don't have python3-lxml installed.

Yes, I know that, but I believe this issue would be resolved on #1897. So I'd like to separate this issue from this ticket. This ticket is already so big:(.

send_head() (lines 124+):

the code could use a few comments as to what it is doing here, I see a lot of path modifications, but it is not directly clear what is being modified and why (in the first case of the if at line 125)

I've added comments related to the path preparation at

71410ff [2298_7] add an example of the path in docstring of xml_handler()
d54f479 [2298_7] add comments about preparation of the requested path, regarding the condition that it is started with X
f0ab6da [2298_7] add description about evaluating the requested path in each if-condition

open_template() (lines 553+):

...

this can be done slightly more straightforward with a 'with'-statement:

        try:
            with open(file_name, 'r') as f:
                lines = "".join(f.readlines())
        except IOError as err:
            raise StatsHttpdDataError(
                "%s: %s" % (err.__class__.__name__, err))

Thank you, I've revised as you commented.

932dcb6 [2298_7] use with-statement instead of try-except-finally statement


b10-stats-httpd_test.py:

in test_do_GET's check_XML_URL_PATH (lines 256+):

Not entirely sure if root.find('item') is supposed to ever return anything, but in the current set of tests it is always an empty iterator, and the checks in that final loop are never reached.
(I was looking to see why the check at line 260 is an assertFalse, not an assertTrue)

It was an empty iteration. I missed. I removed find(). I think it would be run. And I've added comments about the tests in the iteration.

3e9d625 [2298_7] add comments about checking each 'item' element under the root element
3097176 [2298_7] remove find() so that each element 'item' under the root element is checked (actually this test wasn't

Regards,

comment:20 Changed 7 years ago by naokikambe

  • Owner changed from naokikambe to jelte

I forgot to reassign.

comment:21 follow-up: Changed 7 years ago by jelte

  • Owner changed from jelte to naokikambe

Ah, yes, the logging thing should be addressed in #1897.

Changes look good.

comment:22 in reply to: ↑ 21 Changed 7 years ago by naokikambe

  • Resolution set to fixed
  • Status changed from reviewing to closed

Replying to jelte:

Changes look good.

Thank you for reviewing. I've merged trac2298_1 and trac2298_7 into master. I'm closing this ticket.

Ah, yes, the logging thing should be addressed in #1897.

I attached a proposed patch on #1897.

Note: See TracTickets for help on using tickets.