Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#2062 closed task (fixed)

bind10-showtech initial version

Reported by: shane Owned by: muks
Priority: medium Milestone: Sprint-20120703
Component: Unclassified Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: Core Feature Depending on Ticket: bind10-showtech
Estimated Difficulty: 5 Add Hours to Ticket: 0
Total Hours: 3.63 Internal?: no

Description

We need to build an initial bind10-showtech utility. A proposal about the contents is on the ShowTech page.

This could be a Python script, or a shell script, as appropriate. Much of the output will be unavoidably system dependent, however some can and should come from standard BIND 10 installation (for example, we need to put config.log in a place where it can be found after installation).

ISC's support was asked for the most important information in a DHCP context, and have said the most important information is:

  • OS and version
  • dhcpd.conf file (or BIND 10 configuration in our case)
  • Lease file (zone contents in our case)
  • Log information
  • List of network interfaces
  • core file (if it crashed)

Since configuration and zone contents can both be quite big, probably we can leave those out of the initial version. Perhaps these can be added later with something like "bind10-showtech --with-zone-contents" and "bind10-showtech --with-full-config".

Subtickets

Change History (17)

comment:1 Changed 7 years ago by jinmei

I suggest for the initial version we only have some small number
of items (with the base framework) so the development and review
are "bitable".

comment:2 Changed 7 years ago by shane

  • Estimated Difficulty changed from 0 to 5
  • Milestone changed from Next-Sprint-Proposed to Sprint-20120703

We agreed on the sprint planning session that this ticket would only include creating the initial bind10-showtech program, as well as a very minimal set of information that it provides. Additional information will be added later via separate tickets.

comment:3 Changed 7 years ago by muks

  • Owner set to muks
  • Status changed from new to assigned

Picking

comment:4 Changed 7 years ago by muks

  • Owner changed from muks to UnAssigned

Up for review.

Note that this only has a Linux implementation for now. I want someone to review how it is currently, and then I'll make SysInfo? abstract and add a factory and implementations for BSD, MacOS, and Solaris.

comment:5 Changed 7 years ago by muks

  • Status changed from assigned to reviewing

comment:6 Changed 7 years ago by jinmei

Some obvious points:

  • there's no test.
  • we definitely need a changelog for this ticket.

comment:7 Changed 7 years ago by jinmei

  • Owner changed from UnAssigned to muks

comment:8 Changed 7 years ago by muks

  • Owner changed from muks to jinmei

How is this for a ChangeLog? entry:

XXX.    [func]* 	muks
	b10-showtech: An initial implementation of the b10-showtech tool
        is now available. It gathers and outputs system information which
        can be used by tech support staff.
        (Trac #2062, git ...)

SysInfo? has been changed to be abstract. The abstract class is tested so it returns known values. Docs and tests have been added.

comment:9 Changed 7 years ago by jinmei

  • Owner changed from jinmei to UnAssigned

(making it unassigned as I'm not sure if I can review the entire branch quickly)

comment:10 Changed 7 years ago by vorner

  • Owner changed from UnAssigned to vorner

comment:11 follow-up: Changed 7 years ago by vorner

  • Owner changed from vorner to muks

Hello

I have several comments.

First, I'd modify the changelog slightly. It looks like Bind10 has a support staff. Maybe say future support staff?

Regarding functionality

I'm not sure what if anything of this is in scope of this ticket, but I want to note it down.

I'd like an option to put the output to a file. The output is really long and scrolls through the screen. Also, once we include things like config.log, our log files and our configuration, it could be handy to have them as separate files and pack them together with tar or something.

While I agree it is good to have network information, routing info, list of open connections with process names and list of processes (as suggested on the wiki page), these information may be considered sensitive by some admins and they would not like to share them. Unless we provide some kind of switch, like --no-sensitive, it could lead to them not providing the output of b10-showtech at all.

There's no --help option.

If the script is run as non-root (which is reasonable to assume it will), it silently says the network information is Unknown. It would be nice to detect the situation and warn about it (preferably at the end of the output, so it does not scroll far away and the user sees).

Technical problems

A test fails:

======================================================================
FAIL: test_calls (__main__.TestRename)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/vorner/work/bind10/bind10-devel-20120405/_build/src/bin/tests/process_rename_test.py", line 62, in test_calls
    self.__scan(d, script, fun)
  File "/home/vorner/work/bind10/bind10-devel-20120405/_build/src/bin/tests/process_rename_test.py", line 32, in __scan
    "Didn't find a call to isc.util.process.rename in " + prettyname)
AssertionError: None is not true : Didn't find a call to isc.util.process.rename in src/src/bin/showtech/b10-showtech

----------------------------------------------------------------------

Should the generated man page be in git?

I don't think the current tests are good enough. They test only little bit of the code that is just setting of some default values. Everything should be tested, if only on Linux or by replacing the system calls with some mock functions.

A lot of the calls are not system specific. For example, according to python documentation, os.uname() should be available on most Unix systems. So it would be nice to put it into some try-catch block to the base class so other OSes (or the unknown ones) don't need to reimplement it again and again. Other things might be also available more generally than on linux only. The derived ones can always override the detection if it does not work for the system.

comment:12 in reply to: ↑ 11 ; follow-up: Changed 7 years ago by muks

  • Owner changed from muks to vorner

Hi vorner

Replying to vorner:

First, I'd modify the changelog slightly. It looks like Bind10 has a support staff. Maybe say future support staff?

How does this look:

XXX.    [func]* 	muks
	b10-showtech: An initial implementation of the b10-showtech tool
        is now available. It gathers and outputs system information which
        can be used by future tech support staff.
        (Trac #2062, git ...)

Regarding functionality

I'm not sure what if anything of this is in scope of this ticket, but I want to note it down.

I'd like an option to put the output to a file. The output is really long and scrolls through the screen. Also, once we include things like config.log, our log files and our configuration, it could be handy to have them as separate files and pack them together with tar or something.

This has been implemented now. -o/--output options have been added. Manpage has also been updated.

While I agree it is good to have network information, routing info, list of open connections with process names and list of processes (as suggested on the wiki page), these information may be considered sensitive by some admins and they would not like to share them. Unless we provide some kind of switch, like --no-sensitive, it could lead to them not providing the output of b10-showtech at all.

I agree with the idea, but it needs more discussion. If we cut out the entire data when --no-sensitive is specified, then the b10-showtech output won't be very useful as these are the data that support staff would want to see. We may want to filter or anonymize data. The format of the data varies from system to system, but this should not be problematic as it can be done in the implementations of SysInfo. I think we should create a separate ticket for it to be done after #2085.

There's no --help option.

This has been added now. Manpage has also been updated.

If the script is run as non-root (which is reasonable to assume it will), it silently says the network information is Unknown. It would be nice to detect the situation and warn about it (preferably at the end of the output, so it does not scroll far away and the user sees).

It now prints a message to stderr at the end of its run, if run as non-root.

Technical problems

A test fails:

======================================================================
FAIL: test_calls (__main__.TestRename)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/vorner/work/bind10/bind10-devel-20120405/_build/src/bin/tests/process_rename_test.py", line 62, in test_calls
    self.__scan(d, script, fun)
  File "/home/vorner/work/bind10/bind10-devel-20120405/_build/src/bin/tests/process_rename_test.py", line 32, in __scan
    "Didn't find a call to isc.util.process.rename in " + prettyname)
AssertionError: None is not true : Didn't find a call to isc.util.process.rename in src/src/bin/showtech/b10-showtech

----------------------------------------------------------------------

Aieeeee. Fixed! :)

Should the generated man page be in git?

This follows the convention of all other bin programs whose manpages are also checked in.

I don't think the current tests are good enough. They test only little bit of the code that is just setting of some default values. Everything should be tested, if only on Linux or by replacing the system calls with some mock functions.

I strugged with how to test this properly too, as different implementations return different data. But I don't think we should do this. Unit tests are supposed to check at the interfaces, not into the implementation. While some things (such as to check if a lock is being acquired) are notable, I don't think we should check if system calls are being called. It would be similar to testing if each line of code were written in a specific way. There would also not be a benefit of doing this. Each system's output will be different, and the implementation of how we gather a particular piece of data may change too (exec free instead of reading /proc/meminfo, or exec df intead of reading inside /proc/). We'd have to rewrite the tests too then to match the code, which would defeat the purpose of a check. The tests should not test at this level.

A testcase for the factory has been added.

A lot of the calls are not system specific. For example, according to python documentation, os.uname() should be available on most Unix systems. So it would be nice to put it into some try-catch block to the base class so other OSes (or the unknown ones) don't need to reimplement it again and again. Other things might be also available more generally than on linux only. The derived ones can always override the detection if it does not work for the system.

We can have a class in the hierarchy between SysInfo and SysInfoLinux from which other UNIXes also derive, and this can implement the common parts. This is better done in #2085 when we know what is common. I'll update that bug about it.

comment:13 in reply to: ↑ 12 ; follow-up: Changed 7 years ago by vorner

  • Owner changed from vorner to muks

Replying to muks:

Hi vorner

Replying to vorner:

Hello

How does this look:

XXX.    [func]* 	muks
	b10-showtech: An initial implementation of the b10-showtech tool
        is now available. It gathers and outputs system information which
        can be used by future tech support staff.
        (Trac #2062, git ...)

Yes, this one is better. Thanks.

While I agree it is good to have network information, routing info, list of open connections with process names and list of processes (as suggested on the wiki page), these information may be considered sensitive by some admins and they would not like to share them. Unless we provide some kind of switch, like --no-sensitive, it could lead to them not providing the output of b10-showtech at all.

I agree with the idea, but it needs more discussion. If we cut out the entire data when --no-sensitive is specified, then the b10-showtech output won't be very useful as these are the data that support staff would want to see. We may want to filter or anonymize data. The format of the data varies from system to system, but this should not be problematic as it can be done in the implementations of SysInfo. I think we should create a separate ticket for it to be done after #2085.

Would you start the discussion on the mailing list or should I?

Should the generated man page be in git?

This follows the convention of all other bin programs whose manpages are also checked in.

Right, maybe I don't complain loud enough about the other programs too O:-).

I don't think the current tests are good enough. They test only little bit of the code that is just setting of some default values. Everything should be tested, if only on Linux or by replacing the system calls with some mock functions.

I strugged with how to test this properly too, as different implementations return different data. But I don't think we should do this. Unit tests are supposed to check at the interfaces, not into the implementation. While some things (such as to check if a lock is being acquired) are notable, I don't think we should check if system calls are being called. It would be similar to testing if each line of code were written in a specific way. There would also not be a benefit of doing this. Each system's output will be different, and the implementation of how we gather a particular piece of data may change too (exec free instead of reading /proc/meminfo, or exec df intead of reading inside /proc/). We'd have to rewrite the tests too then to match the code, which would defeat the purpose of a check. The tests should not test at this level.

I don't agree here. There's quite some processing (like a bunch of regular expressions for the memory information). These really need to be tested. We don't need to test if the system functions are called, we just need to fake them so they return some known value. Note that python is cool and it is easy to do so, like this:

def fake(param):
        return 42

tested_module.sys.function = fake

The fact that the test is implementation-specific is a drawback, but not a really big problem. When I tried to argue we should test the interface only and never touch the internals, I was told this is not true at least for us.

A lot of the calls are not system specific. For example, according to python documentation, os.uname() should be available on most Unix systems. So it would be nice to put it into some try-catch block to the base class so other OSes (or the unknown ones) don't need to reimplement it again and again. Other things might be also available more generally than on linux only. The derived ones can always override the detection if it does not work for the system.

We can have a class in the hierarchy between SysInfo and SysInfoLinux from which other UNIXes also derive, and this can implement the common parts. This is better done in #2085 when we know what is common. I'll update that bug about it.

OK.

comment:14 in reply to: ↑ 13 Changed 7 years ago by muks

  • Owner changed from muks to vorner

Replying to vorner:

While I agree it is good to have network information, routing info, list of open connections with process names and list of processes (as suggested on the wiki page), these information may be considered sensitive by some admins and they would not like to share them. Unless we provide some kind of switch, like --no-sensitive, it could lead to them not providing the output of b10-showtech at all.

I agree with the idea, but it needs more discussion. If we cut out the entire data when --no-sensitive is specified, then the b10-showtech output won't be very useful as these are the data that support staff would want to see. We may want to filter or anonymize data. The format of the data varies from system to system, but this should not be problematic as it can be done in the implementations of SysInfo. I think we should create a separate ticket for it to be done after #2085.

Would you start the discussion on the mailing list or should I?

Please do so. I need to sort out my ISC email sending before posting there (my emails get flagged for admin approval).

Should the generated man page be in git?

This follows the convention of all other bin programs whose manpages are also checked in.

Right, maybe I don't complain loud enough about the other programs too O:-).

Hehe.. agreed that these should not be in the repo. Every time I build with --enable-man, the tool versions get into the generated manpages and git status reports diffs. No idea why these are in the repo, as they can be dist'd instead.

I don't agree here. There's quite some processing (like a bunch of regular expressions for the memory information). These really need to be tested. We don't need to test if the system functions are called, we just need to fake them so they return some known value. Note that python is cool and it is easy to do so, like this:

def fake(param):
        return 42

tested_module.sys.function = fake

The fact that the test is implementation-specific is a drawback, but not a really big problem. When I tried to argue we should test the interface only and never touch the internals, I was told this is not true at least for us.

In the pursuit of moving the bug along, I've closed my eyes and checked in such tests. :) (But it was weird writing them.)

comment:15 Changed 7 years ago by vorner

  • Owner changed from vorner to muks

Hello

I think it is OK now. Just that the „outputs“ of various ip commands and similar are slightly surprising. Would you comment about the purpose (that it is to test there's no processing node) and merge?

Thank you

comment:16 Changed 7 years ago by muks

  • Resolution set to fixed
  • Status changed from reviewing to closed

I've added such a comment.

Merged to master in commit 144e80212746f8d55e6a59edcf689fec9f32ae95:

* a64cb60 [2062] Add some comments to the SysInfo unit test
* e26eb60 [2062] Add testcases for SysInfo Linux implementation
* 304e2b9 [2062] Add a testcase for SysInfoFromFactory
* c7017fc [2062] Call parent's __init__() in SysInfoLinux
* fdc9afc [2026] Add command line arguments for help and output-file
* 83e3964 [2062] Print a message to stderr when not run as root
* 99d2c13 [2062] Call isc.util.process.rename() in b10-showtech
* 7e93a92 [2062] Add tests for isc.sysinfo.SysInfo
* 9dfb07c [2062] Add a factory to create system-specific SysInfo instances
* f4e22ff [2062] Fix class description
* 04fd11b [2062] Show network information
* 5c6c470 [2062] Add OS distribution name
* b2a7049 [2062] Update showtech tool to start showing info
* 86b50d5 [2062] Start implementing a b10-showtech utility

Resolving as fixed. Thank you for the reviews vorner.

comment:17 Changed 7 years ago by vorner

  • Total Hours changed from 0 to 3.63
Note: See TracTickets for help on using tickets.