Opened 10 years ago

Closed 9 years ago

#180 closed enhancement (complete)

b10 start auth server drop privs asap

Reported by: shane Owned by: jelte
Priority: medium Milestone: 05. 3rd Incremental Release: Serious Secondary
Component: ~Boss of BIND (obsolete) Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity:
Sub-Project: Feature Depending on Ticket:
Estimated Difficulty: Add Hours to Ticket:
Total Hours: Internal?:

Description

The boss process needs to drop server permissions as soon as possible.

Subtickets

Attachments (1)

trac180.diff (9.2 KB) - added by shane 9 years ago.

Download all attachments as: .zip

Change History (11)

comment:1 Changed 10 years ago by shane

  • Component changed from Unclassified to Boss of BIND
  • Status changed from new to accepted

comment:2 Changed 10 years ago by shane

  • Milestone changed from 04. 2nd Incremental Release: Early Adopters to 05. 3rd Incremental Release

I thought about this last week, and decided that it didn't make much sense. If we drop permissions then we'll end up in a state where we can't restart processes and so on.

I think it is better to implement the full privilege separation documented on the PrivilegedSocketCreator page, which is beyond the scope of what we had time for in the Y2 2nd release. Moving to Y2 3rd release!

comment:3 Changed 9 years ago by jreed

For now this can drop permissions for the components that don't need extra privileges (like cmdctl, xfrin, xfrout, cfgmgr). bind10 can start them as dedicated user(s) and group(s). This also means that these tools may need permission to write to files.

comment:4 Changed 9 years ago by shane

  • Owner changed from shane to UnAssigned
  • Status changed from accepted to reviewing

Okay, this has been done and is now in branches/trac180. Tests were added for the startup option.

Note that to run the test for setuid you have to run as root, and you have to modify the args_test.py to give it the name of the user to change to. Otherwise that particular test is skipped. I am open to suggestions on how to do this in a more flexible way (perhaps we could add a "test" user to our build systems, for example).

comment:5 Changed 9 years ago by shane

Realized that maybe I should add a description of the changes.

  • I added a "uid" attribute to the ProcessInfo? class, which is changed to in the preexec_fn of the POpen instance associated with it. This means that after to fork() but before the exec() the process changes to that user if it is specified. We do this so we can setuid() for msgq and cfgmgr, which have to start before the auth process.
  • I added the uid to the constructor for the BossOfBind? class, which is what the process ends up setuid() to, if present.
  • I moved the xfrout process after the auth process in startup, because it does not need to run as root.
  • Before the auth process stars, setuid() is invoked if a uid is specified.
  • Fixed a bug in the return value of restart_processes() that I discovered during testing. During shutdown this should return 0, not None, because if None is specified the select() will wait until some data is ready - and we really just want to skip it and proceed to our cleanup.
  • Set output to line buffering, even when not going to a TTY. This is useful for testing, and should not affect performance on such a low output program.
  • Added the "-u" (or "--user") option, which accepts UID or user name
  • Check for arguments on start - this is an error and is now reported as such
  • Added output when the server is done starting, useful for testing (and admins!)
  • Explicitly exit(0) after shutdown, just to be clear what is going on

There are tests in the tests/test_args.py file which do:

  • Run bind10 without arguments to make sure it starts
  • Run bind10 with an unknown option
  • Run bind10 with an argument
  • Run bind10 with a bogus setuid user name
  • Run bind10 with a bogus setuid UID
  • Run bind10 with a valid setuid name, and make sure it changes to that user

The last test was tricky for 2 reasons. As I mentioned, I don't know a way to find out a valid user name on the system (besides root). Also, I don't know if there is a generic way to get a list of processes and users. What I used will work for Linux and FreeBSD, but probably not for Solaris.

The test also does not check the user that the various BIND 10 processes are running as. This is possibly useful to test, but I'll wait until the boss process is changed to run with configurable processes to make that check.

comment:6 Changed 9 years ago by jelte

  • Owner changed from UnAssigned to jelte

comment:7 follow-up: Changed 9 years ago by jelte

  • Owner changed from jelte to shane

bind10.py.in:

i personally think 'if foo not in bar' and 'if foo is not None' reads better than 'if not foo in bar' and 'if not foo is None'. We don't really have a style thingy about this, and both versions appear in the code. It's also quite minor :)

_setuid(self) has two write statements that don't prefix with the module. I suspect at least one of them is debug leftover, and shouldn't need to be there at all (perhaps even both of them).

So the initial few 'downsetuidable' processes get the given uid as an argument, but after that you set the uid of bob himself, and then simply don't pass it along anymore. I don't think this is a very good approach, it's inconsistent at the very least, but it also gives us a problem (restart of auth will fail if it uses privileged port, and keep on failing after that). The 'full' way would be to not setuid of bob himself, only for subprocesses (either way, this should all change when we have our holy socket creator).

Don't know if it's out of scope, but the process starting code contains nothing but replication, so a refactor would be nice ;)

Please document the return values of restart_processes (0, not-0 and None?) (more generally, i think we can do a better job of documenting arguments and return values in our python code)

If you give an existing user, but don't start as root, the error message is a bit confusing (simply says 'operation not permitted', not what operation that is)

bind10_test.in:

Do we still use the (generated) test scripts? (we have make check for that now, right?)

args_test.py:

You mention this in your own comment, but "shane" does not exist on my system so this test fails... i don't think 'nobody' will work. Perhaps take the username of the user that runs configure?

should this one also test other arguments while we're at it?

comment:8 in reply to: ↑ 7 Changed 9 years ago by shane

  • Owner changed from shane to jelte

Replying to jelte:

bind10.py.in:

i personally think 'if foo not in bar' and 'if foo is not None' reads better than 'if not foo in bar' and 'if not foo is None'. We don't really have a style thingy about this, and both versions appear in the code. It's also quite minor :)

Okay, I've updated this to your preferred style. I've post a question to the bind10-dev list to see if there is a preference.

_setuid(self) has two write statements that don't prefix with the module. I suspect at least one of them is debug leftover, and shouldn't need to be there at all (perhaps even both of them).

Whoops! Removed.

So the initial few 'downsetuidable' processes get the given uid as an argument, but after that you set the uid of bob himself, and then simply don't pass it along anymore. I don't think this is a very good approach, it's inconsistent at the very least, but it also gives us a problem (restart of auth will fail if it uses privileged port, and keep on failing after that). The 'full' way would be to not setuid of bob himself, only for subprocesses (either way, this should all change when we have our holy socket creator).

This is a design choice to minimize the amount of code running as root. The bob process should probably not run as root, which is why it calls setuid() on itself. After that point no further setuid() is possible. And yes - it means we cannot restart the auth process properly. That was the reason we decided on the PriviligedSocketCreator? for the long-term solution.

So, I'd prefer to keep this patch "as is" in this regard.

Don't know if it's out of scope, but the process starting code contains nothing but replication, so a refactor would be nice ;)

It's a separate ticket - something to be done when this is read from the cfgmgr rather than hard-coded.

Please document the return values of restart_processes (0, not-0 and None?) (more generally, i think we can do a better job of documenting arguments and return values in our python code)

Done, and agreed (although I didn't go through everything and check it).

If you give an existing user, but don't start as root, the error message is a bit confusing (simply says 'operation not permitted', not what operation that is)

Okay, I fixed this to throw a more readable exception.

bind10_test.in:

Do we still use the (generated) test scripts? (we have make check for that now, right?)

Um... in this case it seems to still be used. I'll ask Jeremy if this is bad style or what.

args_test.py:

You mention this in your own comment, but "shane" does not exist on my system so this test fails... i don't think 'nobody' will work. Perhaps take the username of the user that runs configure?

should this one also test other arguments while we're at it?

It's out of scope for this ticket, but I have created a separate one for that (ticket #258). It's non-trivial due to needing to check ports and addresses and suchlike.

I've committed the changes made, and will attach a diff here as well for review.

Changed 9 years ago by shane

comment:9 Changed 9 years ago by jelte

ok consider my protest officially submitted about the privilegedropping inconsistency (until such time as we have the portcreator stuff) ;)

lines 156 and 157 are dead

currently the args_test isn't in make check yet, but it also doesn't work because of the username in there, add both to #258 too?

For the rest it looks good.

comment:10 Changed 9 years ago by shane

  • Resolution set to complete
  • Status changed from reviewing to closed

Okay, I removed the dead lines, and added the args_test work to ticket #258.

Merged into trunk.

Note: See TracTickets for help on using tickets.