Opened 8 years ago

Closed 6 years ago

#1707 closed defect (wontfix)

Default configuration was used when mistake in different configuration

Reported by: jreed Owned by:
Priority: medium Milestone: DNS Outstanding Tasks
Component: b10-auth Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: Very High
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 4 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

Default configuration was used when mistake in different configuration:

2012-02-22 13:09:04.846 INFO  [b10-auth.auth] AUTH_SERVER_CREATED server created
2012-02-22 13:09:04.855 ERROR [b10-auth.auth] AUTH_CONFIG_UPDATE_FAIL update of configuration failed: Server configuration failed: Failed to open master file: /home/jreed/dnsbenchsuite/tests/root/root.zone.file-canonical
2012-02-22 13:09:04.887 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [::]:53 of type TCP from the creator
2012-02-22 13:09:04.888 ERROR [b10-boss.boss] BIND10_SOCKET_ERROR error on bind call in the creator: 13/Permission denied
2012-02-22 13:09:04.889 ERROR [b10-auth.server_common] SRVCOMM_ADDRESS_FAIL failed to listen on addresses ("Error creating socket on bind")
2012-02-22 13:09:04.889 ERROR [b10-auth.auth] AUTH_CONFIG_LOAD_FAIL load of configuration failed: Server configuration failed: "Error creating socket on bind"

The only change was the wrong name of the database filename. The config is:

{"version": 2, "Auth": {"datasources": [{"zones": [{"origin": ".", "file": "/home/jreed/dnsbenchsuite/tests/root/root.zone-canonical"}], "type": "memory"}], "listen_on": [{"port": 5300, "address": "127.0.0.1"}]}}

So why did the second error above happen? Why did it use port 53?

Once I renamed the file, it started fine using port 5300 as expected.

I am not sure if this problem is in cfgmgr or auth.

Subtickets

Change History (15)

comment:1 Changed 8 years ago by jreed

Here is another example where earlier ERROR causes wrong config to be used:

2012-02-22 15:32:19.216 INFO  [b10-auth.auth] AUTH_SERVER_CREATED server created
2012-02-22 15:32:19.225 ERROR [b10-auth.auth] AUTH_CONFIG_UPDATE_FAIL update of configuration failed: Server configuration failed: Invalid RR text at line 19: attempt to decode a value not in base64 char set
2012-02-22 15:32:19.255 INFO  [b10-boss.boss] BIND10_SOCKET_GET requesting socket [::]:53 of type TCP from the creator
2012-02-22 15:32:19.255 ERROR [b10-boss.boss] BIND10_SOCKET_ERROR error on bind call in the creator: 13/Permission denied
2012-02-22 15:32:19.257 ERROR [b10-auth.server_common] SRVCOMM_ADDRESS_FAIL failed to listen on addresses ("Error creating socket on bind")
2012-02-22 15:32:19.257 ERROR [b10-auth.auth] AUTH_CONFIG_LOAD_FAIL load of configuration failed: Server configuration failed: "Error creating socket on bind"

comment:2 Changed 8 years ago by shane

  • Milestone New Tasks deleted

comment:3 Changed 8 years ago by jreed

  • Defect Severity changed from N/A to Very High
  • Milestone set to Next-Sprint-Proposed

comment:4 Changed 8 years ago by jelte

  • Estimated Difficulty changed from 0 to 4

comment:5 Changed 8 years ago by jelte

  • Milestone changed from Next-Sprint-Proposed to Sprint-20120403

comment:6 Changed 8 years ago by jinmei

I've looked into it, and I think I understood how this happened.

When b10-auth constructs ModuleCCSession, the latter retrieves the
config spec and its default, and passes it to b10-auth via the
callback. b10-auth first tries to install the given configuration,
and then it calls ModuleCCSession::getFullConfig() to get the merged
full configuration (defaults with overridden local config), and
installs it.

When the first attempt fails, the local configuration is considered
empty, and b10-auth is given default-only configuration and installs
it. Note that even if the first installation succeeds, it's not an
ideal scenario in that the same configuration data need to be
installed again.

I can think of some minimalist hack to fix this problem, but I suspect
it suggests that it's time to solve various configuration related
issues at a higher level, maybe also revisiting APIs. We know many
small issues related to configuration and many of them were deferred
due to various reasons including time constraints. But now that we
are going to focus on usability, I think we should start solving these
issues in a higher level context.

Maybe we should discuss this at bind10-dev?

comment:7 follow-up: Changed 8 years ago by jelte

We have been discussing lowlevel design changes on the config API indeed, though mostly on the admin access/bindctl side; but I certainly wouldn't mind getting rid of the weirdness in auth there as well :)

As for this problem, I'm a bit confused; if a filename just got changed, and the listen ports were left as they were, it shouldn't touch those. If both are changed then it should reject both if one contains a problem. On startup, this is always the case; all existing settings are considered one change that either works or does not work (and then it would indeed fall back to 53 if another part contains an error).

There are several ways we can change that behaviour;

  • each 'setting' could be considered independent, and applied separately from the rest. We'd need a way to signal back which settings were taken over and which were not.
  • we can also (additionally, perhaps) make a separation between values that are 'valid' and values that 'actually work'; i.e. a negative port number would be invalid, a port number that happens to be in use is valid, but doesn't work (so it would report something but still accept and store it). Similar for a file that is needed but does not exist.

either change would be non-trivial and should indeed be taken into the context of a larger discussion

comment:8 in reply to: ↑ 7 Changed 8 years ago by jinmei

Replying to jelte:

As for this problem, I'm a bit confused; if a filename just got changed, and the listen ports were left as they were, it shouldn't touch those. If both are changed then it should reject both if one contains a problem. On startup, this is always the case; all existing settings are considered one change that either works or does not work (and then it would indeed fall back to 53 if another part contains an error).

First off, this is about the startup case (as I understand it).

And, since there's an error in the zone file name, both the file name
and listen_on configurations are ignored, and fall back to the default
listen_on. If that's what you meant above, your understanding of
what's happening is correct.

So the confusion is about whether this is a problem? I think it is,
because the admin may have never wanted the default setting (e.g. the
intended configuration may be to restrict the access quite tightly and
the admin may not want to allow access from any even for a very short
period until they notice the error). This is different from the case
of updating an existing config - in that case, the previous one was at
least once intended, and until the update is completed successfully it
makes sense to keep that configuration.

comment:9 follow-up: Changed 8 years ago by jelte

the confusion was whether this was at startup or at a change while running; i understand why it happens on startup, i would not understand why it would happen while running :)

Didn't we at some point simply fail to start if the initial configuration was bad? Maybe that would be a good interim solution while we discuss and plan more invasive changes. Though that would make bind10 fail to run completely, which is obviously also not ideal.

comment:10 in reply to: ↑ 9 Changed 8 years ago by jinmei

Replying to jelte:

Didn't we at some point simply fail to start if the initial configuration was bad? Maybe that would be a good interim solution while we discuss and plan more invasive changes. Though that would make bind10 fail to run completely, which is obviously also not ideal.

I guess, regardless of whether it's a short term workaround or part of
a larger change, the revised b10-auth should fail to run if it cannot
even install initial configuration because there's no other fallback
config in that case. Some errors may be safely ignore in practice,
but that depends on the operation policy so we cannot be sure.

comment:11 Changed 8 years ago by vorner

AFAIK the change was done (not to fail) because we can't change the configuration if it's not running. Therefore we really need to do something about the offline configuration (we keep talking about it all the time, but never do it), then we can change it to fail.

comment:12 Changed 8 years ago by jelte

  • Milestone changed from Sprint-20120403 to Year 3 Task Backlog

comment:13 Changed 8 years ago by jreed

Maybe this ticket is related to #662? (default configuration used when configuration failed)

comment:14 Changed 6 years ago by stephen

  • Milestone set to DNS Outstanding Tasks

comment:15 Changed 6 years ago by tomek

  • Resolution set to wontfix
  • Status changed from new to closed

Kea project is DHCP only. Closing stale DNS-related tickets.

If you're interested in DNS, please see Bundy project http://bundy-dns.de

Note: See TracTickets for help on using tickets.