Opened 8 years ago

Closed 8 years ago

#1790 closed task (fixed)

update xfrin to have auth reload transfered zones

Reported by: jinmei Owned by: muks
Priority: medium Milestone: Sprint-20120515
Component: xfrin Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket: xfr for in-memory
Estimated Difficulty: 3 Add Hours to Ticket: 0
Total Hours: 3.43 Internal?: no

Description

This is a subtask of #1787 and a successor of #1789.

We update b10-xfrin further: on a successful zone transfer, if
the zone is served by b10-auth in the in-memory data source
using sqlite3 as a backend, send the "loadzone" command for the zone
to auth.

This task depends on #1789.

Subtickets

Change History (20)

comment:1 Changed 8 years ago by jelte

  • Estimated Difficulty changed from 0 to 3

comment:2 Changed 8 years ago by jelte

  • Milestone changed from Year 3 Task Backlog to Sprint-20120417

comment:3 Changed 8 years ago by muks

  • Owner set to muks
  • Status changed from new to assigned

Picking

comment:4 Changed 8 years ago by muks

  • Owner changed from muks to UnAssigned
  • Status changed from assigned to reviewing

Up for review. This branch also contains fixes for #1794 and #1900.

comment:5 Changed 8 years ago by vorner

  • Owner changed from UnAssigned to vorner

comment:6 follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to muks

Hello

I must admit that reviewing 3 tickets at once can take one from the boredom of everyday life ;-).

To the code. The filetype parameter is optional, which means it might be missing from the dict. Therefore indexing it with zone["filetype"] without further check is unsafe.

The handling of origin and dots is needlessly complicated. Could I propose creating Name objects of both and comparing them? This makes sure the comparison is correct one.

The ignoring of errors when sending/receiving is suspicious at least. What is the purpose? And, if it was unable to send because the msgq terminated, then it should be FATAL error and terminate directly (the msgq must not quit). Anyway, I don't know if we want to handle this at all with try-catch, or maybe the receiving of answer.

Why do you switch to IPv4 in the lettuce tests? I believe we had both types of addresses to make sure we work with both and if we should choose one, it should be the IPv6, not 4 I think.

The following 2 files have a badly/differently than the rest indented line (is it possible you mix tabs and spaces?):

  • tests/lettuce/configurations/xfrin/inmem_slave.conf
  • tests/lettuce/features/inmemory_over_sqlite3.feature

comment:7 in reply to: ↑ 6 ; follow-up: Changed 8 years ago by muks

  • Owner changed from muks to vorner

Replying to vorner:

Hello

I must admit that reviewing 3 tickets at once can take one from the boredom of everyday life ;-).

:)

To the code. The filetype parameter is optional, which means it might be missing from the dict. Therefore indexing it with zone["filetype"] without further check is unsafe.

Fixed!

The handling of origin and dots is needlessly complicated. Could I propose creating Name objects of both and comparing them? This makes sure the comparison is correct one.

Creating Name objects is indeed the right way. I've updated it to also use RRClass for the zone class comparison.

The ignoring of errors when sending/receiving is suspicious at least. What is the purpose? And, if it was unable to send because the msgq terminated, then it should be FATAL error and terminate directly (the msgq must not quit). Anyway, I don't know if we want to handle this at all with try-catch, or maybe the receiving of answer.

The msgq must not quit. I coded this based on another method in xfrin which sends out "notify" to xfrout. We can abort there, but ignoring the error (basically unable to ask auth to load zones) is not bad as long as auth can keep serving. If msgq dies, it will probably get restarted if it is of kind necessary?

Why do you switch to IPv4 in the lettuce tests? I believe we had both types of addresses to make sure we work with both and if we should choose one, it should be the IPv6, not 4 I think.

The test was configured with 127.0.0.1 in some files and ::1 in other.

The following 2 files have a badly/differently than the rest indented line (is it possible you mix tabs and spaces?):

  • tests/lettuce/configurations/xfrin/inmem_slave.conf
  • tests/lettuce/features/inmemory_over_sqlite3.feature

Fixed!

Back to you. :)

comment:8 Changed 8 years ago by muks

Filed #1902 due to that exceptionally large catch tuple.

comment:9 in reply to: ↑ 7 ; follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to muks

Sorry for the late response.

Replying to muks:

The handling of origin and dots is needlessly complicated. Could I propose creating Name objects of both and comparing them? This makes sure the comparison is correct one.

Creating Name objects is indeed the right way. I've updated it to also use RRClass for the zone class comparison.

There's one small problem in the changed code. We have a policy of using each log message identifier exactly once, so the place where it was logged can be easily identified by grep or something. But you use XFRIN_AUTH_CONFIG_ERROR at two places.

The ignoring of errors when sending/receiving is suspicious at least. What is the purpose? And, if it was unable to send because the msgq terminated, then it should be FATAL error and terminate directly (the msgq must not quit). Anyway, I don't know if we want to handle this at all with try-catch, or maybe the receiving of answer.

The msgq must not quit. I coded this based on another method in xfrin which sends out "notify" to xfrout. We can abort there, but ignoring the error (basically unable to ask auth to load zones) is not bad as long as auth can keep serving. If msgq dies, it will probably get restarted if it is of kind necessary?

Well, msgq crashing is considered fatal and can not be changed. If msgq quits, the system goes down completely. That's why I think the catch is suspicious and should be removed (it is not properly handled error, so the error should not be handled at all).

Why do you switch to IPv4 in the lettuce tests? I believe we had both types of addresses to make sure we work with both and if we should choose one, it should be the IPv6, not 4 I think.

The test was configured with 127.0.0.1 in some files and ::1 in other.

Not that it would be very important, but as I said, we want to make sure both v4 and v6 addresses work correctly, therefore we have a mix of them.

comment:10 in reply to: ↑ 9 ; follow-up: Changed 8 years ago by muks

  • Owner changed from muks to vorner

Hi vorner

Replying to vorner:

There's one small problem in the changed code. We have a policy of using each log message identifier exactly once, so the place where it was logged can be easily identified by grep or something. But you use XFRIN_AUTH_CONFIG_ERROR at two places.

Ah.. fixed. :)

Well, msgq crashing is considered fatal and can not be changed. If msgq quits, the system goes down completely. That's why I think the catch is suspicious and should be removed (it is not properly handled error, so the error should not be handled at all).

Do you mean to say that we just don't need to catch the exception at all?

Why do you switch to IPv4 in the lettuce tests? I believe we had both types of addresses to make sure we work with both and if we should choose one, it should be the IPv6, not 4 I think.

The test was configured with 127.0.0.1 in some files and ::1 in other.

Not that it would be very important, but as I said, we want to make sure both v4 and v6 addresses work correctly, therefore we have a mix of them.

Actually, what I meant was that the _same test_ had ::1 for xfrout listen address in one config file, and 127.0.0.1 in another config file (both for where to find xfrout). So I went about changing them to 127.0.0.1 to be consistent.

I think ::1 would cause getaddrinfo() to return addrinfo structs for both ::1 and 127.0.0.1 if AF_UNSPEC is passed. But I don't know if this happens in the case of xfr* causing it to listen on both.

comment:11 in reply to: ↑ 10 ; follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to muks

Hello

Replying to muks:

Replying to vorner:

There's one small problem in the changed code. We have a policy of using each log message identifier exactly once, so the place where it was logged can be easily identified by grep or something. But you use XFRIN_AUTH_CONFIG_ERROR at two places.

Ah.. fixed. :)

I also noticed one thing. Does the exception raised contain the name/rrclass text that is being used? So the user knows what is wrong?

Well, msgq crashing is considered fatal and can not be changed. If msgq quits, the system goes down completely. That's why I think the catch is suspicious and should be removed (it is not properly handled error, so the error should not be handled at all).

Do you mean to say that we just don't need to catch the exception at all?

Yes, that's exactly what I want to say (or, more precisely, that we should not catch the exception at all). At least if the exception happens in the situation described.

Not that it would be very important, but as I said, we want to make sure both v4 and v6 addresses work correctly, therefore we have a mix of them.

Actually, what I meant was that the _same test_ had ::1 for xfrout listen address in one config file, and 127.0.0.1 in another config file (both for where to find xfrout). So I went about changing them to 127.0.0.1 to be consistent.

Ah, sorry, then I misunderstood.

And, sorry for bringing it so late, but I noticed there are no tests for the branch. Or there are and I don't see them? We definitely should have them.

comment:12 in reply to: ↑ 11 ; follow-up: Changed 8 years ago by muks

  • Owner changed from muks to vorner

Hi vorner

Replying to vorner:

I also noticed one thing. Does the exception raised contain the name/rrclass text that is being used? So the user knows what is wrong?

I had updated the name parser code sometime ago to include the string being checked. I've updated the place where the InvalidRRClass exception is generated now.

Yes, that's exactly what I want to say (or, more precisely, that we should not catch the exception at all). At least if the exception happens in the situation described.

Updated. :)

And, sorry for bringing it so late, but I noticed there are no tests for the branch. Or there are and I don't see them? We definitely should have them.

This involves orchestrating 3 different components (xfrout, xfrin and auth) with the sqlite3+inmem configuration. The lettuce feature (#1794) tests this:

# xfrin
When I send bind10 the command Xfrin retransfer example.org IN 127.0.0.1 47807
# xfrin logs:
Then wait for new bind10 stderr message XFRIN_TRANSFER_SUCCESS not XFRIN_XFR_PROCESS_FAILURE
# auth logs:
Then wait for new bind10 stderr message AUTH_LOAD_ZONE

comment:13 in reply to: ↑ 12 ; follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to muks

Hello

Replying to muks:

And, sorry for bringing it so late, but I noticed there are no tests for the branch. Or there are and I don't see them? We definitely should have them.

This involves orchestrating 3 different components (xfrout, xfrin and auth) with the sqlite3+inmem configuration. The lettuce feature (#1794) tests this:

Yes, but I meant unit tests, where we would check reactions to various error
cases (wrong configuration, for example). Checking the command is sent with the
correct parameters is enough there (or, in case the zone is not cached in
in-memory, checking it is not sent), etc. The lettuce just tests the basic scenario.

Thanks

comment:14 in reply to: ↑ 13 ; follow-up: Changed 8 years ago by muks

  • Owner changed from muks to vorner

Replying to vorner:

Yes, but I meant unit tests, where we would check reactions to various error
cases (wrong configuration, for example). Checking the command is sent with the
correct parameters is enough there (or, in case the zone is not cached in
in-memory, checking it is not sent), etc. The lettuce just tests the basic scenario.

I've added tests for various datasource configurations now (both good and bad), and more asserts to the default cases.

comment:15 in reply to: ↑ 14 ; follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to muks

Hello

I'd have few comments on the new code:

  • The naming and description of the tests is not really helpful. Is it possible to describe what the situation is (eg. the zone is in memory, therefore it should be notified, etc).
  • You probably should use assertTrue and assertFalse instead of assertEqual(True, …).
  • Why is the _do_auth_loadzone taken out?
  • I don't see any test with broken configuration (eg. unknown filetypes, missing elements, wrong types of elements).

Thanks

comment:16 in reply to: ↑ 15 Changed 8 years ago by muks

  • Owner changed from muks to vorner

Replying to vorner:

  • The naming and description of the tests is not really helpful. Is it possible to describe what the situation is (eg. the zone is in memory, therefore it should be notified, etc).

I have updated the descriptions.

  • You probably should use assertTrue and assertFalse instead of assertEqual(True, …).

Done. :)

  • Why is the _do_auth_loadzone taken out?

The publish_xfrin_news() method in Xfrin calls _do_auth_loadzone(). As part of a different bug, you had added (9a4db008) testcase for publish_xfrin_news() which uses a mock method. This bug has to be tested when publish_xfrin_news() is called in the same setting. So I had to add a call to _do_auth_loadzone() in that method, except that this is not an Xfrin derived object but a different mock object. To test it, the actual _do_auth_loadzone() implementation had to be called. That's why it had to be moved.

BTW, there are two TestXfrinProcess? classes in that file, which you may want to look at. 9a4db008 introduced the second.

  • I don't see any test with broken configuration (eg. unknown filetypes, missing elements, wrong types of elements).

I've added more tests, but these are really checked by the spec schema anyway.

comment:17 follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to muks

Sorry for the late answer.

I have just one small thing. In the test_inmem_backend_type_is_missing test, if we leave out the filetype, it defaults to textual zone file. Therefore the .sqlite3 extension is slightly misleading. Could you change it?

I think it is OK to merge after this small change (I don't need to review this).

comment:18 Changed 8 years ago by vorner

  • Total Hours changed from 0 to 3.43

comment:19 in reply to: ↑ 17 Changed 8 years ago by muks

Replying to vorner:

Sorry for the late answer.

I have just one small thing. In the test_inmem_backend_type_is_missing test, if we leave out the filetype, it defaults to textual zone file. Therefore the .sqlite3 extension is slightly misleading. Could you change it?

I think it is OK to merge after this small change (I don't need to review this).

Done. :)

comment:20 Changed 8 years ago by muks

  • Resolution set to fixed
  • Status changed from reviewing to closed

Pushed to master:

* f987a99 [1790] Update backend filenames to match the filetype
* d872878 [1790] Add more testcases for missing keys
* b685e7c [1790] Use appropriate boolean assert calls
* ffdc8be [1790] Update testcase descriptions
* 6d2d11d [1790] Also handle the case where published is not OK
* 2867539 [1790] Use a simpler constructor
* 95b8522 [1790] Add unit tests using mock objects
* 631dc5b [master] Fix test class name
* d05c369 [1790] Don't catch exceptions caused by msgq failures
* 33d2123 [1790] When throwing RR parameter exceptions, include the bad string as well
* 5de8f51 [1790] Don't use the same log message id twice
* 613641b [master] Check for nonexistent db right after BIND starts (contd.)
* 9a93486 [master] Check for nonexistent db right after BIND starts
* d59c24a [1790] Untabify files
* ab86972 [1790] Address review comments (see full log)
* 84f5946 [1794] Add a lettuce test for updating in-memory after xfr feature
* 7eb4965 [master] Use IPv4 addresses in tests consistently
* c46763c [1790] Bugfix _do_auth_loadzone implementation
* a1528b4 [master] Update the xfrin_bind10.feature test
* c2d70c8 [1900] Check for CC status before starting inmemory_over_sqlite3.feature
* 9dc2c2e [1790] After successful zone transfers, send b10-auth loadzone if necessary

Resolving this bug, and #1794 and #1900 as fixed. Thank you for the reviews vorner.

Note: See TracTickets for help on using tickets.