Opened 8 years ago

Closed 7 years ago

#2281 closed task (fixed)

use new in-memory data source in the static data source

Reported by: jinmei Owned by: muks
Priority: high Milestone: Sprint-20130219
Component: data source Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 4 Add Hours to Ticket: 0
Total Hours: 1.55 Internal?: no

Description

This is a deferred cleanup of #2219.

In #2219, the static data source client still internally uses the
older in-memory data source client. We should switch it to the new
one. It shouldn't be difficult, but will be multi-line changes
because we'll need to create and use a separate memory segment.

When this is done, we should be able to remove old in-memory
implementations, and I suggest doing that within this ticket. Those
include

  • datasrc/memory_datasrc.cc
  • datasrc/memory_datasrc.h
  • datasrc/memory_datasrc_link.h
  • datasrc/tests/memory_datasrc_unittest.cc (check if (reasonably) ported all tests to the tests for the new implementation).

Subtickets

Change History (19)

comment:1 Changed 8 years ago by muks

  • Estimated Difficulty changed from 0 to 4

comment:2 Changed 7 years ago by jinmei

  • Milestone set to Next-Sprint-Proposed

comment:3 Changed 7 years ago by jinmei

  • Milestone set to Next-Sprint-Proposed

comment:4 Changed 7 years ago by jinmei

This is causing maintenance overhead. I think it's time to complete this cleanup.

Also note that we should remove the contexttest-almost-obsolete.zone data file
under datasrc/tests/testdata when this task is completed.

Last edited 7 years ago by jinmei (previous) (diff)

comment:5 Changed 7 years ago by jinmei

  • Milestone set to Next-Sprint-Proposed
  • Priority changed from medium to high

comment:6 Changed 7 years ago by jinmei

  • Milestone set to Next-Sprint-Proposed

comment:7 Changed 7 years ago by shane

  • Milestone changed from Previous-Sprint-Proposed to Next-Sprint-Proposed

comment:8 Changed 7 years ago by jelte

  • Milestone changed from Next-Sprint-Proposed to Sprint-20130219

comment:9 Changed 7 years ago by muks

  • Owner set to muks
  • Status changed from new to assigned

Picking

comment:10 Changed 7 years ago by muks

ChangeLog for this bug:

XXX.	[func]*		muks
	The old in-memory data source code has been completely removed
	from the tree. The static datasource which serves the BIND./CH
	zone has been migrated to use the new in-memory data source. The
	removal of the old code should not affect any existing behavior as
	we switched to the new in-memory data source a while ago. A number
	of tests were updated. Some obsolete message IDs were
	removed.
	(Trac #2281, git bae3798603affdb276f370c1ac6b33b011a5ed4f)

comment:11 Changed 7 years ago by muks

  • Owner changed from muks to UnAssigned
  • Status changed from assigned to reviewing

This bug is ready for review.

Please note that the removal of the old in-memory code involves significant changes, so it may not be as trivial a review as the title of this ticket suggests. Please carefully check that the adjusted tests were correctly changed.

I have tried to separate implementation and deletion changes into different commits, but some commits may contain both. For the b10-auth query unittests changes, I've written the description in the git log. Please read it.

comment:12 Changed 7 years ago by vorner

  • Owner changed from UnAssigned to vorner

comment:13 follow-up: Changed 7 years ago by vorner

  • Owner changed from vorner to muks

Hello

It seems there's some problem with the makefiles, since distcheck complains:

libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I../../../../src/lib/datasrc -I../../.. -I../../../../src/lib -I../../../src/lib -I../../../../src/lib/dns -I../../../src/lib/dns -DOS_LINUX -I../../../../ext/asio -I../../../../ext/coroutine -DASIO_DISABLE_THREADS=1 -Wall -Wextra -Wwrite-strings -Woverloaded-virtual -Wno-sign-compare -Werror -fPIC -g -O2 -MT static_datasrc_link.lo -MD -MP -MF .deps/static_datasrc_link.Tpo -c ../../../../src/lib/datasrc/static_datasrc_link.cc  -fPIC -DPIC -o .libs/static_datasrc_link.o
../../../../src/lib/datasrc/static_datasrc_link.cc:16:26: fatal error: factory_link.h: No such file or directory
compilation terminated.
make[7]: *** [static_datasrc_link.lo] Error 1
make[7]: Leaving directory `/tmp/bind10-1/bind10-20130205/_build/src/lib/datasrc'

Is it OK to pass the config of the static data source to the zone table segment? I don't think the config parameter has been described, but it seems strange that such low-level thing would know anything about configuration of a very specific data source.

If you don't link the in-memory library, how does the code get in? Is it some magic I didn't notice, or is it by accident already there? Or, who does link to the in-memory?

I don't understand why the change in the python tests is needed. Can you explain it, please? Why removing in-memory from there? I would understand adding the static one too.

Should the mock return the longest partial match if there's no exact one? It seems to pick the first one it finds, which might as well be something shorter than optimal.

If you used reverse names for the names in the map (still in the mock), it would be possible to find the longest match by some trick with map::lower_bound, I think. Not that it would be important for the performance.

comment:14 in reply to: ↑ 13 ; follow-up: Changed 7 years ago by muks

  • Owner changed from muks to vorner

Hi Michal

Replying to vorner:

Hello

It seems there's some problem with the makefiles, since distcheck complains:

libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I../../../../src/lib/datasrc -I../../.. -I../../../../src/lib -I../../../src/lib -I../../../../src/lib/dns -I../../../src/lib/dns -DOS_LINUX -I../../../../ext/asio -I../../../../ext/coroutine -DASIO_DISABLE_THREADS=1 -Wall -Wextra -Wwrite-strings -Woverloaded-virtual -Wno-sign-compare -Werror -fPIC -g -O2 -MT static_datasrc_link.lo -MD -MP -MF .deps/static_datasrc_link.Tpo -c ../../../../src/lib/datasrc/static_datasrc_link.cc  -fPIC -DPIC -o .libs/static_datasrc_link.o
../../../../src/lib/datasrc/static_datasrc_link.cc:16:26: fatal error: factory_link.h: No such file or directory
compilation terminated.
make[7]: *** [static_datasrc_link.lo] Error 1
make[7]: Leaving directory `/tmp/bind10-1/bind10-20130205/_build/src/lib/datasrc'

This file has been renamed and added to the Makefile.am SOURCES.
Documentation for this module's interface has also been added.

Is it OK to pass the config of the static data source to the zone
table segment? I don't think the config parameter has been described,
but it seems strange that such low-level thing would know anything
about configuration of a very specific data source.

I have replaced this with a NullElement now. Unless the syntax/purpose
of the config as passed to the ZoneTableSegment is defined, we can't
be sure what to pass there. A FIXME comment has been added.

If you don't link the in-memory library, how does the code get in? Is
it some magic I didn't notice, or is it by accident already there? Or,
who does link to the in-memory?

I'm not sure too. Unless -rdynamic is specified when linking the main
executable, its symbols are not exposed to the module. But what seems to
be happening is that its symbols are indeed being exposed and linked at
dlopen time.

  • Firstly, it works properly without being linked to the memory datasource library. So it's finding its symbols at runtime.
  • The reason why this library was removed from being linked to the static data source was that b10-auth's unit tests (and some other tests) were crashing. This was because, during the tests, the factory was used to repeatedly load and unload .sos within the same process. During this, the dlclose() was finalizing the .so's global storage, and destroying a logger inside the memory datasource library, which further resulted in singletons within liblog being finalized. This wouldn't happen unless they were commonly linked.

I don't know how it's finding its symbols.

I don't understand why the change in the python tests is needed. Can
you explain it, please? Why removing in-memory from there? I would
understand adding the static one too.

Now I'm not sure if my change is correct. The old in-memory data source
_module_ has been removed, so I thought that asking the factory to
return the 'memory' source would be wrong.

Should the mock return the longest partial match if there's no exact
one? It seems to pick the first one it finds, which might as well be
something shorter than optimal.

If you used reverse names for the names in the map (still in the
mock), it would be possible to find the longest match by some trick
with map::lower_bound, I think. Not that it would be important for the
performance.

From what I observed, only 1 or 2 zones at most are added into the map
by any unittest. So any overhead is at best constant. Due to the number
of zones, and the data being used, it does pick the correct zone to
return. I didn't want to spend more time optimizing it.

But now that you have mentioned it, I have changed it to do what you
describe above. :)

comment:15 in reply to: ↑ 14 ; follow-up: Changed 7 years ago by vorner

  • Owner changed from vorner to muks

Hello

Replying to muks:

If you don't link the in-memory library, how does the code get in? Is
it some magic I didn't notice, or is it by accident already there? Or,
who does link to the in-memory?

I'm not sure too. Unless -rdynamic is specified when linking the main
executable, its symbols are not exposed to the module. But what seems to
be happening is that its symbols are indeed being exposed and linked at
dlopen time.

[...]

I don't know how it's finding its symbols.

I wonder, is it possible the in-memory is directly linked into the data source library itself? Then the symbols would be available.

I just fear if it can happen the data source can't be loaded because the in-memory is not available under some circumstances.

Now I'm not sure if my change is correct. The old in-memory data source
_module_ has been removed, so I thought that asking the factory to
return the 'memory' source would be wrong.

That seems like a reason. Can you add a note to the test saying it uses the static data source as an abuse, because in-memory doesn't exist and that the static contains the in-memory one?

From what I observed, only 1 or 2 zones at most are added into the map
by any unittest. So any overhead is at best constant. Due to the number
of zones, and the data being used, it does pick the correct zone to
return. I didn't want to spend more time optimizing it.

Thanks. I wasn't worried about the performance that much (again, these are tests, who cares). But it could be very surprising if someone added another test where it returned the wrong partial match. It could take long time to discover that. But this version looks good.

comment:16 in reply to: ↑ 15 ; follow-up: Changed 7 years ago by muks

  • Owner changed from muks to vorner

Replying to vorner:

I wonder, is it possible the in-memory is directly linked into the
data source library itself? Then the symbols would be available.

I just fear if it can happen the data source can't be loaded because
the in-memory is not available under some circumstances.

libb10-datasrc does link to the new in-memory datasource library. But
this forms a part of the program that links to libb10-datasrc. I'm not
sure how missing symbols inside the static DS's .so get resolved to
the ones in the main process without -rdynamic.

This is why I don't know how it's happening.

That seems like a reason. Can you add a note to the test saying it
uses the static data source as an abuse, because in-memory doesn't
exist and that the static contains the in-memory one?

Added. :)

Thanks. I wasn't worried about the performance that much (again, these
are tests, who cares). But it could be very surprising if someone
added another test where it returned the wrong partial match. It could
take long time to discover that. But this version looks good.

Nod. :)

comment:17 in reply to: ↑ 16 Changed 7 years ago by vorner

  • Owner changed from vorner to muks

Hello

Replying to muks:

libb10-datasrc does link to the new in-memory datasource library. But
this forms a part of the program that links to libb10-datasrc. I'm not
sure how missing symbols inside the static DS's .so get resolved to
the ones in the main process without -rdynamic.

This is why I don't know how it's happening.

Hmm. I don't know. But it obviously takes them from the application. Let's see if it breaks on the build bots.

I think it is OK to merge. I'm just not sure if we want to do it before the branch for release, because of the linking problem, or wait.

comment:18 Changed 7 years ago by vorner

  • Total Hours changed from 0 to 1.55

comment:19 Changed 7 years ago by muks

  • Resolution set to fixed
  • Status changed from reviewing to closed

Merged to master branch in commit dd0f73adac540774a6cb2c99faece4d6c4b90353:

* 67940bd [2281] Add comments on use of static datasrc instead of memory datasrc in tests
* 8e33f9e [2281] Use map::lower_bound() to speed up partial match searches
* d404649 [2281] Pass a null config to ZoneTableSegment::create() for now
* cfd00e9 [2281] Add documentation for the static datasource shared object's public symbols
* d7f14ea [2281] Rename factory_link.h to static_datasrc.h
* 6a06b31 [2281] Re-enable the query tests
* 11633f6 [2281] Enable factory test for static DS with bad zone file
* 899432b [2281] Don't use memory source from the factory
* c30f889 [2281] Don't link to libdatasrc_memory from the static data source
* 3a9d8e2 [2281] Remove unused message IDs and move some others to datasrc::memory
* fbaab5f [2281] Remove the old in-memory data source code
* 60a9f3a [2281] Use the new in-memory data source for the static data source

Resolving as fixed. Thank you for the reviews Michal.

Note: See TracTickets for help on using tickets.