Opened 8 years ago

Closed 8 years ago

#1206 closed task (complete)

Create datasource factory

Reported by: stephen Owned by: jelte
Priority: medium Milestone: Sprint-20111011
Component: data source Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 4 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

Write the factory class that creates data sources.

Subtickets

Change History (15)

comment:1 Changed 8 years ago by jelte

  • Estimated Difficulty changed from 0 to 4

comment:2 follow-up: Changed 8 years ago by jelte

Do we want the datasource backends to be dynamically loadable (with dlopen()) or can we hardcode them and provide a name->type link? (optionally; 'for now')

comment:3 in reply to: ↑ 2 Changed 8 years ago by jinmei

Replying to jelte:

Do we want the datasource backends to be dynamically loadable (with dlopen()) or can we hardcode them and provide a name->type link? (optionally; 'for now')

I'm not 100% sure if I understand the 'hardcode' correctly, so let me
check: For a given type (name) of data source, like "sqlite3", we need
to detect that which derived class of DataSourceClient?
(e.g. DatabaseClient?) should be instantiated, and if it's
DatabaseClient?, which derived class of DatabaseAccessor? (e.g.
SQLite3Accessor) should be instantiated.

Does "hardcode" mean hardcoding the mapping of
"sqlite3"->DataSourceClient? (and then SQLite3Accessor) in the code?

If so, I guess this is orthogonal to whether we want to make the
datasource backends (especially database-based ones) dynamically
loadable (although they may have related points to be considered).

Regarding the mapping, we could either hardcode it for now or
introduce some kind of dynamic registration from the beginning. I
personally incline to do the latter, but if it turns out to require a
lot of work I also think it's okay to begin with hardcoding.

As for whether we want to make the actual data source implementation
dynamically loadable, I believe we do, especially because both BIND 9
and PowerDNS provide this capability. But I suspect it will be a
subject of (near) future extension, and we'll begin with the
implementation as part of libdatasrc (although I suspect
the additional implementation cost for dynamic loading is not so
high).

comment:4 Changed 8 years ago by jelte

The way I figure dynamic loading would work is that you specify a .so filename (or a filename that is automatically mapped to that .so filename, e.g. by adding _datasrc.so), and that would then be all the mapping would need; the .so is found, called, and initialized through some predefined functions. We get a bunch of .so files that all have the exact same external interface, and no specific mapping from name to class/function should be needed.

This is indeed mostly orthogonal from the factory-API-side (the API for the factory should not need much changes, if any, for either way). It is a nontrivial task though :) (the accessor API will need a bit of tweaking, see also the notes from yesterday: http://bind10.isc.org/wiki/WeeklyMinutes20110920)

comment:5 Changed 8 years ago by jelte

  • Owner set to jelte
  • Status changed from new to assigned

comment:6 Changed 8 years ago by jelte

turns out dlopen is indeed not really that much (more) work, we just need to add a RAII container for the lib reference and destructor function. ds-type specific initializers are needed anyway, so those can go under a fixed name in the generated dynlibs.

The bigger question is how to do config, but see my mail on -dev from earlier today
https://lists.isc.org/pipermail/bind10-dev/2011-September/002621.html

comment:7 Changed 8 years ago by jelte

  • Owner changed from jelte to UnAssigned
  • Status changed from assigned to reviewing

Ok, so I've made it into a DataSourceClientContainer? class, which serves as both a factory and a container; it autoloads (and clears) a dlopened library (through a LibraryHolder? class for RAII), then initialized it through 'factory' functions with a fixed name implemented by the various libraries, and cleans up through a destructor provided by those libs as well.

See documentation i added in client.h and factory.h

Because memory is still directly used by auth and some other stuff i did have to put in a few temporary tweaks to their makefiles.

comment:8 Changed 8 years ago by jelte

  • Milestone changed from Sprint-20110927 to Sprint-20111011

comment:9 Changed 8 years ago by stephen

  • Owner changed from UnAssigned to stephen

comment:10 follow-up: Changed 8 years ago by stephen

  • Owner changed from stephen to jelte

Reviewing version 5e14c4caafaa44b92134c5df01b726f435f46845

src/lib/datasrc/Makefile.am
Just a comment on the file names - the names of the memory data source files in the separate DLL are of the form memory_datasrc.*, but those for the SQLite3 data source are sqlite3_accessor.*. On the other hand, sqlite3_datasrc - and static_datasrc are included in the main library. This is confusing.

src/lib/datasrc/client.h
Good description of DataSourceClientContainer, thanks. It really helps the understanding of the code.

The description of the return value in getSym() should indicate that NULL will be returned if the symbols does not exist.

src/lib/datasrc/factory.h
There are a few #includes commented out.

I assume that the constructor takes a std::string because this is something read from the configuration, whereas getSym takes a char* because the names of the symbols retrieved are literal character strings in the code? (If so, a note in the getSym() header to this effect would be useful.)

Perhaps want to add another exception - DataSourceLibrarySymbolError - to distinguish between a failure to load a library and a failure to access a symbol in the library?

src/lib/datasrc/factory.cc
DataSourceClientContainer constructor: Suggest using reinterpret_cast instead of the C-style casts when casting the returned symbol to the correct type.

Question: in what directory is the DLL located? The reason for this question is that I would see that there are two locations that would be in use. The first is the library directory of the BIND 10 installation (which may be a directory specified on the "configure" command line). The second is a site-specific directory in which local data sources are stored. If the location of the latter were stored in the configuration database somewhere, and that were preserved across BIND 10 upgrades, it would be possible to upgrade BIND 10 without affecting the local data source.

src/lib/datasrc/memory_datasrc.cc
addError(): by the time the code gets here, "errors" should be non-null. (Especially as this is a function in the anonymous namespace and visible only to functions declared below it - and all paths through those functions create the error list.) Under these circumstances, can't "errors->add()" be called directly from within the code?

src/lib/datasrc/sqlite3_accessor.cc
checkConfig() can also be placed in the anonymous namespace (as it is in memory_datasrc.cc) - there is no reason for it to be visible outside the module.

comment:11 in reply to: ↑ 10 Changed 8 years ago by jelte

  • Owner changed from jelte to stephen

src/lib/datasrc/Makefile.am
Just a comment on the file names - the names of the memory data source files in the separate DLL are of the form memory_datasrc.*, but those for the SQLite3 data source are sqlite3_accessor.*. On the other hand, sqlite3_datasrc - and static_datasrc are included in the main library. This is confusing.

Yes. I also noticed some weird stuff while working on 1179, and I think we
should move things around and change some names, but wasn't sure if this was
the place to do it...

I'll create a new ticket for this

src/lib/datasrc/client.h
Good description of DataSourceClientContainer, thanks. It really helps the understanding of the code.

thank you :)

The description of the return value in getSym() should indicate that NULL will be returned if the symbols does not exist.

actually, on error, or if the symbol does not exist, an exception is raised. If
getSym() returns NULL, it means the symbol *does* exist, but has the value NULL
itself. I've added that to the description.

src/lib/datasrc/factory.h
There are a few #includes commented out.

oops, removed two of them. I realized last night that the two new classes
should actually be noncopyable, so I left that one in and actually use it.

I assume that the constructor takes a std::string because this is something read from the configuration, whereas getSym takes a char* because the names of the symbols retrieved are literal character strings in the code? (If so, a note in the getSym() header to this effect would be useful.)

added

Perhaps want to add another exception - DataSourceLibrarySymbolError - to distinguish between a failure to load a library and a failure to access a symbol in the library?

good idea, done

src/lib/datasrc/factory.cc
DataSourceClientContainer constructor: Suggest using reinterpret_cast instead of the C-style casts when casting the returned symbol to the correct type.

ohyeah, meant to do that, i knew i forgot something. It does remind me that i
do not actually know if every compiler accepts this... anyway changed for now.

Question: in what directory is the DLL located? The reason for this question is that I would see that there are two locations that would be in use. The first is the library directory of the BIND 10 installation (which may be a directory specified on the "configure" command line). The second is a site-specific directory in which local data sources are stored. If the location of the latter were stored in the configuration database somewhere, and that were preserved across BIND 10 upgrades, it would be possible to upgrade BIND 10 without affecting the local data source.

hmm. Interesting idea. I do think we can do something to that effect as a
separate enhancement, and for now leave them in the 'main system' standard lib
path.

src/lib/datasrc/memory_datasrc.cc
addError(): by the time the code gets here, "errors" should be non-null. (Especially as this is a function in the anonymous namespace and visible only to functions declared below it - and all paths through those functions create the error list.) Under these circumstances, can't "errors->add()" be called directly from within the code?

see below

src/lib/datasrc/sqlite3_accessor.cc
checkConfig() can also be placed in the anonymous namespace (as it is in memory_datasrc.cc) - there is no reason for it to be visible outside the module.

The answer to both is, 'ack, but for now'. I was actually considering making
checkConfig() an externally visible option, where errors may be null (if you
don't care about the errors but only need to know whether it's good).

Since i wasn't sure we needed it (we may know once we start turning it on), it
now lives halfway between the two options.

So for now here's my proposal; I do indeed move checkConfig to anonymous
namespace, but I leave the check in addError (it's not harmful, it's not on a
critical code path, and it will probably prevent a problem of forgetfullnes
should we decide to export the symbols)

Done in separate commit, you may respectfully disagree ;)

comment:12 Changed 8 years ago by stephen

  • Owner changed from stephen to jelte

Reviewed commit fdf1c88a53f5970aa4e6d55da42303ce7d4730f7

All OK, please merge.

comment:13 Changed 8 years ago by jelte

  • Owner changed from jelte to stephen

When merging master into this branch and running the tests on some different systems, I found a few new things that needed to be done (mostly because master has changed).

Could you take a look at these?

the commits are:
8c838cf57adef3c004b910b086513d9620147692
f7bb760f4d8290d52959ea83b090d1877e4ac9ee
and
00f4c38428153bb5ad99ba1cc40e9a204266dace
(in between are commits merged from master)

first is a similar issue needed for auth; the wrappers in master currently directly initialize sqlite3 datasource (pending 1253, wrappers for this tickets), so I directly compile them in as well for now

second is a nasty segfault i found on sunstudio, it looks like the destruction of an exception raised by
a dlopened() function can cause problems unless explicitely defined (in this case the datasourceconfigerror, but we might need more if we are going to raise other exceptions) If you have any idea why, I'd be happy to learn that :)

third is that cppcheck now fails, on the *old* api.

comment:14 Changed 8 years ago by stephen

  • Owner changed from stephen to jelte

8c838cf57adef3c004b910b086513d9620147692
OK

f7bb760f4d8290d52959ea83b090d1877e4ac9ee
00f4c38428153bb5ad99ba1cc40e9a204266dace
OK, but not sure why the indentation for the methods of the DataSourceXxxxError classes has been removed.

comment:15 Changed 8 years ago by jelte

  • Resolution set to complete
  • Status changed from reviewing to closed

Ohyeah, had addressed that last comment and merged this. Closing ticket.

Note: See TracTickets for help on using tickets.