Opened 5 years ago

Closed 4 years ago

Last modified 4 years ago

#3682 closed enhancement (complete)

Implement MySQL Host Data Source

Reported by: tomek Owned by: kalmus
Priority: medium Milestone: Kea1.0-beta
Component: database-all Version: git
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DHCP Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description (last modified by tomek)

We need a class that will be responsible for extracting host information
from MySQL. It should be derived from BaseHostDataSource? (to provide
HostDataSource? interface) and from MySQLConnection (see ticket #3681)
to reuse existing MySQL connection capabilities.

The reasonable name for this class seems to be MySQLHostDataSource.

Subtickets

Attachments (2)

0001-trac3682-MySQL-Host-Data-Source.patch (53.3 KB) - added by tomek 4 years ago.
Adam's patch
0001-3682-Implement-MySQL-Host-Data-Source-first_part.patch (99.3 KB) - added by tomek 4 years ago.
Second patch

Download all attachments as: .zip

Change History (18)

comment:1 Changed 5 years ago by tomek

  • Description modified (diff)
  • Summary changed from Implement MYSQLHostDataSource to Implement MySQL Host Data Source

comment:2 Changed 5 years ago by tomek

  • Milestone changed from Kea-proposed to Kea1.0

comment:3 Changed 5 years ago by tomek

  • Milestone changed from Kea1.0 to Kea0.9.1

comment:4 Changed 5 years ago by marcin

  • Milestone changed from Kea0.9.1 to Kea1.1

comment:5 Changed 5 years ago by kalmus

  • Owner set to kalmus
  • Status changed from new to assigned

comment:6 Changed 4 years ago by kalmus

Patch with MySqlHostDataSource? implementation was sent to Tomek for rewiev.

ChangeLog? entry proposal:

XXX. [func] kalmus
MySqlHostDataSource class implementation. Deriving from BaseHostDataSource
and MySqlConnection provides methods to extract and add Hosts objects from 
and to MySQL database. 
(Trac #3682, git XXX)
Last edited 4 years ago by kalmus (previous) (diff)

comment:7 Changed 4 years ago by kalmus

  • Owner changed from kalmus to tomek
  • Status changed from assigned to reviewing

Changed 4 years ago by tomek

Adam's patch

comment:8 Changed 4 years ago by tomek

  • Owner changed from tomek to kalmus

I did a quick check of this patch and have several basic comments:

  • Makefile is not updated, so those two new files are not built or linked.
  • There are no unit-tests. Each class need unit-tests, especially as large as this one.

comment:9 Changed 4 years ago by fdupont

I have some questions about this proposal:

  • is there a similar patch fro PostgreSQL?
  • when this alternate source will be used? I imagine it will complete the configuration at startup / reconfig but is something more dynamic in plans?
  • how conflicts between two sources are supposed to be handled? (fatal error at the first conflict, all errors reported and failure only at the end? Priority of a source over others? etc)

Last question: when this is supposed to be merged?

Changed 4 years ago by tomek

Second patch

comment:10 Changed 4 years ago by tomek

  • Owner changed from kalmus to tomek

Adam sent me a patch 2 days ago. It was impossible to find a conflict free commit, but I found one on trac3681_rebase that has relatively few conflicts. Then I rebased to latest trac3681_base, created trac3682 from it and fixed several compilation issues. The situation is as follows:

master   trac3682
  |        /
  |   trac3681_rebase
  |      /
merge-base
  |
  |

The way forward is:

  1. solve 2 outstanding comments from Marcin (see ticket:3681:comment:20) (adam)
  2. merge trac3681_rebase to master (tomek)
  3. review trac3682 (tomek)
  4. fix any issues raised in review (adam)
  5. merge 3682 (tomek)

I'll review 3682 as soon as possible.

comment:11 Changed 4 years ago by tomek

I did complete first part of the review. I reviewed all files and changes, except two main files: mysql_host_data_source.cc|h. I will review them tomorrow.


This review covers all changes on branch trac3682 up to and including the latest commit 010187c628e43a88447bceb4ffff9c54ad2744bc.


The code does not compile. The following error appears when trying
to compile code on Ubuntu 15.04 x64.

Command:

git checkout trac3682
autoreconf -i
./configure --with-gtest-source=/home/thomson/devel/gtest-1.7.0 --with-dhcp-mysql
make -j9

Error:

make[4]: Entering directory '/home/thomson/devel/kea/src/bin/dhcp4'
make  all-recursive
make[5]: Entering directory '/home/thomson/devel/kea/src/bin/dhcp4'
Making all in .
make[6]: Entering directory '/home/thomson/devel/kea/src/bin/dhcp4'
make[6]: Nothing to be done for 'all-am'.
make[6]: Leaving directory '/home/thomson/devel/kea/src/bin/dhcp4'
Making all in tests
make[6]: Entering directory '/home/thomson/devel/kea/src/bin/dhcp4/tests'
  CXX      dhcp4_unittests-dora_unittest.o
In file included from ../../../../src/lib/dhcpsrv/mysql_host_data_source.h:20:0,
                 from ../../../../src/lib/dhcpsrv/host_mgr.h:21,
                 from dora_unittest.cc:22:
../../../../src/lib/dhcpsrv/mysql_connection.h:20:19: fatal error: mysql.h: No such file or directory
 #include <mysql.h>
                   ^
compilation terminated.
Makefile:819: recipe for target 'dhcp4_unittests-dora_unittest.o' failed
make[6]: *** [dhcp4_unittests-dora_unittest.o] Error 1
make[6]: Leaving directory '/home/thomson/devel/kea/src/bin/dhcp4/tests'
Makefile:711: recipe for target 'all-recursive' failed
make[5]: *** [all-recursive] Error 1
make[5]: Leaving directory '/home/thomson/devel/kea/src/bin/dhcp4'
Makefile:495: recipe for target 'all' failed
make[4]: *** [all] Error 2

Several comments below recommend removing mysql related includes from headers.
Addressing those may fix this compilation problem.


pgslq_lease_mgr.cc
Changes in convertToDatabaseTime mention DataSource::MAX_DB_TIME, but the
class is now called DatabaseConnection::MAX_DB_TIME.

host_mgr.h
This header includes mysql_host_data_source.h. Is it necessary at all?
I think not, so this include should be removed. If it's really
necessary and it's unavoidable to include it, its inclusion must be
ifdefed. Otherwise it will not compile without --with-dhcp-mysql.

host_data_source_factory.h
HostDataSourceFactory? class must have class comment.

Please do not use tabs. Use spaces instead. My editor is set to tab=8
spaces, so the indentation is broken. To avoid such cases, please use
spaces.

create(), destroy(), instance() methods are not documented properly.
create: There's todo and dbaccess parameter lack description,
instance() should have @return clause.

Note: If a todo is expected to be visible by Doxygen, it should be
written in triple slash comments and with @ at front, e.g.:

   /// @todo Not implemented yet.

host_data_source_factory.cc
HostHostDataSourceFactory::destroy() should not use DHCPSRV_CLOSE_DB,
but use its own log entry instead. It should be clear that this is
host data source being closed and not to be confused with lease
manager.

host_mgr.cc
This file should not include mysql_host_data_source.h, it should
include host_data_source_factory.h instead.

The change in HostMgr::create() is useless. Its parameter should be
named and the code should call HostMgrFactory::create(...). The
todo comment should be removed.

host_mgr.h
This header must not include mysql_host_data_source.h header.

tests/generic_host_data_source_unittest.h
GenericHostDataSourceTest? class should have class comment.

tests/generic_host_data_source_unittest.cc
As currently written, the class is useless. As I understand it,
the plan forward is to develop tests for MySQLHostDataSource.

As MySQLHostDataSource is the first host data source and there
will be others (in particular, we'll get Postgres one day), it
makes sense to write the tests in GenericHostDataSourceTest? class and
call it from MySqlHostDataSourceTest?. That would be similar
approach to what is already done with GenericLeaseMgrTest? and
mysql_lease_mgr_unittest.cc.

mysql_host_data_source_unittest.cc
Need to clean up the destructor. Lines cannot be longer than 100
characters.

MySqlHostDataSourceTest?.checkTimeConversion test should be renamed
to MySqlConnection?.checkTimeConversion.


This concludes the first part of the review. Remaining two files
will be reviewed tomorrow.

comment:12 Changed 4 years ago by tomek

Another round of the review:

mysql_host_data_source.h
CURRENT_VERSION_VERSION, CURRENT_VERSION_MINOR should be defined in a
single place. They are now in mysql_host_data_source.h and
mysql_lease_mgr.h. If you want to version Host data source differently from lease database, please change the names.

MySqlHostDataSource class requires class description.

See Marcin's comments in #3681. He suggested that using multiple
inheritance is problematic. His comment also applies to
MySqlHostDataSource. Fortunately, it's easy to fix that. Convert
MySqlConnection to a member of the MySqlHostDataSource class.

There were many tabs in the code. I converted them to spaces. Please
pull the changes from trac3682.

get4(), get6() - the description of this method talks about throwing exception
if more then one host is found, but does not say which exception it
is.

add() - type of exception thrown is not specified in the comments.

Comment for getVersion() should be clear that it returns version
stored in the database, not hardcoded version of the software.

Comments for StatementIndex values are too long. Our coding guidelines
say that the code should be up to 80 characters if possible, but must
not exceed 100 characters.

I'll continue the review in the next couple hours.

comment:13 Changed 4 years ago by tomek

  • Owner changed from tomek to kalmus

The MySQL schema has dhcp_identifier_type, but its values
are never defined. http://kea.isc.org/wiki/HostReservationDesign
says that two values should be defined: 0 (hardware address),
1 (DUID). Depending on its value, dhcp_identifier contains
either hardware address or DUID.


mysql_host_data_source.cc
There are many lines over 100 characters long. 100 is the maximum
allowed.

DHCP_IDENTIFIER_MAX_LEN can be replaced by DUID::MAX_DUID_LEN.

I think there should be a define for HOSTNAME_MAX_LEN as well, but I
couldn't find it. Possibly it was not defined.

Several calls are incorrectly implemented. GET_HOST_HWADDR_DUID
has only one parametr dhcp_identifier. It should be dhcp_identifier and
dhcp_identifier_type.

MySqlHostDataSource::getAll() sets two parameters (duid and hw address),
but the GET_HOST_HWADDR_DUID query takes only one.

GET_HOST_PREFIX uses left join. I'm not an SQL expert by any means,
so all of the following are honest questions. What is the advantage
of using left join over doing a simple query in form
"SELECT hosts.host_id, dhcp_identifier, dhcp_identifier_type,
dhcp4_subnet_id, dhcp6_subnet_id, ipv4_address, hostname, dhcp4_client_classes,
dhcp6_client_classes FROM hosts, ipv6_reservations WHERE
hosts.host_id = ipv6_reservations.host_id AND ipv6_reservations.address = ?"

MySqlHostReservationExchange(): The identation of field initializations
is odd. It should be indented by 4 characters relative to
MySqlHostReservationExchange (so start in 8th column).

!createBindForSend() - there's commented out code in line 221. I'm on
a train right and can't compile the the code right now, but I believe
the problem here is that you're trying to take a reference of
a constant, which doesn't make sense. Try doing something like

my_bool my_false = MLM_FALSE;
bind_[1].is_null = &my_false;

Anyway, since couple lines earlier you cleared the whole bind_ array,
all fields are initialized to 0. In particular, is_null is initialized
to false, so there's no need to explicitly set it. So it would be better
to put in line 221 there the same comment as for other instances of
commented out is_null.

Error in line 438 converts dhcp identifier to char* and prints it out.
This can be an arbitrary binary data, so the exception could contain
a lot of garbage and potentially even segfault, as there's no
guarantee that dhcp identifier is null terminated.

Comment in line 444 is correct. SubnetId is currently uint32_t, so
you can initialize it as:

SubnetID ipv4_subnet_id_(0);

Conversions in lines 467 and 471 are incorrect in general case.
They will work for a single class. Note that ClientClasses? is
an extension of std::set<string>. If there are multiple classes,
they should be stored somehow in the DB, e.g. coma separated.
The opposite conversion should also be conducted.

getHostData() does not set ipv6_reservations.

Comments for all fields in MySqlHostReservationExchange? should
start with triple slashes, not double. The difference is that
triple slash is a Doxygen comment. / will appear in the
Developer's Guide,
won't.

Line 576 and following contains a comment about time conversion
methods. There are no time conversion methods here. Did you move
the code, but left the comment here?

MySqlHostDataSource::add should throw an exception if addHost
fails.

addHost method has poor interface. It either returns true
or throws an exception. Those two should not be mixed together.
It should either a) return true or false b) be void and throw
exception on failure. I don't really care which approach
the code uses, but it should be consistent with other
uses. That means that making it void and throw if necessary
will probably be the way to go.

The logic in get4() is flawed. inbind[1] will be set to hwaddr
if hwaddr is specified when calling this command. But what
if the database has host entry for duid B and the get4 is
called with get4(hwaddr A, duid B)? The code will set hwaddr
A and then fail to get the host. This flaw is repeated in
get6().

reset() in line 772 is not necessary. The result variable is
already null. The same is true for lines 803, 852, 883, 915 and
possibly others.

getDescription() could be a bit more descriptive.
Returning something like "Host data source that stores
host information in MySQL database" would do the trick here.


There are no unit-tests. We did discuss over e-mail and I was
asked to provide several examples. Here are my suggestions.

The code should be laid out similar to how tests are done
for lease mgrs. Important thing to note here is that MySQL
host data source is the first host data source, there will
be other implementations (e.g. Postgres based) in the future.

One easy way to do it would be to lay out the code like
this:

Define pointer to host data source in base_host_data_source.h

typedef boost::shared_ptr<BaseHostDataSource> HostDataSourcePtr;

generic_host_data_source_unittest.h would look like this:

/// @brief Base class for all host data source implementations
class GenericHostDataSourceTest : public ::testing:Test {

    GenericHostDataSourceTest();

	virtual ~GenericHostDataSourceTest() {
        /// nothing to do here.
	}

	/// @brief Generates example host reservations
	std::vector<HostPtr> generateHosts();

    testCase1();

    testCase2();

    ...;

    HostDataSourcePtr data_src_ptr_;
};

test*() methods implement actual tests, e.g. add a host entry,
then retrieve it and check that retrieved values are correct.
They would assume that data_src_ptr_ is initialized and
points to tested host data source.

This is how mysql_host_data_source_unittest.cc could look like:


class MySQLHostDataSourceTest : public GenericHostDataSourceTest {

    MySQLHostDataSourceTest()
	  : GenericHostDataSourceTest() {

        data_src_ptr_.reset(new MySQLHostDataSource());

		/// initialize or wipe actual MySQL tables
    }

    virtual ~MySQLHostDataSourceTest() {
        /// wipe MySQL tables
        
        /// destroy MySQLHostDataSource object.
        data_src_ptr_.reset();		
    }
};

TEST_F(MySQLHostDataSource, testCase1) {
    testCase1();
}

TEST_F(MySQLHostDataSource, testCase2) {
    testCase2();
}

/// more test cases 

Now, on to the test cases. This list is not exhaustive and should
give you an idea for more test cases. Make sure that all public method are tested. In particular, that means that there should be tests that get all hosts.

  1. basic - add host reserveration, retrieve it by address and confirm that all fields have correct value.
  1. add host reservation with hardware address, retrieve it by hardware address and make sure the values are correct.
  1. add host reservation with client-identifier, retrieve it by client-identifer and make sure the values are correct.
  1. add host reservation with hardware address X, try to retrieve host by client-identifier X, verify that the reservation is not returned.
  1. add host reservation with client identifier X, try to retrieve host by hardware address X, verify that the reservation is not returned.
  1. Add host reservation with hardware address X, try to retrieve host for hardware address X or client identifier Y, verify that the reservation is returned.
  1. Add host reservation with client identifier Y, try to retrieve host for hardware address X or client identifier Y, verify that the reservation is returned.
  1. Add host reservation with an IPv6 address, retrieve it and verify that IPv6 address is retrieved correctly.
  1. Add host reservation with an IPv6 address and IPv6 prefix, retrieve it and verify that both v6 address and prefix are retrieved correctly.
  1. Add host reservation with a multiple v4 client-classes, retrieve it and make sure that all client classes are retrieved properly.
  1. Add host reservation with a multiple v6 client-classes, retrieve it and make sure that all client classes are retrieved properly.
  1. (case of 10 + 11 together)
  1. Insert 100 host reservations, each with unique hardware address, try to retrieve each and make sure that the correct host is returned.
  1. Insert 10 host reservations for a given physical host (the same hardware address), but for different subnets (different subnet-ids). Make sure that getAll() returns them all correctly.
  1. Insert 10 host reservations for a given physical host (the same client-identifier), but for different subnets (different subnet-ids). Make sure that getAll() returns them correctly.
  1. Insert 10 host reservations for different subnets. Make sure that get4(subnet-id, ...), get6(subnet-id,...) calls return correct reservation.
  1. Test for errors, e.g. try to add multiple instances of the same host reservation and verify that the second and following attempts will throw exceptions.

During my review, I found couple of things that I couldn't figure out on my own. These are questions to Marcin, who implemented the BaseHostDataSource? interface.

Marcin: there's getAll4() method that could return multiple hosts from different subnets. Is this a preparation for multi-tenancy? Otherwise I see no reason why the address could be duplicated in different subnets.

Why do we need get6(prefix, prefix_len) call? It would be sufficient to do get6(prefix) as the prefixes can't overlap.


When addressing those questions, make sure you develop your patch against trac3682. I did spend significant time to apply your patch and resolve conflicts. I would appreciate if I didn't have to do it again. If you are not using it already, the git repo that have development branches is git://git.kea.isc.org/kea. See details here: http://kea.isc.org/wiki/GitGuidelines

comment:14 Changed 4 years ago by tomek

  • Description modified (diff)
  • Resolution set to complete
  • Status changed from reviewing to closed

Adam and I were talking about this ticket over e-mail. It's already getting out of proportion. The change set is currently in the 4k lines range. Adam sent a patch couple days ago. I did merge as c8834c62efac2470801c9633b4877a7d12f09fff. It seems to address all the issues raised, except the IPv6 functionality.

After some deliberation, I decided to push off IPv6 to a separate ticket. It's unfortunate to introduce IPv4-IPv6 disparity, but the reality is that IPv6 is more complex. There may be multiple v6 reservations, so they need to be stored in separate SQL table. This brings in a lot of extra complexity - a whole new class similar to MySQLHostExchange is required, also extended logic in add() and all get() methods. This is way beyond what's reasonable for a single ticket. So I made a decision to split this work. The part we have now in 3682 is going to be merged. The IPv6 part will be done in a separate ticket (#4212).

Having said that, I did a number of changes before the code was ready for merge:

  1. removed unnecessary includes (the code didn't compile with MySQL disabled).
  2. added constants for 0 and 1 as interface identifiers
  3. removed many instances of commented out code
  4. implemented some IPv6 unit-tests (that was before I realized how much work to be there is)
  5. fixed indentation and whitespaces

All changes were committed to trac3682 branch. This was branched from master around 4 months ago, so I branched another one (trac3682_rebase) from master and merged trac3682 to it (there were couple conflicts, but I resolved them). I did submit this new branch to our Jenkins build farm and it seems to be working fine, except the problem with distcheck on FreeBSD 10:

Making all in asiolink
--- all-recursive ---
Making all in .
--- libkea_asiolink_la-interval_timer.lo ---
--- libkea_asiolink_la-io_address.lo ---
--- libkea_asiolink_la-io_endpoint.lo ---
--- libkea_asiolink_la-io_service.lo ---
--- libkea_asiolink_la-interval_timer.lo ---
  CXX      libkea_asiolink_la-interval_timer.lo
--- libkea_asiolink_la-io_address.lo ---
  CXX      libkea_asiolink_la-io_address.lo
--- libkea_asiolink_la-io_endpoint.lo ---
CXX      libkea_asiolink_la-io_endpoint.lo
--- libkea_asiolink_la-io_service.lo ---
  CXX      libkea_asiolink_la-io_service.lo
--- libkea_asiolink_la-io_address.lo ---
In file included from ../../../../../src/lib/asiolink/io_address.cc:22:
In file included from /usr/local/include/boost/asio.hpp:21:
In file included from /usr/local/include/boost/asio/basic_datagram_socket.hpp:20:
In file included from /usr/local/include/boost/asio/basic_socket.hpp:20:
In file included from /usr/local/include/boost/asio/basic_io_object.hpp:19:
In file included from /usr/local/include/boost/asio/io_service.hpp:767:
In file included from /usr/local/include/boost/asio/impl/io_service.hpp:71:
In file included from /usr/local/include/boost/asio/detail/task_io_service.hpp:203:
/usr/local/include/boost/asio/detail/impl/task_io_service.ipp:261:42: error: unused parameter 'is_continuation' [-Werror,-Wunused-parameter]
    task_io_service::operation* op, bool is_continuation)

This error is not related to the changes in this ticket, but rather coming out from the fact that the master distcheck uses -Wno-error and the developer version uses -Werror. With the upcoming 1.0 beta freeze and the above explanation in mind, I decided to merge the code.

It is now merged to master. Woohoo! That was one of the longest tickets that managed to get its way to master. Thanks a lot for your hard work, Adam! Closing ticket.

comment:15 Changed 4 years ago by tomek

  • Milestone changed from Kea1.1 to Kea1.0

comment:16 Changed 4 years ago by tomek

  • Milestone changed from Kea1.0 to Kea1.0-beta

Milestone renamed

Note: See TracTickets for help on using tickets.