Opened 9 years ago

Closed 8 years ago

#324 closed enhancement (fixed)

improve SQL data source performance

Reported by: each Owned by: jinmei
Priority: medium Milestone: Sprint-20120417
Component: data source Version:
Keywords: performance Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket: sqlite schema upgrade
Estimated Difficulty: 5 Add Hours to Ticket: 0
Total Hours: 15.5 Internal?: no

Description

Our use of indexes in the SQLite3 data source is apparently suboptimal. During the f2f meeting I met with Bill Karwin, who made some specific recommendations for improvement. This ticket is for testing them.

Subtickets

Attachments (1)

trac324.diff (15.8 KB) - added by stephen 8 years ago.
Suggested changes to src/lib/datasrc/sqlite3*.* files

Download all attachments as: .zip

Change History (31)

comment:1 Changed 9 years ago by stephen

  • Owner changed from each to UnAssigned
  • Status changed from new to assigned

comment:2 Changed 8 years ago by shane

  • Defect Severity set to N/A
  • Sub-Project set to DNS

This ticket makes #414 unnecessary.

comment:3 Changed 8 years ago by shane

  • Feature Depending on Ticket set to sqlite schema upgrade
  • Milestone set to Year 3 Task Backlog

This ticket improves performance and solves a performance regression described in #414, but has a design flaw where the schema is updated by the server code itself.

We should have a separate utility to upgrade schema. This is described in ticket #963, and that ticket should be completed first.

This ticket needs to be reworked so that it does not actually perform the upgrade if an earlier version is detected.

Since the changes are performance related, the software should be able to run with the previous schema, although with a warning.

As a final change, the database used for the tests should probably be built from a text file of SQL statements rather than shipped as a binary.

comment:4 Changed 8 years ago by shane

  • Milestone changed from Year 3 Task Backlog to Next-Sprint-Proposed

Perhaps we could include the relevant bits as part of our performance improvement work?

The schema change is:

  • src/lib/datasrc/sqlite3_datasrc.cc

    a b struct Sqlite3Parameters { 
    5454};
    5555
    5656namespace {
    5757
     58// SQL statements to create a database file. schema version 2.
    5859const char* const SCHEMA_LIST[] = {
     60    "BEGIN EXCLUSIVE TRANSACTION",
    5961    "CREATE TABLE schema_version (version INTEGER NOT NULL)",
    60     "INSERT INTO schema_version VALUES (1)",
     62    "INSERT INTO schema_version VALUES (2)",
    6163    "CREATE TABLE zones (id INTEGER PRIMARY KEY, "
    6264    "name STRING NOT NULL COLLATE NOCASE, "
  • src/lib/datasrc/sqlite3_datasrc.cc

    a b const char* const SCHEMA_LIST[] = { 
    6769    "rname STRING NOT NULL COLLATE NOCASE, ttl INTEGER NOT NULL, "
    6870    "rdtype STRING NOT NULL COLLATE NOCASE, sigtype STRING COLLATE NOCASE, "
    6971    "rdata STRING NOT NULL)",
    70     "CREATE INDEX records_byname ON records (name)",
    71     "CREATE INDEX records_byrname ON records (rname)",
     72    "CREATE INDEX records_byname ON records (zone_id, name, rdtype, "
     73    "sigtype, id, ttl, rdata)",
     74    "CREATE INDEX records_byrname ON records (zone_id, rname, rdtype, name)",
    7275    "CREATE TABLE nsec3 (id INTEGER PRIMARY KEY, zone_id INTEGER NOT NULL, "
    7376    "hash STRING NOT NULL COLLATE NOCASE, "
    7477    "owner STRING NOT NULL COLLATE NOCASE, "
    7578    "ttl INTEGER NOT NULL, rdtype STRING NOT NULL COLLATE NOCASE, "
    7679    "rdata STRING NOT NULL)",
    77     "CREATE INDEX nsec3_byhash ON nsec3 (hash)",
     80    "CREATE INDEX nsec3_byhash ON nsec3 (zone_id, hash, rdtype, "
     81    "id, ttl, rdata)",
     82    "COMMIT",
     83    NULL
     84};

The important bits the changes to the indexes.

comment:5 Changed 8 years ago by shane

Note that we would need to discuss doing this without an update utility!

comment:6 Changed 8 years ago by jelte

  • Estimated Difficulty changed from 0.0 to 5

comment:7 Changed 8 years ago by jelte

  • Milestone changed from Next-Sprint-Proposed to Sprint-20120306

comment:8 Changed 8 years ago by shane

See comment on ticket #963 for a recent discussion about version numbers.

However, for this ticket we should simply change the schema to modify the indexes and add an entry in the release notes.

comment:9 Changed 8 years ago by jinmei

I've taken a closer a look at it, and if I understand it correctly
we cannot solve this problem just by tweaking indexes. In my
experiment the major bottleneck is this query:

const char* const q_count_str = "SELECT COUNT(*) FROM records "

"WHERE zone_id=?1 AND rname LIKE (?2
'%');";

I suspect due to the "LIKE" sqlite3 cannot perform efficient search
using indices.

The new version of sqlite3 datasource also uses a "LIKE" search to see
if a wildcard match is the best one.

So, I suspect that if we really want to solve this problem we should
consider more fundamental restructure of the schema, e.g., storing
each label in a separate column.

In the newer version of implementation, one possible compromise would
be to allow a zone to have an option of "no wildcard match" (and say
it's admin's responsbility to ensure the zone has no wildcard). In
the new version there's no other place that requires select with
"LIKE", and thanks to the cleaner code organiation it looks quite easy
to implement this option. (Another reason for switching to the new
version sooner:-)

For the purpose of this sprint, my suggestion is to give up handling
it. Maybe we can close it with 'wontfix' or just leave it open.

comment:10 Changed 8 years ago by jinmei

  • Owner changed from UnAssigned to shane

comment:11 Changed 8 years ago by jinmei

See my followup message at bind10-dev:
https://lists.isc.org/pipermail/bind10-dev/2012-March/003167.html

We can actually solve this issue, and I've submitted the schema and
code changes to the "trac324new" branch. If someone wants to complete
this task by addressing compatibility issue somehow, please do so
(I've already started a new task, so I'll keep working on that one).

I'll make this ticket unassigned.

comment:12 Changed 8 years ago by jinmei

  • Owner changed from shane to UnAssigned

comment:13 Changed 8 years ago by jinmei

  • Owner changed from UnAssigned to jinmei
  • Status changed from assigned to accepted

comment:14 Changed 8 years ago by jinmei

  • Owner changed from jinmei to UnAssigned
  • Status changed from accepted to assigned

releasing it for now. if someone wants to take it over and complete it, please do so.
trac324new already includes necessary schema changes. the rest of the work is some cleanup
and to provide some kind of migration tool (IMO an ad hoc one is okay, like
compatcheck/sqlite3-difftbl-check.py)

comment:15 Changed 8 years ago by stephen

  • Owner changed from UnAssigned to stephen

comment:16 Changed 8 years ago by stephen

  • Owner changed from stephen to UnAssigned

Putting this back to UnAssigned as I got sidetracked on #963 and never completed this ticket (or checked any files in). As I am now away for over a week, someone else needs to complete it.

Suggested changes for the src/lib/datasrc/sqlite3_*.* files are in the attached patch. These changes get the version from the updated schema_version table, throwing an exception if the version is not 2.0. For the changes to work, all the databases used in the unit tests need to be upgraded to the new version of the schema. It is suggested that the b10-dbutil utility in ticket #963 be used for the conversion.

Note: the schema changes introduced by this ticket - essentially changing STRING columns to TEXT columns - should be accompanied by an increase in the schema version number. The opportunity is being taken to change the schema_version table to hold two columns - "version" and "minor". "version" holds the major version number and is so named so that old code (that expects a single column named "version") will be able to access the table without generating an exception. At the same time, the version number is being updated to 2.0. This means that code that reads "version" from schema_version can do one of two things:

  • If it gets a value of 1, it assumes the database is at schema version 1.0
  • if it gets a value other than one, it can interrogate the "minor" column for the minor version number.

Changed 8 years ago by stephen

Suggested changes to src/lib/datasrc/sqlite3*.* files

comment:17 Changed 8 years ago by jelte

in ticket #1846, we temporarily hardcoded the dbutil to return '1' as the current version, and disabled a number of tests related to the upgrade checking. When this ticket is done, those changes should be reversed (grep for 'Temporarily' in src/bin/dbutil and src/bin/dbutil/tests should point out all the cases)

comment:18 Changed 8 years ago by jinmei

  • Owner changed from UnAssigned to jinmei
  • Status changed from assigned to accepted

comment:19 Changed 8 years ago by jinmei

this task is now ready for review at branch trac324new2 (it has this
strange branch name for historical reasons).

As long as we are okay with the proposed schema change, the content
of this branch should be basically straightforward. I've applied
Stephen's patch, and re-enabled the dbutil part that was temporarily
disabled. It contains some additional cleanups.

The proposed changelog entry is:

416.?	[bug]*		jinmei, stephen
	Updated the DB schema used in the SQLite3 data source so it can
	use SQL indices more effectively.  The previous schema had several
	issues in this sense and could be very slow for some queries on a
	very large zone (especially for negative answers).  This change
	requires a major version up of the schema; use b10-dbutil to
	upgrade existing database files.  Note: 'make install' will fail
	unless old DB files installed in the standard location have been
	upgraded.
	(Trac #324, git TBD)
Last edited 8 years ago by jinmei (previous) (diff)

comment:20 Changed 8 years ago by jinmei

  • Owner changed from jinmei to UnAssigned
  • Status changed from accepted to reviewing

comment:21 Changed 8 years ago by jelte

  • Owner changed from UnAssigned to jelte

comment:22 follow-up: Changed 8 years ago by jelte

  • Owner changed from jelte to jinmei

Say, looks like this was already wrong in the original, or I may be reading this wrong, but it seems to me that the createDatabase() functions can leave an exclusive transaction active (if it was racing for the exclusive lock, and lost).

Minor (heh) naming issue; perhaps the member variables version_ and minor_ should be renamed for consistency (to either major_ and minor_ or version_major_ and version_minor_)

Also perhaps we should set the major version in testdata/newschema.sqlite3 to something we never expect to actually reach (so that it won't fail on this test the moment we change schema version again :) ). Though of course it can be updated then as well. Come to think of it, all these db's need updating then. You were saying something about creating these in the tests themselves? :)

changelog message final sentence should say 'files installed in the standard location have been' instead of 'has been' :)

For the rest this looks OK to me.

comment:23 in reply to: ↑ 22 ; follow-up: Changed 8 years ago by jinmei

Replying to jelte:

Thanks for the review.

Say, looks like this was already wrong in the original, or I may be reading this wrong, but it seems to me that the createDatabase() functions can leave an exclusive transaction active (if it was racing for the exclusive lock, and lost).

You are right, it's not safe. You're also right in that this is a
separate bug (not a regression introduced in this branch), and I
personally think we should stop creating the schema in the accessor
anyway (and the old implementation should be deprecated too), and this
error is quite unlikely to happen, so I was not sure if we really want
to fix it, especially in this ticket. Another reason I hesitated to
do it is that it would be very difficult to test because "the error is
unlike to happen".

That said, I tried to fix it in the branch anyway (4df83ec). But I
couldn't come up with a reasonable test scenario, so I didn't add any
test. So, I'm okay either with or without incorporating this
particular change to master.

Minor (heh) naming issue; perhaps the member variables version_ and minor_ should be renamed for consistency (to either major_ and minor_ or version_major_ and version_minor_)

Yeah I had the same impression. I still kept the inconsistent
variable naming in the initial implementation so they match the DB
column names. But now that you also pointed out the same thing, I
changed the variable names. (In any event they are internal
variables, so we can easily change them).

Also perhaps we should set the major version in testdata/newschema.sqlite3 to something we never expect to actually reach (so that it won't fail on this test the moment we change schema version again :) ). Though of course it can be updated then as well. Come to think of it, all these db's need updating then. You were saying something about creating these in the tests themselves? :)

As for newschema.sqlite3, yes, that's a good point. I changed the
version to 1000. Regarding the need for updating test DB files in
general, I think it's better to create test data at build/run time
dynamically (except for the intentionally unusual/broken ones). But
I believe it's beyond the scope of this task and should go to a
separate ticket (maybe one of "hardening" things).

changelog message final sentence should say 'files installed in the standard location have been' instead of 'has been' :)

Ah, good catch, thanks. I'll update the proposed text directly so I
can simply copy-paste it later.

For the rest this looks OK to me.

Okay, thanks. I'm giving the ticket back to you as I made a few non
trivial changes.

comment:24 Changed 8 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:25 in reply to: ↑ 23 ; follow-up: Changed 8 years ago by jelte

  • Owner changed from jelte to jinmei

Replying to jinmei:

Replying to jelte:

That said, I tried to fix it in the branch anyway (4df83ec). But I
couldn't come up with a reasonable test scenario, so I didn't add any
test. So, I'm okay either with or without incorporating this
particular change to master.

this looks fine and I propose we include it. One trivial change could be to add Exclusive to the class name so we don't accidentally use it for normal transactions in the future. But i'll leave that up to you :)

Yeah I had the same impression. I still kept the inconsistent
variable naming in the initial implementation so they match the DB
column names. But now that you also pointed out the same thing, I
changed the variable names. (In any event they are internal
variables, so we can easily change them).

ack

Also perhaps we should set the major version in testdata/newschema.sqlite3 to something we never expect to actually reach (so that it won't fail on this test the moment we change schema version again :) ). Though of course it can be updated then as well. Come to think of it, all these db's need updating then. You were saying something about creating these in the tests themselves? :)

As for newschema.sqlite3, yes, that's a good point. I changed the
version to 1000. Regarding the need for updating test DB files in
general, I think it's better to create test data at build/run time
dynamically (except for the intentionally unusual/broken ones). But
I believe it's beyond the scope of this task and should go to a
separate ticket (maybe one of "hardening" things).

ok. Speaking of which, I found one other problem; the zonemgr tests have the side-effect of doing just this, by creating a database called initdb.file, and leaving it (until you run make clean). So if you switch to and from this branch (or later, if you pull the merge of this into master) and don't run make clean, these tests will fail with a database error.

AFAICT, this file does not need to be kept, and we might as well delete it at the start of make check, i.e. something like

diff --git a/src/bin/zonemgr/tests/Makefile.am b/src/bin/zonemgr/tests/Makefile.
index 8c6b904..1903995 100644
--- a/src/bin/zonemgr/tests/Makefile.am
+++ b/src/bin/zonemgr/tests/Makefile.am
@@ -17,6 +17,7 @@ if ENABLE_PYTHON_COVERAGE
        rm -f .coverage
        ${LN_S} $(abs_top_srcdir)/.coverage .coverage
 endif
+       rm -f initdb.file
        for pytest in $(PYTESTS) ; do \
        echo Running test: $$pytest ; \
        $(LIBRARY_PATH_PLACEHOLDER) \

Okay, thanks. I'm giving the ticket back to you as I made a few non
trivial changes.

With these things I think it can be merged.

comment:26 in reply to: ↑ 25 Changed 8 years ago by jinmei

Replying to jelte:

That said, I tried to fix it in the branch anyway (4df83ec). But I
couldn't come up with a reasonable test scenario, so I didn't add any
test. So, I'm okay either with or without incorporating this
particular change to master.

this looks fine and I propose we include it. One trivial change could be to add Exclusive to the class name so we don't accidentally use it for normal transactions in the future. But i'll leave that up to you :)

Okay. I'll keep the class name simply because I believe we should
stop the DB creation here pretty soon.

As for newschema.sqlite3, yes, that's a good point. I changed the
version to 1000. Regarding the need for updating test DB files in
general, I think it's better to create test data at build/run time
dynamically (except for the intentionally unusual/broken ones). But
I believe it's beyond the scope of this task and should go to a
separate ticket (maybe one of "hardening" things).

ok. Speaking of which, I found one other problem; the zonemgr tests have the side-effect of doing just this, by creating a database called initdb.file, and leaving it (until you run make clean). So if you switch to and from this branch (or later, if you pull the merge of this into master) and don't run make clean, these tests will fail with a database error.

AFAICT, this file does not need to be kept, and we might as well delete it at the start of make check, i.e. something like

Hmm. While this would solve this particular situation, more
fundamentally I think this file must be always cleaned up after each
test case (and it probably has to be assured that it doesn't exist at
the beginning of each test case). Further, ideally I guess we should
even avoid creating such a file with help of some abstraction.

I revised the branch with the first type of cleanup. I think The
second one is beyond the scope of this ticket. The change is not
trivial, so I'll give the ticket back to you once again.

comment:27 Changed 8 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:28 follow-up: Changed 8 years ago by jelte

  • Owner changed from jelte to jinmei

ok, looks good, please go ahead and merge :)

comment:29 in reply to: ↑ 28 Changed 8 years ago by jinmei

Replying to jelte:

ok, looks good, please go ahead and merge :)

Thanks, merge done, closing.

comment:30 Changed 8 years ago by jinmei

  • Resolution set to fixed
  • Status changed from reviewing to closed
  • Total Hours changed from 0 to 15.5
Note: See TracTickets for help on using tickets.