#5651 closed enhancement (complete)

Extend lease cmds library to support fetch limits and ranges

Reported by: marcin Owned by: marcin
Priority: medium Milestone: Kea1.5
Component: hook-lease-cmds Version: git
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DHCP Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

One of the major use cases for the lease_cmds hooks library is to provide a way to synchronize leases between HA enabled servers. Currently the HA hooks library will fetch the entire lease database which requires the lease_cmds hooks library to create a JSON structure of the whole lease database. This eats the CPU and memory. In case of large number of leases in the database it may freeze the server for a long period of time.

In order to mitigate this issue the lease_cmds hooks librart must support fetching limited number of leases, e.g. 1000, 2000 leases etc. The controlling client should be able to specify last fetched leases with the limit and the server should return leases with addresses beyond this last fetched address. That way, the entire lease database may be returned in chunks with client specifying the start of the next chunk.

This ticket is about implementing this on the lease_cmds hooks library side.

Subtickets

Change History (13)

comment:1 Changed 22 months ago by marcin

  • Component changed from hooks to hook-lease-cmds

comment:2 Changed 22 months ago by marcin

  • Milestone changed from Kea-proposed to Kea1.5

comment:3 Changed 22 months ago by marcin

  • Owner set to marcin
  • Status changed from new to accepted

comment:4 Changed 21 months ago by marcin

  • Owner changed from marcin to UnAssigned
  • Status changed from accepted to reviewing

I extended all lease database backends to support paging leases. I also extended them to support fetching leases by address ranges. However, I left it unimplemented for Cassandra/IPv6 case, which is not as easy as one would think. I initially planned to add commands for both fetching leases by pages and ranges, but this ticket already grew to a size which made me think that it is better to implement commands that fetch by address ranges to another ticket (#5662).

This ticket adds lease4-get-page and lease6-get-page commands which allow for fetching all leases with multiple (perhaps many) calls. These are the commands that will be used in our HA library instead of lease4-get-all and lease6-get-all.

Proposed ChangeLog entry:

14XX.	[func]		marcin
	Implemented lease4-get-page and lease6-get-page commands
	in lease_cmds hooks library.
	(Trac #5651, git cafe)

comment:5 Changed 21 months ago by tmark

  • Owner changed from UnAssigned to tmark

comment:6 follow-up: Changed 21 months ago by tmark

It wouldn't be one of my reviews, if I didn't tell you to change the name of something:

The argument "count" is mis-named. It is a limit or maximum upon which to stop. Normally, "count"s are tallies of what you have. I would prefer "limit" to "count", it is more accurate and also a label people will recognize from SQL.


Returning the total lease count on every page is going to be pretty costly at runtime for large lease databases. I think this is performance hit with little real value to a user. Cassandra and Memfile both have to count lease stats every time you run the lease stats query. MySQL and PostgreSQL still
have to roll up the results into totals.

A client fetching all leases in pages should just fetch until EOF. If they want to know need to know how many to expect, they can explicitly issue a stats query first. In fact, they might very well do that to decide on the page size (i.e. "limit" :) to use.

For an admin who wants to see a dozen leases past address-X, the total number of leases in the DB is pretty useless.

I think we want this as efficient as possible, especially for HA, and should drop the total count from the response.


I think to state that order for leases cannot be guaranteed should be limited to apply only to Cassandra, as all the other back ends can and do guarantee it. I had to play with it again to recall that:

message="ORDER BY is only supported when the partition key is restricted by an EQ or an IN."

Heavy sigh. Did I mention I hate Cassandra? So far it's useful how?


  1. The use of INET6_ATON() has an availability issue. It was not part of MariaDB (often used in place of MySQL) until verion 10.0.12, which was released t in 2014. Yes, that's a long time ago, but for Centos 7 as an example, the yum package is MariaDB 5.5, hence:

MariaDB [test_1]> select INET6_ATON(address) from lease6;
ERROR 1305 (42000): FUNCTION test_1.INET6_ATON does not exist

You have to install 10 another way, which I haven't done yet:

https://mariadb.com/resources/blog/installing-mariadb-10-centos-7-rhel-7

It appears that getLeases6(from, to) is not used other than it is tested. If we do not use it anywhere I'd suggest getting rid of it or not use INET6_ATON for it.


MySQl an PostgreSQL GET_LEASE4_RANGE statements do not include order by address clause. It may be implicit because of the where but generally it is better to explicitly tell the back end what you want done. But again, these are not apparently used in live code and I would suggest simply dropping them.


src/hooks/dhcp/lease_cmds/lease_cmds.cc
src/hooks/dhcp/lease_cmds/lease_cmds.h

/ These commands attempt to retrieve 1 page of all IPv4 or IPv6

Acutally they do not fetch page N or ALL, they simply fetch a subset of at
most N leases beginning with those greater than the given lease.

The fact that HA will iteratively use it to fetch all of them is another matter.


Unit tests pass for PosgreSQL but fall down for MariaDB 5.5.

comment:7 Changed 21 months ago by tmark

  • Owner changed from tmark to marcin

comment:8 in reply to: ↑ 6 ; follow-up: Changed 21 months ago by marcin

  • Owner changed from marcin to tmark

Replying to tmark:

It wouldn't be one of my reviews, if I didn't tell you to change the name of something:

The argument "count" is mis-named. It is a limit or maximum upon which to stop. Normally, "count"s are tallies of what you have. I would prefer "limit" to "count", it is more accurate and also a label people will recognize from SQL.

Sure. I always leave one thing like this for you to cover real issues. lol.

One could argue that most of the time count in the request is equal to count in the response. So, why bother naming them differently. But, I changed the count to limit to make you happy.


Returning the total lease count on every page is going to be pretty costly at runtime for large lease databases. I think this is performance hit with little real value to a user. Cassandra and Memfile both have to count lease stats every time you run the lease stats query. MySQL and PostgreSQL still
have to roll up the results into totals.

I think the issue you're bringing with Memfile is not a real issue, given that everything is in memory. The overhead is relatively low.

A client fetching all leases in pages should just fetch until EOF. If they want to know need to know how many to expect, they can explicitly issue a stats query first. In fact, they might very well do that to decide on the page size (i.e. "limit" :) to use.

The size of the page, i.e. limit is rather driven by the time and intensity of preparing a single page of data, rather than fixed division of total number of leases per how many pages I want to get.

For an admin who wants to see a dozen leases past address-X, the total number of leases in the DB is pretty useless.

I guess, if someone is paging leases he usually is interested in broader scope than dozen leases. This is not the type of a query which returns a range of specific addresses. The addresses returned are random in general case. So, I'd argue that in most cases this mechanism is used to fetch the whole database or large part of it.

I think we want this as efficient as possible, especially for HA, and should drop the total count from the response.

I dropped it, but I am still thinking to make a stats query in the HA library to know how many leases are to come. This is much better than freezing the server for a minute or two without saying anything.


I think to state that order for leases cannot be guaranteed should be limited to apply only to Cassandra, as all the other back ends can and do guarantee it. I had to play with it again to recall that:

message="ORDER BY is only supported when the partition key is restricted by an EQ or an IN."

Heavy sigh. Did I mention I hate Cassandra? So far it's useful how?

I disagree with limiting the guarantee of ordering to Cassandra. I don't want people to rely on the ordering of leases. We may one day change the implementation of MySQL, PgSQL queries if we find that ORDER BY is a performance killer. The use case for leases paging is not to view ranges of addresses but to fetch stuff in chunks. The order of this stuff shouldn't matter. Otherwise, people will start building their systems on top of it and any change in the behavior would come as a big surprise.


  1. The use of INET6_ATON() has an availability issue. It was not part of MariaDB (often used in place of MySQL) until verion 10.0.12, which was released t in 2014. Yes, that's a long time ago, but for Centos 7 as an example, the yum package is MariaDB 5.5, hence:

MariaDB [test_1]> select INET6_ATON(address) from lease6;
ERROR 1305 (42000): FUNCTION test_1.INET6_ATON does not exist

You have to install 10 another way, which I haven't done yet:

https://mariadb.com/resources/blog/installing-mariadb-10-centos-7-rhel-7

It appears that getLeases6(from, to) is not used other than it is tested. If we do not use it anywhere I'd suggest getting rid of it or not use INET6_ATON for it.

I removed queries by ranges.


MySQl an PostgreSQL GET_LEASE4_RANGE statements do not include order by address clause. It may be implicit because of the where but generally it is better to explicitly tell the back end what you want done. But again, these are not apparently used in live code and I would suggest simply dropping them.

I removed this code. As for the ordering, I didn't want to get them in any particular order because it puts additional overhead on the backend. But, it doesn't matter anymore.


src/hooks/dhcp/lease_cmds/lease_cmds.cc
src/hooks/dhcp/lease_cmds/lease_cmds.h

/ These commands attempt to retrieve 1 page of all IPv4 or IPv6

Acutally they do not fetch page N or ALL, they simply fetch a subset of at
most N leases beginning with those greater than the given lease.

I slightly updated this comment.

The fact that HA will iteratively use it to fetch all of them is another matter.

This is primary use case for this feature. If not HA, something doing similar thing.


Unit tests pass for PosgreSQL but fall down for MariaDB 5.5.

comment:9 in reply to: ↑ 8 Changed 21 months ago by tmark

  • Owner changed from tmark to marcin

Replying to marcin:

Replying to tmark:

It wouldn't be one of my reviews, if I didn't tell you to change the name of something:

The argument "count" is mis-named. It is a limit or maximum upon which to stop. Normally, "count"s are tallies of what you have. I would prefer "limit" to "count", it is more accurate and also a label people will recognize from SQL.

Sure. I always leave one thing like this for you to cover real issues. lol.

You're too kind.

One could argue that most of the time count in the request is equal to count in the response. So, why bother naming them differently. But, I changed the count to limit to make you happy.


Returning the total lease count on every page is going to be pretty costly at runtime for large lease databases. I think this is performance hit with little real value to a user. Cassandra and Memfile both have to count lease stats every time you run the lease stats query. MySQL and PostgreSQL still
have to roll up the results into totals.

I think the issue you're bringing with Memfile is not a real issue, given that everything is in memory. The overhead is relatively low.

A client fetching all leases in pages should just fetch until EOF. If they want to know need to know how many to expect, they can explicitly issue a stats query first. In fact, they might very well do that to decide on the page size (i.e. "limit" :) to use.

The size of the page, i.e. limit is rather driven by the time and intensity of preparing a single page of data, rather than fixed division of total number of leases per how many pages I want to get.

For an admin who wants to see a dozen leases past address-X, the total number of leases in the DB is pretty useless.

I guess, if someone is paging leases he usually is interested in broader scope than dozen leases. This is not the type of a query which returns a range of specific addresses. The addresses returned are random in general case. So, I'd argue that in most cases this mechanism is used to fetch the whole database or large part of it.

I think we want this as efficient as possible, especially for HA, and should drop the total count from the response.

I dropped it, but I am still thinking to make a stats query in the HA library to know how many leases are to come. This is much better than freezing the server for a minute or two without saying anything.

Waiting for consensus of the group for progress bar.


I think to state that order for leases cannot be guaranteed should be limited to apply only to Cassandra, as all the other back ends can and do guarantee it. I had to play with it again to recall that:

message="ORDER BY is only supported when the partition key is restricted by an EQ or an IN."

Heavy sigh. Did I mention I hate Cassandra? So far it's useful how?

I disagree with limiting the guarantee of ordering to Cassandra. I don't want people to rely on the ordering of leases. We may one day change the implementation of MySQL, PgSQL queries if we find that ORDER BY is a performance killer. The use case for leases paging is not to view ranges of addresses but to fetch stuff in chunks. The order of this stuff shouldn't matter. Otherwise, people will start building their systems on top of it and any change in the behavior would come as a big surprise.

Ok, leave the text in.


  1. The use of INET6_ATON() has an availability issue. It was not part of MariaDB (often used in place of MySQL) until verion 10.0.12, which was released t in 2014. Yes, that's a long time ago, but for Centos 7 as an example, the yum package is MariaDB 5.5, hence:

MariaDB [test_1]> select INET6_ATON(address) from lease6;
ERROR 1305 (42000): FUNCTION test_1.INET6_ATON does not exist

You have to install 10 another way, which I haven't done yet:

https://mariadb.com/resources/blog/installing-mariadb-10-centos-7-rhel-7

It appears that getLeases6(from, to) is not used other than it is tested. If we do not use it anywhere I'd suggest getting rid of it or not use INET6_ATON for it.

I removed queries by ranges.


MySQl an PostgreSQL GET_LEASE4_RANGE statements do not include order by address clause. It may be implicit because of the where but generally it is better to explicitly tell the back end what you want done. But again, these are not apparently used in live code and I would suggest simply dropping them.

I removed this code. As for the ordering, I didn't want to get them in any particular order because it puts additional overhead on the backend. But, it doesn't matter anymore.


src/hooks/dhcp/lease_cmds/lease_cmds.cc
src/hooks/dhcp/lease_cmds/lease_cmds.h

/ These commands attempt to retrieve 1 page of all IPv4 or IPv6

Acutally they do not fetch page N or ALL, they simply fetch a subset of at
most N leases beginning with those greater than the given lease.

Less code. Cool. Except that the SQL statements are still present so MySQL stuff stillf falls down.
PostgreSQL statements are still present also.

I slightly updated this comment.

The fact that HA will iteratively use it to fetch all of them is another matter.

This is primary use case for this feature. If not HA, something doing similar thing.

Thanks


Unit tests pass for PosgreSQL but fall down for MariaDB 5.5.

Crash. Boom.

comment:10 follow-up: Changed 21 months ago by marcin

  • Owner changed from marcin to tmark

I removed the unused queries. Can you try again?

comment:11 in reply to: ↑ 10 Changed 21 months ago by tmark

  • Owner changed from tmark to marcin

Replying to marcin:

I removed the unused queries. Can you try again?

Ok, that's better. Unit tests pass now. Go ahead and merge.

comment:12 Changed 21 months ago by fdupont

I saw some comments about ordering: IMHO this comes with the idea of chunks: if you partition entries into chunks you need a notion of "next chunk" so at least an order between chunks...

comment:13 Changed 21 months ago by marcin

  • Resolution set to complete
  • Status changed from reviewing to closed

Merged with commit b056828212f7b206ff8bd07c097fd6f427d22d71.

Note: See TracTickets for help on using tickets.