Opened 2 years ago

Closed 2 years ago

#5487 closed enhancement (complete)

Cassandra must support recount leases statistics

Reported by: tomek Owned by: tmark
Priority: high Milestone: Kea1.4
Component: database-cassandra Version: git
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DHCP Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 16
Total Hours: 0 Internal?: no


Kea has a mechanism for maintaining lease statistics (number of leases assigned, declined, available etc). There are several scenarios where this statistic may become incorrect:

  • it is runtime, so after restart all counters are set to 0
  • when someone modifies the DB directly, bypassing Kea
  • when the configuration has changed and the new pools are bigger/smaller

For that reason, Kea supports statistics recalculation mechanism. This mechanism is triggered during start-up, after reconfiguration and soon it will be possible to trigger it with administrative command.

Cassandra doesn't support this feature yet, although it should. See LeaseMgr::recountLeaseStats{4,6} and possibly CqlLeaseMgrTest?.recountLeaseStats{4,6} (present, but disabled).


Change History (10)

comment:1 Changed 2 years ago by razvan.becheriu

  • Owner set to razvan.becheriu
  • Status changed from new to accepted

comment:2 Changed 2 years ago by tomek

  • Owner changed from razvan.becheriu to tmark
  • Status changed from accepted to assigned

As discussed on 2018-01-17 call, reassigning this ticket to Thomas.

comment:3 Changed 2 years ago by tmark

  • Add Hours to Ticket changed from 0 to 16
  • Owner changed from tmark to UnAssigned
  • Status changed from assigned to reviewing

There is an underlying issue with Cassandra and the potential for performance impacts when recomputing lease stats. For MySQL and Postgres backends, we are able to offload the computation of the stats to the RDBMS using standard SQL tactics of "group by" and "order by".

Cassandra, however, restricts the use of aggregation and ordering to being based on the partition key (i.e. component of the primary key). The stems from Cassandra's distributed data model. If data is spread among many nodes, aggregating on columns that are not used to partition data is problematic, as one needs the entire "table" content to do it properly.

The current solution is for Kea to query for the subnet id, lease type (v6 only), and lease state for all leases and then do the aggregation itself. Obviously, for sites that have millions of leases this could be an enormous impact. It depends upon the bandwidth between Kea and Cassandra nodes, and the frequency with which recounting is done. Currently, recounts are done every time a new configuration is successfully committed. On systems with a lot of bandwidth that do not frequently replace their configuration it may not be that much of a concern.

Toward that end, we should add some configurable behavior such that recounts can be done upon startup, after reconfig, or only on-demand. The last option would require a new command be added instigate statistics recounting. This work should be done under a separate ticket(s).

Longer term, we may need to improve the current solution. There is talk of better support for aggregation in Cassandra. We may find a more efficient way to do it based on extensions or as we gain Cassandra experience. Or we may have to consider persistence of lease stats in Cassandra via triggers to lease4 and lease6 table updates.

Ticket is ready for review.

comment:4 Changed 2 years ago by fdupont

  • Owner changed from UnAssigned to fdupont

comment:5 Changed 2 years ago by fdupont

  • Owner changed from fdupont to tmark

Reviewed. Fixed spelling. As I use macOS I had to pass #5494 patch and to add some missing override keywords. I added to a check for the support of override (we have already final so I don't expect a problem) and constexpr (clearly more C++11 and greater specific) so we can use them outside Cassandra code.

Code is OK, the 2 new unit tests (slowly) passed but please consider to get #5494 reviewed and merged before this ticket.

comment:6 Changed 2 years ago by fdupont

Two additional comments:

  • I have to fix the src/lib/dhcpsrv/ because of the MD5 use by Cassandra host table (more on its ticket)
  • I plainly agree about the idea to make the database itself to manage the state counters: obviously it is the right solution on the long term.

comment:7 Changed 2 years ago by fdupont

  • Priority changed from medium to high

Prompt the priority as this ticket because of blocks things.

comment:8 follow-up: Changed 2 years ago by tmark

This ticket would block nothing if you had not merged unrelated changes into it. It is one of the primary reasons why WE DO NOT DO THAT.

Last edited 2 years ago by tmark (previous) (diff)

comment:9 in reply to: ↑ 8 Changed 2 years ago by fdupont

Replying to tmark:

This ticket would block nothing if you had not merged unrelated changes into it. It is one of the primary reasons why WE DO NOT DO THAT.

=> yes, I should have reopened the last ticket where the problem was introduced. I tried to go too fast so I got the opposite result...

comment:10 Changed 2 years ago by tmark

  • Resolution set to complete
  • Status changed from reviewing to closed

Changes merged with git c807388d581ee1c3e479324f3c399f27feba1c96.
Added ChangeLog? entry #1361.

Ticket is complete.

Note: See TracTickets for help on using tickets.