#5572 closed enhancement (complete)

Design for coherent statistics when running multiple instances

Reported by: tomek Owned by: tmark
Priority: medium Milestone: Kea1.4
Component: statistics Version: git
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DHCP Feature Depending on Ticket:
Estimated Difficulty: 0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

OVERVIEW

This ticket is only loosely related to HA work Marcin is doing. HA setup should be covered, but the primary case to cover here is a scenario is running multiple Kea instances connected to the same database.

The major problem is with assigned addresses, but similar problem may occur for other stats (e.g. number of currently declined addresses).

Assume there are 2 servers: a and b, connected to the same db. There are some leases. The problem:

  1. server a starts, recalculates stats and determines there are 30 addresses being used right now, so sets its assigned-addresses stat to 30.
  2. server b starts and does the same. it too sets its own assigned-addresses to 30.
  3. server a processes new packet that assigns a new lease. Server A increases its assigned-addresses to 31.

Statistics are now out of sync.

In particular, depending on how the client traffic is split between the servers, it is possible one server will be handling more releases and lease expirations than the other. As a result, the assigned-addresses can over time become negative, as reported by one user.

GOAL
This ticket covers coming up with a proposal how to solve this problem. It should provide a design, perhaps with written down requirements, but no code should be written. Depending on the complexity of the solution proposed, we'll determine whether this is something we'll be able to tackle in 1.4 or not.

Subtickets

Change History (9)

comment:1 Changed 21 months ago by tomek

  • Milestone changed from Kea-proposed to Kea1.4

There are quite a quite different approaches how to solve this problem. There are my thoughts. Maybe they'll be useful to whoever takes this ticket:

  1. when running more than one Kea instance, we could have one that is designated. It would be the one to return stats. This on its own is not sufficient, as it would have to be aware of what other instances are doing.
  1. we could change existing stats or add new stats to better handle the situation. We could have new stat: leases-assigned-base during reconfig (which would be constant until next reconfig) and another stat called leases-assigned-delta, which would be increased and decreased according to leases operation. It could go negative.
  1. There would be one entity that would take the one value of leases-assinged-base and sum all values of lease-assigned-delta from each instance. It would work as long as you reconfigure all instances in roughly the same time.

The entity handling this statistical gymnastics could be implemented as a hook for ctrl-agent that could hide the details to some degree. Of course there are people who run multiple instances without ctrl-agent. But I think that's ok. This would encourage more people to run ctrl-agent and also if we expose those new stats, we'd give them tools to come up with the stats on their own.

Finally, there may be more complex computations related to statistics further down the road. One student participating in GSoC came up with an intriguing idea to have predictive models. It's too early to talk about specifics, but it's at least plausible that some time in the future more advanced logic related to statistics will be implemented. A hook would be a great container for that.

comment:2 Changed 21 months ago by tmark

  • Owner set to tmark
  • Status changed from new to assigned

comment:3 Changed 20 months ago by saura

I am the student mentioned above who is participating in GSoC '18 for Kea, ISC.

According to me, apart from the approaches already listed, we could also try to create global variable statistics for data that is common for both the instances of Kea.

The only challenge in this implementation is that the delivery of data to the global variable must be very efficient.

Any suggestions or discussions to this approach is welcome. Also for reference, this is a link to my GSoC proposal: https://docs.google.com/document/d/1IvAHxSJzG_U-Rn2O7Z7KxT4ul8fTCLoVVTqyOzBq2UY/edit?usp=sharing

comment:4 Changed 20 months ago by tmark

  • Owner changed from tmark to UnAssigned
  • Status changed from assigned to reviewing

The design proposal:

http://kea.isc.org/wiki/SharedLeaseStorageStats

is ready for review.

comment:5 Changed 20 months ago by tmark

  • Owner changed from UnAssigned to tmark

I am reclaiming this ticket long enough to gather some data on Kea performance with and without triggers.

Last edited 20 months ago by tmark (previous) (diff)

comment:6 Changed 20 months ago by tmark

  • Owner changed from tmark to UnAssigned

comment:7 Changed 20 months ago by tmark

Design posted to kea-users for comment

comment:8 Changed 20 months ago by tomek

  • Owner changed from UnAssigned to tmark

Ok, people had enough time to comment. I have discussed the design (and all approaches mentioned) and agree that the proposal (triggers + hook with extra stats retrieval commands) is a good way to solve this problem.

You can close this ticket and carry on with the implementation.

comment:9 Changed 20 months ago by tmark

  • Resolution set to complete
  • Status changed from reviewing to closed

Ticket is complete. Implementation is underway.

Note: See TracTickets for help on using tickets.