#opendaylight-clustering: clustering hackers
Meeting started by colindixon at 15:08:17 UTC
(full logs).
Meeting summary
- agenda bashing (colindixon, 15:08:20)
- Jan wants an update on the progress on BUG-5421
for a the singleton app in clustering (colindixon,
15:08:51)
- ashutosh wants to cove the leader election
issues (colindixon,
15:11:16)
- leader elections repeatedly happening (colindixon, 15:11:52)
- ashutosh says they're seeing spurious elections
at high loads (I thnk 2 millions of txns per second) (colindixon,
15:12:27)
- the simplest idea seems to be to split out the
heartbeat actor from the RAFT actor so it doesn't get bogged
down (colindixon,
15:12:59)
- TomP says that when they moved to doing one
actor per shard, but that resulted in a huge performance hit (not
sure if it was 30% performance loss or droped to 30% of
performance) (colindixon,
15:15:32)
- the result is that this protoctype was
abandoned (colindixon,
15:16:13)
- rovarga notes that the performance issues we're
seeing are really an issue of us not pipelining transactions
(colindixon,
15:19:38)
- rovarga says that we have an internal queue to
the CDS which tracks the transactions which have been accepted, but
not yet replicated, persisted, and committed, the only issue is that
right now the RAFT actor doesn't offer an asynchronous way to do
perist and replicate (colindixon,
15:28:39)
- rovarga thinks that the only things which need
to be synchronously peristed are internal to the RAFT actor (though
they will also force syncing the prior data to disk to maintain
order as part of Akka peristence) (colindixon,
15:29:56)
- assuming internal synchronous peristence events
are relatively rare compared to user data asks (which is real) that
is likely to help peformance a lot (colindixon,
15:30:30)
- singleton app template progress (colindixon, 15:30:47)
- it seems like with one comment from Robert
around the API, things look good (colindixon,
15:31:04)
- TomP says that there two other patches he needs
to do to move the EOS service to the new MD-SAL APIs, which are
blocked on moving those APIs now (colindixon,
15:32:06)
- vaclav says that the new MD-SAL APIs should be
merged either now, or very shortly (colindixon,
15:32:39)
- https://git.opendaylight.org/gerrit/#/q/owner:%22Vaclav+Demcak+%253Cvaclav.demcak%2540pantheon.sk%253E%22
the work is the ones talking about Bug 5421 here (colindixon,
15:34:05)
- https://bugs.opendaylight.org/show_bug.cgi?id=5421
this is the bug (colindixon,
15:35:15)
- Jan asks when this will be done, TomP says his
hope will be it's ready for apps to start using by the end of the
week (colindixon,
15:36:05)
- Jan is looking to have an example of something
using the new APIs in the code by Boron release so people can use
it (colindixon,
15:36:39)
- Jan asks how you advertise a singleton with
Blueprint, TomP says right now it's up to the EOS to make sure to
ignore the advertised services on the nodes where the singleton app
is running but not the "owner", in the future that could be baked
into Blueprint (colindixon,
15:44:10)
- Jan and TomP agree that baking the EOS
advertising of services into Blueprint makes sense to discuss at the
summit and plan for carbon (colindixon,
15:44:14)
- there's a long discussion about the internals
of clustering, TomP says that we have two rate limiters: a txn rate
limiter and an operation rate limiter, Moiz said the second one was
still important but less than the first (colindixon,
16:04:11)
- TomP isn't sure if we still need the operation
rate limiter now that we have batching (colindixon,
16:04:42)
Meeting ended at 16:07:56 UTC
(full logs).
Action items
- (none)
People present (lines said)
- colindixon (28)
- odl_meetbot (4)
Generated by MeetBot 0.1.4.