15:08:17 <colindixon> #startmeeting clustering hackers
15:08:17 <odl_meetbot> Meeting started Tue Jul 12 15:08:17 2016 UTC.  The chair is colindixon. Information about MeetBot at http://ci.openstack.org/meetbot.html.
15:08:17 <odl_meetbot> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:08:17 <odl_meetbot> The meeting name has been set to 'clustering_hackers'
15:08:20 <colindixon> #topic agenda bashing
15:08:51 <colindixon> #info Jan wants an update on the progress on BUG-5421 for a the singleton app in clustering
15:11:16 <colindixon> #info ashutosh wants to cove the leader election issues
15:11:52 <colindixon> #topic leader elections repeatedly happening
15:12:27 <colindixon> #info ashutosh says they're seeing spurious elections at high loads (I thnk 2 millions of txns per second)
15:12:59 <colindixon> #info the simplest idea seems to be to split out the heartbeat actor from the RAFT actor so it doesn't get bogged down
15:15:32 <colindixon> #info TomP says that when they moved to doing one actor per shard, but that resulted in a huge performance hit (not sure if it was 30% performance loss or droped to 30% of performance)
15:16:13 <colindixon> #info the result is that this protoctype was abandoned
15:19:38 <colindixon> #Info rovarga notes that the performance issues we're seeing are really an issue of us not pipelining transactions
15:28:39 <colindixon> #info rovarga says that we have an internal queue to the CDS which tracks the transactions which have been accepted, but not yet replicated, persisted, and committed, the only issue is that right now the RAFT actor doesn't offer an asynchronous way to do perist and replicate
15:29:56 <colindixon> #info rovarga thinks that the only things which need to be synchronously peristed are internal to the RAFT actor (though they will also force syncing the prior data to disk to maintain order as part of Akka peristence)
15:30:30 <colindixon> #info assuming internal synchronous peristence events are relatively rare compared to user data asks (which is real) that is likely to help peformance a lot
15:30:47 <colindixon> #topic singleton app template progress
15:31:04 <colindixon> #info it seems like with one comment from Robert around the API, things look good
15:32:06 <colindixon> #info TomP says that there two other patches he needs to do to move the EOS service to the new MD-SAL APIs, which are blocked on moving those APIs now
15:32:39 <colindixon> #info vaclav says that the new MD-SAL APIs should be merged either now, or very shortly
15:34:05 <colindixon> #link https://git.opendaylight.org/gerrit/#/q/owner:%22Vaclav+Demcak+%253Cvaclav.demcak%2540pantheon.sk%253E%22 the work is the ones talking about Bug 5421 here
15:35:15 <colindixon> #link https://bugs.opendaylight.org/show_bug.cgi?id=5421 this is the bug
15:36:05 <colindixon> #info Jan asks when this will be done, TomP says his hope will be it's ready for apps to start using by the end of the week
15:36:39 <colindixon> #info Jan is looking to have an example of something using the new APIs in the code by Boron release so people can use it
15:43:52 <colindixon> #info Jan and TomP agree that baking the EOS advertising of services into Blueprint makes sense to discuss at the summit and plan for carbon
15:44:08 <colindixon> #undo
15:44:08 <odl_meetbot> Removing item from minutes: <MeetBot.ircmeeting.items.Info object at 0x2597450>
15:44:10 <colindixon> #Info Jan asks how you advertise a singleton with Blueprint, TomP says right now it's up to the EOS to make sure to ignore the advertised services on the nodes where the singleton app is running but not the "owner", in the future that could be baked into Blueprint
15:44:14 <colindixon> #info Jan and TomP agree that baking the EOS advertising of services into Blueprint makes sense to discuss at the summit and plan for carbon
16:04:11 <colindixon> #info there's a long discussion about the internals of clustering, TomP says that we have two rate limiters: a txn rate limiter and an operation rate limiter, Moiz said the second one was still important but less than the first
16:04:42 <colindixon> #info TomP isn't sure if we still need the operation rate limiter now that we have batching
16:07:56 <colindixon> #endmeeting