15:05:54 #startmeeting clustering hackers 15:05:54 Meeting started Tue Jul 26 15:05:54 2016 UTC. The chair is colindixon. Information about MeetBot at http://ci.openstack.org/meetbot.html. 15:05:54 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:05:54 The meeting name has been set to 'clustering_hackers' 15:06:01 #topic keepalive actor 15:06:27 #info TomP says that he's been working on a separate KeepAlive timer to try to reduce spurious claims that nodes are down 15:07:18 #info I think the theory is that the actor will be scheduled better than the normal actors which might suffer from the other things going through them 15:07:56 #info TomP is trying to use a separate dispatcher for the keep alives, so that they shouldn't wind up behind anything else 15:11:58 #info this will help spurious timeouts because actors are busy 15:12:16 #info this won't help spurious timeouts because of garbage collection pauses 15:15:59 #info TomP says it would help a lot if Muthu could look at and test the patches 15:26:14 #info jan asks if TomP is coordinating with Robert on the changes here 15:27:31 #info TomP says that the stuff he's doing is orthogonal from what Robert is doing 15:29:17 #topic changing global RPC behavior 15:29:42 #info Jan says that his view is that global RPC deliver is wrong, it should be delivered remotely if there's a registered handler 15:30:43 #info Jan says that his view is that global RPC delivery is wrong, it should be delivered remotely if there's a registered handler 15:31:15 #info right now it is delivered locally to the given node only 15:32:20 #info colindixon says there's a bigger problem where we have 5 different kinds of "events" (routed RPCs, global RPCs, Data Change Notifiactions, YANG Notifications, Clustered Data Change notifications) and they each have different delivery disciplines 15:33:00 #info colindixon says we really need to have deliver disciplines and events and be able to pick-and-choose the delivery discipline for each event as you like 15:34:53 #info Jan wants to make sure that we at least fix the things we hit as we do instead of blocking progress on making everything perfect 15:35:02 #action Jan to open a bug against global RPCs 15:35:17 #action colindixon to open a topic for the DDF 15:39:03 #info colindixon and jan agree that singleton apps in the cluster are likely to be the most-common because they're simple and you only need more performance 15:40:06 #info colindixon points out that in addition to normal event delivery discipline issues with where they're delivered, there is also the reliability, e.g., at most once vs. at least once vs. exactly once vs. zero or more times 15:41:32 #topic refactoring clustering 15:41:54 #info Robert says that he has a refactor patch that he needs to be able to fix bug 5280 15:42:13 #info it will be there sometime tomorrow for TomP to review, it's about 1000 lines so far, but doesn't compile yet 15:44:33 #info TomP says he also has a lot of patches out there, which he'd love to have Robert review as well 15:49:46 #topic serialization optimization 15:50:04 #info TomP says that he has better ways to do serialization as patches 15:54:57 #info robert will review the patches 15:55:43 #info muthu notes that (on another topic) that when you configure Akka with fsync() off it improves performance a lot (about 2x) 16:02:33 #info robert points out that disabling fsync() is giving up on some data durability, balancing that is hard 16:03:35 #endmeeting