#opendaylight-meeting: md-sal interest
Meeting started by colindixon at 17:02:06 UTC
(full logs).
Meeting summary
- agenda bashing (colindixon, 17:02:12)
- https://wiki.opendaylight.org/view/MD-SAL_Weekly_Call#Agenda
the agenda (colindixon,
17:03:34)
- Mouli will share some performance evaluation
results and solicit feedback from the community. (colindixon,
17:03:50)
- Helium MD-SAL Data Store & OF Plugin Performance Analysis (colindixon, 17:06:29)
- slides will be posted after the meeting (there
is a webex recording) (colindixon,
17:06:55)
- the goal is to characterize the performance of
the MD-SAL DOM Data Store and OF plugin (as well as scalability and
reliability) (colindixon,
17:08:41)
- used a switch simulator to measure flow rate
(flows/second) and flow scaling (max flows) with a ~20
switches (colindixon,
17:09:28)
- used a Dell server with 64GB RAM, 8 cores and
NWSim (colindixon,
17:09:58)
- used modified OpenFlow Plugin test provider
(multi-threaded, batching flows, etc.) (colindixon,
17:10:52)
- tested DOM datastore alone (no OFplugin), DOM
datastore alone (no OFplugin, no notifications), OF drop test (skip
DOM datastore), DOM datatstore + OF plugin, and DOM datastore (no
notifications) + OF Plugin (colindixon,
17:13:04)
- this was informed by finding bottlenecks, e.g.,
they found that notifications were expensive (colindixon,
17:13:58)
- observations so far (colindixon, 17:15:15)
- originally were seeing <100 flows/second
from the MD-SAL data store + OpenFlow plugin (colindixon,
17:15:42)
- potential bottlenecks here: (colindixon,
17:16:18)
- * data change notifications on WriteTxCommit
processing (colindixon,
17:16:56)
- * single-threaded commit processing compounds
this problem (colindixon,
17:17:10)
- * the merge operation processing overheads were
much higher than for put (colindixon,
17:17:29)
- removing the use of data change notifications
in the process resulted in ~5000 flows/second (colindixon,
17:20:01)
- notifications are two operations (first
creating listener trees and then difference computation), the
overhead seems to be in the second (colindixon,
17:20:45)
- this ~5000 flows/second was with the OFplugin,
but with no clustering (at least not yet) (colindixon,
17:21:48)
- they do note that the existing MD-SAL
microbenchmark on a laptop-grade system hit 10s of thousands of
operations per second (colindixon,
17:24:48)
- there were three key differences: (1) using
normalized nodes instead of binding aware interfaces, (2) using
notifications, (3) using a very simple model instead of the more
complex flows (colindixon,
17:25:44)
- rovarga seems to think that (1) would cause a
lot of overhead (colindixon,
17:26:08)
- possible fixes (colindixon, 17:29:06)
- tony says one issue could be that there is a
translation from MD-SAL internal data change events to the one
defined in the API (colindixon,
17:30:10)
- rovarga thinks the right approach is to migrate
the difference computation to the client, not the data store
(colindixon,
17:34:51)
- the logic is that this would allow the client
to do just the computation it needs, not all the way down to the
leaves (colindixon,
17:37:37)
- rovarga argues that getting the subscription
API implemented in way that allows for this kind of optimization
would be difficult (colindixon,
17:40:32)
- colindixon says he thought that the triggering
scope did this, e.g., subtree vs. current node vs. current node +
children (colindixon,
17:41:05)
- rovarga says that this is *just* triggering
scope, not the scope of the changes that are provided (colindixon,
17:41:35)
- the problem is that in order to know whether
the scope has triggered, we need to perform full comparison. there
is no way to say 'this is granular enough' (rovarga,
17:42:28)
- and once we know it triggered, we need to also
calculate all the nodes which changed, as we do not know what the
user will ask for (rovarga,
17:46:15)
- a navigable tree of what has changed would
solve this -- the app can navigate, find it out, and ask for DTOs
which it is interested in (rovarga,
17:47:12)
- Uyen says that the current apps seem to be
mostly based on data changes, and it appears (from this discussion)
that this might not scale well (colindixon,
17:51:02)
- given that Uyen asks what the performance
guidelines for using the MD-SAL are (colindixon,
17:51:37)
- rovarga and muthu answer that “it depends on
your application” (colindixon,
17:52:59)
- colindixon restates the question “given that
you want to have complex models with high update rates, is the
answer don’t use data change notifications?” (colindixon,
17:53:38)
- rovarga says yes, but there may be APIs that
allow for better scoped notifications and thus less pain here
(colindixon,
17:54:37)
- in general: batch as much as possible, listen
to the minimal set you need, perform put() [which pushes you to
single-writer-per-subtree] (rovarga,
17:55:18)
- try to match what the produces put() and what
the consumers trigger on (rovarga,
17:57:51)
- wrap-up (colindixon, 17:57:51)
- ACTION: muthu will
send out data (profiler and performance numbers) (colindixon,
17:58:15)
- ACTION: Mouli will
send mail about possible missing/dangling flows (colindixon,
17:58:41)
- ACTION: Muthu/Mouli
to post slides to wiki (colindixon,
17:58:55)
- ACTION: the community
needs to understand the right patterns to use the MD-SAL to get
decent performance (colindixon,
18:00:36)
Meeting ended at 18:02:00 UTC
(full logs).
Action items
- muthu will send out data (profiler and performance numbers)
- Mouli will send mail about possible missing/dangling flows
- Muthu/Mouli to post slides to wiki
- the community needs to understand the right patterns to use the MD-SAL to get decent performance
People present (lines said)
- colindixon (46)
- odl_meetbot (5)
- rovarga (5)
- abhijitkumbhare (0)
Generated by MeetBot 0.1.4.