15:55:17 #startmeeting kernel projects 15:55:17 Meeting started Tue Sep 5 15:55:17 2017 UTC. The chair is ryangoulding. Information about MeetBot at http://ci.openstack.org/meetbot.html. 15:55:17 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:55:17 The meeting name has been set to 'kernel_projects' 15:55:25 #topic agenda bashing 15:55:35 #info FYI Tooling to find the real root cause culprit of memory leaks related to non-closed transactions (and tx chains) 15:55:42 #info Checkstyle more rulez 15:55:52 #info Exceptions lost in log from background threads: How to hunt down ALL usages of thread factories, and make ALL of them use the setUncaughtExceptionHandler() ? Just grep all (kernel project) code? Any volunteers? Or is there any way we could enforce it? 15:56:00 #info Make sure that nobody just ignores returned Future (or CompletionStage, and others..) - how? FindBugs, and/or errorprone, and @CheckReturnValue ? How can we get started to become more serious about this, everywhere? 15:56:08 #info errorprone! https://git.opendaylight.org/gerrit/#/c/62090/ 15:56:17 #info How to determine a good poll interval in async tests given a timeout? Util to (exponentially?) increase? See discussion in https://git.opendaylight.org/gerrit/#/c/61927/ 15:56:24 #info JavaDoc sites, next step? Merge https://git.opendaylight.org/gerrit/#/c/60213/ 15:56:33 #info Thoughts / objections re. https://git.opendaylight.org/gerrit/#/c/50905/ ? 15:56:49 #info kernel projects status for Nitrogen 15:56:59 #info [Next week, with Stephen] Discussion: Should each thing create its own ExecutorService or should we have a single central shared one? Like an EE servers's manged thread pools. 16:02:33 #link https://bugs.opendaylight.org/show_bug.cgi?id=9033 16:02:46 #link https://bugs.opendaylight.org/show_bug.cgi?id=8987 16:04:25 sorry, I'm late, just joining now 16:04:59 #link https://docs.google.com/spreadsheets/d/1MYyGLFWN2RzUkJl8XMzXQ-3zWuOrUCQpIS6ORbmf4_U/edit#gid=794930820 - nitrogen CSIT problems 16:05:48 #topic Blocker Bugs 16:05:58 #link https://bugs.opendaylight.org/show_bug.cgi?id=9033 16:06:14 #info increased status to blocker 16:09:38 Are we talking about 9033 or https://bugs.opendaylight.org/show_bug.cgi?id=9055 ? 16:10:23 Probably the same thing reprted from different point of view. 16:13:31 #link https://bugs.opendaylight.org/show_bug.cgi?id=8987 16:15:14 #link https://logs.opendaylight.org/releng/jenkins092/bgpcep-csit-1node-gate-userfeatures-only-nitrogen/7/odl1_karaf.log.gz 16:15:20 #info something putting null in payload 16:15:26 #info either something wrong with test or restconf 16:15:47 #link https://bugs.opendaylight.org/show_bug.cgi?id=8988 16:16:17 #info the oproblem is likely test related in the XML payload 16:16:26 #info may be related to the boot process 16:16:38 #info when ODL is fully booted this does not appear to happen 16:16:49 #info this may only occur in a transient booting state 16:17:44 #link https://docs.google.com/spreadsheets/d/1MYyGLFWN2RzUkJl8XMzXQ-3zWuOrUCQpIS6ORbmf4_U/edit#gid=794930820 16:19:51 #link https://bugs.opendaylight.org/show_bug.cgi?id=9092 16:20:10 #info is this being looked at? 16:20:44 #info rovarga notes that instead the code should use normalized node, use NormalizedNodeCodec, then output to JSON directly 16:20:51 #info skitt says he will look at this later today 16:21:10 #info tomorrow rather (Wednesday) 16:21:11 #info this is the websocket stuff for reference 16:21:41 #topic FYI Tooling to find the real root cause culprit of memory leaks related to non-closed transactions (and tx chains) 16:21:51 #link https://bugs.opendaylight.org/show_bug.cgi?id=9060 16:22:00 #link https://git.opendaylight.org/gerrit/#/q/topic:bug/9060 16:22:35 #info mdsal trace augmentations 16:22:52 #info now when you install, it adds a CLI to karaf “trace-transactions” 16:23:04 #info keeps track of the md-sal transactions using a new Broker Facade 16:23:19 #info helps find Transaction(s) that aren’t closed 16:23:32 #info includes the stack trace for unclosed transactions 16:23:42 #info this is mostly apublic disclosure that it is available 16:24:28 #action rgoulding to add documentation to “trace-transactions” command 16:24:55 #topic Checkstyle more rulez 16:25:02 #link https://lists.opendaylight.org/pipermail/odlparent-dev/2017-August/001316.html 16:25:59 #info Any objections to merging https://git.opendaylight.org/gerrit/#/c/43324/ ? 16:26:09 #info https://git.opendaylight.org/gerrit/#/c/62145/ (split off from https://git.opendaylight.org/gerrit/#/c/43331/) 16:26:17 #info Discussion 20/80% re. https://git.opendaylight.org/gerrit/#/c/43331/ 16:27:11 #info https://git.opendaylight.org/gerrit/#/c/43324/ is incompatible so should only be in 3.X branch 16:27:20 #info odlparent 3.X is scheduled for Oxygen 16:27:55 #info changes will be picked up there 16:28:21 #link https://git.opendaylight.org/gerrit/#/c/43331/ 16:28:25 #info maybe not a good idea? 16:29:23 #info must chain exception(s) if you are reacting to a different exception 16:30:07 #info vorburger if you log and propagate, then you may havecontext info you want to provide 16:31:14 #info raises the point that supressing the warnings for checkstyle is already pretty superfluorous 16:31:25 #info may not be a good idea for future code 16:31:38 #info -1 vs -2 was given because it was two rules and only one was contentious 16:31:55 #info skitt would two sets of rules be useful? 16:32:19 #info skitt this rule is nice to make sure that errors are only logged in one place where you have all of the context 16:32:44 #info kernel is more prone to “retrhowing” the exceptions 16:32:56 #info either that or dump tons of data into message, which isn’t great either 16:33:06 #info other thing is to add fields to Exception(s) 16:33:40 #info sometimes this isn’t great because the fields are not extracted in consumers 16:33:47 #info the information is available but not dumped into the log 16:33:58 #info that is somewhat worrisome (also fomratting becomes an issue) 16:34:58 #info vorburger will abandon https://git.opendaylight.org/gerrit/#/c/43331/ 16:35:23 #info next odlparent release may require downstream consumers to reformat code 16:35:46 #info maybe revisit the abandoned idea in a future odlparent release 16:35:58 #topic Exceptions lost in log from background threads: How to hunt down ALL usages of thread factories, and make ALL of them use the setUncaughtExceptionHandler() ? Just grep all (kernel project) code? Any volunteers? Or is there any way we could enforce it? 16:36:20 #info FYI there are some IMHO neat utilities in infrautils around this now, see the ThreadFactoryProvider or directly the LoggingThreadUncaughtExceptionHandler, which we could use use (but don't have to, of course) 16:36:45 #info background threads may blow up and not get logged 16:36:50 #info these are written to system.out 16:37:38 #info rgoulding these will also get pushed into data/log/karaf.out 16:37:46 #info vorburger should we push these all in karaf.log? 16:37:58 #info is there a way to set the uncaughtexceptionhandler everywhere 16:37:59 ? 16:38:19 #info rovarga the semantic question is how do you deal with exceptions in a generic way? you dont 16:38:36 #info if an exception is automatically recoverable, it is not an exception by definition 16:39:36 #info vorburger means log and re-throw, just so it is put in the proper log file 16:40:04 #info rethrowing will then propagate the exception, making the thread die 16:40:21 #info the idea is that there may be several uncaught exceptions that are never even seen 16:40:35 #info these are bad problems that get hidden under the radar unles syou start watching karaf.out 16:41:13 #info rovarga says it is aJVM level error and is thus handling it by only using whats available (stdout/stderr) 16:42:15 #info this could be handled via using ThreadPools or ThreadFactories that can be preconfigured to utilize a default handler 16:42:42 #info do a checkstyle plugin to check for instantiation of “naked” Threads and forbid it? 16:42:57 #info it is a tradeoff of what we let people do with the platform v.s., how consistent and what mistakes they can make 16:43:14 #info if we dont let them spin threads they wont make this mistake, but they also wont get to use thread instantiation 16:43:58 #info bala could we redirect karaf.out to karaf.log 16:44:19 #info rovarga then you would have to do line buffering into karaf.log since there are mutliple writers to karaf.log 16:47:14 #topic Make sure that nobody just ignores returned Future (or CompletionStage, and others..) - how? FindBugs, and/or errorprone, and @CheckReturnValue ? How can we get started to become more serious about this, everywhere? 16:48:17 #info errorprone! 16:48:19 #link https://git.opendaylight.org/gerrit/#/c/62090/ 16:48:23 #info have to switch compiler 16:48:28 #info not the default one 16:49:08 #info we would maybe be able to do this as a regular job vs part of the standard build process 16:49:12 #info act in a reactive way 16:49:44 #info people will automatically just do a get to get around this 16:49:53 #info there is room for abuse 16:50:22 #info checkstyle isn’t really capable 16:50:24 #info maybe findbugs 16:51:14 #topic How to determine a good poll interval in async tests given a timeout? Util to (exponentially?) increase? See discussion in https://git.opendaylight.org/gerrit/#/c/61927/ 16:51:28 #link https://git.opendaylight.org/gerrit/#/c/61927/ 16:52:40 #info intended audience? 16:52:47 #info are they waiting for uSec, minutes, etc? 16:52:59 #info this is the typical question you get in a utility dumping ground without users 16:53:12 #info it is probably fine for most users 16:53:17 #info this is used in Genius 16:53:37 #info is it the type of test in Geniuys that should eventually be rewritten to not rely upon timeouts? 16:53:51 #info testing something that required queue drainiage 16:54:05 #info in int/test something similar exists 16:54:16 #info require clients to define the configurable parameters 16:54:35 #info brought up by vrpolak 16:54:50 #info rovarga what is the use of this utility? this then becomes a very thin wrapper 16:55:28 #info vorburger points out that its just 10 lines that he doesn’t want to copy and paste all over tests 16:58:27 #info who is allocating the queues? 17:00:51 #link https://bugs.opendaylight.org/show_bug.cgi?id=8927 17:01:42 #action jmorvay to take a look after blocker bugs are resolved 17:02:35 #endmeeting