====================================== #opendaylight-meeting: kernel projects ====================================== Meeting started by ryangoulding at 15:55:17 UTC. The full logs are available at http://meetings.opendaylight.org/opendaylight-meeting/2017/kernel_projects/opendaylight-meeting-kernel_projects.2017-09-05-15.55.log.html . Meeting summary --------------- * agenda bashing (ryangoulding, 15:55:25) * FYI Tooling to find the real root cause culprit of memory leaks related to non-closed transactions (and tx chains) (ryangoulding, 15:55:35) * Checkstyle more rulez (ryangoulding, 15:55:42) * Exceptions lost in log from background threads: How to hunt down ALL usages of thread factories, and make ALL of them use the setUncaughtExceptionHandler() ? Just grep all (kernel project) code? Any volunteers? Or is there any way we could enforce it? (ryangoulding, 15:55:52) * Make sure that nobody just ignores returned Future (or CompletionStage, and others..) - how? FindBugs, and/or errorprone, and @CheckReturnValue ? How can we get started to become more serious about this, everywhere? (ryangoulding, 15:56:00) * errorprone! https://git.opendaylight.org/gerrit/#/c/62090/ (ryangoulding, 15:56:08) * How to determine a good poll interval in async tests given a timeout? Util to (exponentially?) increase? See discussion in https://git.opendaylight.org/gerrit/#/c/61927/ (ryangoulding, 15:56:17) * JavaDoc sites, next step? Merge https://git.opendaylight.org/gerrit/#/c/60213/ (ryangoulding, 15:56:24) * Thoughts / objections re. https://git.opendaylight.org/gerrit/#/c/50905/ ? (ryangoulding, 15:56:33) * kernel projects status for Nitrogen (ryangoulding, 15:56:49) * [Next week, with Stephen] Discussion: Should each thing create its own ExecutorService or should we have a single central shared one? Like an EE servers's manged thread pools. (ryangoulding, 15:56:59) * LINK: https://bugs.opendaylight.org/show_bug.cgi?id=9033 (ryangoulding, 16:02:33) * LINK: https://bugs.opendaylight.org/show_bug.cgi?id=8987 (ryangoulding, 16:02:46) * LINK: https://docs.google.com/spreadsheets/d/1MYyGLFWN2RzUkJl8XMzXQ-3zWuOrUCQpIS6ORbmf4_U/edit#gid=794930820 - nitrogen CSIT problems (klou, 16:04:59) * Blocker Bugs (ryangoulding, 16:05:48) * LINK: https://bugs.opendaylight.org/show_bug.cgi?id=9033 (ryangoulding, 16:05:58) * increased status to blocker (ryangoulding, 16:06:14) * LINK: https://bugs.opendaylight.org/show_bug.cgi?id=8987 (ryangoulding, 16:13:31) * LINK: https://logs.opendaylight.org/releng/jenkins092/bgpcep-csit-1node-gate-userfeatures-only-nitrogen/7/odl1_karaf.log.gz (ryangoulding, 16:15:14) * something putting null in payload (ryangoulding, 16:15:20) * either something wrong with test or restconf (ryangoulding, 16:15:26) * LINK: https://bugs.opendaylight.org/show_bug.cgi?id=8988 (ryangoulding, 16:15:47) * the oproblem is likely test related in the XML payload (ryangoulding, 16:16:17) * may be related to the boot process (ryangoulding, 16:16:26) * when ODL is fully booted this does not appear to happen (ryangoulding, 16:16:38) * this may only occur in a transient booting state (ryangoulding, 16:16:49) * LINK: https://docs.google.com/spreadsheets/d/1MYyGLFWN2RzUkJl8XMzXQ-3zWuOrUCQpIS6ORbmf4_U/edit#gid=794930820 (ryangoulding, 16:17:44) * LINK: https://bugs.opendaylight.org/show_bug.cgi?id=9092 (ryangoulding, 16:19:51) * is this being looked at? (ryangoulding, 16:20:10) * rovarga notes that instead the code should use normalized node, use NormalizedNodeCodec, then output to JSON directly (ryangoulding, 16:20:44) * skitt says he will look at this later today (ryangoulding, 16:20:51) * tomorrow rather (Wednesday) (skitt, 16:21:10) * this is the websocket stuff for reference (ryangoulding, 16:21:11) * FYI Tooling to find the real root cause culprit of memory leaks related to non-closed transactions (and tx chains) (ryangoulding, 16:21:41) * LINK: https://bugs.opendaylight.org/show_bug.cgi?id=9060 (ryangoulding, 16:21:51) * LINK: https://git.opendaylight.org/gerrit/#/q/topic:bug/9060 (ryangoulding, 16:22:00) * mdsal trace augmentations (ryangoulding, 16:22:35) * now when you install, it adds a CLI to karaf “trace-transactions” (ryangoulding, 16:22:52) * keeps track of the md-sal transactions using a new Broker Facade (ryangoulding, 16:23:04) * helps find Transaction(s) that aren’t closed (ryangoulding, 16:23:19) * includes the stack trace for unclosed transactions (ryangoulding, 16:23:32) * this is mostly apublic disclosure that it is available (ryangoulding, 16:23:42) * ACTION: rgoulding to add documentation to “trace-transactions” command (ryangoulding, 16:24:28) * Checkstyle more rulez (ryangoulding, 16:24:55) * LINK: https://lists.opendaylight.org/pipermail/odlparent-dev/2017-August/001316.html (ryangoulding, 16:25:02) * Any objections to merging https://git.opendaylight.org/gerrit/#/c/43324/ ? (ryangoulding, 16:25:59) * https://git.opendaylight.org/gerrit/#/c/62145/ (split off from https://git.opendaylight.org/gerrit/#/c/43331/) (ryangoulding, 16:26:09) * Discussion 20/80% re. https://git.opendaylight.org/gerrit/#/c/43331/ (ryangoulding, 16:26:17) * https://git.opendaylight.org/gerrit/#/c/43324/ is incompatible so should only be in 3.X branch (ryangoulding, 16:27:11) * odlparent 3.X is scheduled for Oxygen (ryangoulding, 16:27:20) * changes will be picked up there (ryangoulding, 16:27:55) * LINK: https://git.opendaylight.org/gerrit/#/c/43331/ (ryangoulding, 16:28:21) * maybe not a good idea? (ryangoulding, 16:28:25) * must chain exception(s) if you are reacting to a different exception (ryangoulding, 16:29:23) * vorburger if you log and propagate, then you may havecontext info you want to provide (ryangoulding, 16:30:07) * raises the point that supressing the warnings for checkstyle is already pretty superfluorous (ryangoulding, 16:31:14) * may not be a good idea for future code (ryangoulding, 16:31:25) * -1 vs -2 was given because it was two rules and only one was contentious (ryangoulding, 16:31:38) * skitt would two sets of rules be useful? (ryangoulding, 16:31:55) * skitt this rule is nice to make sure that errors are only logged in one place where you have all of the context (ryangoulding, 16:32:19) * kernel is more prone to “retrhowing” the exceptions (ryangoulding, 16:32:44) * either that or dump tons of data into message, which isn’t great either (ryangoulding, 16:32:56) * other thing is to add fields to Exception(s) (ryangoulding, 16:33:06) * sometimes this isn’t great because the fields are not extracted in consumers (ryangoulding, 16:33:40) * the information is available but not dumped into the log (ryangoulding, 16:33:47) * that is somewhat worrisome (also fomratting becomes an issue) (ryangoulding, 16:33:58) * vorburger will abandon https://git.opendaylight.org/gerrit/#/c/43331/ (ryangoulding, 16:34:58) * next odlparent release may require downstream consumers to reformat code (ryangoulding, 16:35:23) * maybe revisit the abandoned idea in a future odlparent release (ryangoulding, 16:35:46) * Exceptions lost in log from background threads: How to hunt down ALL usages of thread factories, and make ALL of them use the setUncaughtExceptionHandler() ? Just grep all (kernel project) code? Any volunteers? Or is there any way we could enforce it? (ryangoulding, 16:35:58) * FYI there are some IMHO neat utilities in infrautils around this now, see the ThreadFactoryProvider or directly the LoggingThreadUncaughtExceptionHandler, which we could use use (but don't have to, of course) (ryangoulding, 16:36:20) * background threads may blow up and not get logged (ryangoulding, 16:36:45) * these are written to system.out (ryangoulding, 16:36:50) * rgoulding these will also get pushed into data/log/karaf.out (ryangoulding, 16:37:38) * vorburger should we push these all in karaf.log? (ryangoulding, 16:37:46) * is there a way to set the uncaughtexceptionhandler everywhere (ryangoulding, 16:37:58) * rovarga the semantic question is how do you deal with exceptions in a generic way? you dont (ryangoulding, 16:38:19) * if an exception is automatically recoverable, it is not an exception by definition (ryangoulding, 16:38:36) * vorburger means log and re-throw, just so it is put in the proper log file (ryangoulding, 16:39:36) * rethrowing will then propagate the exception, making the thread die (ryangoulding, 16:40:04) * the idea is that there may be several uncaught exceptions that are never even seen (ryangoulding, 16:40:21) * these are bad problems that get hidden under the radar unles syou start watching karaf.out (ryangoulding, 16:40:35) * rovarga says it is aJVM level error and is thus handling it by only using whats available (stdout/stderr) (ryangoulding, 16:41:13) * this could be handled via using ThreadPools or ThreadFactories that can be preconfigured to utilize a default handler (ryangoulding, 16:42:15) * do a checkstyle plugin to check for instantiation of “naked” Threads and forbid it? (ryangoulding, 16:42:42) * it is a tradeoff of what we let people do with the platform v.s., how consistent and what mistakes they can make (ryangoulding, 16:42:57) * if we dont let them spin threads they wont make this mistake, but they also wont get to use thread instantiation (ryangoulding, 16:43:14) * bala could we redirect karaf.out to karaf.log (ryangoulding, 16:43:58) * rovarga then you would have to do line buffering into karaf.log since there are mutliple writers to karaf.log (ryangoulding, 16:44:19) * Make sure that nobody just ignores returned Future (or CompletionStage, and others..) - how? FindBugs, and/or errorprone, and @CheckReturnValue ? How can we get started to become more serious about this, everywhere? (ryangoulding, 16:47:14) * errorprone! (ryangoulding, 16:48:17) * LINK: https://git.opendaylight.org/gerrit/#/c/62090/ (ryangoulding, 16:48:19) * have to switch compiler (ryangoulding, 16:48:23) * not the default one (ryangoulding, 16:48:28) * we would maybe be able to do this as a regular job vs part of the standard build process (ryangoulding, 16:49:08) * act in a reactive way (ryangoulding, 16:49:12) * people will automatically just do a get to get around this (ryangoulding, 16:49:44) * there is room for abuse (ryangoulding, 16:49:53) * checkstyle isn’t really capable (ryangoulding, 16:50:22) * maybe findbugs (ryangoulding, 16:50:24) * How to determine a good poll interval in async tests given a timeout? Util to (exponentially?) increase? See discussion in https://git.opendaylight.org/gerrit/#/c/61927/ (ryangoulding, 16:51:14) * LINK: https://git.opendaylight.org/gerrit/#/c/61927/ (ryangoulding, 16:51:28) * intended audience? (ryangoulding, 16:52:40) * are they waiting for uSec, minutes, etc? (ryangoulding, 16:52:47) * this is the typical question you get in a utility dumping ground without users (ryangoulding, 16:52:59) * it is probably fine for most users (ryangoulding, 16:53:12) * this is used in Genius (ryangoulding, 16:53:17) * is it the type of test in Geniuys that should eventually be rewritten to not rely upon timeouts? (ryangoulding, 16:53:37) * testing something that required queue drainiage (ryangoulding, 16:53:51) * in int/test something similar exists (ryangoulding, 16:54:05) * require clients to define the configurable parameters (ryangoulding, 16:54:16) * brought up by vrpolak (ryangoulding, 16:54:35) * rovarga what is the use of this utility? this then becomes a very thin wrapper (ryangoulding, 16:54:50) * vorburger points out that its just 10 lines that he doesn’t want to copy and paste all over tests (ryangoulding, 16:55:28) * who is allocating the queues? (ryangoulding, 16:58:27) * LINK: https://bugs.opendaylight.org/show_bug.cgi?id=8927 (ryangoulding, 17:00:51) * ACTION: jmorvay to take a look after blocker bugs are resolved (ryangoulding, 17:01:42) Meeting ended at 17:02:35 UTC. People present (lines said) --------------------------- * ryangoulding (118) * odl_meetbot (3) * vrpolak (2) * vorburger (1) * klou (1) * skitt (1) Generated by `MeetBot`_ 0.1.4