15:05:01 #startmeeting neutron_northbound 15:05:01 Meeting started Mon Mar 20 15:05:01 2017 UTC. The chair is yamahata. Information about MeetBot at http://ci.openstack.org/meetbot.html. 15:05:01 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:05:01 The meeting name has been set to 'neutron_northbound' 15:05:07 #chair vthapar 15:05:07 Current chairs: vthapar yamahata 15:05:14 #topic agenda bashing and roll call 15:05:18 #info yamahata 15:05:21 #info vthapar 15:05:29 #info rajivk 15:05:34 #info mkolesni 15:05:52 Is there any additional topic toay? 15:05:52 hi might be a bit busy during the meeting but ill try to attend fully 15:06:39 seems no special topics. 15:06:47 #topic Announcements 15:07:01 ODL carbon M5 mile stone has been reached. 15:07:16 Now it's code freeze. 15:07:31 Only bug fix is allowed. 15:07:48 Karaf4 migration isn't fully done yet. It will be addressed by Dileep. 15:07:59 any other announcement? 15:08:20 #link https://wiki.opendaylight.org/view/NeutronNorthbound:Meetings meeting agenda 15:09:12 move on 15:09:23 #topic action items from last meeting 15:09:38 yamahata send a mail for planning IRC discussion on https://review.openstack.org/#/c/407784/ 15:09:44 vthapar to file RFE bug for pre/post test hooks for ODL in tempest 15:09:59 yamahata sent out a mail on 407784. Today we have mkolesni 15:10:23 vthapar: did you file a rfe bug? 15:10:45 yamahata: yes, jsut fetching the bug ID... laptop bit slow due to ODL running in background. 15:10:58 I see thanks 15:11:27 #done vthapar filed https://bugs.launchpad.net/networking-odl/+bug/1672620 15:12:02 thanks for the link 15:12:09 #topic Pike planning 15:12:39 So far we have several spec proposal. 15:12:56 brb 15:13:01 #link https://review.openstack.org/#/c/443541/ 15:13:41 For tech detail we can discuss there. 15:13:51 #link https://blueprints.launchpad.net/networking-odl 15:14:22 Please update your blueprint status and upload spec if necessary 15:14:54 Although I haven't write spec, I'm going to propose one, rpc from ODL. 15:15:15 https://git.opendaylight.org/gerrit/#/c/53242/ 15:15:39 vthapar: you may be interested in it. and we'd like to discuss on it at ODL DDF. 15:15:46 I was PTO most of last week being unwell, will get to adding blueprints/specs if any this week. 15:16:06 vthapar: thanks. take care of yourself first. 15:16:10 yamahata: +1 15:16:35 travel plans not finalized yet but most likely will be at DDF and not in Boston. 15:16:52 any blueprint/spec to discuss specifically? 15:17:08 back 15:17:19 yes 15:17:30 yamahataL what are plans with RPC one? 15:17:38 yamahata: plans for RPC one? 15:17:52 as in, use cases. 15:18:05 allow ODL to create dhcp port. 15:18:32 #link https://review.openstack.org/#/c/443541/ 15:18:37 aha, so ODL would be like an agent but no heartbeat? 15:18:58 please review I commented on it and updated some things due to the last review 15:19:02 vthapar: That's right. Now ODL is becoming like agent more and more. 15:19:49 #action everyone review https://review.openstack.org/#/c/443541/ 15:20:06 mkolesni, hi 15:20:14 rajivk_, hey, sup 15:20:24 mkolesni: do you have anything specific to discuss /explain today? 15:20:33 I think, i reviewed it and find it in good shape :) 15:20:45 rajivk_, thanks :) 15:20:56 lets talk about 407784 ? 15:21:06 Do you think, adding an example in spec will help to understand better, however it is very well explained. 15:21:07 kind of RFE really.. not a bug fix 15:21:28 rajivk_, what you mean an example? 15:21:36 I'm concerned that although it's claiming to improve cpu usage, but no benchmark. 15:21:56 I mean take a table and dependencies table you are proposing 15:21:58 i.e. no plan to verify it. 15:22:06 fill in some entries with random examples 15:22:17 Otherwise it seems good experiment for improvement. 15:22:23 yamahata, as i said if we have the postmortem tool i can run it before and after 15:22:30 say on the tempest logsa 15:22:31 logs 15:22:41 and see what it gives 15:22:53 mkolesni: I just noticed the spec, just omake sure my quick read is correct, aim is to do check before adding to journal and record dependency information in journal itself so we can query, correct? 15:22:57 if thats the concern 15:23:30 vthapar, the tl;dr is to calculate the dependencies when we add the entry based on the state of things at that point 15:23:43 vthapar, instead of trying to guess it later 15:23:43 No. I'm concerned about cpu usage etc. 15:23:58 mkolesni, do you think, we can make some kind of dependency graph and keep it in memory. 15:24:00 yamahata, im not sure why you think cpu usage will increase 15:24:06 We need to check if cpu usage is improved with the change by a sort of benchmark. 15:24:22 mkolesni: you're claiming so by line 43-47 15:24:42 mkolesni, I think yamahata's concern is claiming in spec without means to validate it 15:24:46 i claim it will probably improve things from a design perspective 15:25:25 for sure you can't think this will be worse than current situation where dependencies are checked >= 1 times for each entry 15:25:39 whereas I'm proposing a change that checks = 1 times 15:26:14 not sure how in this scenario you imagine that more cpu can ever be used when we're performing a set number of operations instead of a variable number 15:26:34 extrawork is added at entry creation time. 15:26:46 no 15:27:03 work is moved from entry selection time to entry creation time 15:27:22 there will be no dependency calculation done on entry selection 15:27:55 only an additional condition on db query which of course could be optimized if necessary but i dont see a reason to look into it now 15:28:29 i.e. where not exists (select 1 from dependency_table where dependent_id = row.seqnum) 15:28:32 the number of potential number of dependent row would be larger at creation time than selection time. 15:28:35 or something of the sort 15:28:50 So we're not certain for it. 15:29:00 what do you mean? 15:29:01 The golden rule of optimisation is to measure it. 15:29:27 look the change comes to address an architectural need im sorry if i wasn't clear in the spec 15:29:30 At least we should do some sort of benchmark. 15:29:52 it can be easy one. it doesn't have to be extensive one. 15:29:54 i don't care about any possible performance improvement or impact though i don't see how there will be one 15:30:28 im more interested in a way to know on every second what exactly each entry depends on 15:30:51 That's great. 15:30:55 and doing this at create time is vastly better since at that point you know the state of things youre working with 15:31:00 Why do you not want to do benchmark? 15:31:17 doing it on entry selection the image already changed from creation time so the data might be stale 15:31:43 benchmark is just not important its not the main point of this change 15:31:58 we need to do benchmarks in general for numerous scenarios 15:32:13 but i dont see why we need one specifically for the spec 15:32:30 but i can remove any mention of performance impact if it will be better 15:32:35 I agree that we need benchmark in general. 15:32:48 I'm not claiming extensive one. Just simple one will be okay. 15:32:59 there is no simple benchmark 15:33:13 run rally and see it's cpu usage. 15:33:17 It would be easy one 15:33:24 benchmarking is always complicated since it's hard to really measure what you want or even define what you want to measure 15:34:22 i dont see what that gives you how will you know it improved anything 15:34:34 lets say cpu usage is 10% higher, then what? 15:34:48 The assumption that when selecting entry, the number of potintal dependent row is very small. 15:34:49 maybe run time of rally decreased? 15:35:27 the possibility of retry again would be low. 15:35:37 the idea is wrong to do the dependency calculations on select as i said the state of things changed so you might be getting stale data 15:35:44 mkolesni, i think, i can take this task of benchmarking. and yamahata can guide me. 15:35:44 If it's large, there is something anomaly. 15:35:48 Is that ok? 15:36:03 did you see the recent logs from full sync testing by community members? 15:36:29 rajivk_, i dont have a problem with that but i think doing this specifically here is not the main point 15:37:17 if you see the logs youll see theres some entries retrying more than 100 times due to dependency checks 15:37:30 except cpu usage, I'm quite fun for code cleanup/improvement. 15:37:38 Let's see your outcome. 15:38:13 mkolesni: I'm saying it's anomaly. we should change the logic to select entry. 15:38:50 You've sticked to select entry in roundrobbin manner. it can be changed. 15:39:05 ok ill write your idea in the spec though i dont believe it 15:39:29 can we talk about 407784 ? 15:39:51 We've spent too much time on the single one. 15:40:04 Let's move the other one. 15:40:18 if were on the topic of performance, that change intends a penalty for any running system for the sake of debugging 15:40:37 whereas a post mortem tool can provide the same data without penalty to an active system 15:40:47 thats why i -2 it 15:41:06 its basically putting some kind of profiling in a system 15:41:24 which no one would do in a production system 15:41:36 not on a fixed interval at least 15:42:11 mkolesni, may be we can provide some kind of configurable option, which can be used by deleveper to turn it on and turn it off in production 15:42:27 mkolesni, what do you think? 15:43:03 In production admin will always have option to profile from cmd. 15:43:12 we could though again why do you need it given that you can achieve the same outcome with a post mortem tool? 15:43:26 which was discussed on that spec 15:44:06 I don't think so. Surely we can log entry creation/deletion. it's hard to track the number of pending rows. 15:44:10 because of log rotation. 15:44:36 log rotation is controllable externally so i dont see how its a problem 15:45:19 usually log rotations are set per process 15:45:33 if we don't set log rotation for profiling use case 15:45:43 it can increase logs size too much. 15:45:57 and multiple neutron servers are running so the log entry of creation/deletion will be scattered. 15:46:17 yamahata, you mean in ha env? 15:46:22 mkolesni: yes. 15:46:39 you can use logstash or probably simple rsyslog to funnel them to a single place 15:47:08 It will make these tools necessary be in production system. 15:47:15 In that case, I wouldn't say it's easily realisable. 15:47:32 However some deplyment may have some other mechnism to log. 15:47:50 I agree with mkolesni, idea that it should not be in production 15:48:03 and with Yamahata about log rotation. 15:48:07 the whole idea was for development no? 15:48:24 i just think this might be useful for debugging real production systems 15:48:33 but usually there you wont have access to the db 15:48:37 ha environment is also be tested somehow. 15:48:43 youll usually have just the logs 15:49:55 yes, but admin might have access. 15:50:06 However, i don't know, whether people use this tool or not. 15:50:21 if i have a bug in lp usually i dont have access to the actual prod env 15:50:51 also it might be a few days or even a month passes before someone looks at it 15:51:00 that why analyzing logs is a better idea 15:51:18 we can have both tools. 15:51:42 why would you have that? 15:51:53 two toold that do the same thing that we need to maintain? 15:52:04 tools* 15:52:06 My original intention is to check the number of backlog in journal db at runtime. 15:52:28 log rotation and collection at one place makes it difficult to analyze logs. 15:52:31 ok great so run your log analyzer periodically then 15:52:34 Once we're satisfied with log analyzing tools, we can remove the tool that accesses db. 15:52:51 Can not we think, about some other way of achiving the same results? 15:53:42 Something better way, where everyone's point get's resolved. 15:53:50 how would the tool access db? does it not need the db user/pass? 15:54:16 Yes, it needs them. 15:54:20 May be conf file is available. 15:54:34 there's 5 min left. 15:55:02 can we continue discussion at gerrit and move on to other topics/patches/bugs? 15:55:09 sure 15:55:14 +1 15:55:24 #topic carbon/nitrogen planning 15:55:45 no big one. Carbon is approaching to end. Nitrogen planning will be discussed at ODL DDF in May 15:55:49 #topic patches/bugs 15:55:51 unfortunately its holiday here while the DDF so i wont be able to make it there 15:55:54 any other patches/bugs 15:56:07 mkolesni: Ohhh that's unfortunate. 15:56:19 At least there is tempest failures related to ssh. 15:56:21 yamahata, sorry for asking a lot. But launchpad does not seems to be updated. 15:56:36 also since the summit talks werent accepted i probably wont be going 15:56:37 rajivk: which one? 15:56:57 there are some critical bugs. 15:57:08 rajivk: I see. 15:57:27 #action yamahata check/scrub launchpad bugs 15:57:38 #link https://review.openstack.org/#/c/445251/ ssh failures 15:57:58 rajivk: has checked the log 15:58:21 yes, actually machine's were not getting ip/s as far as i could understand logs. 15:58:25 It's a critical issue and blocking ocata, newton backports. 15:59:14 so you know why its happening? 15:59:28 seemed like a change on ODL caused this 15:59:29 no, it has to be investigated. 15:59:43 I have to check the scenarios etc. 15:59:52 theyre not getting internal or floating ips? 16:00:04 But due to lack of n-odl knowledge, it might be difficult for me. 16:00:08 internal IPs. 16:00:16 sometimes there are failures due to FIP not being set in ODL correctly but internal IPs work fine 16:00:38 then dhcp needs to be investigated 16:00:45 The log i checked, did not have internal Ips. 16:00:52 sorry folks, I need to head out. will check the meeting logs later. 16:00:53 Yeah, I've suspected floatingip. internal ip sounds a bit surprising. 16:01:05 I have shared link of logs 16:01:06 bye vthapar 16:01:15 vthapar: bye 16:01:23 you can verify from the logs 16:01:34 ifconfig command does not show any IPs. 16:01:48 because of that ther are no routes added. 16:01:54 Let me paste the link here. 16:02:10 rajivk_, did they get ip in the boot? it usually shows in the log from the guest cirros 16:02:17 #link http://logs.openstack.org/51/445251/1/check/gate-tempest-dsvm-networking-odl-boron-snapshot-v2driver/4293447/console.html#_2017-03-18_07_24_23_683290 16:02:29 i checked only those logs 16:03:22 Failure of all the test cases seems to be this only. 16:03:28 2017-03-18 07:24:23.610149 | 2017-03-18 07:24:23.609 | udhcpc (v1.20.1) started 16:03:28 2017-03-18 07:24:23.611594 | 2017-03-18 07:24:23.611 | Sending discover... 16:03:29 2017-03-18 07:24:23.613118 | 2017-03-18 07:24:23.612 | Sending discover... 16:03:29 2017-03-18 07:24:23.614824 | 2017-03-18 07:24:23.614 | Sending discover... 16:03:29 2017-03-18 07:24:23.616433 | 2017-03-18 07:24:23.616 | Usage: /sbin/cirros-dhcpc 16:03:30 2017-03-18 07:24:23.618135 | 2017-03-18 07:24:23.617 | No lease, failing 16:03:31 None of the instances got the IPs. 16:03:38 yes no getting ip from the dhcp 16:03:48 oh. 16:04:35 should have ip 10.1.0.6 in this case 16:04:56 so ODL somehow failed to provide it via dhcp 16:05:06 but u need to look at ODL log perhaps 16:05:18 ie karaf log 16:05:45 How does ODL provides IP, does it get from dhcp namespace or some other mechinsm. 16:06:09 with openstack ci, q-dhcp is running. 16:06:21 ODL sets up only OF rules. 16:06:39 theres also option for odl to act as dhcp server 16:07:04 i read somewhere, we have odl expert here. 16:07:11 as u can see it doesnt fail in beryllium 16:07:33 so its something that got broke in carbon and backported to boron 16:07:51 sorry gtg 16:08:00 thanks guys, have a nice day 16:08:08 bye, mkolesni 16:08:13 have a good day :) 16:08:21 bye, thanks :) 16:08:25 okay thanks everyone 16:08:31 #topic open mike 16:08:42 #topic cookies 16:08:47 #endmeeting