15:01:01 #startmeeting neutron_northbound 15:01:01 Meeting started Mon Jun 19 15:01:01 2017 UTC. The chair is yamahata_. Information about MeetBot at http://ci.openstack.org/meetbot.html. 15:01:01 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:01 The meeting name has been set to 'neutron_northbound' 15:01:08 #chair rajivk 15:01:08 Current chairs: rajivk yamahata_ 15:01:16 #topic agenda bashing and roll call 15:01:25 #info yamahata 15:01:34 #info rajivk 15:01:40 any topics in addition to usual ones? 15:01:59 not 15:02:00 right now, openstack CI is heavily broken. so I'd like to share it later. 15:02:28 #topic Announcements 15:02:49 ODL karaf4 migration is on-going. 15:03:34 #link https://wiki.opendaylight.org/view/Karaf_4_migration 15:04:13 Within 2 weeks, nitrogen branch may be broken heavily. Especially all the projects are kicked out from karaf distribution. 15:04:20 And then will be added. 15:05:00 also karaf feature start up would be issue as it's known. 15:05:12 hi 15:05:15 With karaf4 migration, bootFeature may be dropped. 15:05:18 mkolesni: hi. 15:05:18 #info mkolesni 15:05:27 sorry had some traffic on the way home 15:06:45 The next thing is, right now networking-odl is heavily broken. 15:06:58 The issue is already tracked down. 15:07:08 devstack change and neutron security group ovo. 15:07:36 several patches are on the way. So please help to make them merged 15:08:08 #link https://bugs.launchpad.net/networking-midonet/+bug/1698129 devstack fails trying to create systemd user unit file for q-agt 15:08:26 This is the devstack issue 15:08:56 the another one is, neutron security group ovo broke several decomposed modules. 15:09:26 The patch for neutron, https://review.openstack.org/#/c/448420/ was already merged. 15:09:32 hi 15:10:06 So the patches for networking-odl can be found at https://review.openstack.org/#/q/topic:security-group-ovo 15:10:25 Finally, they need to be squashed into single patch to make CI pass. 15:10:28 manjeets: hi 15:10:51 That's the current situation and what I'm aware of. 15:11:00 any thing else to announce? 15:12:12 seems nothing, move on. 15:12:15 #topic action items from last meeting 15:12:38 961735 15:12:40 no action items except patch review. 15:12:45 sry ignore 15:13:09 #topic patches/bugs 15:13:22 any patches/bugs? 15:13:25 mkolesni: you're on stage 15:13:27 yes 15:14:06 wrong number? 15:14:15 ive been running some perf testing on https://review.openstack.org/#/q/status:open+project:openstack/networking-odl+branch:master+topic:bp/dep-validations-on-create 15:14:23 rally mostly 15:15:05 i didnt see any improvement or any significant negative impact on performance 15:15:25 the most improvement i saw was when ODL was down 15:15:59 now i have pretty basic h/w so not sure if cranking up the thread count might lead to different results 15:16:14 so basically we need to decide if we want this or not 15:16:51 the main thing about the change of dependency calculation is that it prevents coders from introducing certain bugs such as cyclic dependency 15:17:01 Are those performance improvement, right? 15:17:18 or for example a bug where older entry depends on newer entry 15:17:39 the main goal was always the code improvement not performance improvement 15:17:42 we can do a review if it make code simpler, I guess it be worth keeping it to prevent future bugs 15:18:29 sure feel free 15:18:40 467060 looks like breaking the core logic. 15:19:02 If we'd like to avoid lock for update, for example, we can do update where limit 1 etc. 15:19:27 no thats one is needed because otherwise theres deadlocks because of cross dependency not being handled well by mysql 15:19:35 Then we can pick journal entry with single sql statement. 15:19:59 its not breaking anything, how do you arrive at that conclusion? 15:20:09 cross dependency? can you please elaborate it? 15:21:00 the change 453581 changes so that not any entry is selected but just latest that doesnt have unprocessed dependencies 15:21:41 so upon insert into dependency table it will hold a lock already over the journal table 15:22:00 obviously since the journal entry is first created and then the dependency table rows are created 15:22:32 but the select for update lock in the opposite direction, first the dependency table and then the journal table is locked 15:22:44 so obviously opposite order here results in deadlock 15:22:56 i know of no way to fix the order of locking in the select 15:23:10 and in the inserting transaction the journal table has to be created beforehand 15:23:10 453851? I suppose typo. it's for openstack-infra/zuul 15:23:42 the change 453581 ... 15:23:47 no typo 15:23:55 https://review.openstack.org/453581 15:23:59 full link 15:24:23 ho sorry. my typo locally 15:24:27 so the only way to avoid tons of deadlocks that i know is to change to optimistic locking 15:24:37 unless someone has a better idea 15:25:03 i noticed that on a rally gate run the number of deadlocks can reach even 1000 15:25:48 problem is mysql usually will restart the transaction with the insert and not the other one, even thought the inserting one is doing more work 15:25:59 otherwise we could leave with a deadlock 15:26:48 but when a port operation takes ~10 seconds end to end, a deadlock causes retry which will take this same time so it might at works case take 5x time to retry (until max retries is eached) 15:27:54 hence the optimistic locking patch 15:28:34 Is your rally run with the patch of https://review.openstack.org/#/c/453581/ ? 15:28:50 If so, I suppose I understand your description. 15:29:27 yes 15:29:39 its build on top of each other 15:30:34 gotcha 15:32:13 so we need to decide on this change as a whole 15:32:46 My concern is 453581. 15:33:10 What about if two resources with dependency are updated simultaneously? 15:33:11 i think since theres no performance impact then its better than existing code for dependency calculations which is more bug prone 15:33:35 Doesn't it miss depenency? or don't we need to calculate dependency atomically somehow? 15:33:42 this was discussed previously and we arrived at the conclusion its impossible 15:33:52 see YAMAMOTOs comment 15:34:00 on this issue 15:34:07 It 15:34:12 It's with same resource. 15:34:18 yes 15:34:23 same resource 15:34:25 How about different type. e.g. security group and security group rule? 15:34:37 or whatever two kind of resources? 15:35:05 so whats the problem there? 15:35:27 if you have a concrete concern please explain whats the possible bug there 15:35:50 Suppose network update and port update. 15:36:05 ok what will exactly cause a bug? what payload the update carries? 15:36:06 network and port are only example. generally two type of resource with dependency. 15:36:35 dependency calculation for network and port is done without other update. 15:37:06 so what? whats the exact payload of update that causes a problem? 15:37:08 then the dependency table will be inserted at the same time. 15:37:19 so the late commer will miss the dependency. 15:38:09 again not sure what the exact problem with payload there? 15:38:18 you don't care? then why do we have depdenency validation? 15:38:22 what is the network update? 15:38:39 the scenario youre describing could happen today as well 15:38:53 so im not sure whats exactly your concern 15:39:28 the patch doesnt aim to solve every single problem, just make the dependency calculations mechanism more resillient 15:39:56 it cant solve any problem in networking odl dependencies since thats not the aim 15:40:09 The patch introduces new issue. 15:40:17 again no it doesnt 15:40:23 If we don't care inter-resource dependency, 15:40:23 this same issue can happen today 15:40:32 this can happen with todays code 15:40:39 okay, we can drop whole dependency validation logic. we can have only same resource check. 15:40:43 the winner of the race will be independent 15:40:55 im not sure what you want 15:41:01 Is it the direction you'd like to go for? 15:41:08 this problem exists today and im not trying to solve it 15:41:27 no the direction is to improve the code without introducing new issues 15:41:40 this is not a new issue youre describing 15:41:48 and also not a very interesting one 15:41:56 but thats besides the point 15:42:29 but by your logic maybe we should drop the entire V2 driver and go back to V1 15:42:43 since V2 driver obviously has issues not present in V1 driver 15:42:51 is that your direction? 15:43:06 No. I'm guessing direction you'd like to go for? 15:43:27 i feel this project has too much bike shedding and not enough real work done 15:43:52 for instance you approved a patch with numerous issues 15:44:09 so now we had to revert and re-work it to iron out the issues 15:44:44 I'm very glad that you jumped in to address that patch. 15:44:50 so please lets try to avoid bike shedding and get some work done, we dont have much until pike-3 which is feature freeze 15:45:47 Right now, we would need more reviewer to cover as a whole. 15:45:58 not only the area interested to contributor. 15:46:09 yes that would also help with project health 15:47:34 but less bike shedding and focusing on unreal stuff and instead making real iterative progress would also be very beneficial 15:48:48 I agree with less bike shedding. 15:49:51 great so lets try to make some real progress 15:50:10 pike 3 is quickly approaching and everyone has other work to take care of 15:50:52 ok please if you find real issues post it on review 15:51:03 So far CI isn't in good shape. especially ssh via floating ip doesn't work with tempest. 15:51:06 but please dont post issues that already exist in current system 15:51:33 i can try to take a look at it but this week im swamped and also i have to review rajivk's patch 15:51:34 So we've disabled related tests. They needs fixed and also several combination isn't working. 15:51:57 also next week ill be at a company summit thing so i will be of limited availability 15:52:11 ok 15:52:26 Oh i see. Enjoy the event! 15:52:52 thanks 15:52:58 :) 15:53:12 except the recent breakage, the situation is't good. we need to address them for neutron stadium. 15:53:21 okay, we're running out of time. 15:53:26 anything else to discuss? 15:53:31 nothing from me 15:53:33 #topic open mike 15:53:43 #link http://grafana.openstack.org/dashboard/db/networking-odl-failure-rate 15:53:52 yamahata_, a suggestion 15:54:17 manjeets: sure. what's suggestion? 15:54:23 if everyone agrees we can have slot for discussing new features within this meeting 15:54:34 like dhcp-port for odl is coming 15:55:06 we can do that but please keep in mind that after pike-3 we should be stabilizing the code and fixing bugs 15:55:19 like we discuss bugs so we'll have a overview in this meeting what new things are coming will help on prioritizing reviews 15:55:32 especially since we switched to release with milestones work mode 15:55:50 mkolesni, sure !! I agree 15:56:34 ok so we can discuss it in the bugs slot or before that 15:57:13 i want to discuss about db locking patch. 15:57:14 we can give it a try to experiment. 15:57:36 rajivk: please go ahead. 15:57:44 mkolesni: do you have time? 15:57:51 i have to go in a couple of minutes 15:58:00 Currently i have made periodic class singelton 15:58:15 we will not be able to configure all the tasks time interval 15:58:21 rajivk, can we do it earlier tomorrow or do you need yamahata_ for this? 15:58:38 we can discuss it tomorrow as well 15:58:41 that's fine 15:58:48 ok cool 15:59:01 i have an idea but lets discuss tomorrow i really gtg 15:59:14 anything else last minutes? 15:59:19 yes 15:59:38 yamahata, i would like db locking and threadpool patch to into pike release if possible 15:59:55 gtg bye guys 15:59:57 rajivk: I see. okay. 16:00:14 did you check failure of CI? 16:00:30 Yes, now I'm addressing CI breakage for now. 16:00:41 and manjeets is looking at floating ip issue. 16:00:52 We're also talking is netvirt people. 16:01:07 Off course it's very glad for you to jump in. 16:01:22 we run out of time. 16:01:27 i went through discussion, it seems now floating ip issue is resolved. as per last comment from manjeet ODL bug. 16:01:28 thank you everyone 16:01:48 rajivk: yeah. after fixing CI breakage, we'll see the result. 16:02:05 i tried on the same day to verify it 16:02:24 but i was not able to verify it. 16:02:33 Ohh, 16:02:37 I checked you last patch for bug fix 16:03:10 rajivk: please put your comment on the review. then we can continue discussion and investigate the issues. 16:03:22 yamahata_, ok 16:03:33 manjeets: are you still there? 16:03:36 yes 16:03:43 please follow it up. 16:03:58 anything else? 16:04:11 no 16:04:19 okay thank you! 16:04:25 #topic cookies 16:04:27 yamahata_, what the status of ci after neutron sg patch merge ? 16:04:33 #endmeeting