15:08:52 <yamahata> #startmeeting neutron_northbound 15:08:52 <odl_meetbot> Meeting started Mon Nov 13 15:08:52 2017 UTC. The chair is yamahata. Information about MeetBot at http://ci.openstack.org/meetbot.html. 15:08:52 <odl_meetbot> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:08:52 <odl_meetbot> The meeting name has been set to 'neutron_northbound' 15:08:59 <yamahata> #chair mkolesni rajivk_ mkolesni 15:08:59 <odl_meetbot> Current chairs: mkolesni rajivk_ yamahata 15:09:03 <yamahata> #chair mpeterson 15:09:03 <odl_meetbot> Current chairs: mkolesni mpeterson rajivk_ yamahata 15:09:10 <yamahata> #topic agenda bashing and roll call 15:09:14 <mkolesni> #info mkolesni 15:09:16 <yamahata> #info yamahata 15:09:17 <rajivk_> #info rajivk 15:09:30 <mpeterson> #info mpeterson 15:09:33 <yamahata> today any additional topics? 15:09:44 <mkolesni> theres a ton of stuff on the stable/pike branch waiting for review 15:09:50 <mkolesni> most of it +1 by us 15:09:55 <mpeterson> yes, I have two topics 15:10:08 <mkolesni> how would we approach this wrt stable main team? 15:10:11 <mpeterson> #info ODL cleanup between tests 15:10:33 <mpeterson> #info Neutron/Infra/etc updates during their meetings 15:11:04 <yamahata> this week we have three extra topics 15:11:14 <yamahata> any other topics? 15:11:42 <mpeterson> no 15:12:07 <mkolesni> none from me 15:12:16 <rajivk_> none from me 15:12:19 <yamahata> okay, move on 15:12:21 <yamahata> #topic Announcements 15:12:36 <yamahata> Last week there was openstack summit. there was forum. 15:12:50 <yamahata> #link https://wiki.openstack.org/wiki/Forum/Sydney2017 openstack forum 15:12:56 <yamahata> #link https://etherpad.openstack.org/p/Neutron-pain-points-Sydney 15:13:18 <yamahata> We'd like to check what was discussed there. 15:13:51 <yamahata> #link https://www.openstack.org/ptg/ 15:14:06 <yamahata> openstack PTG is planned on the week of Feb 26, 2018 in Dublin. 15:14:15 <yamahata> For now the details isn't announced yet. 15:14:16 <mkolesni> im still waiting on an answer for you guys :) 15:14:54 <mpeterson> #info openstack PTG is planned on the week of Feb 26, 2018 in Dublin. No more details at the moment. 15:14:59 <yamahata> Without concrete plan, it's difficult for me to get travel budget. 15:15:04 <mkolesni> if youre going to attend or not 15:15:19 <mkolesni> tell them rest of n-odl team will be there :) 15:15:22 <yamahata> I need to wait for more details to be announced. 15:16:16 <yamahata> any other announcement? 15:17:07 <yamahata> seems nothing. let's move on 15:17:11 <yamahata> #topic action items from last meeting 15:17:27 <yamahata> There is only timeslot and patch review stuff. 15:17:41 <yamahata> okay, now move on proposed topics 15:17:48 <yamahata> #topic approach for stable branch 15:18:04 <yamahata> mkolesni: please go ahead. 15:18:29 <yamahata> basically they don't monitor back port patches. we have to add them to reviewer explicitly. 15:18:34 <mkolesni> yeah i just said theres a few things we backported there 15:18:36 <mkolesni> ok 15:18:46 <mkolesni> i thought they have wome weekly review or something? 15:19:02 <mkolesni> we need to explicitly add the individuals to the patches? 15:19:27 <yamahata> the latter. 15:19:45 <yamahata> We need to add the maintenance team explicity to the patch review. 15:19:45 <mkolesni> well thats unfortunate.. 15:20:00 <yamahata> They have many patches to review, they don't actively poll our patches. 15:20:09 <mkolesni> ok will add them 15:20:22 <yamahata> Once they are added, usually they will review in several days. 15:20:34 <mkolesni> i think maybe we can raise this on neutron level 15:20:40 <mpeterson> yamahata: mmm and who would they be? I usually see they are added however they just linger without +2 +W the patches 15:20:48 <mkolesni> like why dont we moderate stable ourselves? 15:21:19 <yamahata> For now that's the neutron team consensus. 15:21:34 <yamahata> So far do you have any issues? 15:21:44 <yamahata> except patch review without adding reviewers? 15:22:11 <yamahata> #link https://review.openstack.org/#/admin/groups/539,members neutron stable maint team 15:22:36 <mkolesni> im not sure if its event relevant to have them moderating it 15:22:38 <mpeterson> #info Regarding Stable patches in order for them to be reviewed the neutron stable maint team has to be manually added to the patch. Link above. 15:23:23 <mkolesni> maybe we can have a "stable liason" that will moderate so that they do less checks on each patch 15:23:59 <mkolesni> also probably some of the checks can be done by CI, i.e. don't backport db migration scripts, that stuff 15:25:19 <yamahata> If so, we need to raise it to change the neutron stadium governance. 15:26:11 <yamahata> So far do we have big issue to change the neutorn governance? 15:26:20 <mkolesni> no 15:27:31 <yamahata> If we have issue on reviewing of backport patches with adding them to reviewer, Let's discuss on it again. 15:28:00 <yamahata> Is it okay for you? mkolesni 15:28:27 <mkolesni> ok 15:28:46 <yamahata> thanks. then next topic 15:28:50 <yamahata> #topic ODL cleanup between tests 15:28:56 <yamahata> mpeterson: you're on stage 15:29:00 <mpeterson> cool 15:29:04 <mpeterson> so the thing is the following 15:29:09 <mkolesni> sorry have to go 15:29:12 <mkolesni> see you guys next week 15:29:25 <mpeterson> neutron has a guideline of not cleaning up after tests and instead it deletes the db and creates it again, right? 15:29:26 <yamahata> mkolesni: see you next week. It's 17:00UTC. 15:29:58 <mpeterson> well, because of that for example bgpvpn doesn't cleanup on failures 15:30:20 <yamahata> You mean test=unit tests. 15:30:22 <mpeterson> which means, since we are interacting with ODL, that ODL ends in a dirty status 15:30:31 <mpeterson> unit and functional are affected 15:30:46 <mpeterson> #info https://bugs.launchpad.net/bgpvpn/+bug/1723725 reference on the issue 15:30:54 <mpeterson> #link https://bugs.launchpad.net/bgpvpn/+bug/1723725 reference on the issue 15:31:06 <yamahata> Usually yes because it causes non-determinism depending on order of running test cases 15:31:49 <mpeterson> so basically we need to trigger a cleanup of ODL between tests 15:32:31 <mpeterson> in a similar way as neutron does it by dropping the DB 15:32:40 <yamahata> I see. So that's the reason why we're seeing intermittent errors wigh bgpvpn. 15:32:58 <mpeterson> yamahata: very possibly 15:33:48 <mpeterson> #info Because of a neutron design decision tests don't cleanup the DB between runs. As a result ODL gets to a dirty state. We need to cleanup ODL manually between tests. 15:34:27 <mpeterson> #action mpeterson to create a task or bug to cleanup ODL between tests 15:34:51 <mpeterson> yamahata: could also be part of the reason why tempest has interminent errors too 15:35:11 <yamahata> mpeterson: that sounds very plausible. 15:35:25 <mpeterson> now that we have grafana you can see that there is around 50% failure ratio 15:35:52 * yamahata opening grafana page 15:36:00 <mpeterson> and I've found this problem by chance :) 15:36:39 <yamahata> I'm very glad to see grafana back. 15:37:23 <yamahata> great finding. 15:37:25 <mpeterson> just a clarification: no datapoints means there were no failures or there were no executions 15:38:23 <mpeterson> that's the conclusion of this topic, if you want to continue 15:38:34 <mpeterson> yamahata: ^^ 15:38:56 <yamahata> anything else to add? 15:39:40 <mpeterson> yamahata: nope, I think it's pretty explanatory, unless someone has questions 15:39:48 <yamahata> okay, next topic 15:39:59 <yamahata> #topic Neutron/Infra/etc updates during their meetings 15:40:07 <mpeterson> my stage again 15:40:09 <yamahata> mpeterson: you're still on stage. :-) 15:40:40 <mpeterson> basically there are updates happening in neutron and infra IRC meetings that we don't have an idea of what's going on... 15:40:57 <mpeterson> ie: the effort to freely receive patches of the neutron-lib rehoming 15:41:16 <mpeterson> ie: incompatible changes to Zuul v3 that will be introduced 15:42:29 <mpeterson> I've read for example in regards of the first one, that there is a ML thread where they recommend a representative of each team to attend their meetings 15:42:37 <mpeterson> we should consider this, right? 15:43:03 <yamahata> Basically neutron stuff is discussed at neutron meeting 15:43:13 <yamahata> http://eavesdrop.openstack.org/#Neutron_Team_Meeting 15:44:07 <yamahata> Also there are additional specific neutron meetings. 15:44:21 <yamahata> e.g. drivers/CI/L3/Qos/upgreads. 15:44:35 <yamahata> I think neutron-lib is discussed at neutron team meeting. 15:45:08 <yamahata> For zuul I suppose it's discussed zuul meeting, but I'm not very sure about this. 15:45:22 <yamahata> #link http://eavesdrop.openstack.org/#Zuul_Meeting 15:45:50 <mpeterson> yamahata: okey, but do we have a vested interest to participate and can we? 15:46:01 <yamahata> regarding to neutorn we should attend it. 15:46:38 <mpeterson> #info there are updates that only happen on IRC that could have an impact on this project. Unless there is no participation of a representative of this project we might find ourselves in a bad position. 15:47:13 <yamahata> Personally I sometimes attend neutron meeting. But not very persistently recently. 15:47:54 <mpeterson> okey, personally I can't attend on their timeslot :/ (I don't work all days of the week) 15:48:32 <yamahata> #link https://wiki.openstack.org/wiki/Network/Meetings neutron meeting agenda 15:48:40 <yamahata> We can see neutron-lib updates 15:48:46 <mpeterson> #action to decide if we should and who should participate in the different meetings 15:48:59 <mpeterson> okey, that's it for the topic 15:49:09 <yamahata> Timeslot is rotated biweekly. 15:49:44 <mpeterson> yes, I can't in either 15:49:50 <yamahata> I see. 15:50:00 <rajivk_> I need to check time, may be i can 15:50:36 <mpeterson> the action has been created, we can follow up next week on this perhaps and continue since we only have 10'? 15:50:46 <yamahata> At worst we can check meeting minutes/logs. 15:51:12 <yamahata> Sure. 15:51:21 <mpeterson> great 15:51:25 <mpeterson> what's the next topic? 15:51:37 <yamahata> Okay, now we can move on to usually patches/bugs 15:51:43 <yamahata> #topic patches/bugs 15:52:15 <rajivk_> I would like to discuss about https://review.openstack.org/#/c/516857/ 15:53:08 <rajivk_> how should we proceed on this one? 15:53:42 <rajivk_> Should we target the scenarios, which lead us to this situation? 15:54:16 <rajivk_> yamahata, what do you mean by covering systematically? 15:54:58 <yamahata> My guess is that it's due to bug, maybe in journaling. 15:55:05 <yamahata> So it's bug work around. 15:55:25 <yamahata> If we get HTTP error from ODL, there are several possibility. 15:55:36 <mpeterson> rajivk_: I'm reluctant to accept this patch, as currently there is a bug which we haven't found which is causing those 404 in several situations. This is a workaround that would complicate things. 15:55:50 <yamahata> i.e. operation=create/update/delete, and http error=404, etc. 15:56:17 <yamahata> If it makes sense, we should address reasonable combination. 15:56:20 <mpeterson> yamahata, rajivk_: currently at redhat we are running a scale test and the 404s and Read Timeouts to REST are all over the place 15:56:33 <yamahata> But it's arguable if we should address it or not. 15:56:47 <mpeterson> yamahata, rajivk_: so it seems there is an underlying cause 15:57:04 <rajivk_> yes, it is temporary fix but it will cover original issue. 15:57:05 <yamahata> Do you observe other error case? 15:57:26 <mpeterson> yamahata, rajivk_: we haven't identified the root cause yet 15:57:36 <rajivk_> yamahata, no, i saw this one only. mpeterson, what about you? 15:58:07 <rajivk_> mpeterson, yamahata, i saw them in the logs of gate jobs 15:58:18 <mpeterson> just to give a magnitude about this... we have seen cases where we get more than 3000 Read Timeouts per minute 15:58:19 <rajivk_> i did not encountered them on my env 15:58:49 <rajivk_> read timeout from ODL? 15:58:53 <mpeterson> yes 15:58:58 <mpeterson> from the REST interface 15:59:22 <rajivk_> does odl stuck? 15:59:26 <yamahata> with 10 sec timeout? 15:59:53 <mpeterson> yes 16:00:07 <yamahata> With Openstack CI, I increased the timeout from default 10 sec to 60secs in the past. 16:00:18 <mpeterson> yamahata: but that's not the solution 16:00:34 <mpeterson> yamahata: there is an underlying cause that needs to be found 16:00:38 <yamahata> mpeterson: right. 16:01:07 <mpeterson> yamahata: we are working our way through the logs, if we find anything you'll be updated on it 16:01:17 <yamahata> Even with single ODL deployment, sometime ODL MD-SAL transaction abort sometimes and it's retried internally. 16:01:38 <rajivk_> hey, i notices some failure on odl side 16:01:44 <mpeterson> yamahata: correct 16:01:59 <rajivk_> it was something like optimistic locking. and it is continuous 16:02:10 <mpeterson> yamahata: in these tests it's even more complex because there are 3 controllers and it includes HA 16:02:26 <yamahata> You can see it in ODL log. something like Got OptimisticLockFailedException 16:02:34 <mpeterson> rajivk_: yes, that's what mkolesni found today and he'll discuss with Josh 16:02:38 <yamahata> Yeah. with ODL HA, thing is more complex. 16:02:57 <yamahata> With that I don't know what timeout is appropreate. 16:02:59 <mpeterson> rajivk_: good to see you found the same, it could be a solid clue 16:03:39 <yamahata> the default value 10 sec is just randomly picked. It's not based on measurement. 16:03:58 <mpeterson> yamahata: anyways 10 sec is a huge timeout frame... it should be way way smaller 16:04:04 <rajivk_> yamahata, this is the timeout our rest client waits for ODL to wait? 16:04:25 <mpeterson> rajivk_: correct 16:04:33 <rajivk_> 10s is too much 16:04:38 <yamahata> rajivk_: yes. 16:05:00 <rajivk_> I head ODL operations are async 16:05:23 <rajivk_> their is something wrong with odl. mpeterson, are you seeing these logs with all releases? 16:05:31 <rajivk_> or carbon or neutron specific? 16:05:37 <rajivk_> netron -> newton 16:06:17 <mpeterson> rajivk_: this is in carbon IIRC 16:07:32 <rajivk_> i saw those error logs in latest, AFAIK 16:07:51 <yamahata> In my past experience, 10sec timeout causes error and with 60sec, tests became much stabler. 16:08:10 <yamahata> Maybe we can experiment by decreasing timeout to 10 sec again. 16:09:27 <yamahata> Now we're over 9mins. 16:09:34 <rajivk_> I want to know, what happens if we disable router, does connectivity stays or lost among different subnet? 16:09:38 <yamahata> Do we have any other urgent patches/bugs? 16:09:58 <rajivk_> yamahata, mpeterson ^^^ 16:10:18 <mpeterson> yamahata: https://review.openstack.org/#/c/519384/1 16:10:56 <yamahata> Wow. why didn't we notice it... 16:11:23 <mpeterson> yamahata: because there were no UT 16:12:09 <yamahata> any other patches? 16:12:29 <mpeterson> yamahata: not for now 16:12:34 <yamahata> I think recovery patch is near for merge. 16:13:06 <yamahata> https://review.openstack.org/#/c/500366/ 16:13:08 <mpeterson> yamahata: yes, but I've posted a big comment section last time that rajivk_ hasn't addressed yet :) 16:13:29 <yamahata> Okay. 16:13:30 <rajivk_> mpeterson, now i want to introduce changes by small patches 16:13:34 <mpeterson> yamahata, rajivk_: nothing too big though 16:13:50 <rajivk_> mpeterson, ok then 16:13:56 <rajivk_> it is db one right? 16:14:14 <rajivk_> i will finish it tomorrow 16:14:53 <mpeterson> yamahata: another thing before you close the meeting... mkolesni just called me and says we can leave the meeting at the time it was today 16:15:25 <yamahata> you mean 15:00UTC? 16:15:48 <mpeterson> yamahata: if we started 15 UTC today, then yes 16:15:51 <yamahata> how about rajivk_, mpeterson ? 15:00UTC works for you tow? 16:16:02 <mpeterson> mpeterson: it's preferable for me 16:16:09 <yamahata> Right today we've started at 15:00UTC. 16:16:10 <rajivk_> i am ok 16:16:18 <yamahata> Okay, then let's continue 15:00UTC 16:16:25 <mpeterson> #agree we will continue this meetings 15:00 UTC 16:16:33 <yamahata> #action yamahata update timeslot to 15:00UTC on wiki. 16:16:48 <yamahata> anything else? 16:16:54 <yamahata> #topic open mike 16:17:13 <yamahata> okay, thank you everyone. 16:17:20 <mpeterson> thanks! 16:17:22 <mpeterson> see you next week 16:17:24 <mpeterson> take care 16:17:40 <yamahata> #topic cookies 16:17:44 <yamahata> #endmeeting