15:08:52 <yamahata> #startmeeting neutron_northbound
15:08:52 <odl_meetbot> Meeting started Mon Nov 13 15:08:52 2017 UTC.  The chair is yamahata. Information about MeetBot at http://ci.openstack.org/meetbot.html.
15:08:52 <odl_meetbot> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:08:52 <odl_meetbot> The meeting name has been set to 'neutron_northbound'
15:08:59 <yamahata> #chair mkolesni rajivk_  mkolesni
15:08:59 <odl_meetbot> Current chairs: mkolesni rajivk_ yamahata
15:09:03 <yamahata> #chair mpeterson
15:09:03 <odl_meetbot> Current chairs: mkolesni mpeterson rajivk_ yamahata
15:09:10 <yamahata> #topic agenda bashing and roll call
15:09:14 <mkolesni> #info mkolesni
15:09:16 <yamahata> #info yamahata
15:09:17 <rajivk_> #info rajivk
15:09:30 <mpeterson> #info mpeterson
15:09:33 <yamahata> today any additional topics?
15:09:44 <mkolesni> theres a ton of stuff on the stable/pike branch waiting for review
15:09:50 <mkolesni> most of it +1 by us
15:09:55 <mpeterson> yes, I have two topics
15:10:08 <mkolesni> how would we approach this wrt stable main team?
15:10:11 <mpeterson> #info ODL cleanup between tests
15:10:33 <mpeterson> #info Neutron/Infra/etc updates during their meetings
15:11:04 <yamahata> this week we have three extra topics
15:11:14 <yamahata> any other topics?
15:11:42 <mpeterson> no
15:12:07 <mkolesni> none from me
15:12:16 <rajivk_> none from me
15:12:19 <yamahata> okay, move on
15:12:21 <yamahata> #topic Announcements
15:12:36 <yamahata> Last week there was openstack summit. there was forum.
15:12:50 <yamahata> #link https://wiki.openstack.org/wiki/Forum/Sydney2017 openstack forum
15:12:56 <yamahata> #link https://etherpad.openstack.org/p/Neutron-pain-points-Sydney
15:13:18 <yamahata> We'd like to check what was discussed there.
15:13:51 <yamahata> #link https://www.openstack.org/ptg/
15:14:06 <yamahata> openstack PTG is planned on the week of Feb 26, 2018 in Dublin.
15:14:15 <yamahata> For now the details isn't announced yet.
15:14:16 <mkolesni> im still waiting on an answer for you guys :)
15:14:54 <mpeterson> #info openstack PTG is planned on the week of Feb 26, 2018 in Dublin. No more details at the moment.
15:14:59 <yamahata> Without concrete plan, it's difficult for me to get travel budget.
15:15:04 <mkolesni> if youre going to attend or not
15:15:19 <mkolesni> tell them rest of n-odl team will be there :)
15:15:22 <yamahata> I need to wait for more details to be announced.
15:16:16 <yamahata> any other announcement?
15:17:07 <yamahata> seems nothing. let's move on
15:17:11 <yamahata> #topic action items from last meeting
15:17:27 <yamahata> There is only timeslot and patch review stuff.
15:17:41 <yamahata> okay, now move on proposed topics
15:17:48 <yamahata> #topic approach for stable branch
15:18:04 <yamahata> mkolesni: please go ahead.
15:18:29 <yamahata> basically they don't monitor back port patches. we have to add them to reviewer explicitly.
15:18:34 <mkolesni> yeah i just said theres a few things we backported there
15:18:36 <mkolesni> ok
15:18:46 <mkolesni> i thought they have wome weekly review or something?
15:19:02 <mkolesni> we need to explicitly add the individuals to the patches?
15:19:27 <yamahata> the latter.
15:19:45 <yamahata> We need to add the maintenance team explicity to the patch review.
15:19:45 <mkolesni> well thats unfortunate..
15:20:00 <yamahata> They have many patches to review, they don't actively poll our patches.
15:20:09 <mkolesni> ok will add them
15:20:22 <yamahata> Once they are added, usually they will review in several days.
15:20:34 <mkolesni> i think maybe we can raise this on neutron level
15:20:40 <mpeterson> yamahata: mmm and who would they be? I usually see they are added however they just linger without +2 +W the patches
15:20:48 <mkolesni> like why dont we moderate stable ourselves?
15:21:19 <yamahata> For now that's the neutron team consensus.
15:21:34 <yamahata> So far do you have any issues?
15:21:44 <yamahata> except patch review without adding reviewers?
15:22:11 <yamahata> #link https://review.openstack.org/#/admin/groups/539,members neutron stable maint team
15:22:36 <mkolesni> im not sure if its event relevant to have them moderating it
15:22:38 <mpeterson> #info Regarding Stable patches in order for them to be reviewed the neutron stable maint team has to be manually added to the patch. Link above.
15:23:23 <mkolesni> maybe we can have a "stable liason" that will moderate so that they do less checks on each patch
15:23:59 <mkolesni> also probably some of the checks can be done by CI, i.e. don't backport db migration scripts, that stuff
15:25:19 <yamahata> If so, we need to raise it to change the neutron stadium governance.
15:26:11 <yamahata> So far do we have big issue to change the neutorn governance?
15:26:20 <mkolesni> no
15:27:31 <yamahata> If we have issue on reviewing of backport patches with adding them to reviewer, Let's discuss on it again.
15:28:00 <yamahata> Is it okay for you? mkolesni
15:28:27 <mkolesni> ok
15:28:46 <yamahata> thanks. then next topic
15:28:50 <yamahata> #topic ODL cleanup between tests
15:28:56 <yamahata> mpeterson: you're on stage
15:29:00 <mpeterson> cool
15:29:04 <mpeterson> so the thing is the following
15:29:09 <mkolesni> sorry have to go
15:29:12 <mkolesni> see you guys next week
15:29:25 <mpeterson> neutron has a guideline of not cleaning up after tests and instead it deletes the db and creates it again, right?
15:29:26 <yamahata> mkolesni: see you next week. It's 17:00UTC.
15:29:58 <mpeterson> well, because of that for example bgpvpn doesn't cleanup on failures
15:30:20 <yamahata> You mean test=unit tests.
15:30:22 <mpeterson> which means, since we are interacting with ODL, that ODL ends in a dirty status
15:30:31 <mpeterson> unit and functional are affected
15:30:46 <mpeterson> #info https://bugs.launchpad.net/bgpvpn/+bug/1723725 reference on the issue
15:30:54 <mpeterson> #link https://bugs.launchpad.net/bgpvpn/+bug/1723725 reference on the issue
15:31:06 <yamahata> Usually yes because it causes non-determinism depending on order of running test cases
15:31:49 <mpeterson> so basically we need to trigger a cleanup of ODL between tests
15:32:31 <mpeterson> in a similar way as neutron does it by dropping the DB
15:32:40 <yamahata> I see. So that's the reason why we're seeing intermittent errors wigh bgpvpn.
15:32:58 <mpeterson> yamahata: very possibly
15:33:48 <mpeterson> #info Because of a neutron design decision tests don't cleanup the DB between runs. As a result ODL gets to a dirty state. We need to cleanup ODL manually between tests.
15:34:27 <mpeterson> #action mpeterson to create a task or bug to cleanup ODL between tests
15:34:51 <mpeterson> yamahata: could also be part of the reason why tempest has interminent errors too
15:35:11 <yamahata> mpeterson: that sounds very plausible.
15:35:25 <mpeterson> now that we have grafana you can see that there is around 50% failure ratio
15:35:52 * yamahata opening grafana page
15:36:00 <mpeterson> and I've found this problem by chance :)
15:36:39 <yamahata> I'm very glad to see grafana back.
15:37:23 <yamahata> great finding.
15:37:25 <mpeterson> just a clarification: no datapoints means there were no failures or there were no executions
15:38:23 <mpeterson> that's the conclusion of this topic, if you want to continue
15:38:34 <mpeterson> yamahata: ^^
15:38:56 <yamahata> anything else to add?
15:39:40 <mpeterson> yamahata: nope, I think it's pretty explanatory, unless someone has questions
15:39:48 <yamahata> okay, next topic
15:39:59 <yamahata> #topic Neutron/Infra/etc updates during their meetings
15:40:07 <mpeterson> my stage again
15:40:09 <yamahata> mpeterson: you're still on stage. :-)
15:40:40 <mpeterson> basically there are updates happening in neutron and infra IRC meetings that we don't have an idea of what's going on...
15:40:57 <mpeterson> ie: the effort to freely receive patches of the neutron-lib rehoming
15:41:16 <mpeterson> ie: incompatible changes to Zuul v3 that will be introduced
15:42:29 <mpeterson> I've read for example in regards of the first one, that there is a ML thread where they recommend a representative of each team to attend their meetings
15:42:37 <mpeterson> we should consider this, right?
15:43:03 <yamahata> Basically neutron stuff is discussed at neutron meeting
15:43:13 <yamahata> http://eavesdrop.openstack.org/#Neutron_Team_Meeting
15:44:07 <yamahata> Also there are additional specific neutron meetings.
15:44:21 <yamahata> e.g. drivers/CI/L3/Qos/upgreads.
15:44:35 <yamahata> I think neutron-lib is discussed at neutron team meeting.
15:45:08 <yamahata> For zuul I suppose it's discussed zuul meeting, but I'm not very sure about this.
15:45:22 <yamahata> #link http://eavesdrop.openstack.org/#Zuul_Meeting
15:45:50 <mpeterson> yamahata: okey, but do we have a vested interest to participate and can we?
15:46:01 <yamahata> regarding to neutorn we should attend it.
15:46:38 <mpeterson> #info there are updates that only happen on IRC that could have an impact on this project. Unless there is no participation of a representative of this project we might find ourselves in a bad position.
15:47:13 <yamahata> Personally I sometimes attend neutron meeting. But not very persistently recently.
15:47:54 <mpeterson> okey, personally I can't attend on their timeslot :/ (I don't work all days of the week)
15:48:32 <yamahata> #link https://wiki.openstack.org/wiki/Network/Meetings neutron meeting agenda
15:48:40 <yamahata> We can see neutron-lib updates
15:48:46 <mpeterson> #action to decide if we should and who should participate in the different meetings
15:48:59 <mpeterson> okey, that's it for the topic
15:49:09 <yamahata> Timeslot is rotated biweekly.
15:49:44 <mpeterson> yes, I can't in either
15:49:50 <yamahata> I see.
15:50:00 <rajivk_> I need to check time, may be i can
15:50:36 <mpeterson> the action has been created, we can follow up next week on this perhaps and continue since we only have 10'?
15:50:46 <yamahata> At worst we can check meeting minutes/logs.
15:51:12 <yamahata> Sure.
15:51:21 <mpeterson> great
15:51:25 <mpeterson> what's the next topic?
15:51:37 <yamahata> Okay, now we can move on to usually patches/bugs
15:51:43 <yamahata> #topic patches/bugs
15:52:15 <rajivk_> I would like to discuss about https://review.openstack.org/#/c/516857/
15:53:08 <rajivk_> how should we proceed on this one?
15:53:42 <rajivk_> Should we target the scenarios, which lead us to this situation?
15:54:16 <rajivk_> yamahata, what do you mean by covering systematically?
15:54:58 <yamahata> My guess is that it's due to bug, maybe in journaling.
15:55:05 <yamahata> So it's bug work around.
15:55:25 <yamahata> If we get HTTP error from ODL, there are several possibility.
15:55:36 <mpeterson> rajivk_: I'm reluctant to accept this patch, as currently there is a bug which we haven't found which is causing those 404 in several situations. This is a workaround that would complicate things.
15:55:50 <yamahata> i.e. operation=create/update/delete, and http error=404, etc.
15:56:17 <yamahata> If it makes sense, we should address reasonable combination.
15:56:20 <mpeterson> yamahata, rajivk_: currently at redhat we are running a scale test and the 404s and Read Timeouts to REST are all over the place
15:56:33 <yamahata> But it's arguable if we should address it or not.
15:56:47 <mpeterson> yamahata, rajivk_: so it seems there is an underlying cause
15:57:04 <rajivk_> yes, it is temporary fix but it will cover original issue.
15:57:05 <yamahata> Do you observe other error case?
15:57:26 <mpeterson> yamahata, rajivk_: we haven't identified the root cause yet
15:57:36 <rajivk_> yamahata, no, i saw this one only. mpeterson, what about you?
15:58:07 <rajivk_> mpeterson, yamahata, i saw them in the logs of gate jobs
15:58:18 <mpeterson> just to give a magnitude about this... we have seen cases where we get more than 3000 Read Timeouts per minute
15:58:19 <rajivk_> i did not encountered them on my env
15:58:49 <rajivk_> read timeout from ODL?
15:58:53 <mpeterson> yes
15:58:58 <mpeterson> from the REST interface
15:59:22 <rajivk_> does odl stuck?
15:59:26 <yamahata> with 10 sec timeout?
15:59:53 <mpeterson> yes
16:00:07 <yamahata> With Openstack CI, I increased the timeout from default 10 sec to 60secs in the past.
16:00:18 <mpeterson> yamahata: but that's not the solution
16:00:34 <mpeterson> yamahata: there is an underlying cause that needs to be found
16:00:38 <yamahata> mpeterson: right.
16:01:07 <mpeterson> yamahata: we are working our way through the logs, if we find anything you'll be updated on it
16:01:17 <yamahata> Even with single ODL deployment, sometime ODL MD-SAL transaction abort sometimes and it's retried internally.
16:01:38 <rajivk_> hey, i notices some failure on odl side
16:01:44 <mpeterson> yamahata: correct
16:01:59 <rajivk_> it was something like optimistic locking. and it is continuous
16:02:10 <mpeterson> yamahata: in these tests it's even more complex because there are 3 controllers and it includes HA
16:02:26 <yamahata> You can see it in ODL log. something like Got OptimisticLockFailedException
16:02:34 <mpeterson> rajivk_: yes, that's what mkolesni found today and he'll discuss with Josh
16:02:38 <yamahata> Yeah. with ODL HA, thing is more complex.
16:02:57 <yamahata> With that I don't know what timeout is appropreate.
16:02:59 <mpeterson> rajivk_: good to see you found the same, it could be a solid clue
16:03:39 <yamahata> the default value 10 sec is just randomly picked. It's not based on measurement.
16:03:58 <mpeterson> yamahata: anyways 10 sec is a huge timeout frame... it should be way way smaller
16:04:04 <rajivk_> yamahata, this is the timeout our rest client waits for ODL to wait?
16:04:25 <mpeterson> rajivk_: correct
16:04:33 <rajivk_> 10s is too much
16:04:38 <yamahata> rajivk_: yes.
16:05:00 <rajivk_> I head ODL operations are async
16:05:23 <rajivk_> their is something wrong with odl. mpeterson, are you seeing these logs with all releases?
16:05:31 <rajivk_> or carbon or neutron specific?
16:05:37 <rajivk_> netron -> newton
16:06:17 <mpeterson> rajivk_: this is in carbon IIRC
16:07:32 <rajivk_> i saw those error logs in latest, AFAIK
16:07:51 <yamahata> In my past experience, 10sec timeout causes error and with 60sec, tests became much stabler.
16:08:10 <yamahata> Maybe we can experiment by decreasing timeout to 10 sec again.
16:09:27 <yamahata> Now we're over 9mins.
16:09:34 <rajivk_> I want to know, what happens if we disable router, does connectivity stays or lost among different subnet?
16:09:38 <yamahata> Do we have any other urgent patches/bugs?
16:09:58 <rajivk_> yamahata, mpeterson ^^^
16:10:18 <mpeterson> yamahata: https://review.openstack.org/#/c/519384/1
16:10:56 <yamahata> Wow. why didn't we notice it...
16:11:23 <mpeterson> yamahata: because there were no UT
16:12:09 <yamahata> any other patches?
16:12:29 <mpeterson> yamahata: not for now
16:12:34 <yamahata> I think recovery patch is near for merge.
16:13:06 <yamahata> https://review.openstack.org/#/c/500366/
16:13:08 <mpeterson> yamahata: yes, but I've posted a big comment section last time that rajivk_ hasn't addressed yet :)
16:13:29 <yamahata> Okay.
16:13:30 <rajivk_> mpeterson, now i want to introduce changes by small patches
16:13:34 <mpeterson> yamahata, rajivk_: nothing too big though
16:13:50 <rajivk_> mpeterson, ok then
16:13:56 <rajivk_> it is db one right?
16:14:14 <rajivk_> i will finish it tomorrow
16:14:53 <mpeterson> yamahata: another thing before you close the meeting... mkolesni just called me and says we can leave the meeting at the time it was today
16:15:25 <yamahata> you mean 15:00UTC?
16:15:48 <mpeterson> yamahata: if we started 15 UTC today, then yes
16:15:51 <yamahata> how about rajivk_, mpeterson ? 15:00UTC works for you tow?
16:16:02 <mpeterson> mpeterson: it's preferable for me
16:16:09 <yamahata> Right today we've started at 15:00UTC.
16:16:10 <rajivk_> i am ok
16:16:18 <yamahata> Okay, then let's continue 15:00UTC
16:16:25 <mpeterson> #agree we will continue this meetings 15:00 UTC
16:16:33 <yamahata> #action yamahata update timeslot to 15:00UTC on wiki.
16:16:48 <yamahata> anything else?
16:16:54 <yamahata> #topic open mike
16:17:13 <yamahata> okay, thank you everyone.
16:17:20 <mpeterson> thanks!
16:17:22 <mpeterson> see you next week
16:17:24 <mpeterson> take care
16:17:40 <yamahata> #topic cookies
16:17:44 <yamahata> #endmeeting