16:00:13 <frankbrockners> #startmeeting FDS synch
16:00:13 <collabot`> Meeting started Thu Nov  3 16:00:13 2016 UTC.  The chair is frankbrockners. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:13 <collabot`> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:00:13 <collabot`> The meeting name has been set to 'fds_synch'
16:00:18 <frankbrockners> #info Frank Brockners
16:00:25 <jlinkes> #info Juraj Linkes
16:00:31 <michal-cmarada> #info Michal Cmarada
16:00:42 <frankbrockners> hmm - looks like the meetbot has an issue...
16:01:10 <michal-cmarada> it didnt start the meeting
16:01:26 <frankbrockners> ah now it did
16:01:33 <frankbrockners> looks like meetbot is a bit slow today
16:01:41 <mmarsale> #info Maros Marsalek
16:01:41 <tomas_c> #info Tomas Cechvala
16:01:43 <michal-cmarada> #info Michal Cmarada
16:01:58 <mmarsale> #info Maros Marsalek
16:02:16 <tomas_c> #info Tomas Cechvala
16:03:23 <frankbrockners> agenda for today - (a) VXLAN tunnels don't pass traffic for periods of time (b) deploy status of HA scenario on Cisco FDS POD (c) deploy status of HA scenario on CENGN supermicro lab (d) hostconfig implementation
16:03:33 <frankbrockners> #info agenda for today - (a) VXLAN tunnels don't pass traffic for periods of time (b) deploy status of HA scenario on Cisco FDS POD (c) deploy status of HA scenario on CENGN supermicro lab (d) hostconfig implementation
16:04:29 <frankbrockners> let's get rolling
16:05:04 <frankbrockners> jlinkes - any updates on (a) FDS-142? It might be that the issues seanatcisco sees are similar to that.
16:06:27 <jlinkes> when debugging (a) I found that a similar qemu issue happens when I delete a security group - could be that multiple recreations of the port cause this. The qemu logs are a little bit different - https://paste.fedoraproject.org/469680/18538514/
16:07:27 <frankbrockners> jlinkes - do we have a Jira ticket in FD.io for this already?
16:07:56 <jlinkes> frankbrockners: not yet
16:08:59 <frankbrockners> jlinkes: Could you create one? Once we have a ticket there - it would be good to send email to Emran and Damjan asking for guidance of how to get this resolved. Please cc Maciek as well
16:09:17 <frankbrockners> seanatcisco - are you around?
16:09:24 <seanatcisco> frankbrockners: here
16:09:32 <seanatcisco> #info Sean Chandler
16:10:50 <frankbrockners> seanatcisco - could the issue you see with regards to packets not being forwarded for some amount of time be related to FDS-142 and what jlinkes stated above?
16:11:02 <seanatcisco> frankbrokners: checking now
16:11:12 <jlinkes> frankbrockners: https://jira.opnfv.org/browse/FDS-144
16:11:43 <frankbrockners> thanks jlinkes
16:12:32 <seanatcisco> frankbrockners:  in my case i have to restart vpp and the agent in order to get it to work
16:12:34 <frankbrockners> this sounds similar to a speculation of seanatcisco (that sean sent to me over email): "My guess is that the number of deletions/additions are causing this problem but … that is just a guess.  We aren’t going to be able to make progress unless this sort of issue is resolved."
16:13:16 <frankbrockners> seanatcisco - should we groom the behavior into the above VPP ticket?
16:13:25 <seanatcisco> frankbrockners: well yes that is still true.  there is a patch to master on port reuse/cleanup that im going to test today in 16.09 that im building myself
16:13:50 <frankbrockners> seanatcisco - is this also in master?
16:13:59 <frankbrockners> because jlinkes uses master and not 16.09
16:14:04 <seanatcisco> frankbrockners:  the patch is in master but i was seeing it in FDS in 16.09
16:14:31 <seanatcisco> frankbrockners: similar symptoms have been seen in both but its really just speculation that is relating the two at this point
16:14:40 <seanatcisco> frankbrockners: will test to be certain
16:14:51 <frankbrockners> ok - just wondering how much more we wait/try ourselves before roping in Emran
16:15:13 <jlinkes> all of this is very speculative, agreed
16:15:19 <frankbrockners> seanatcisco: could you create a jira ticket for the issue - like jlinkes did?
16:15:28 <seanatcisco> frankbrockners: im having Shriram build the RPMs now to test in FDS (16.09)
16:15:36 <seanatcisco> frankbrockners: will do
16:15:45 <frankbrockners> I'll bring the two issues to Emran's attention post this meeting.
16:15:53 <seanatcisco> frankbrockners: should we be testing with master still or move up?
16:16:15 <frankbrockners> seanatcisco - could you do this right now? It would be ok to start with what you put into the email to me
16:16:26 <frankbrockners> seanatcisco - I'd test with master
16:16:35 <seanatcisco> frankbrockners: sure thing
16:16:39 <frankbrockners> we use master for now, given that all the security stuff is only in master
16:16:46 <frankbrockners> thanks seanatcisco
16:16:58 <frankbrockners> if you could test with master that would be great
16:17:08 <frankbrockners> let's move to the HA scenarios
16:17:38 <jlinkes> frankbrockners: does that mean that I don't have to write the e-mail to Emran?
16:17:42 <frankbrockners> jlinkes - any updates on how functest performed on the Cisco FDS POD or the CENGN supermicro POD?
16:18:20 <frankbrockners> jlinkes - I'll take care of the email.
16:18:51 <jlinkes> supermicro pod - I applied the galera config along with the secgroup workaround, healthcheck passed and vping_ssh didn't
16:19:33 <jlinkes> fds pod - it doesn't seem like the galera config does anything there, the services are flapping regardless
16:19:48 <jlinkes> fds pod - when I ran functest, I got Service Unavailable (HTTP 503)
16:19:50 <frankbrockners> jlinkes - did you run any other tests apart from vping (i.e. tempest etc)
16:20:03 <jlinkes> I didn't
16:20:29 <frankbrockners> jlinkes - do we see the flapping services on the supermicro lab as well?
16:20:30 <jlinkes> the values that the galera configuration if based on fluctuate
16:20:37 <jlinkes> no, supermicro was fine
16:20:52 <jlinkes> so I tried a few different values
16:20:58 <frankbrockners> jlinkes do you know why vping-ssh failed
16:20:59 <frankbrockners> ?
16:21:00 <jlinkes> but that didn't solve the problem
16:21:33 <jlinkes> couldn't ssh into the vm, but I don't know the reason behind that
16:22:05 <frankbrockners> trozet - are you around?
16:22:36 <frankbrockners> trozet - in case you are and have a few spare cycles - could you check the CENGN deployment?
16:22:57 <jlinkes> I redeployed fds pod in case the deployment was somehow corrupt - I did run functest on it before configuring galera, so that might've caused some problems
16:23:00 <frankbrockners> looks like Tim is not here
16:23:18 <jlinkes> but that's where I ended
16:24:11 <frankbrockners> jlinkes - do we have a jira ticket for the two issues you described? IMHO it would be good if we had one. I could include this into my status email to Emran.
16:25:09 <frankbrockners> let's move to the last agenda topic - hostconfig implementation
16:25:22 <frankbrockners> tomas_c - quick update?
16:25:47 <tomas_c> frankbrockners: sure, i made a progress, finally was able to bind nova instance into a generic vhostuser socket
16:26:49 <frankbrockners> tomas_c - great news - congrats!
16:27:08 <tomas_c> since my env was not a full stack
16:27:09 <tomas_c> but made of partial components
16:27:09 <tomas_c> we probably need to test it with full deploy
16:27:09 <tomas_c> frankbrockners: thank you!
16:27:10 <tomas_c> if it works as it should. I was talking with Juraj and Michal about my observations
16:27:18 <tomas_c> they expect it to work, so we will see
16:27:31 <frankbrockners> tomas_c - are there any open issues or are we ready to switch to proper hostconfig and deprecate the workaround that Woj did a while back?
16:27:58 <tomas_c> frankbrockners: no open issues that i'm aware of
16:28:01 <tomas_c> just
16:28:03 <frankbrockners> tomas_c - thanks. you were quicker than I :-)
16:28:16 <tomas_c> implementation in gbp, but that's a small bit
16:28:25 <jlinkes> frankbrockners: I just checked and there is a difference between cengn and fds pod - fds pod has https://jira.opnfv.org/browse/APEX-337 and cengn doesn't
16:29:09 <frankbrockners> jlinkes - could you try to redeploy without APEX-337?
16:29:12 <jlinkes> frankbrockners: the very thing that should've fixed thing could very well might've made things worse
16:29:28 <jlinkes> frankbrockers: sure
16:29:56 <frankbrockners> thanks jlinkes
16:30:53 <frankbrockners> looks like we're done
16:32:07 <frankbrockners> I'll just wait for seanatcisco to create a jira ticket for the issue and then send an update to Emran. Will also include the status of HA deploy.
16:32:23 <seanatcisco> frankbrockners: https://jira.opnfv.org/browse/FDS-145
16:32:41 <frankbrockners> jlinkes - if you could try to ping trozet on the HA deploy issues that would be nice - unfortunately trozet does not seem to be around
16:32:49 <frankbrockners> thanks seanatcisco!
16:33:02 <frankbrockners> thanks everyone!
16:33:09 <seanatcisco> frankbrockners: its under the name alaskanson... dont judge me :)
16:33:09 <frankbrockners> #endmeeting