16:00:13 #startmeeting FDS synch 16:00:13 Meeting started Thu Nov 3 16:00:13 2016 UTC. The chair is frankbrockners. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:13 Useful Commands: #action #agreed #help #info #idea #link #topic. 16:00:13 The meeting name has been set to 'fds_synch' 16:00:18 #info Frank Brockners 16:00:25 #info Juraj Linkes 16:00:31 #info Michal Cmarada 16:00:42 hmm - looks like the meetbot has an issue... 16:01:10 it didnt start the meeting 16:01:26 ah now it did 16:01:33 looks like meetbot is a bit slow today 16:01:41 #info Maros Marsalek 16:01:41 #info Tomas Cechvala 16:01:43 #info Michal Cmarada 16:01:58 #info Maros Marsalek 16:02:16 #info Tomas Cechvala 16:03:23 agenda for today - (a) VXLAN tunnels don't pass traffic for periods of time (b) deploy status of HA scenario on Cisco FDS POD (c) deploy status of HA scenario on CENGN supermicro lab (d) hostconfig implementation 16:03:33 #info agenda for today - (a) VXLAN tunnels don't pass traffic for periods of time (b) deploy status of HA scenario on Cisco FDS POD (c) deploy status of HA scenario on CENGN supermicro lab (d) hostconfig implementation 16:04:29 let's get rolling 16:05:04 jlinkes - any updates on (a) FDS-142? It might be that the issues seanatcisco sees are similar to that. 16:06:27 when debugging (a) I found that a similar qemu issue happens when I delete a security group - could be that multiple recreations of the port cause this. The qemu logs are a little bit different - https://paste.fedoraproject.org/469680/18538514/ 16:07:27 jlinkes - do we have a Jira ticket in FD.io for this already? 16:07:56 frankbrockners: not yet 16:08:59 jlinkes: Could you create one? Once we have a ticket there - it would be good to send email to Emran and Damjan asking for guidance of how to get this resolved. Please cc Maciek as well 16:09:17 seanatcisco - are you around? 16:09:24 frankbrockners: here 16:09:32 #info Sean Chandler 16:10:50 seanatcisco - could the issue you see with regards to packets not being forwarded for some amount of time be related to FDS-142 and what jlinkes stated above? 16:11:02 frankbrokners: checking now 16:11:12 frankbrockners: https://jira.opnfv.org/browse/FDS-144 16:11:43 thanks jlinkes 16:12:32 frankbrockners: in my case i have to restart vpp and the agent in order to get it to work 16:12:34 this sounds similar to a speculation of seanatcisco (that sean sent to me over email): "My guess is that the number of deletions/additions are causing this problem but … that is just a guess. We aren’t going to be able to make progress unless this sort of issue is resolved." 16:13:16 seanatcisco - should we groom the behavior into the above VPP ticket? 16:13:25 frankbrockners: well yes that is still true. there is a patch to master on port reuse/cleanup that im going to test today in 16.09 that im building myself 16:13:50 seanatcisco - is this also in master? 16:13:59 because jlinkes uses master and not 16.09 16:14:04 frankbrockners: the patch is in master but i was seeing it in FDS in 16.09 16:14:31 frankbrockners: similar symptoms have been seen in both but its really just speculation that is relating the two at this point 16:14:40 frankbrockners: will test to be certain 16:14:51 ok - just wondering how much more we wait/try ourselves before roping in Emran 16:15:13 all of this is very speculative, agreed 16:15:19 seanatcisco: could you create a jira ticket for the issue - like jlinkes did? 16:15:28 frankbrockners: im having Shriram build the RPMs now to test in FDS (16.09) 16:15:36 frankbrockners: will do 16:15:45 I'll bring the two issues to Emran's attention post this meeting. 16:15:53 frankbrockners: should we be testing with master still or move up? 16:16:15 seanatcisco - could you do this right now? It would be ok to start with what you put into the email to me 16:16:26 seanatcisco - I'd test with master 16:16:35 frankbrockners: sure thing 16:16:39 we use master for now, given that all the security stuff is only in master 16:16:46 thanks seanatcisco 16:16:58 if you could test with master that would be great 16:17:08 let's move to the HA scenarios 16:17:38 frankbrockners: does that mean that I don't have to write the e-mail to Emran? 16:17:42 jlinkes - any updates on how functest performed on the Cisco FDS POD or the CENGN supermicro POD? 16:18:20 jlinkes - I'll take care of the email. 16:18:51 supermicro pod - I applied the galera config along with the secgroup workaround, healthcheck passed and vping_ssh didn't 16:19:33 fds pod - it doesn't seem like the galera config does anything there, the services are flapping regardless 16:19:48 fds pod - when I ran functest, I got Service Unavailable (HTTP 503) 16:19:50 jlinkes - did you run any other tests apart from vping (i.e. tempest etc) 16:20:03 I didn't 16:20:29 jlinkes - do we see the flapping services on the supermicro lab as well? 16:20:30 the values that the galera configuration if based on fluctuate 16:20:37 no, supermicro was fine 16:20:52 so I tried a few different values 16:20:58 jlinkes do you know why vping-ssh failed 16:20:59 ? 16:21:00 but that didn't solve the problem 16:21:33 couldn't ssh into the vm, but I don't know the reason behind that 16:22:05 trozet - are you around? 16:22:36 trozet - in case you are and have a few spare cycles - could you check the CENGN deployment? 16:22:57 I redeployed fds pod in case the deployment was somehow corrupt - I did run functest on it before configuring galera, so that might've caused some problems 16:23:00 looks like Tim is not here 16:23:18 but that's where I ended 16:24:11 jlinkes - do we have a jira ticket for the two issues you described? IMHO it would be good if we had one. I could include this into my status email to Emran. 16:25:09 let's move to the last agenda topic - hostconfig implementation 16:25:22 tomas_c - quick update? 16:25:47 frankbrockners: sure, i made a progress, finally was able to bind nova instance into a generic vhostuser socket 16:26:49 tomas_c - great news - congrats! 16:27:08 since my env was not a full stack 16:27:09 but made of partial components 16:27:09 we probably need to test it with full deploy 16:27:09 frankbrockners: thank you! 16:27:10 if it works as it should. I was talking with Juraj and Michal about my observations 16:27:18 they expect it to work, so we will see 16:27:31 tomas_c - are there any open issues or are we ready to switch to proper hostconfig and deprecate the workaround that Woj did a while back? 16:27:58 frankbrockners: no open issues that i'm aware of 16:28:01 just 16:28:03 tomas_c - thanks. you were quicker than I :-) 16:28:16 implementation in gbp, but that's a small bit 16:28:25 frankbrockners: I just checked and there is a difference between cengn and fds pod - fds pod has https://jira.opnfv.org/browse/APEX-337 and cengn doesn't 16:29:09 jlinkes - could you try to redeploy without APEX-337? 16:29:12 frankbrockners: the very thing that should've fixed thing could very well might've made things worse 16:29:28 frankbrockers: sure 16:29:56 thanks jlinkes 16:30:53 looks like we're done 16:32:07 I'll just wait for seanatcisco to create a jira ticket for the issue and then send an update to Emran. Will also include the status of HA deploy. 16:32:23 frankbrockners: https://jira.opnfv.org/browse/FDS-145 16:32:41 jlinkes - if you could try to ping trozet on the HA deploy issues that would be nice - unfortunately trozet does not seem to be around 16:32:49 thanks seanatcisco! 16:33:02 thanks everyone! 16:33:09 frankbrockners: its under the name alaskanson... dont judge me :) 16:33:09 #endmeeting