16:00:16 <frankbrockners> #startmeeting FDS synch 16:00:16 <collabot> Meeting started Thu Sep 1 16:00:16 2016 UTC. The chair is frankbrockners. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:16 <collabot> Useful Commands: #action #agreed #help #info #idea #link #topic. 16:00:16 <collabot> The meeting name has been set to 'fds_synch' 16:00:20 <jlinkes> #info Juraj Linkes 16:00:41 <frankbrockners> #info Frank Brockners 16:01:09 <frankbrockners> #info draft agenda - https://wiki.opnfv.org/display/meetings/FastDataStacks#FastDataStacks-Thursday,September1,2016 16:01:15 <michal-cmarada|2> #info Michal Cmarada 16:02:02 <frankbrockners> let's get rolling... - let's touch on the open issues first. Michal made good progress. 16:03:14 <frankbrockners> michal-cmarada|2 - quick update on the qr tap to BD? 16:04:14 <michal-cmarada|2> seems that it is working properly. Port is added to the BD and VMs can reach it. 16:04:20 <frankbrockners> #info Michal submitted https://git.opendaylight.org/gerrit/#/c/44935/; https://git.opendaylight.org/gerrit/#/c/45000/ (for boron and carbon) which are expected to resolve the qrouter tap to BD issue 16:04:39 <frankbrockners> #info Michal validated things locally: Port is added to the BD and VMs can reach it. 16:05:18 <frankbrockners> michal-cmarada|2 - can we get the patch merged? 16:05:48 <michal-cmarada|2> patch from boron is failing in jenkins https://git.opendaylight.org/gerrit/#/c/45000/. we did a recheck but many jenkins checks are failing today. 16:06:01 <frankbrockners> :-( 16:06:22 <michal-cmarada|2> Qrouter tap port is also working on UCS-B side 1 16:07:39 <frankbrockners> edwarnicke - are you there? 16:08:01 <frankbrockners> edwarnicke - do you know of issues with verify jobs on Boron right now? 16:08:02 <edwarnicke> frankbrockners: Yes 16:08:14 <edwarnicke> I did not 16:08:15 <edwarnicke> Catching up 16:08:27 <frankbrockners> edwarnicke - https://git.opendaylight.org/gerrit/#/c/44935 verifies fine 16:08:39 <frankbrockners> but the cherry pick to boron fails verify 16:09:02 <frankbrockners> https://git.opendaylight.org/gerrit/#/c/45000/ is critical to us - it fixes the last technical issue that FDS has 16:09:28 <edwarnicke> frankbrockners: I just poked the right folks on #opendaylight-releng 16:09:33 <edwarnicke> Should hear something shortly 16:09:38 <frankbrockners> thanks edwarnicke 16:10:07 <frankbrockners> let's see what else is on the laundry list... (a) functest (b) hugepages 16:10:26 <michal-cmarada|2> Just tested my patch on UCS-B side 1 once again. setup is correct, going to test the pings. 16:10:27 <frankbrockners> trozet mentioned that deploy works - but functest fails 16:11:44 <frankbrockners> #info jlinkes told me earlier the day that functest fails because of 2 reasons (a) qrouter - to - BD patch not there yet (b) VMs need to be started with hugepages option 16:12:19 <frankbrockners> #info morgan explained details how to amend functest config yaml to enable hugepages option 16:12:26 <frankbrockners> jlinkes - anything to add? 16:12:53 <jlinkes> nothing to add 16:12:57 <frankbrockners> thanks jlinkes 16:13:18 <trozet> frankbrockners: good point 16:13:23 <trozet> frankbrockners: about hugepages I mean 16:13:31 <frankbrockners> let's move to hugepages... 16:14:02 <frankbrockners> fact is that we need hugepages configured 16:14:10 <frankbrockners> the real question is where and how... 16:14:56 <frankbrockners> by default this is done in sysctl.d/80-vpp.conf for vpp 16:15:05 <frankbrockners> see also damians email 16:15:17 <frankbrockners> trozet - you didn't like that approach too much - correct? 16:15:39 <trozet> frankbrockners: no 16:16:20 <trozet> frankbrockners: although I'm willing to go with it 16:16:31 <frankbrockners> trozet: any rational / background? 16:16:31 <trozet> frankbrockners: we would need to update the puppet module to configure that conf file 16:17:00 <trozet> frankbrockners: I don't think it's a good idea for VPP to override sysctl settings when you install it's RPM 16:17:18 <trozet> frankbrockners: i think vpp documentation should say hey, you need to set hugepages, here is how to do it 16:17:32 <trozet> frankbrockners: got to remember this is a linux host, and vpp is one application on it 16:17:38 <frankbrockners> trozet - but from what I understand, without the 80-vpp.conf, hugepages aren't configured correctly 16:18:00 <trozet> frankbrockners: no, we configure hugepages for the host, the 80-vpp.conf is overridding our grub settings 16:18:29 <jlinkes> frankbrockers: damian 16:18:44 <frankbrockners> hmmm... jlinkes - did we see hugepages configured properly without things being set in 80-vpp.conf? 16:18:52 <michal-cmarada|2> i did 16:18:53 <jlinkes> frankbrockers: damian's e-mail didn't contain much in terms of information 16:19:04 <frankbrockners> jlinkes - agreed 16:19:15 <michal-cmarada|2> sysctl -w vm.nr_hugepages=10000 16:19:17 <michal-cmarada|2> sysctl -w vm.max_map_count=25000 16:19:18 <trozet> frankbrockners: in my email you can see messages from the kernel, saying it is set to 2048 (the value apex set it to). After that, vpp.conf changes it 16:19:19 <michal-cmarada|2> sysctl -w vm.hugetlb_shm_group=0 16:19:21 <michal-cmarada|2> sysctl -w kernel.shmmax=20971520000 16:19:22 <michal-cmarada|2> i have set this directly in 16:19:33 <jlinkes> frankbrockners: yes, we just set it using sysctl -w vm.nr_hugepages=10000 16:19:34 <jlinkes> sysctl -w vm.max_map_count=25000 16:19:34 <jlinkes> sysctl -w vm.hugetlb_shm_group=0 16:19:34 <jlinkes> sysctl -w kernel.shmmax=20971520000 16:19:40 <jlinkes> frankbrockners: and that worked fine 16:19:46 <michal-cmarada|2> sorry wrong order 16:19:47 <frankbrockners> but what if we don't put anything into vpp.conf? 16:20:03 <jlinkes> frankbrockners: and we left 80-vpp.conf alone 16:20:14 <trozet> frankbrockners: vpp.conf will only be read when you reboot 16:20:22 <trozet> frankbrockners: so if jlinkes reboots his machine, it will fallback to 1024 16:20:29 <jlinkes> right 16:20:55 <trozet> or you can do sysctl --system 16:20:58 <trozet> and it will reload the values from the conf file 16:21:04 <jlinkes> frankbrockners: that seems like it could work, deleting everything from 80-vpp.conf and leaving it empty 16:21:23 <frankbrockners> jlinkes - let's try that 16:21:37 <frankbrockners> "you should not reboot" isn't too much of an option... 16:21:50 <trozet> frankbrockners: what I'm saying is 16:22:01 <trozet> frankbrockners: the whole 80-vpp.conf, should be removed from VPP RPM install 16:22:14 <trozet> frankbrockners: not deleted post install, or anything 16:22:41 <jlinkes> frankbrockners: noted, will try that tomorrow 16:22:42 <trozet> frankbrockners: it is much better to document the settings that need to be changed, and let the user decide how he wants to do it 16:23:13 <jlinkes> trozet: that's a debate to have with vpp folks 16:23:19 <frankbrockners> trozet - understand - but this would mean that we need an RPM specific for OPNFV 16:23:43 <trozet> frankbrockners: no i htink it means modifying the VPP RPM for all users :) 16:24:02 <frankbrockners> vpp folks would like to keep vpp.conf in the rpm to make sure things work.. - edwarnicke might have a view from a VPP perspective 16:24:03 <jlinkes> the problem with damian's e-mail is that even though we got "answers" we didn't learn anything and we're in the same position as if he didn't respond at all 16:24:08 <trozet> frankbrockners: but like jlinkes said, that will be a bigget debate/take longer 16:24:41 <frankbrockners> trozet - so what would you suggest as interim solution? 16:24:42 <jlinkes> he just stated their position and provided no explanation for anything 16:24:45 <edwarnicke> trozet: vpp rpms need to work out of the box 16:24:59 <edwarnicke> trozet: Without that, they don't 16:25:13 <trozet> edwarnicke: they dont work without a sysctl reload anyway 16:25:27 <edwarnicke> trozet: Package install does the sysctl reload :) 16:25:32 <trozet> edwarnicke: so you might as well provide instructions to the user on how to set his hugepage settings, then let him do it 16:25:49 <trozet> frankbrockners: the interim solution is, Apex can either A) remove that file before deployment 16:25:59 <trozet> or B) we can add more puppet conf to puppet-fdio and configure it properly 16:26:12 <trozet> B will take longer than A 16:26:15 <edwarnicke> trozet: Its done by a post install script 16:26:34 <edwarnicke> frankbrockners: The ODL releng folks are aware of the issue and are fixing it that is blocking our verify 16:26:53 <frankbrockners> trozet - how about we do A) for now to get the deployment going 16:27:03 <frankbrockners> edwarnicke - many thanks 16:27:03 <jlinkes> edwarnicke: what does it mean the need to work out of the box and how is it tied to that 80-vpp.conf file? 16:27:12 <trozet> edwarnicke: sure. I still don't think it's the best approach 16:27:33 <edwarnicke> trozet: I'm open to alternatives that allow vpp to actually work out of the box on package install :) 16:28:19 <trozet> edwarnicke: why not just put in your install docs, to set these settings before using VPP, and let the user decide how many hugepages, etc 16:28:30 <trozet> edwarnicke: then a user decides for his host, what is appropriate 16:28:32 <edwarnicke> trozet: Install docs != works out of the box 16:28:41 <edwarnicke> trozet: When someone installs a package, it should run not crash 16:29:44 <frankbrockners> edwarnicke, trozet - unlikely that we solve things here.. - how about we do trozet's option A) for now? 16:30:46 <trozet> frankbrockners: we will need to also add some of those other options that vpp.conf is setting, if they are required 16:31:14 <trozet> edwarnicke: if htey read the instructions first, it wont crash and work out of hte box :) 16:31:36 <edwarnicke> trozet: Who do you know who reads the instrutions when they type 'yum install foo' 16:31:40 <edwarnicke> Nobody I know does that 16:31:51 <trozet> edwarnicke: i need my lawn mower to work out of the box, but i didnt read the instrucitons about filling it up with gas.. 16:31:56 <trozet> :) 16:32:14 <edwarnicke> LOL 16:32:26 <trozet> hehe 16:32:56 <trozet> frankbrockners: I'll work on removing the file and adding the argumetns to our deploy settings file 16:33:12 <trozet> frankbrockners: but the end result should be, to tweak these parameters for a deployment, do it in the apex deploy settings file 16:33:25 <frankbrockners> thanks trozet - so basically move the contents of vpp.conf to your own config 16:33:25 <jlinkes> edwarnicke: how does this actually work? you install vpp, it then sets the stuff in 80-vpp and then vpp starts? 16:33:47 <edwarnicke> trozet: Just to make sure I understand, the issue for you with 80-vpp.conf is that you need to add a different file there with different hugepages requirements, correct? 16:34:08 <edwarnicke> jlinkes: Yes. You install vpp and then service vpp start just works 16:34:11 <trozet> edwarnicke: no. The problem is 80-vpp.conf is there, so it overrides hte hugepages that the kernel is booted with 16:34:30 <trozet> edwarnicke: the vpp virus attacks our host and changes its hugepages from 2048 to 1024 16:34:51 <frankbrockners> edwarnicke .. and 1024 is too low... 16:34:56 <edwarnicke> trozet: I see that as a variation of the same theme, you are just putting the hugepages in the kernel arguments instead of the sysclt directory.. but net net, its overriding a different choice you've made 16:35:06 <trozet> edwarnicke: right 16:35:17 <trozet> edwarnicke: we want to give hte option to the user in Apex to declare how many hugepages before he deploys 16:35:25 <trozet> edwarnicke: so its a setting, and we set those as kernel args 16:35:30 <jlinkes> edwarnicke: my main question was whether vpp configured hugepages as part of installation 16:35:35 <jlinkes> configures 16:35:38 <edwarnicke> trozet: Totally valid 16:35:44 <edwarnicke> jlinkes: It does 16:36:04 <edwarnicke> jlinkes: It runs sysctl -system in its post install script 16:36:21 <trozet> frankbrockners, edwarnicke: so if you see here https://gerrit.opnfv.org/gerrit/gitweb?p=apex.git;a=blob;f=config/deploy/os-odl_l2-fdio-noha.yaml;h=ad54fbdc830ecf9790c8d0f2b4104192e41214d2;hb=HEAD 16:36:21 <edwarnicke> trozet: Let me think about this a bit 16:36:28 <edwarnicke> trozet: Because there may be a comprimise 16:36:35 <trozet> frankbrockners, edwarnicke: you see kernel arguments a user can change there for an apex deployment 16:36:36 <jlinkes> edwarnicke: so if there are other applications using hugepages vpp installation could totally screw hugepages for them? 16:36:59 <edwarnicke> trozet: How would you feel about this: iff the vpp package discovers 1024 or more hugepages,it leaves well enough alone, otherwise, it sets them 16:37:14 <trozet> edwarnicke: that makes sense 16:37:29 <edwarnicke> trozet: Which brings me to my question... how do I find out that hugepages is being set via kernel params? 16:37:29 * frankbrockners likes that solution 16:37:48 <trozet> edwarnicke: but what about the other settings, I thought I saw some of the comments in that vpp.conf file relate to the number of hugepages set 16:37:59 <jlinkes> trozet: here 16:37:59 <edwarnicke> trozet: The proposed solution is basically to go from: make sure hugepages == 1024, to make sure hugepages >= 1024 16:38:12 <trozet> edwarnicke: you check the bootloader or /proc/meminfo 16:38:15 <edwarnicke> trozet: That I'd have to look at 16:38:16 <jlinkes> sysctl -w vm.nr_hugepages=10000 16:38:16 <jlinkes> sysctl -w vm.max_map_count=22000 16:38:16 <jlinkes> sysctl -w vm.hugetlb_shm_group=0 16:38:16 <jlinkes> sysctl -w kernel.shmmax=20971520000 16:38:24 <jlinkes> trozet: this is what we use 16:38:52 <edwarnicke> proc/meminfo doesn't help me, because it only tells me current runtime, not next boottime 16:39:09 <jlinkes> trozet: the formulas are vm.max_map_count = 2 * vm.nr_hugepage + 10% for vpp 16:39:13 <trozet> edwarnicke: oh hmm 16:39:53 <trozet> edwarnicke: yeah I'm not sure how to do that 16:40:10 <trozet> edwarnicke: we may even instlal VPP before we set the hugepages for next boot 16:40:27 <edwarnicke> trozet: Yeah, that I can't help you with :( 16:40:28 <jlinkes> trozet: and kernel.shmmax = 2 * vm.nr_hugepages * 1024 * 1024 for 2MB hugepages 16:40:42 <trozet> jlinkes: ok let me see about setting those in apex kernel args 16:40:44 <edwarnicke> trozet: I do potentially have an idea though 16:40:56 <edwarnicke> trozet: Is there a standard for naming a config file to set hugepages? 16:41:03 <edwarnicke> If there is... 16:41:08 <edwarnicke> We may have a good approach 16:41:10 <edwarnicke> Hmm... 16:41:18 <frankbrockners> but can't we start with don't touch hugepages unless current-hugepages < 1024 16:41:42 <trozet> i still like the idea of documenting in requirements :) 16:41:56 <trozet> if a user starts VPP without hugepages 16:42:03 <trozet> print an error in journalctl 16:42:08 <trozet> and say go configure huge pages 16:42:08 <frankbrockners> we'll add this to the docs for sure 16:43:29 <trozet> frankbrockners: for hte interim, i'll remove the file before we deploy and add the config to the kernel args 16:45:18 <jlinkes> trozet: there's also a requirement from woj 16:45:28 <trozet> jlinkes: what's that? 16:45:45 <jlinkes> trozet: it would be best if apex could figure out the maximum possible number of hugepages and configure that number 16:46:07 <jlinkes> trozet: well, not really a requirement 16:46:31 <frankbrockners> trozet - thanks - makes sense 16:47:03 <jlinkes> trozet: a suggestion for how to do this 16:47:15 <trozet> jlinkes: yeah that would be good 16:47:18 <jlinkes> trozet: I think he mentioned it one of the e-mails 16:47:38 <trozet> jlinkes: TripleO is capable of doing "introspection" which means find out all the info about the hardware you are going to deploy to - before you deploy 16:47:54 <trozet> jlinkes: we currently have that disabled, but with that info, we coudl modify the hugepages on the fly before we deploy 16:48:08 <trozet> jlinkes: sounds like a good improvement for colorado2.0 16:48:19 <trozet> jlinkes: can you file a JIRA for that? 16:48:32 <jlinkes> trozet: okay 16:50:12 <frankbrockners> are we done for today? 16:50:21 <jlinkes> I think so 16:51:18 <michal-cmarada|2> frankbrockners: whats with the jenkins 16:52:04 <michal-cmarada|2> frankbrockners: did I missed something? :) 16:52:20 <frankbrockners> michal-cmarada|2 - what exactly do you refer to? 16:53:08 <frankbrockners> trozet has jenkins jobs up 16:53:19 <trozet> frankbrockners: let me link 16:53:31 <trozet> https://build.opnfv.org/ci/job/apex-deploy-baremetal-os-odl_l2-fdio-noha-colorado/ 16:53:35 <frankbrockners> what we're missing is functest - but this is what we discussed earlier 16:53:41 <trozet> https://build.opnfv.org/ci/job/apex-deploy-baremetal-os-nosdn-fdio-noha-colorado/ 16:53:51 <michal-cmarada|2> frankbrockners: I mean why it is failing. If you have found out something. 16:53:59 <trozet> frankbrockners: nosdn fdio also passes, but I don't think fpan has everything done yet for that 16:54:17 <trozet> michal-cmarada|2: i think frankbrockners said it is because functest doesnt create VMs with hugepages 16:54:35 <trozet> frankbrockners: should i talk to jose about getting this fixed when it detects FDIO as a scenario? 16:55:03 <jlinkes> trozet: I think we'll have to do this 16:55:17 <jlinkes> trozet: and by we I really mean me :-) 16:55:42 <trozet> jlinkes: so you will follow up with jose/morgan on that? 16:56:05 <frankbrockners> thanks jlinkes... and michal-cmarada|2 - deployment already succeeds: https://build.opnfv.org/ci/job/apex-deploy-baremetal-os-odl_l2-fdio-noha-colorado/6/console 16:56:09 <jlinkes> trozet: we already talked to morgan today 16:56:14 <jlinkes> trozet: so yes 16:56:18 <trozet> jlinkes: cool 16:58:08 <frankbrockners> ... looks like we're done for today. Thanks everyone! 16:58:11 <frankbrockners> #endmeeting