16:00:16 <frankbrockners> #startmeeting FDS synch
16:00:16 <collabot> Meeting started Thu Sep  1 16:00:16 2016 UTC.  The chair is frankbrockners. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:16 <collabot> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:00:16 <collabot> The meeting name has been set to 'fds_synch'
16:00:20 <jlinkes> #info Juraj Linkes
16:00:41 <frankbrockners> #info Frank Brockners
16:01:09 <frankbrockners> #info draft agenda - https://wiki.opnfv.org/display/meetings/FastDataStacks#FastDataStacks-Thursday,September1,2016
16:01:15 <michal-cmarada|2> #info Michal Cmarada
16:02:02 <frankbrockners> let's get rolling... - let's touch on the open issues first. Michal made good progress.
16:03:14 <frankbrockners> michal-cmarada|2 - quick update on the qr tap to BD?
16:04:14 <michal-cmarada|2> seems that it is working properly. Port is added to the BD and VMs can reach it.
16:04:20 <frankbrockners> #info Michal submitted https://git.opendaylight.org/gerrit/#/c/44935/; https://git.opendaylight.org/gerrit/#/c/45000/ (for boron and carbon) which are expected to resolve the qrouter tap to BD issue
16:04:39 <frankbrockners> #info Michal validated things locally:  Port is added to the BD and VMs can reach it.
16:05:18 <frankbrockners> michal-cmarada|2 - can we get the patch merged?
16:05:48 <michal-cmarada|2> patch from boron is failing in jenkins https://git.opendaylight.org/gerrit/#/c/45000/. we did a recheck but many jenkins checks are failing today.
16:06:01 <frankbrockners> :-(
16:06:22 <michal-cmarada|2> Qrouter tap port is also working on UCS-B side 1
16:07:39 <frankbrockners> edwarnicke - are you there?
16:08:01 <frankbrockners> edwarnicke - do you know of issues with verify jobs on Boron right now?
16:08:02 <edwarnicke> frankbrockners: Yes
16:08:14 <edwarnicke> I did not
16:08:15 <edwarnicke> Catching up
16:08:27 <frankbrockners> edwarnicke -  https://git.opendaylight.org/gerrit/#/c/44935 verifies fine
16:08:39 <frankbrockners> but the cherry pick to boron fails verify
16:09:02 <frankbrockners> https://git.opendaylight.org/gerrit/#/c/45000/ is critical to us - it fixes the last technical issue that FDS has
16:09:28 <edwarnicke> frankbrockners: I just poked the right folks on #opendaylight-releng
16:09:33 <edwarnicke> Should hear something shortly
16:09:38 <frankbrockners> thanks edwarnicke
16:10:07 <frankbrockners> let's see what else is on the laundry list... (a) functest (b) hugepages
16:10:26 <michal-cmarada|2> Just tested my patch on UCS-B side 1 once again. setup is correct, going to test the pings.
16:10:27 <frankbrockners> trozet mentioned that deploy works - but functest fails
16:11:44 <frankbrockners> #info jlinkes told me earlier the day that functest fails because of 2 reasons (a) qrouter - to - BD patch not there yet (b) VMs need to be started with hugepages option
16:12:19 <frankbrockners> #info morgan explained details how to amend functest config yaml to enable hugepages option
16:12:26 <frankbrockners> jlinkes - anything to add?
16:12:53 <jlinkes> nothing to add
16:12:57 <frankbrockners> thanks jlinkes
16:13:18 <trozet> frankbrockners: good point
16:13:23 <trozet> frankbrockners: about hugepages I mean
16:13:31 <frankbrockners> let's move to hugepages...
16:14:02 <frankbrockners> fact is that we need hugepages configured
16:14:10 <frankbrockners> the real question is where and how...
16:14:56 <frankbrockners> by default this is done in sysctl.d/80-vpp.conf for vpp
16:15:05 <frankbrockners> see also damians email
16:15:17 <frankbrockners> trozet - you didn't like that approach too much - correct?
16:15:39 <trozet> frankbrockners: no
16:16:20 <trozet> frankbrockners: although I'm willing to go with it
16:16:31 <frankbrockners> trozet: any rational / background?
16:16:31 <trozet> frankbrockners: we would need to update the puppet module to configure that conf file
16:17:00 <trozet> frankbrockners: I don't think it's a good idea for VPP to override sysctl settings when you install it's RPM
16:17:18 <trozet> frankbrockners: i think vpp documentation should say hey, you need to set hugepages, here is how to do it
16:17:32 <trozet> frankbrockners: got to remember this is a linux host, and vpp is one application on it
16:17:38 <frankbrockners> trozet - but from what I understand, without the 80-vpp.conf, hugepages aren't configured correctly
16:18:00 <trozet> frankbrockners: no, we configure hugepages for the host, the 80-vpp.conf is overridding our grub settings
16:18:29 <jlinkes> frankbrockers: damian
16:18:44 <frankbrockners> hmmm... jlinkes - did we see hugepages configured properly without things being set in 80-vpp.conf?
16:18:52 <michal-cmarada|2> i did
16:18:53 <jlinkes> frankbrockers: damian's e-mail didn't contain much in terms of information
16:19:04 <frankbrockners> jlinkes - agreed
16:19:15 <michal-cmarada|2> sysctl -w vm.nr_hugepages=10000
16:19:17 <michal-cmarada|2> sysctl -w vm.max_map_count=25000
16:19:18 <trozet> frankbrockners: in my email you can see messages from the kernel, saying it is set to 2048 (the value apex set it to).  After that, vpp.conf changes it
16:19:19 <michal-cmarada|2> sysctl -w vm.hugetlb_shm_group=0
16:19:21 <michal-cmarada|2> sysctl -w kernel.shmmax=20971520000
16:19:22 <michal-cmarada|2> i have set this directly in
16:19:33 <jlinkes> frankbrockners: yes, we just set it using sysctl -w vm.nr_hugepages=10000
16:19:34 <jlinkes> sysctl -w vm.max_map_count=25000
16:19:34 <jlinkes> sysctl -w vm.hugetlb_shm_group=0
16:19:34 <jlinkes> sysctl -w kernel.shmmax=20971520000
16:19:40 <jlinkes> frankbrockners: and that worked fine
16:19:46 <michal-cmarada|2> sorry wrong order
16:19:47 <frankbrockners> but what if we don't put anything into vpp.conf?
16:20:03 <jlinkes> frankbrockners: and we left 80-vpp.conf alone
16:20:14 <trozet> frankbrockners: vpp.conf will only be read when you reboot
16:20:22 <trozet> frankbrockners: so if jlinkes reboots his machine, it will fallback to 1024
16:20:29 <jlinkes> right
16:20:55 <trozet> or you can do sysctl --system
16:20:58 <trozet> and it will reload the values from the conf file
16:21:04 <jlinkes> frankbrockners: that seems like it could work, deleting everything from 80-vpp.conf and leaving it empty
16:21:23 <frankbrockners> jlinkes - let's try that
16:21:37 <frankbrockners> "you should not reboot" isn't too much of an option...
16:21:50 <trozet> frankbrockners: what I'm saying is
16:22:01 <trozet> frankbrockners: the whole 80-vpp.conf, should be removed from VPP RPM install
16:22:14 <trozet> frankbrockners: not deleted post install, or anything
16:22:41 <jlinkes> frankbrockners: noted, will try that tomorrow
16:22:42 <trozet> frankbrockners: it is much better to document the settings that need to be changed, and let the user decide how he wants to do it
16:23:13 <jlinkes> trozet: that's a debate to have with vpp folks
16:23:19 <frankbrockners> trozet - understand - but this would mean that we need an RPM specific for OPNFV
16:23:43 <trozet> frankbrockners: no i htink it means modifying the VPP RPM for all users :)
16:24:02 <frankbrockners> vpp folks would like to keep vpp.conf in the rpm to make sure things work.. - edwarnicke might have a view from a VPP perspective
16:24:03 <jlinkes> the problem with damian's e-mail is that even though we got "answers" we didn't learn anything and we're in the same position as if he didn't respond at all
16:24:08 <trozet> frankbrockners: but like jlinkes said, that will be a bigget debate/take longer
16:24:41 <frankbrockners> trozet - so what would you suggest as interim solution?
16:24:42 <jlinkes> he just stated their position and provided no explanation for anything
16:24:45 <edwarnicke> trozet: vpp rpms need to work out of the box
16:24:59 <edwarnicke> trozet: Without that, they don't
16:25:13 <trozet> edwarnicke: they dont work without a sysctl reload anyway
16:25:27 <edwarnicke> trozet: Package install does the sysctl reload :)
16:25:32 <trozet> edwarnicke: so you might as well provide instructions to the user on how to set his hugepage settings, then let him do it
16:25:49 <trozet> frankbrockners: the interim solution is, Apex can either A) remove that file before deployment
16:25:59 <trozet> or B) we can add more puppet conf to puppet-fdio and configure it properly
16:26:12 <trozet> B will take longer than A
16:26:15 <edwarnicke> trozet: Its done by a post install script
16:26:34 <edwarnicke> frankbrockners: The ODL releng folks are aware of the issue and are fixing it that is blocking our verify
16:26:53 <frankbrockners> trozet  - how about we do A) for now to get the deployment going
16:27:03 <frankbrockners> edwarnicke - many thanks
16:27:03 <jlinkes> edwarnicke: what does it mean the need to work out of the box and how is it tied to that 80-vpp.conf file?
16:27:12 <trozet> edwarnicke: sure.  I still don't think it's the best approach
16:27:33 <edwarnicke> trozet: I'm open to alternatives that allow vpp to actually work out of the box on package install :)
16:28:19 <trozet> edwarnicke: why not just put in your install docs, to set these settings before using VPP, and let the user decide how many hugepages, etc
16:28:30 <trozet> edwarnicke: then a user decides for his host, what is appropriate
16:28:32 <edwarnicke> trozet: Install docs != works out of the box
16:28:41 <edwarnicke> trozet: When someone installs a package, it should run not crash
16:29:44 <frankbrockners> edwarnicke, trozet - unlikely that we solve things here.. - how about we do trozet's option A) for now?
16:30:46 <trozet> frankbrockners: we will need to also add some of those other options that vpp.conf is setting, if they are required
16:31:14 <trozet> edwarnicke: if htey read the instructions first, it wont crash and work out of hte box :)
16:31:36 <edwarnicke> trozet: Who do you know who reads the instrutions when they type 'yum install foo'
16:31:40 <edwarnicke> Nobody I know does that
16:31:51 <trozet> edwarnicke: i need my lawn mower to work out of the box, but i didnt read the instrucitons about filling it up with gas..
16:31:56 <trozet> :)
16:32:14 <edwarnicke> LOL
16:32:26 <trozet> hehe
16:32:56 <trozet> frankbrockners: I'll work on removing the file and adding the argumetns to our deploy settings file
16:33:12 <trozet> frankbrockners: but the end result should be, to tweak these parameters for a deployment, do it in the apex deploy settings file
16:33:25 <frankbrockners> thanks trozet - so basically move the contents of vpp.conf to your own config
16:33:25 <jlinkes> edwarnicke: how does this actually work? you install vpp, it then sets the stuff in 80-vpp and then vpp starts?
16:33:47 <edwarnicke> trozet: Just to make sure I understand, the issue for you with 80-vpp.conf is that you need to add a different file there with different hugepages requirements, correct?
16:34:08 <edwarnicke> jlinkes: Yes.  You install vpp and then service vpp start just works
16:34:11 <trozet> edwarnicke: no.  The problem is 80-vpp.conf is there, so it overrides hte hugepages that the kernel is booted with
16:34:30 <trozet> edwarnicke: the vpp virus attacks our host and changes its hugepages from 2048 to 1024
16:34:51 <frankbrockners> edwarnicke .. and 1024 is too low...
16:34:56 <edwarnicke> trozet: I see that as a variation of the same theme, you are just putting the hugepages in the kernel arguments instead of the sysclt directory.. but net net, its overriding a different choice you've made
16:35:06 <trozet> edwarnicke: right
16:35:17 <trozet> edwarnicke: we want to give hte option to the user in Apex to declare how many hugepages before he deploys
16:35:25 <trozet> edwarnicke: so its a setting, and we set those as kernel args
16:35:30 <jlinkes> edwarnicke: my main question was whether vpp configured hugepages as part of installation
16:35:35 <jlinkes> configures
16:35:38 <edwarnicke> trozet: Totally valid
16:35:44 <edwarnicke> jlinkes: It does
16:36:04 <edwarnicke> jlinkes: It runs sysctl -system in its post install script
16:36:21 <trozet> frankbrockners, edwarnicke: so if you see here https://gerrit.opnfv.org/gerrit/gitweb?p=apex.git;a=blob;f=config/deploy/os-odl_l2-fdio-noha.yaml;h=ad54fbdc830ecf9790c8d0f2b4104192e41214d2;hb=HEAD
16:36:21 <edwarnicke> trozet: Let me think about this a bit
16:36:28 <edwarnicke> trozet: Because there may be a comprimise
16:36:35 <trozet> frankbrockners, edwarnicke: you see kernel arguments a user can change there for an apex deployment
16:36:36 <jlinkes> edwarnicke: so if there are other applications using hugepages vpp installation could totally screw hugepages for them?
16:36:59 <edwarnicke> trozet: How would you feel about this: iff the vpp package discovers 1024 or more hugepages,it leaves well enough alone, otherwise, it sets them
16:37:14 <trozet> edwarnicke: that makes sense
16:37:29 <edwarnicke> trozet: Which brings me to my question... how do I find out that hugepages is being set via kernel params?
16:37:29 * frankbrockners likes that solution
16:37:48 <trozet> edwarnicke: but what about the other settings, I thought I saw some of the comments in that vpp.conf file relate to the number of hugepages set
16:37:59 <jlinkes> trozet: here
16:37:59 <edwarnicke> trozet: The proposed solution is basically to go from: make sure hugepages == 1024, to make sure hugepages >= 1024
16:38:12 <trozet> edwarnicke: you check the bootloader or /proc/meminfo
16:38:15 <edwarnicke> trozet: That I'd have to look at
16:38:16 <jlinkes> sysctl -w vm.nr_hugepages=10000
16:38:16 <jlinkes> sysctl -w vm.max_map_count=22000
16:38:16 <jlinkes> sysctl -w vm.hugetlb_shm_group=0
16:38:16 <jlinkes> sysctl -w kernel.shmmax=20971520000
16:38:24 <jlinkes> trozet: this is what we use
16:38:52 <edwarnicke> proc/meminfo doesn't help me, because it only tells me current runtime, not next boottime
16:39:09 <jlinkes> trozet: the formulas are vm.max_map_count = 2 * vm.nr_hugepage + 10% for vpp
16:39:13 <trozet> edwarnicke: oh hmm
16:39:53 <trozet> edwarnicke: yeah I'm not sure how to do that
16:40:10 <trozet> edwarnicke: we may even instlal VPP before we set the hugepages for next boot
16:40:27 <edwarnicke> trozet: Yeah, that I can't help you with :(
16:40:28 <jlinkes> trozet: and kernel.shmmax = 2 * vm.nr_hugepages * 1024 * 1024 for 2MB hugepages
16:40:42 <trozet> jlinkes: ok let me see about setting those in apex kernel args
16:40:44 <edwarnicke> trozet: I do potentially have an idea though
16:40:56 <edwarnicke> trozet: Is there a standard for naming a config file to set hugepages?
16:41:03 <edwarnicke> If there is...
16:41:08 <edwarnicke> We may have a good approach
16:41:10 <edwarnicke> Hmm...
16:41:18 <frankbrockners> but can't we start with don't touch hugepages unless current-hugepages < 1024
16:41:42 <trozet> i still like the idea of documenting in requirements :)
16:41:56 <trozet> if a user starts VPP without hugepages
16:42:03 <trozet> print an error in journalctl
16:42:08 <trozet> and say go configure huge pages
16:42:08 <frankbrockners> we'll add this to the docs for sure
16:43:29 <trozet> frankbrockners: for hte interim, i'll remove the file before we deploy and add the config to the kernel args
16:45:18 <jlinkes> trozet: there's also a requirement from woj
16:45:28 <trozet> jlinkes: what's that?
16:45:45 <jlinkes> trozet: it would be best if apex could figure out the maximum possible number of hugepages and configure that number
16:46:07 <jlinkes> trozet: well, not really a requirement
16:46:31 <frankbrockners> trozet - thanks - makes sense
16:47:03 <jlinkes> trozet: a suggestion for how to do this
16:47:15 <trozet> jlinkes: yeah that would be good
16:47:18 <jlinkes> trozet: I think he mentioned it one of the e-mails
16:47:38 <trozet> jlinkes: TripleO is capable of doing "introspection" which means find out all the info about the hardware you are going to deploy to - before you deploy
16:47:54 <trozet> jlinkes: we currently have that disabled, but with that info, we coudl modify the hugepages on the fly before we deploy
16:48:08 <trozet> jlinkes: sounds like a good improvement for colorado2.0
16:48:19 <trozet> jlinkes: can you file a JIRA for that?
16:48:32 <jlinkes> trozet: okay
16:50:12 <frankbrockners> are we done for today?
16:50:21 <jlinkes> I think so
16:51:18 <michal-cmarada|2> frankbrockners: whats with the jenkins
16:52:04 <michal-cmarada|2> frankbrockners: did I missed something? :)
16:52:20 <frankbrockners> michal-cmarada|2 - what exactly do you refer to?
16:53:08 <frankbrockners> trozet has jenkins jobs up
16:53:19 <trozet> frankbrockners: let me link
16:53:31 <trozet> https://build.opnfv.org/ci/job/apex-deploy-baremetal-os-odl_l2-fdio-noha-colorado/
16:53:35 <frankbrockners> what we're missing is functest - but this is what we discussed earlier
16:53:41 <trozet> https://build.opnfv.org/ci/job/apex-deploy-baremetal-os-nosdn-fdio-noha-colorado/
16:53:51 <michal-cmarada|2> frankbrockners: I mean why it is failing. If you have found out something.
16:53:59 <trozet> frankbrockners: nosdn fdio also passes, but I don't think fpan has everything done yet for that
16:54:17 <trozet> michal-cmarada|2: i think frankbrockners said it is because functest doesnt create VMs with hugepages
16:54:35 <trozet> frankbrockners: should i talk to jose about getting this fixed when it detects FDIO as a scenario?
16:55:03 <jlinkes> trozet: I think we'll have to do this
16:55:17 <jlinkes> trozet: and by we I really mean me :-)
16:55:42 <trozet> jlinkes: so you will follow up with jose/morgan on that?
16:56:05 <frankbrockners> thanks jlinkes... and michal-cmarada|2 - deployment already succeeds: https://build.opnfv.org/ci/job/apex-deploy-baremetal-os-odl_l2-fdio-noha-colorado/6/console
16:56:09 <jlinkes> trozet: we already talked to morgan today
16:56:14 <jlinkes> trozet: so yes
16:56:18 <trozet> jlinkes: cool
16:58:08 <frankbrockners> ... looks like we're done for today. Thanks everyone!
16:58:11 <frankbrockners> #endmeeting