18:15:03 <dfarrell07> #startmeeting cperf
18:15:03 <collabot> Meeting started Thu Jun 15 18:15:03 2017 UTC.  The chair is dfarrell07. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:15:03 <collabot> Useful Commands: #action #agreed #help #info #idea #link #topic.
18:15:03 <collabot> The meeting name has been set to 'cperf'
18:18:33 <dfarrell07> #info mattw4 reports they they have reached some nice stability with setup and take down of scale deployment
18:18:54 <dfarrell07> #info very exciting that they have this deployment working
18:19:00 <dfarrell07> #info have scaled up to 100 compute nodes sof ar
18:19:17 <dfarrell07> #info creating lots of instances at the same time results in half of them failing
18:19:38 <dfarrell07> #info this is similar to what Nikos saw with nstat, they had to add in batches
18:20:08 <dfarrell07> #info many test dimensions in this matrix, need to make prios
18:20:29 <dfarrell07> #info next prios is getting this pushed to the upstream stuff they have started
18:20:48 <dfarrell07> #info Jamo suggests some initial tests, single host on each compute node, make sure everyone can talk to everyone
18:21:13 <dfarrell07> #info jamo talks about bugs they see in regular netvirt with some flows not getting installed, some instances not being able to talk to each other
18:21:37 <dfarrell07> #info Raghu confirms they are looking at that test early
18:23:33 <dfarrell07> #info mattw4 talks about timing between instance creation being important, waiting some number of seconds, or better yet waiting for a ping smoke test to pass before creating more instances
18:23:52 <dfarrell07> #info jamo talks about how openstack may have some issues like this, has heard something like 13 at a time
18:24:22 <dfarrell07> #info when spinning up lots of instances, some rules don't get installed, no tunels, openstack thinks things are good but they aren't because missing rules
18:24:34 <dfarrell07> #info jamo says make sure you're using latest carbon, bugs recently
18:24:44 <dfarrell07> #info mattw4 is on boron sr2, which is def a problem
18:24:59 <dfarrell07> #info jamoluhrsen says throw away boron asap, maybe sr4 but that's still otw
18:25:20 <dfarrell07> #info mattw4 is using andre's netvirt scale doc as test bible for now, following that closely
18:26:14 <dfarrell07> #info Raghu asks if there are custom configs we do for ODL that they should be doing, jamoluhrsen says no, nothing fancy
18:28:45 <dfarrell07> #info mattw4 talks about deleting instances not working so well, after removing instances ODL still maintaining state, mattw4 saw some crazy stuff with like 54GB of RAM used *wow faces all around*
18:29:43 <dfarrell07> #info LuisGomez says we should dump this ram into a profiler and see what's going on
18:30:07 <dfarrell07> #info dfarrell07 restates that this is def stuff we want to bring to odl devs, they will ask for tests to reproduce so need that as well
18:31:05 <dfarrell07> #info mattw4 is using openstack-ansible modules to work with openstack cluster, poke into and config things, have looked at rally but maybe not what they need, looking for feedback about tools to work with such things
18:32:07 <dfarrell07> #info LuisGomez and mattw4 talk about deployment tooling
18:32:28 <dfarrell07> #info mattw4 talks about using ansible, probing openstack api
18:32:45 <dfarrell07> #info next week they will have test plan to share with us, see where they are going
18:33:09 <dfarrell07> #info LuisGomez talks about this just running in their internal lab, everyone agrees will be awesome to get running in cperf etc
18:33:42 <dfarrell07> #info jamoluhrsen gives updates about migration of pod
18:33:56 <dfarrell07> #info we have new ip address, but they are not changes on boxes for us
18:34:06 <dfarrell07> #info we have console access, but need to get into boxes and change them
18:34:21 <dfarrell07> #info jamoluhrsen doesn't have cycles to do this atm, someone else could help, he would show how
18:36:56 <dfarrell07> #info jamoluhrsen talks about how cperf tools container has been useful in downstream testing, maybe we should advert to ppl as any easy way to get robot running
18:37:09 <dfarrell07> #info discussion about docker support in ODL CI, we have jobs that use it jamoluhrsen says
18:37:25 <dfarrell07> #info LuisGomez wants more things running in containers (of course)
18:38:04 <dfarrell07> #info mattw4 talks about their attempt to use docker networks as underlay, they were not so great, ended up using linux bridge and veth pairs via scripts, pure linux
18:39:17 <dfarrell07> #info LuisGomez reports he hopes to have some time for switch scale tests in openflow cluster next week
18:39:25 <dfarrell07> #info LuisGomez has some perf tests already, but not scale in cluster
18:40:23 <dfarrell07> #info there was a bug in the scale test before, controller was getting switch connections but was not pushing table miss flows, LuisGomez has worked on test very recently, seeing more stable results
18:40:38 <dfarrell07> #info LuisGomez is seeing 400 switches in this non cluster tests
18:40:59 <dfarrell07> #info LuisGomez says this bug has maybe been fixed very recently, new patch, need to revert test changes and check new odl patch
18:42:35 <dfarrell07> #info dfarrell07 highlights issues from opnfv vsperf folks we raised to ODL CI and openflow devs around switches being lost, heartbeat messages not being prio causing more problems
18:43:41 <dfarrell07> #info there was some recent patch to carbon+ about fixing reconnects, but deeper problem is prio of connections
18:47:45 <dfarrell07> #info discussion about running odl tests on opnfv infra, that we're not sure of status of that effort, dfarrell07 guesses that as we do lf collab stuff this will become more easy/focus
18:48:01 <dfarrell07> #info LuisGomez talks about sanity tests, dashboard, what is relevant/important
18:48:18 <dfarrell07> #info LuisGomez talks about dashboard work interns are doing, that we have some cool gui stuff he will demo next week
18:49:32 <dfarrell07> #info LuisGomez talks about elasticsearch, can just push json with any body, then have logic to parse that and populate db, then graphing tools on top of that to make graphs
18:50:43 <dfarrell07> #info can used this for other things later, infra, jenkins, whatever we can measure can go into dashboard
18:51:10 <dfarrell07> #endmeeting