#opnfv-pharos log

13:01:04 <fdegir> #startmeeting Cross Community CI
13:01:04 <collabot> Meeting started Wed Aug  8 13:01:04 2018 UTC.  The chair is fdegir. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:01:04 <collabot> Useful Commands: #action #agreed #help #info #idea #link #topic.
13:01:04 <collabot> The meeting name has been set to 'cross_community_ci'
13:01:29 <fdegir> #topic Rollcall
13:01:33 <mbuil> hwoarang: yes! Would it be possible to wait 2-3 weeks?
13:01:39 <hwoarang> #info Markos Chandras
13:01:40 <hwoarang> mbuil: ok
13:01:44 <jmorgan1> #info Jack Morgan
13:01:46 <mbuil> os you want to try new images for the fdgir problem
13:01:49 <mbuil> *or
13:01:54 <mbuil> #info Manuel Buil
13:01:56 <fdegir> here is the agenda on its usual place: https://etherpad.opnfv.org/p/xci-meetings
13:01:58 <hwoarang> i can wait
13:02:24 <fdegir> and the first topic is the issue with the slaves
13:02:32 <fdegir> #topic Issues with Jenkins Slaves
13:02:56 <fdegir> so I went against the motto if it's working don't touch it and broken things
13:03:08 <fdegir> the opnfv vm can't boot
13:03:11 <mbuil> fdegir: I am attending another meeting in parallel, so I might be slow :P
13:03:45 <fdegir> i tried 3 different kernel versions: 4.4.0-112, 4.4.0-122, 4.4.0-131
13:03:50 <electrocucaracha> #info Víctor Morales
13:04:25 <fdegir> one of the 112 and 122 are probably the kernel we used earlier until this morning's update
13:04:38 <fdegir> what else we should look?
13:05:00 <jmorgan1> fdegir: i just deployed yesterday without problem, opnfv host booted
13:05:01 <fdegir> or if one of you want to take a look, the problematic deployment is available on intel-pod16-node3
13:05:12 <fdegir> jmorgan1: is it ubuntu and what's the kernel?
13:05:28 <fdegir> apart from that, when did you do apt update && apt upgrade ?
13:05:47 <jmorgan1> by opnfv host, we mean ubuntu_xci_vm or opnfv vm inside ubuntu_xci_vm
13:05:55 <fdegir> opnfv vm inside ubuntu_xci_vm
13:06:01 <jmorgan1> fdegir: yes, its ubuntu
13:06:12 <jmorgan1> its up and running for me
13:06:19 <jmorgan1> kernel 128 i beleive
13:06:29 <fdegir> ok
13:06:46 <fdegir> can you do apt update && apt upgrade, recreate ubuntu_xci_vm and run xci-deploy.sh again?
13:06:47 <hwoarang> fdegir: does an opensuse vm work/
13:06:58 <hwoarang> to rule out probs with the VM itself?
13:07:09 <jmorgan1> i had other issues related to k8-calico-onap scenario, but the VM was running
13:07:14 <fdegir> hwoarang: nope
13:07:20 <hwoarang> i mean, if this was triggered by a job, then one of the nodes will have opensuse running
13:07:32 <hwoarang> on the same ubuntu xenial host configuration
13:07:47 <fdegir> hwoarang: https://build.opnfv.org/ci/job/xci-verify-opensuse-deploy-virtual-master/1855/console
13:07:59 <fdegir> hwoarang: https://build.opnfv.org/ci/job/xci-verify-ubuntu-deploy-virtual-master/1854/console
13:08:05 <hwoarang> ok so it's a host thing
13:08:07 <fdegir> same error but it fails faster on opensuse ;)
13:08:17 <hwoarang> no point in updating the images then
13:08:23 <fdegir> i don't think so
13:08:33 <hwoarang> good
13:08:41 <fdegir> is there a way to roll back the complete apt upgrade someway?
13:09:14 <fdegir> or find what kernel version the machine was booted earlier?
13:09:23 <jmorgan1> where are the jenkins slaves?
13:09:24 <hwoarang> journalctl --list-boots ?
13:09:31 <fdegir> cause i did all autoremove to get rid of old kernels as well
13:09:41 <hwoarang> and then look at the logs for the boot you want with journalctl -b<boot id>
13:09:59 <fdegir> there is only 1 boot entry
13:10:02 <hwoarang> :(
13:11:25 <mbuil> fdegir: can you access the VM console?
13:13:38 <fdegir> no
13:14:19 <jmorgan1> does virsh say the VM is paused?
13:14:24 <fdegir> it's running
13:15:02 <jmorgan1> then what is the problen? the vm fails during the boot process?
13:15:17 <hwoarang> but does it matter? it's panicked
13:15:17 <fdegir> http://paste.ubuntu.com/p/5BW7mP3SF5/
13:15:48 <fdegir> let's move on to the next topics
13:16:09 <fdegir> #topic OSA Shabump
13:16:16 <fdegir> this is problematic as well
13:16:26 <hwoarang> i am looking into that
13:16:31 <hwoarang> we need newer bifrost
13:16:31 <fdegir> ok
13:16:41 <fdegir> i bumped to the latest bifrost
13:16:45 <fdegir> to Manuel's patch
13:16:56 <hwoarang> there is one more commit we are missing
13:17:17 <OPNFV-Gerrit-Bot> Merged pharos: [idf.fuel] Add jumpserver.trunks for mgmt  https://gerrit.opnfv.org/gerrit/60743
13:17:47 <mbuil> fdegir: the baremetal patch?
13:18:20 <fdegir> mbuil: i don't remember what was it exactly but after seeing you committed something to bifrost, i tried that one
13:18:29 <fdegir> hwoarang: ok
13:18:33 <hwoarang> no we really need the HEAD of bifrost
13:19:01 <hwoarang> but we will see
13:19:18 <fdegir> ok
13:19:40 <fdegir> the reason why bump is urgent is that I expect the projects will start arriving soon
13:19:45 <fdegir> for masakari and blazar
13:19:58 <fdegir> that's all about shabump
13:20:09 <fdegir> #topic Functest Issues
13:20:15 <fdegir> this is another tricky topic
13:20:28 <fdegir> singlevm test case times out
13:20:32 <hwoarang> that is more urgent i think
13:20:53 <fdegir> i agree
13:21:08 <hwoarang> i had a talk with cedric in the past few weeks and he basically said that nested virt is not proper CI and functest is not designed for such scenarios
13:21:15 <fdegir> well
13:21:20 <hwoarang> so to me the question might be whether we want to keep functest on this level
13:21:23 <fdegir> i probably shouldn't comment on it
13:21:35 <fdegir> what if we were using openstack cloud for patch gating
13:21:38 <hwoarang> maybe we should only use it for smoke + baremetal
13:21:40 <fdegir> getting slaves as vms
13:22:09 <fdegir> and if upstream openstack is doing tempest using virtual machines then what we are doing is not that strange either
13:22:15 <fdegir> but again
13:22:21 <hwoarang> true but we have one more level of virtualization
13:22:33 <fdegir> i think it is same
13:22:50 <fdegir> doesn't openstack ci get slaves from openstack cloud using nodepool?
13:23:01 <jmorgan1> i think so
13:23:03 <hwoarang> yes but we also have clean vm
13:23:09 <fdegir> ok
13:23:18 <fdegir> so openstack gets installed on that nodepool vms directly
13:23:36 <fdegir> then right
13:24:02 <fdegir> i'm wondering how it would perform if we run functest against an xci deployment done directly on host
13:25:15 <fdegir> anyway
13:25:19 <hwoarang> the prob is that if your CI scenario is not considered supported by functets, then we have 0 help from them
13:25:26 <hwoarang> *s/your/our
13:25:37 <hwoarang> so, not sure if we can keep up with that
13:26:29 <fdegir> i really shouldn't say anything
13:26:36 <fdegir> i failed to explain what we are doing and why
13:26:50 <hwoarang> either way
13:26:56 <jmorgan1> we would need one of us to join functest to gain the knowledge
13:27:03 <fdegir> it is not about knowledge
13:27:10 <fdegir> ok
13:27:11 <fdegir> moving on
13:27:43 <fdegir> when we meet next time over a beer, I'll give you details
13:27:49 <hwoarang> we have knowledge to fix stuff. i am submitting patches to fix our cases but i get resistance because of our unsupported case and that's the prob
13:28:03 <hwoarang> wait, because we haven't decided anything and this will block things on SHA bump
13:28:12 <fdegir> nope
13:28:26 <fdegir> we haven't switched back to latest version of functest yet
13:28:32 <fdegir> we are still using the pinned version
13:28:37 <hwoarang> ok then
13:28:38 <hwoarang> :/
13:28:41 <fdegir> we bump against that
13:28:49 <fdegir> and then come back to uplifting functest
13:28:57 <fdegir> because
13:29:03 <fdegir> anyway
13:29:07 <fdegir> #topic Baremetal Status
13:29:16 <fdegir> mbuil: so, you got it working
13:29:55 <fdegir> mbuil: when do you think we can review the change?
13:31:06 <fdegir> mbuil seems to be focused on his other meeting
13:31:10 <mbuil> fdegir: back
13:31:17 <mbuil> I had to talk in te other one
13:31:26 <fdegir> no multitasking ;(
13:32:11 <mbuil> so, baremetal patch works in ubuntu! I deployed mini flavor several times with SFC- scenario in Ericsson POD2
13:32:36 <mbuil> However, the patch has several hardcoded things and I am currently fixing those
13:32:57 <mbuil> AFter that, I'll split the patch into several patches so that it is easily consumable for reviewers
13:33:04 <fdegir> +1
13:33:20 <mbuil> it would be nice to get a POD where the jumphost uses opensuse
13:33:26 <mbuil> because currently I cannot test it
13:33:51 <fdegir> what do you mean?
13:34:00 <fdegir> why should it matter?
13:34:48 <mbuil> LFPOD4 and Ericsson POD2 have Ubuntu in the jumphost. The code is currently using that information to decide what distro to install in OPNFV and in the blades
13:35:02 <fdegir> mbuil: export DISTRO=opensuse before xci-deploy.sh
13:35:02 <jmorgan1> the jump server is running the opnfv host vm which is deploying to BM nodes
13:35:11 <mbuil> Are you suggesting to define the distro with e.g. a env variable?
13:35:14 <fdegir> yes
13:35:18 <fdegir> that's what we do in ci
13:35:28 <mbuil> oh, that is new to me
13:35:34 <mbuil> then ignore my request
13:35:35 <fdegir> all slaves are ubuntu but we are deciding what distro to bring the nodes up
13:35:57 <fdegir> hwoarang: correct me if I'm wrong
13:36:31 <fdegir> mbuil: we are looking forward to split up patches
13:36:34 <jmorgan1> i'm not sure what we are saying here
13:36:39 <fdegir> thanks for the great work
13:36:55 <jmorgan1> jump server with any OS running the opnfv host VM which deploys to actual BM nodes
13:37:05 <fdegir> jmorgan1: you are not bound by the os of the jumphost
13:37:19 <fdegir> jmorgan1: by default, the host os is chosen as the os for target nodes
13:37:20 <jmorgan1> fdegir: right, that is what I'm saying
13:37:23 <mbuil> jmorgan1: by jumphost I meant the host which hosts the opnfv vm
13:37:37 <fdegir> jmorgan1: but you can change the target node os
13:37:45 <jmorgan1> mbuil: yes, this is what OPNFV defines as jumphost
13:38:20 <mbuil> jmorgan1 the default behaviour is: if jumphost has distro A, opnfv VM gets distro A and compute and controller get distro A
13:38:25 <jmorgan1> agreed, otherwise, how are you going to know what OS to deploy to nodes
13:38:34 <fdegir> by setting DISTRO var
13:38:50 <fdegir> which is what xci actually does behind the scenes if DISTRO is not set
13:39:07 <jmorgan1> ok, i think we are all on the same page
13:39:07 <fdegir> and this was mbuil's question
13:39:27 <jmorgan1> we don't case which OS is on jump host
13:39:39 <jmorgan1> we support threee OS on opnfv host vm
13:39:40 <fdegir> yes, that's it
13:39:51 <jmorgan1> which deploys three OS to BM nodes
13:40:16 <fdegir> anything else mbuil?
13:40:27 <jmorgan1> so mbuil should be able to use opensuse based opnfv VM to test
13:40:50 <mbuil> nothing else. I am pretty busy right now but I hope to give 50% of my time to this
13:41:04 <fdegir> thanks mbuil
13:41:12 <fdegir> #topic k8-calico-onap
13:41:32 <fdegir> jmorgan1: electrocucaracha: how is it going?
13:42:11 <electrocucaracha> fdegir: well, jmorgan1 has been facing the nested virtualization issues
13:42:24 <electrocucaracha> fdegir: trying to test the new scenario
13:42:51 <electrocucaracha> fdegir: we have doubts about the proper syntax for adding a new scenario using gerrit as a source
13:43:32 <electrocucaracha> fdegir: jmorgan1 has a draft for that, we'll need some help on that
13:43:44 <fdegir> electrocucaracha: jmorgan1: https://gerrit.opnfv.org/gerrit/#/c/58945/17/xci/opnfv-scenario-requirements.yml
13:43:59 <fdegir> the version matches to sha of your commit
13:44:14 <fdegir> and the refspec matches to refspec of your change on gerrit
13:44:18 <jmorgan1> we also ran into an issue where we needed to update role (create-vm-nodes) to remove python-libvirt / install virtualbmc
13:45:01 <jmorgan1> i think this issue is resolved
13:45:26 <electrocucaracha> but we didn't find any patch solving that issue
13:45:26 <jmorgan1> fdegir: progress is coming along
13:45:58 <jmorgan1> create-vm-nodes was the 1st problem we had to solve
13:46:23 <fdegir> but now that issue is not there anymore?
13:46:28 <jmorgan1> next we are  getting opnfv-scenario-requirements.yml working
13:46:34 <jmorgan1> right
13:46:36 <fdegir> ok
13:46:45 <jmorgan1> might have been fixed upstream
13:47:03 <jmorgan1> when installing virtualbmc playbook, package names changed
13:47:18 <jmorgan1> from libvirt-python to python-libvirt
13:47:49 <electrocucaracha> we tested in ubuntu distro
13:47:51 <jmorgan1> we created a patch to the playbook to remove the old package first, then install new one with different name
13:48:20 <jmorgan1> currently working on opnfv-scenario-requirements.yml
13:49:03 <jmorgan1> ran into nested virt issues then
13:49:18 <jmorgan1> that's all
13:49:28 <fdegir> thanks jmorgan1, electrocucaracha
13:49:40 <fdegir> #topic os-nosdn-osm
13:49:59 <fdegir> the scenario is merged but hit slave issues with the patch integrating the scenario
13:50:12 <fdegir> should be done soon once we get slaves back
13:50:24 <fdegir> #topic os-odl-osm_sfc
13:50:42 <fdegir> i'll try to get this started and then pass it to mbuil
13:51:00 <fdegir> we will also meet with osm guys regarding the next steps
13:51:15 <fdegir> like what should we test and progress with opensuse support
13:51:21 <mbuil> fdegir: this will need to happen in H release probably. No cycles unless somebody helps :P
13:51:25 <fdegir> if anyone is interested joining, ping me
13:51:31 <fdegir> mbuil: that's fine
13:51:47 <fdegir> cause we really don't know how we can test the basic osm yet
13:52:00 <fdegir> so if we can get that pieces for os-nosdn-osm during G-release
13:52:03 <fdegir> i'd be happy
13:52:33 <fdegir> #topic k8-odl-coe
13:52:41 <fdegir> i don't see Peri online so moving on
13:52:53 <fdegir> #topic k8-nosdn-istio
13:53:14 <fdegir> hw_wutianwei_ might be away as well
13:53:46 <fdegir> #topic Testing Objectives
13:53:57 <fdegir> jmorgan1: let's talk about this
13:54:17 <fdegir> for patch verification, we run things in VMs
13:54:36 <fdegir> we do deployment using mini flavor and run old version of functest healthcheck
13:55:12 <fdegir> the reason of running things in VMs (nested virt) is to ensure we always start with clean VM to prevent environment related issues causing unnecessary failures
13:55:20 <fdegir> and utilize slaves more
13:55:36 <fdegir> the next testing is during post merge
13:55:49 <fdegir> the idea is to run functest smoke for merged patches
13:55:55 <jmorgan1> ok, let me explain
13:56:00 <fdegir> and then we take the scenario to baremetal with full functest and yardstick
13:56:06 <fdegir> jmorgan1: ok
13:56:21 <jmorgan1> i think i got my answer wiht our discussion the last two days
13:56:42 <jmorgan1> we have nested virtualization as you pointed out, and baremetal that mbuil is working on
13:56:59 <jmorgan1> no one is doing non-nested VM testing
13:57:25 <jmorgan1> i was curious about what others are doing when we first started getting nested virt errors
13:57:33 <fdegir> others meaning?
13:57:40 <jmorgan1> the xci teeam
13:57:58 <jmorgan1> i think it might be a good use case in the future to have non-nested vm support
13:58:00 <fdegir> i do the same
13:58:07 <fdegir> use ubuntu_xci_vm
13:58:15 <fdegir> but this doesn't mean you have to
13:58:24 <fdegir> you can directly execute xci-deploy.sh
13:58:36 <fdegir> it should be better now given that all the ironic/bifrost stuff is pushed into the opnfv vm
13:59:47 <jmorgan1> yup, i'll look into it when i have time, thanks
13:59:58 <fdegir> ok
14:00:06 <fdegir> #topic Multi-distro Support for New Scenarios
14:00:23 <fdegir> so we want to support all the distros we have
14:00:27 <fdegir> but it is not always possible
14:00:35 <jmorgan1> what is the expectation here? just curious as a new scenario owner
14:00:50 <fdegir> i say start with one distro but develop the scenario in a way that new distro support can easily be introduced
14:01:10 <fdegir> and if you find time, implement support for other distros as well
14:01:41 <jmorgan1> how do we know which distro is supported? been tested?
14:02:04 <fdegir> ubuntu and opensuse are supported for most of the distros
14:02:13 <fdegir> centos is kind of tricky since it is broken upstream osa
14:02:26 <fdegir> so you can start with opensuse or ubuntu
14:03:35 <fdegir> if you are basing your scenario to an existing one, opnfv-scenario-requirements.yml tells you which distros are supported for that scenario
14:04:04 <fdegir> we are over time
14:04:15 <fdegir> #topic XCI nodes in LF Portland lab
14:04:26 <fdegir> jmorgan1: what about these nodes?
14:04:39 <jmorgan1> i added three new systems for xci
14:04:55 <fdegir> it says "LF Portland Lab"
14:04:55 <jmorgan1> i was thinking of using them as develop systems
14:05:03 <fdegir> I suppose it should be Intel Portland Lab, or?
14:05:17 <jmorgan1> but we might want to use them for CI only
14:05:25 <jmorgan1> no, not correct
14:05:40 <jmorgan1> they are in the LF lab, with LF3,4,5
14:05:47 <fdegir> oh
14:05:52 <fdegir> ok
14:05:55 <jmorgan1> thus, LF Portland lab ;)
14:06:18 <fdegir> which pod are these nodes added?
14:06:32 <jmorgan1> no pod, they are their own
14:06:38 <fdegir> ok
14:06:52 <jmorgan1> each systems has one of thesupported OS
14:07:11 <jmorgan1> anyway, lets think about how to use
14:07:18 <fdegir> yes
14:07:24 <jmorgan1> either develop hosts or for CI
14:07:31 <fdegir> we can talk about that separately
14:07:32 <jmorgan1> CI only i mean
14:07:41 <fdegir> we need to go through the resource situation soon anyway
14:07:44 <jmorgan1> i wnated to share that we have them
14:08:33 <fdegir> thanks for that
14:08:47 <fdegir> lets move to the last topic
14:09:03 <fdegir> #topic XCI Developer Guide
14:09:11 <fdegir> jmorgan1: thanks for looking into it
14:09:39 <fdegir> jmorgan1: anything else you want to say?
14:09:48 <jmorgan1> its empty ;)
14:10:04 <jmorgan1> i don't mind starting to work on it, but I'll need some help
14:10:16 <jmorgan1> as the least knowledgable person
14:10:17 <fdegir> jmorgan1: I think Tapio started working on it
14:10:22 <fdegir> can you take a look at it
14:10:24 <jmorgan1> good to know
14:10:28 <fdegir> and perhaps incorporate the comments I posted
14:10:40 <jmorgan1> do you have link?
14:10:51 <fdegir> https://gerrit.opnfv.org/gerrit/#/c/51445/
14:10:59 <fdegir> please feel free to amend
14:11:04 <jmorgan1> thanks
14:11:27 <fdegir> np
14:11:43 <fdegir> then we end the meeting here if there is nothing more
14:11:50 <fdegir> thank you all for joinin
14:11:53 <hwoarang> thank you too
14:11:58 <jmorgan1> thanks
14:11:59 <fdegir> and welcome back to the ones who had holiday
14:12:10 <fdegir> and have a nice holiday to the ones that are going soon
14:12:14 <fdegir> talk to you in 2 weeks
14:12:16 <fdegir> #endmeeting