14:00:31 <fdegir> #startmeeting OpenStack 3rd Party CI 14:00:31 <collabot> Meeting started Wed Sep 28 14:00:31 2016 UTC. The chair is fdegir. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:31 <collabot> Useful Commands: #action #agreed #help #info #idea #link #topic. 14:00:31 <collabot> The meeting name has been set to 'openstack_3rd_party_ci' 14:00:46 <fdegir> #topic Roll Call 14:00:56 <fdegir> anyone around? 14:01:00 <yolanda> hi 14:01:00 <Julien-zte> #info Julien 14:01:03 <Julien-zte> hi 14:01:06 <fdegir> hi 14:01:10 <fdegir> #info Fatih Degirmenci 14:01:15 <qiliang> #info qiliang 14:01:19 <jmorgan1> #info Jack Morgan 14:01:25 <yolanda> #info Yolanda 14:01:54 <fdegir> #topic Status Update 14:02:00 <fdegir> I think yolanda has some news for us 14:02:12 <fdegir> yolanda: stage is yours 14:02:22 <yolanda> so I got baremetal provisioning working with bifrost, on POD5 14:02:43 <yolanda> i have not been able to deploy a full cloud, but at least the bifrost part works fine 14:02:45 <fdegir> #info yolanda got baremetal provisioning working with bifrost, on LF POD5 14:02:50 <hwoarang> #info Markos 14:03:01 <Julien-zte> godd 14:03:08 <yolanda> also i went to the Infra Mid-cycle last week, and exposed some of our needs 14:03:32 <fdegir> #info yolanda attended to OpenStack Infra Mid-cycle meetup last week and exposed some of OPNFV needs 14:03:32 <yolanda> Infra is fine in us implementing HA and different network configs in puppet-infracloud modules, but as long as we put efforts from the OPNFV side 14:03:48 <yolanda> and don't affect the current functionality of infracloud modules, and don't make them too complex 14:03:56 <fdegir> #info OpenStack Infra is fine in us implementing HA and different network configs in puppet-infracloud modules, but as long as we put efforts from the OPNFV side 14:04:08 <fdegir> #info without affecting the current functionality of infracloud modules, and don't make them too complex 14:04:23 <fdegir> yolanda: any luck for us getting help from them to make stuff more flexible? 14:04:42 <fdegir> yolanda: as you personally found out, the things in upstream are pretty tied to their needs 14:05:00 <Julien-zte> yolanda, OpenStack Infra Mid-cycle meetup is F2F meeting? 14:05:03 <yolanda> fdegir, they will review things, but there is no people now in Infra that can put efforts on it 14:05:11 <fdegir> yolanda: I mean, do you think they will do stuff a bit more configurable going forward? 14:05:16 <yolanda> apart from myself 14:05:41 <yolanda> fdegir, i am in the process of making these things more flexible. Mostly Ricardo Carrillo and myself are the ones working on that effort 14:06:20 <qiliang> so we can push HA related code to openstack community when we implement them? 14:06:20 <fdegir> yep 14:06:36 <fdegir> qiliang: yes wasn't for your question 14:07:02 <fdegir> qiliang: but I think they would be happy if ha things go there 14:07:08 <fdegir> in fact yolanda has a patch there 14:07:24 <qiliang> i think so :) 14:07:30 <yolanda> yep, well, i just added the pacemaker module 14:07:35 <yolanda> but no patch to have ha 14:07:51 <fdegir> yolanda is core there so whatever she lets in goes in :) 14:08:05 <qiliang> great 14:08:15 <yolanda> yep, as long as is sane, and doesn't add extra complexity, it can go in 14:08:30 <qiliang> i see 14:08:34 <yolanda> same will apply for OVS instead of linuxbridge, i think it's an important feature to have 14:08:38 <fdegir> anything else you want to mention yolanda? 14:09:04 <yolanda> i think it's all from my side, i'm trying to find time to test full deploy but this week is difficult 14:09:15 <fdegir> one last question 14:09:29 <fdegir> sorry for asking this but how do you see our chances of having something for summit? 14:09:51 <fdegir> on baremetal I mean 14:10:08 <yolanda> so i have hopes that we can have something working, but i'm not totally sure if we can automate 100% 14:10:26 <yolanda> because there are still some problems with bifrost playbook, and also the enroll/deploy steps are not automated 14:10:40 <fdegir> that should be fine as long as we can say we are able to provision/deploy 14:10:44 <fdegir> automation can be fixed later on 14:10:59 <fdegir> hope fully by that time we all are up to the speed, helping out more 14:10:59 <yolanda> then i hope it can be done because the hardest part, that is the provision, is working 14:11:05 <yolanda> now is a matter of ssh to nodes and run puppet there 14:11:58 <fdegir> thanks yolanda 14:12:06 <fdegir> hwoarang: your turn 14:12:22 <hwoarang> right. so suse host support is there for bifrost 14:12:42 <fdegir> hwoarang: just to make sure I record it right 14:12:56 <hwoarang> sure. so you can run bifrost on suse hosts 14:12:56 <fdegir> we can run provisioning of trusty on suse host? 14:13:02 <fdegir> or centos 14:13:23 <hwoarang> yeah, you need some tweaks to make diskimage-builder build such images on foreign hosts 14:13:27 <hwoarang> but it should be possible 14:13:30 <fdegir> #info SuSe host support is now available for bifrost 14:13:51 <hwoarang> i haven't tried that because i have troubles with suse vms at the moment 14:13:58 <hwoarang> tl;dr; is that diskimage-builder does not support minimal opensuse images 14:14:22 <Julien-zte> fdegir, does we have any definition for the VM's Operation system? 14:14:25 <hwoarang> so this causes 2 problems. 1) the generated images are huge so it takes forever to flush them to the vdisk when you pxe boot 14:14:44 <hwoarang> 2) cloud-init and glean run in parallel on such images so you get funny results 14:15:03 <fdegir> #info diskimage-builder does not support minimal opensuse images 14:15:09 <fdegir> #info This causes 2 problems 14:15:15 <hwoarang> and glean has broken suse support, so network is broken and provisioning never completes. i have submitted a patch for that 14:15:17 <fdegir> #info 1. the generated images are huge so it takes forever to flush them to the vdisk when you pxe boot 14:15:31 <fdegir> #info 2. cloud-init and glean run in parallel on such images so you get funny results 14:15:44 <hwoarang> so vm support is nearly there but not quite 14:15:53 <fdegir> #info glean has broken suse support, so network is broken and provisioning never completes. hwoarang has submitted a patch for that 14:16:11 <hwoarang> i could perhaps try and use suse host + centos vm. if that's good enough then we can use this job for the time being if needed 14:16:12 <fdegir> hwoarang: so we wait until it is totally done to enable our unstable jenkins jobs 14:16:19 <hwoarang> but i'd rather keep the host to finish the work. 14:16:21 <fdegir> no point spamming 14:16:31 <fdegir> and taking your machine away :) 14:16:34 <hwoarang> indeed 14:16:48 <fdegir> Julien-zte: we plan to have centos on centos, trusty on trusy and suse on suse 14:17:05 <Julien-zte> good 14:17:19 <fdegir> hwoarang: you were planning to ask some stuff as well 14:17:26 <qiliang> fdegir: what we have now? 14:17:44 <qiliang> seems suse/centos/trusty host are all support 14:17:50 <fdegir> qiliang: we have trusty on trusty, centos on centos and you can run bifrost on suse 14:17:56 <hwoarang> fdegir: my questions have been answered when i read the new job deployment. i wanted to ask you what kind of images you deploy but then i saw you use the -minimal elements 14:18:13 <qiliang> fdegir: got it, thx 14:18:19 <fdegir> finally I did something useful :) 14:18:31 <fdegir> before moving on to others 14:18:46 <fdegir> I want to bring this bifrost instability issue 14:19:04 <fdegir> as you noticed in your junk mail folders, our jenkins jobs fail time to time 14:19:13 <hwoarang> ^_^ 14:19:27 <fdegir> I couldn't figure out what could be the problem after checking things manually 14:19:37 <fdegir> yolanda's permission fix patch is still in the works 14:19:51 <fdegir> but even without that patch and with our chmod 755 /httpboot 14:19:51 <yolanda> yep, i was hitting some errors 14:19:56 <fdegir> that issue shouldn't be there 14:20:04 <hwoarang> latest logs suggest that 1 out of 3 vms fail. so i don't think it's the same perm issue 14:20:16 <hwoarang> if one of them manages to grab the files, so will the rest 14:20:24 <Julien-zte> yah, currently the bifrost CI is not in stable status either 14:20:45 <fdegir> and the other thing I noticed 14:20:53 <fdegir> when I logged in to one of the machines 14:21:02 <fdegir> the /httpboot and /tftpboot folders were empty 14:21:14 <fdegir> even though bifrost said build-dib-images is completed 14:21:59 <Julien-zte> anyone is still here? or am I lost ? 14:22:00 <fdegir> I'm not sure how to tackle all this different types of issues 14:22:20 <fdegir> Julien-zte: we are here 14:22:37 <fdegir> anyway, let's move on 14:22:46 <fdegir> Julien-zte: your turn 14:23:32 <yolanda> and no failures in the log for that creation? 14:23:43 <fdegir> yolanda: nope 14:23:57 <hwoarang> the last log suggest a git problem too 14:23:59 <fdegir> VMs fail pxe booting 14:24:06 <fdegir> cause the stuff is not there 14:24:25 <hwoarang> so are we sure the hardware is ok? :) 14:24:43 <fdegir> jmorgan1 has been quiet for 2 weeks 14:24:49 <jmorgan1> what would be wrong with hardware? 14:25:04 <hwoarang> lets say the git failures could be related to upstream 14:25:05 <jmorgan1> like bad disk? 14:25:36 <hwoarang> don't know. do you track their health somehow? 14:25:50 <Julien-zte> there are several times because of git failure in recent days in ZTE pods 14:26:03 <jmorgan1> LF-POD5 is in Linux Foundation lab so don't know aht they do 14:26:26 <fdegir> jmorgan1: the failures we talk about occurred on lf pod4 jumphost and pod1 14:26:28 <jmorgan1> git failures mans a timeout? 14:26:38 <fdegir> intel pod4 sorry 14:26:41 <yolanda> that tests from fdegir are not on pod5 14:26:50 <hwoarang> missing ref in the upstream git repo 14:27:02 <yolanda> that looks as an upstream failure right 14:27:08 <hwoarang> perhaps some caching, load balancing etc on upstream i'd say 14:27:13 <hwoarang> so i wouldn't worry much 14:27:24 <hwoarang> but the other failures sound more suspicious 14:27:32 <fdegir> agree 14:27:38 <jmorgan1> what other failures? 14:27:56 <fdegir> provisioning failures 14:28:10 <fdegir> we have more failed runs than successful runs 14:28:13 <jmorgan1> those might be hardware related? 14:28:25 <fdegir> that's what hwoarang suspects :) 14:28:43 <jmorgan1> but specifically what? 14:28:57 <jmorgan1> filesystem issue? hard disk? network related? 14:29:35 <hwoarang> i would suspect an FS issue but i can't say much from the log. there is no obvious reason there 14:29:55 <jmorgan1> what FS are we using? 14:30:33 <hwoarang> btw i think it's only the trusty hosts that fails so badly 14:30:42 <hwoarang> the centos one just failed because of git errors 14:31:31 <hwoarang> i don't know anything about the trusty slave so no idea about the FS 14:31:39 <hwoarang> did you try to turn it off and on again ? :) 14:32:11 <fdegir> I can try that 14:32:41 <hwoarang> it might trigger an fsck during boot 14:32:43 <hwoarang> anyway 14:32:51 <fdegir> we can continue talking about this later on after reboot 14:32:56 <hwoarang> yes 14:32:59 <fdegir> Julien-zte: you are here? 14:33:05 <Julien-zte> yes 14:33:11 <fdegir> anything you want to say? 14:33:16 <Julien-zte> just got lost 14:33:41 <Julien-zte> the last issue has been resolved, and bifrost CI runs correctly in inner lab 14:34:12 <Julien-zte> busy for other things and not enough time in these week. 14:34:32 <Julien-zte> good news is that Daisy team interests in bifrost 14:34:42 <fdegir> Julien-zte: I noticed that 14:34:46 <Julien-zte> and they will invest some resource in Daisy 14:34:53 <Julien-zte> for bifrost 14:35:02 <fdegir> and then we get stuff for free 14:35:03 <fdegir> cool 14:35:08 <Julien-zte> yes1 14:35:12 <jmorgan1> no you get more work 14:35:21 <fdegir> that's what you think 14:35:24 <Julien-zte> I'm persuading they be quick enough for decision 14:35:51 <fdegir> #info Bifrost CI runs correctly in ZTE inner lab 14:35:59 <fdegir> #info Daisy is also looking at bifrost 14:36:15 <fdegir> Julien-zte: persuade harder 14:36:38 <fdegir> qiliang: how is it going? 14:36:44 <Julien-zte> OK, I will do it, fdegir 14:36:50 <Julien-zte> and bifrost will be used in the installer project:) 14:36:56 <fdegir> qiliang: have you been able to look at what we have been doing? 14:37:01 <fdegir> Julien-zte: :) 14:37:16 <yolanda> is great to see more bifrost usage 14:37:23 <Julien-zte> hi qilaing, sorry for not troubleshooting for you issue 14:37:27 <Julien-zte> is it resolved? 14:38:11 <fdegir> qiliang: ^ 14:38:32 <qiliang> yes, i've successfully run birfrost and puppet-infracloud in huawei lab. 14:38:33 <fdegir> moving to jmorgan1 then 14:38:44 <Julien-zte> yes, yolanda, currenty centos on centos is stable ? 14:38:51 <fdegir> qiliang: by following the readme files? 14:39:12 <jmorgan1> i've not done anything yet, my pod is being used by others atm 14:39:16 <fdegir> cause it is good to hear it works in other places 14:39:27 <jmorgan1> i will look to get it back next week 14:39:35 <yolanda> Julien-zte, so have not tested in the last 2 weeks but when i left, centos-on-centos was working 14:39:43 <qiliang> and take a look at some of the bifrost and puppet related code. 14:40:04 <Julien-zte> yolanda, OK, I will assign some work for test 14:40:24 <fdegir> thx jmorgan1 14:40:37 <fdegir> updates from me 14:40:49 <qiliang> sorry i just got lost 14:40:55 <fdegir> #info we now have jobs to verify opnfv/bifrost and openstack/bifrost patches 14:41:02 <fdegir> even though they fail, it feels good 14:41:11 <fdegir> #link https://build.opnfv.org/ci/view/3rd%20Party%20CI/ 14:41:33 <fdegir> #info Once SuSe work is done, those 2 jobs will also be enabled 14:41:35 <qiliang> fdegir: i've successfully run birfrost and puppet-infracloud in huawei lab, and take a look at some of the bifrost and puppet related code. 14:41:38 <fdegir> having more failures 14:41:41 <hwoarang> yeah thanks for cleaning the job up 14:42:09 <fdegir> #info qiliang has successfully run birfrost and puppet-infracloud in huawei lab, and take a look at some of the bifrost and puppet related code. 14:42:21 <fdegir> I just got rid of daily jobs 14:42:39 <fdegir> since we lack machines and I wasn't so sure about the use of them for the timebeing 14:43:02 <fdegir> we can recreate them if anyone feels it is good to have them 14:43:18 <hwoarang> the triggered ones are good for now i believe 14:43:59 <fdegir> good 14:44:07 <fdegir> I think that's all of us 14:44:09 <Julien-zte> agree 14:44:11 <Julien-zte> fdegir, I just lost when you mentioned, "finally I did something useful :)" what is the useful thing or I will check the meeting records after the meeting. 14:44:13 <fdegir> anyone wants to bring anything up? 14:44:17 <jmorgan1> what about summit? 14:44:29 <fdegir> Julien-zte: https://build.opnfv.org/ci/view/3rd%20Party%20CI/ 14:44:39 <jmorgan1> any details on meetup? 14:44:45 <fdegir> Julien-zte: we verify both openstack/bifrost and opnfv/bifrost patches 14:44:57 <fdegir> jmorgan1: good poing 14:45:03 <Julien-zte> fgedir that's great 14:45:10 <fdegir> jmorgan1: I will create an etherpad and send the link so people can sign up and add topics 14:45:20 <Julien-zte> I will study the the jenkins job 14:45:33 <jmorgan1> we have opnfv board F2F on Friday 14:45:54 <jmorgan1> so might have it our meetup early 14:46:08 <fdegir> I can add some random day/times there 14:46:18 <fdegir> and see which one is more popular 14:46:48 <hwoarang> so we don't have a room or something right? sounds like an ad-hoc f2f meeting :) 14:47:01 <fdegir> we can have it on the beach 14:47:02 <jmorgan1> find a table or cafe? 14:47:12 <hwoarang> sounds good :) 14:47:24 <fdegir> don't know how the weather would be 14:47:27 <Julien-zte> ok for me. I will be there till Saturday 14:47:42 <fdegir> before we end the meeting 14:48:03 <fdegir> there will be some discussions regarding 3rd party ci and a similar topic during infra wg meeting in 12 minutes 14:48:03 <jmorgan1> qiliang: will you be at openstack summit? 14:48:10 <fdegir> please call in if you are interested 14:48:15 <fdegir> https://wiki.opnfv.org/display/INF/Infra+Working+Group 14:48:31 <hwoarang> ok 14:48:32 <fdegir> that's all for the day 14:48:39 <qiliang> jmorgan1: sorry, today my boss told me NO. :( 14:48:43 <fdegir> thank you for joining 14:48:46 <fdegir> #endmeeting