14:00:31 #startmeeting OpenStack 3rd Party CI 14:00:31 Meeting started Wed Sep 28 14:00:31 2016 UTC. The chair is fdegir. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:31 Useful Commands: #action #agreed #help #info #idea #link #topic. 14:00:31 The meeting name has been set to 'openstack_3rd_party_ci' 14:00:46 #topic Roll Call 14:00:56 anyone around? 14:01:00 hi 14:01:00 #info Julien 14:01:03 hi 14:01:06 hi 14:01:10 #info Fatih Degirmenci 14:01:15 #info qiliang 14:01:19 #info Jack Morgan 14:01:25 #info Yolanda 14:01:54 #topic Status Update 14:02:00 I think yolanda has some news for us 14:02:12 yolanda: stage is yours 14:02:22 so I got baremetal provisioning working with bifrost, on POD5 14:02:43 i have not been able to deploy a full cloud, but at least the bifrost part works fine 14:02:45 #info yolanda got baremetal provisioning working with bifrost, on LF POD5 14:02:50 #info Markos 14:03:01 godd 14:03:08 also i went to the Infra Mid-cycle last week, and exposed some of our needs 14:03:32 #info yolanda attended to OpenStack Infra Mid-cycle meetup last week and exposed some of OPNFV needs 14:03:32 Infra is fine in us implementing HA and different network configs in puppet-infracloud modules, but as long as we put efforts from the OPNFV side 14:03:48 and don't affect the current functionality of infracloud modules, and don't make them too complex 14:03:56 #info OpenStack Infra is fine in us implementing HA and different network configs in puppet-infracloud modules, but as long as we put efforts from the OPNFV side 14:04:08 #info without affecting the current functionality of infracloud modules, and don't make them too complex 14:04:23 yolanda: any luck for us getting help from them to make stuff more flexible? 14:04:42 yolanda: as you personally found out, the things in upstream are pretty tied to their needs 14:05:00 yolanda, OpenStack Infra Mid-cycle meetup is F2F meeting? 14:05:03 fdegir, they will review things, but there is no people now in Infra that can put efforts on it 14:05:11 yolanda: I mean, do you think they will do stuff a bit more configurable going forward? 14:05:16 apart from myself 14:05:41 fdegir, i am in the process of making these things more flexible. Mostly Ricardo Carrillo and myself are the ones working on that effort 14:06:20 so we can push HA related code to openstack community when we implement them? 14:06:20 yep 14:06:36 qiliang: yes wasn't for your question 14:07:02 qiliang: but I think they would be happy if ha things go there 14:07:08 in fact yolanda has a patch there 14:07:24 i think so :) 14:07:30 yep, well, i just added the pacemaker module 14:07:35 but no patch to have ha 14:07:51 yolanda is core there so whatever she lets in goes in :) 14:08:05 great 14:08:15 yep, as long as is sane, and doesn't add extra complexity, it can go in 14:08:30 i see 14:08:34 same will apply for OVS instead of linuxbridge, i think it's an important feature to have 14:08:38 anything else you want to mention yolanda? 14:09:04 i think it's all from my side, i'm trying to find time to test full deploy but this week is difficult 14:09:15 one last question 14:09:29 sorry for asking this but how do you see our chances of having something for summit? 14:09:51 on baremetal I mean 14:10:08 so i have hopes that we can have something working, but i'm not totally sure if we can automate 100% 14:10:26 because there are still some problems with bifrost playbook, and also the enroll/deploy steps are not automated 14:10:40 that should be fine as long as we can say we are able to provision/deploy 14:10:44 automation can be fixed later on 14:10:59 hope fully by that time we all are up to the speed, helping out more 14:10:59 then i hope it can be done because the hardest part, that is the provision, is working 14:11:05 now is a matter of ssh to nodes and run puppet there 14:11:58 thanks yolanda 14:12:06 hwoarang: your turn 14:12:22 right. so suse host support is there for bifrost 14:12:42 hwoarang: just to make sure I record it right 14:12:56 sure. so you can run bifrost on suse hosts 14:12:56 we can run provisioning of trusty on suse host? 14:13:02 or centos 14:13:23 yeah, you need some tweaks to make diskimage-builder build such images on foreign hosts 14:13:27 but it should be possible 14:13:30 #info SuSe host support is now available for bifrost 14:13:51 i haven't tried that because i have troubles with suse vms at the moment 14:13:58 tl;dr; is that diskimage-builder does not support minimal opensuse images 14:14:22 fdegir, does we have any definition for the VM's Operation system? 14:14:25 so this causes 2 problems. 1) the generated images are huge so it takes forever to flush them to the vdisk when you pxe boot 14:14:44 2) cloud-init and glean run in parallel on such images so you get funny results 14:15:03 #info diskimage-builder does not support minimal opensuse images 14:15:09 #info This causes 2 problems 14:15:15 and glean has broken suse support, so network is broken and provisioning never completes. i have submitted a patch for that 14:15:17 #info 1. the generated images are huge so it takes forever to flush them to the vdisk when you pxe boot 14:15:31 #info 2. cloud-init and glean run in parallel on such images so you get funny results 14:15:44 so vm support is nearly there but not quite 14:15:53 #info glean has broken suse support, so network is broken and provisioning never completes. hwoarang has submitted a patch for that 14:16:11 i could perhaps try and use suse host + centos vm. if that's good enough then we can use this job for the time being if needed 14:16:12 hwoarang: so we wait until it is totally done to enable our unstable jenkins jobs 14:16:19 but i'd rather keep the host to finish the work. 14:16:21 no point spamming 14:16:31 and taking your machine away :) 14:16:34 indeed 14:16:48 Julien-zte: we plan to have centos on centos, trusty on trusy and suse on suse 14:17:05 good 14:17:19 hwoarang: you were planning to ask some stuff as well 14:17:26 fdegir: what we have now? 14:17:44 seems suse/centos/trusty host are all support 14:17:50 qiliang: we have trusty on trusty, centos on centos and you can run bifrost on suse 14:17:56 fdegir: my questions have been answered when i read the new job deployment. i wanted to ask you what kind of images you deploy but then i saw you use the -minimal elements 14:18:13 fdegir: got it, thx 14:18:19 finally I did something useful :) 14:18:31 before moving on to others 14:18:46 I want to bring this bifrost instability issue 14:19:04 as you noticed in your junk mail folders, our jenkins jobs fail time to time 14:19:13 ^_^ 14:19:27 I couldn't figure out what could be the problem after checking things manually 14:19:37 yolanda's permission fix patch is still in the works 14:19:51 but even without that patch and with our chmod 755 /httpboot 14:19:51 yep, i was hitting some errors 14:19:56 that issue shouldn't be there 14:20:04 latest logs suggest that 1 out of 3 vms fail. so i don't think it's the same perm issue 14:20:16 if one of them manages to grab the files, so will the rest 14:20:24 yah, currently the bifrost CI is not in stable status either 14:20:45 and the other thing I noticed 14:20:53 when I logged in to one of the machines 14:21:02 the /httpboot and /tftpboot folders were empty 14:21:14 even though bifrost said build-dib-images is completed 14:21:59 anyone is still here? or am I lost ? 14:22:00 I'm not sure how to tackle all this different types of issues 14:22:20 Julien-zte: we are here 14:22:37 anyway, let's move on 14:22:46 Julien-zte: your turn 14:23:32 and no failures in the log for that creation? 14:23:43 yolanda: nope 14:23:57 the last log suggest a git problem too 14:23:59 VMs fail pxe booting 14:24:06 cause the stuff is not there 14:24:25 so are we sure the hardware is ok? :) 14:24:43 jmorgan1 has been quiet for 2 weeks 14:24:49 what would be wrong with hardware? 14:25:04 lets say the git failures could be related to upstream 14:25:05 like bad disk? 14:25:36 don't know. do you track their health somehow? 14:25:50 there are several times because of git failure in recent days in ZTE pods 14:26:03 LF-POD5 is in Linux Foundation lab so don't know aht they do 14:26:26 jmorgan1: the failures we talk about occurred on lf pod4 jumphost and pod1 14:26:28 git failures mans a timeout? 14:26:38 intel pod4 sorry 14:26:41 that tests from fdegir are not on pod5 14:26:50 missing ref in the upstream git repo 14:27:02 that looks as an upstream failure right 14:27:08 perhaps some caching, load balancing etc on upstream i'd say 14:27:13 so i wouldn't worry much 14:27:24 but the other failures sound more suspicious 14:27:32 agree 14:27:38 what other failures? 14:27:56 provisioning failures 14:28:10 we have more failed runs than successful runs 14:28:13 those might be hardware related? 14:28:25 that's what hwoarang suspects :) 14:28:43 but specifically what? 14:28:57 filesystem issue? hard disk? network related? 14:29:35 i would suspect an FS issue but i can't say much from the log. there is no obvious reason there 14:29:55 what FS are we using? 14:30:33 btw i think it's only the trusty hosts that fails so badly 14:30:42 the centos one just failed because of git errors 14:31:31 i don't know anything about the trusty slave so no idea about the FS 14:31:39 did you try to turn it off and on again ? :) 14:32:11 I can try that 14:32:41 it might trigger an fsck during boot 14:32:43 anyway 14:32:51 we can continue talking about this later on after reboot 14:32:56 yes 14:32:59 Julien-zte: you are here? 14:33:05 yes 14:33:11 anything you want to say? 14:33:16 just got lost 14:33:41 the last issue has been resolved, and bifrost CI runs correctly in inner lab 14:34:12 busy for other things and not enough time in these week. 14:34:32 good news is that Daisy team interests in bifrost 14:34:42 Julien-zte: I noticed that 14:34:46 and they will invest some resource in Daisy 14:34:53 for bifrost 14:35:02 and then we get stuff for free 14:35:03 cool 14:35:08 yes1 14:35:12 no you get more work 14:35:21 that's what you think 14:35:24 I'm persuading they be quick enough for decision 14:35:51 #info Bifrost CI runs correctly in ZTE inner lab 14:35:59 #info Daisy is also looking at bifrost 14:36:15 Julien-zte: persuade harder 14:36:38 qiliang: how is it going? 14:36:44 OK, I will do it, fdegir 14:36:50 and bifrost will be used in the installer project:) 14:36:56 qiliang: have you been able to look at what we have been doing? 14:37:01 Julien-zte: :) 14:37:16 is great to see more bifrost usage 14:37:23 hi qilaing, sorry for not troubleshooting for you issue 14:37:27 is it resolved? 14:38:11 qiliang: ^ 14:38:32 yes, i've successfully run birfrost and puppet-infracloud in huawei lab. 14:38:33 moving to jmorgan1 then 14:38:44 yes, yolanda, currenty centos on centos is stable ? 14:38:51 qiliang: by following the readme files? 14:39:12 i've not done anything yet, my pod is being used by others atm 14:39:16 cause it is good to hear it works in other places 14:39:27 i will look to get it back next week 14:39:35 Julien-zte, so have not tested in the last 2 weeks but when i left, centos-on-centos was working 14:39:43 and take a look at some of the bifrost and puppet related code. 14:40:04 yolanda, OK, I will assign some work for test 14:40:24 thx jmorgan1 14:40:37 updates from me 14:40:49 sorry i just got lost 14:40:55 #info we now have jobs to verify opnfv/bifrost and openstack/bifrost patches 14:41:02 even though they fail, it feels good 14:41:11 #link https://build.opnfv.org/ci/view/3rd%20Party%20CI/ 14:41:33 #info Once SuSe work is done, those 2 jobs will also be enabled 14:41:35 fdegir: i've successfully run birfrost and puppet-infracloud in huawei lab, and take a look at some of the bifrost and puppet related code. 14:41:38 having more failures 14:41:41 yeah thanks for cleaning the job up 14:42:09 #info qiliang has successfully run birfrost and puppet-infracloud in huawei lab, and take a look at some of the bifrost and puppet related code. 14:42:21 I just got rid of daily jobs 14:42:39 since we lack machines and I wasn't so sure about the use of them for the timebeing 14:43:02 we can recreate them if anyone feels it is good to have them 14:43:18 the triggered ones are good for now i believe 14:43:59 good 14:44:07 I think that's all of us 14:44:09 agree 14:44:11 fdegir, I just lost when you mentioned, "finally I did something useful :)" what is the useful thing or I will check the meeting records after the meeting. 14:44:13 anyone wants to bring anything up? 14:44:17 what about summit? 14:44:29 Julien-zte: https://build.opnfv.org/ci/view/3rd%20Party%20CI/ 14:44:39 any details on meetup? 14:44:45 Julien-zte: we verify both openstack/bifrost and opnfv/bifrost patches 14:44:57 jmorgan1: good poing 14:45:03 fgedir that's great 14:45:10 jmorgan1: I will create an etherpad and send the link so people can sign up and add topics 14:45:20 I will study the the jenkins job 14:45:33 we have opnfv board F2F on Friday 14:45:54 so might have it our meetup early 14:46:08 I can add some random day/times there 14:46:18 and see which one is more popular 14:46:48 so we don't have a room or something right? sounds like an ad-hoc f2f meeting :) 14:47:01 we can have it on the beach 14:47:02 find a table or cafe? 14:47:12 sounds good :) 14:47:24 don't know how the weather would be 14:47:27 ok for me. I will be there till Saturday 14:47:42 before we end the meeting 14:48:03 there will be some discussions regarding 3rd party ci and a similar topic during infra wg meeting in 12 minutes 14:48:03 qiliang: will you be at openstack summit? 14:48:10 please call in if you are interested 14:48:15 https://wiki.opnfv.org/display/INF/Infra+Working+Group 14:48:31 ok 14:48:32 that's all for the day 14:48:39 jmorgan1: sorry, today my boss told me NO. :( 14:48:43 thank you for joining 14:48:46 #endmeeting