15:01:09 #startmeeting neutron_northbound 15:01:09 Meeting started Fri May 29 15:01:09 2015 UTC. The chair is regXboi. Information about MeetBot at http://ci.openstack.org/meetbot.html. 15:01:09 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:09 The meeting name has been set to 'neutron_northbound' 15:01:21 #topic roll call and agenda bashing 15:01:26 #info regXboi 15:01:53 #link https://wiki.opendaylight.org/view/NeutronNorthbound:Meetings#Agenda_for_Next_Meeting_.285.2F29.29 agenda (in its usual place) 15:03:14 * regXboi listens to crickets 15:05:38 #info edwarnicke 15:05:48 #chair edwarnicke 15:05:48 Current chairs: edwarnicke regXboi 15:06:05 I don't see flavio 15:06:22 anybody else in the channel attending - if so, please #info in 15:07:01 anybody have changes for the agenda? 15:07:28 #info grmontpetit 15:08:11 folks are welcome to #info in as we go, but let's go through the action items (somewhat reordered) 15:08:25 #topic action items from last meeting 15:09:25 #info regXboi struggling to move the E2E patch forward - test claims it is running but really isn't 15:09:43 #info we'll talk more about this later in the meeting if we have time - if not, regXboi will send email to list 15:10:01 #info flaviof not here so carrying item 2 forward 15:10:12 #action flaviof to update OS issues and move from https://trello.com/b/ddIvDQE0/ovs-openstack to https://trello.com/b/LhIIQ8Z0/odl-neutronnorthbound 15:10:43 have we made any progress on neutron-daily-openstack-master and the gate? 15:11:12 regXboi: Do we have flaviof around for that? 15:11:50 edwarnicke: I don't see him in the channel or on the server 15:11:58 so I don't think so 15:12:29 regXboi: neither do I :( 15:12:46 so I don't think we can make a whole lot of progress on those items 15:13:55 so in the meantime, lets talk about the E2E patch 15:14:51 Ryan Moats proposed a change to neutron: WIP, DO NOT MERGE: E2E testing https://git.opendaylight.org/gerrit/18356 15:14:56 I'm pushing the latest draft 15:15:14 the problem with this is that PaxExam claims to be running the class, but nothing happens 15:15:24 * edwarnicke prefers things to happen 15:16:54 * regXboi trying to get a pastebin and arguing with web 15:17:32 * edwarnicke has wasted many years of his life arguing with the web 15:17:56 * flaviof stumbles in (line stolen from regXboi) 15:18:42 #link https://pastebin.com/h1WGyzUz suspicious pax exam output from E2E test 15:18:57 flaviof: fell free to #info in 15:19:03 #info flaviof 15:19:04 and we'll reset the agenda in a few :) 15:19:10 ack 15:19:20 * flaviof sorry for irc disconnect 15:19:33 no worries 15:19:47 regXboi: That does look suspiciously like nothing 15:19:58 edwarnicke: and a whole lot of it 15:20:45 interesting - jenkins is failing 15:21:05 on java 8 15:21:54 but not in the added filed 15:21:58 er file 15:22:23 regXboi: That is strange, link? 15:22:38 #link https://jenkins.opendaylight.org/releng/job/neutron-verify-master/jdk=openjdk8,nodes=dynamic_verify/196/console java8 console 15:23:29 regXboi: Looks like a spurious fail unrelated to our code, try remerge 15:24:00 well, we can also check the J7 output to see if the same no test signature is there 15:24:16 and yes it is 15:24:27 #link https://jenkins.opendaylight.org/releng/job/neutron-verify-master/jdk=openjdk7,nodes=dynamic_verify/196/consoleFull java7 console 15:24:38 in that link, look for NeutronE2ETest 15:24:46 and you'll see the full signature 15:25:21 but let's table this for a few minutes and give flaviof the floor 15:25:40 flaviof: any update on the gate? 15:26:00 because I believe that is why daily-openstack-master continues to be hosed 15:26:32 regXboi: mediocre to none. mestery has taken the work of getting.... 15:26:48 check-tempest-dsvm-networking-odl 15:27:13 working; and I was hoping the fruits of that labor could shed some light 15:27:33 i also have not heard of any updates on nodePool implementation 15:27:43 have any news on that, edwarnicke ? 15:27:59 flaviof: I have not, shall I fetch tykeal and zxiro ? 15:28:14 edwarnicke: +1 15:28:48 flaviof: I have cast a level 1 summon tykeal spell, pray it is powerful enough 15:28:59 tykeal: Welcome :) 15:29:02 heh 15:29:23 tykeal: the question of nodePool update has arisen here 15:29:24 I begin to wonder if devstack gate is causing more harm than help. at least with that we had more info on why jenkins job failed. 15:29:29 edwarnicke: apparently you had the right material and somatic components for your spell 15:29:52 * edwarnicke notes that a level 2 summon tykeal spell involves petitioning his cats ;) 15:29:54 the question being... when? 15:30:12 the question being ... status? 15:30:19 * zxiiro is here 15:30:42 it's still being worked on. I'm in testing phase on the management for it 15:31:00 tykeal: yeah. an alternate would be to try to manually run the job(s), to hopefully get visibility into why stack is failing. 15:31:13 is the stack failing again? 15:31:25 tykeal: it has been for a couple of weeks :( 15:31:27 * tykeal notes that nodepool _wont_ solve the issues related to the slave setup 15:31:47 tykeal: it wont? 15:31:48 all nodepool really does is shift our setup from JClouds to nodepool, we still have to manage the setup 15:31:50 it won't 15:32:03 all nodepool does is bring slaves online and run setup scripts 15:32:15 we still have to make sure that the setup scripts do the right thing in our environment 15:32:16 ack. but with that, we will be able to use the exact vm image the openstack folks use, yes? 15:32:22 no 15:32:27 we can't use their images 15:32:28 okay.... 15:32:32 their images are private to their cloud 15:32:53 but we know how they put their images together, right? 15:33:44 the scripts are available, but their devstack stuff is all ubuntu from what I've seen and while ubuntu is available in our environment I've had issues getting it to _work_ correctly in our environment 15:34:56 the reason being we do isolated network which requires stomping on network configuration that rackspace / cloud-init setup and ubuntu's static routing is harder to work with than EL / Fedora is 15:36:45 that makes it sound like this is not a simply solved issue 15:37:16 tykeal: i thought they supported these flavors: https://git.openstack.org/cgit/openstack-infra/nodepool/tree/nodepool/nodepool.py#n1113 15:38:23 flaviof: their nodepool does, yes but most all of the primary testing for OS is done on ubuntu. I believe all the basic gate testing is done on ubuntu. I don't know for certain as I honestly haven't delved too deeply into their gates 15:38:40 but most of the smoke testing I've seen was all related to ubuntu tests 15:40:37 might we have to rethink how/if the gate is integrated into jenkins? 15:41:37 tykeal: Putting aside for the moment the Ubuntu networking issues (not because they aren't important, but rather because I don't understand them yet ;) ) 15:41:45 tykeal: the only script we care about atm is tempest, and that works with centos, so I'm hoping there is no need to worry about the distro; just nodepool. 15:41:50 tykeal: It sounds like we *think* we could get nodePool going on Ubuntu 15:42:02 tykeal: mestery repo uses that, btw: https://github.com/mestery/odl-openstack-ci-1 15:42:10 tykeal: Is that a fair statement, or did I misunderstand 15:42:45 as I said, switching to nodepool doesn't resolve our getting the slave working, all it does is shift how the slaves come online 15:43:43 what we can do is see what's different between mestery's vagrant repo there and our vagrant definitions 15:43:57 tykeal: Ah... OK... so what do you see as the root issue (given that I apparently have completely misunderstood ;) ) 15:44:15 of course, the base boxes that mestery are using aren't the same as what we've got since I would be they're all designed against virtualbox and we can't use those base boxes 15:45:15 tykeal: ack. mestery has 2 flavors with devstack-gate and w/out. I thinking that by not using devstack-gate we can get better control/visibility when things break. right now all we get is a 'failed' #sadpanda 15:45:53 * regXboi isn't seeing anything actionable :( 15:46:26 the root issue is that our image is built from the following 3 vagrants (yes, it takes 3 different vagrants to make one image) https://git.opendaylight.org/gerrit/gitweb?p=releng/builder.git;a=tree;f=vagrant/rackspace-convert-base;h=68de8ce5caac0a82df888dd272f73bcecb0af9de;hb=HEAD snapshotted and then used by https://git.opendaylight.org/gerrit/gitweb?p=releng/builder.git;a=tree;f=vagrant/ovsdb-devstack;h=3bf3469bad2570c138ba836f 15:46:41 that last one takes care of our static routing so it's rather immaterial 15:47:18 what I really need is someone to make sure that what is happening in the first and second vagrant does what is needed compared to mestery's definitions 15:47:34 to me the issue is that when test fails, I have no clue why it failed. How can we make that better? 15:48:00 once I get visibility on why it failed, then I can fix the issue. 15:48:37 one other thing to note, OS's nodepool rebuilds their base images on a regular basis, our images are currently built when we need to make an update so they aren't always the freshest system 15:49:10 tykeal: right, and that is what nodePool adds value, right? 15:49:24 s/what/where/ 15:49:24 sort of, it doesn't use vagrant to build images. 15:50:01 the scripts are very much big bash scripts. What OS has done is build out a series of ansible playbooks and puppet manifests that are applied to systems by said scripts 15:50:13 * flaviof a little sad to hear that the images built by openstack are not visible useable by odl's infra 15:50:17 we _could_ do the same with our vagrants, we just haven't 15:50:44 would we be better served by only going part of the way at this point? 15:51:05 define part of the way 15:51:08 i.e. looking for a volunteer to host the 3P CI and then have jenkins trigger it for the daily runs? 15:52:04 regXboi: what is 3P ? 15:52:05 then the hosted 3P CI wouldn't have to meet the detached network 15:52:10 3P = 3rd party 15:52:14 ack 15:53:03 * tbachman pokes head in room 15:53:10 * regXboi is trying to see if there is a way to make some progress on this issue 15:53:24 regXboi: flaviof: I think dfarrell07 was looking into this, in coordination with OPNFV 15:53:38 I popped onto yesterday’s integration con-call 15:53:39 tykeal: if we poked our openstack friends ( mestery et all ), do you think we could better leverage the images they built? 15:54:06 * tykeal notes that there are 2 primary reasons we run an isolated network 1) rackspace by default grants all cloud instances a public IP which means rogue packets coming into our systems which are running with no firewall enabled (on purpose) 2) we're doing network device testing and having said rogue packets hitting systems can affect tests 15:54:13 tbachman: yes, I know Luiz has interest in this. In fact I gotta believe a lot of ODL folks do. 15:54:20 flaviof: :) 15:54:32 flaviof: I don't know of a way to use their images 15:54:37 https://lists.opendaylight.org/pipermail/integration-dev/2015-March/002496.html <== email from dfarrell07 describing this 15:54:55 dfarrell07 said that the issue was the OPNFV effort wasn’t quite ready, but was pretty close 15:55:12 yes, he mentioned that during yesterday's integration call 15:55:25 tbachman: ack. I don't know where/how far dfarrell07 has come on this. But yes, the plan exists 15:55:57 fwiw, GBP is interested in this, if only b/c we don’t want to have to stand up a duplicate CI for another neutron provider (and maintain it) 15:56:24 well right now 15:56:48 the opendaily master is broken during setup 15:56:57 we don't even *get* to the test stage 15:57:05 :( 15:57:25 just thinking/hoping if we can get a common infra, then there’s more of us to maintain it 15:57:27 which is why I'm wondering if having a semi permanent 3P CI is the right step 15:57:30 (or at least that’s the idea) 15:58:01 I mean, this is pretty serious: 15:58:03 if someone (not me, I'm honestly too busy right now to do this) could look over the differences between what mestery has in his vagrant's and what we've got and make patches to ours we should be able to get an image working again 15:58:03 + timeout -s 9 115m /opt/stack/new/devstack-gate/devstack-vm-gate.sh /opt/stack/new/devstack-gate/devstack-vm-gate.sh: line 457: cd: /opt/stack/new/devstack: No such file or directory 15:59:07 that error looks like a problem with the devstack pieces we are using 15:59:49 that started happening between 5/11 (build 19) and 5/12 (build 20) 16:00:01 before that we got to the test phase and had a different failure 16:01:09 ok... we've expired our time, so if folks want to wander away they can 16:01:35 I'd like to see if we can get this problem and the E2E problem fixed so I'm going to hang around 16:01:48 but I'm not sure we should keep the meeting open 16:01:48 * flaviof gtg... sorry for the not-so-good news 16:01:57 oh 16:02:11 I almost forgot - I'm not around next friday - can somebody else run the meeting? 16:02:14 regXboi: I have the fix for you E2E test. simple one. But then the test fails anyways 16:02:28 shague - the test *should* fail anysay 16:02:31 er anyway 16:02:35 I just want it to *run* 16:02:43 BTW I don't mean to be all doom and gloom when it comes to OS images vs ours, just trying to shed light on reality 16:02:53 OK, one line change gets it to run 16:02:58 that being? 16:03:10 * regXboi suspected he'd done something brain dead 16:03:42 the Configuration import is wrong: it is currently: import org.ops4j.pax.exam.junit.Configuration; but it should be org.ops4j.pax.exam.Configuration 16:04:15 ah ok 16:04:18 I also removed the features() line in the config since you already have that feature in the klaraf bundle 16:04:19 that's easy enough to change 16:04:31 what was heppening is your config was never used since it was not the right one 16:05:17 this will get the testrunning. The next problem you have is you actually ahve two tests - the singleFeatureTest and then your intended test 16:05:32 that's because of the use of using features-parent 16:05:54 you can use theis mvn cli to tun just test: mvn -Dit.test=NeutronE2ETest#test verify 16:06:22 also forgot, comment out debugConfiguration() since it will hang waiting for the debugger to connect 16:06:46 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.544 sec - in org.opendaylight.odlparent.featuretest.SingleFeatureTest 16:06:48 Results : 16:06:49 Failed tests: 16:06:51 NeutronE2ETest.test:86->singleton_network_create_test:117 Singleton Network Post Failed NB expected:<201> but was:<404> 16:06:58 Cool 16:07:03 that's a useful result! 16:07:17 I can work with that 16:07:32 now - can I play a trick to override singleFeatureTest? 16:08:27 because I don't really want to pass that -Dit.test into jenkins 16:09:14 I don't know about that 16:09:49 if you don't add the -D part then you will just get two tests run. the singleFeatureTest doesn't do anything in this case so it is benighn 16:10:29 ok, let me try it the way I think jenkins will run it and verify that I get the 404 16:10:49 because if I do, then I *think* I can work with that 16:11:18 the question will be is the 404 coming from the NB code or from the NB code not being registered :) 16:11:33 so I may also run an initial get to verify that :) 16:11:49 shague: thanks 16:11:54 and ... 16:12:03 #topic persians 16:12:05 #endmeeting