16:00:57 #startmeeting OPNFV BGS daily release readiness synch up 16:00:57 Meeting started Thu May 21 16:00:57 2015 UTC. The chair is frankbrockners. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:57 Useful Commands: #action #agreed #help #info #idea #link #topic. 16:00:57 The meeting name has been set to 'opnfv_bgs_daily_release_readiness_synch_up' 16:01:03 #info Frank Brockners 16:01:08 #info Tim Rozet 16:01:14 <[1]JonasB> #info Jonas Bjurel 16:01:47 #info Morgan Richomme 16:02:41 hey folks - good morning / evening 16:02:55 #info Peter Bandzi 16:03:59 let's get started 16:04:33 #info Updates on functest, autodeployments on POD1 & 2, etc. 16:04:44 can we start with functest? 16:04:47 ok 16:04:55 #info updates on functest 16:05:23 #info never been so close to a first full CI run...3 of testcases run last night on POD1 16:05:43 #info problem due to jenkins job 16:05:59 #info ODL suite successfully tested manually on POD2 16:06:31 #info analysis of the error: 2 main causes: quotas and multiple networks 16:06:56 #info 100% pass on ODL suite with pbandzi's changes 16:07:14 #info quota we saw running neutron and nova quotas that some quotas are set to 10 => to be increased 16:07:27 +---------------------+-------+ 16:07:27 | Field | Value | 16:07:27 +---------------------+-------+ 16:07:27 | floatingip | 50 | 16:07:27 | network | 10 | 16:07:27 | port | 50 | 16:07:27 | router | 10 | 16:07:28 | security_group | 10 | 16:07:28 | security_group_rule | 100 | 16:07:29 | subnet | 10 | 16:07:29 +---------------------+-------+ 16:07:37 #info current quotas: http://pastebin.com/T8PmkJ1B 16:07:53 trozet: better this way :) 16:07:59 :) 16:08:07 <[1]JonasB> morgan: Do you expect us to reconfigure the quotas? 16:08:11 yes 16:08:19 I would like to push some of them 16:08:41 on neutron router, network, sec group couldbe set to 50 16:08:43 <[1]JonasB> morgan: To what, indefenit 0? 16:08:49 on nova sec group to 50 16:09:18 it seems that siome cleaning is not always immedaite leaing to these errors due to quotas ( abeautifull 403) 16:09:28 when do you plan to do the change? 16:09:37 I think the proposal is: security_groups 10 -> 50, floating ip 50 -> 100, network 10 -> 50, router 10-> 50, subnet 10 ->50 16:09:48 let's info that 16:10:00 #info proposal is: security_groups 10 -> 50, floating ip 50 -> 100, network 10 -> 50, router 10-> 50, subnet 10 ->50 16:10:09 <[1]JonasB> Lets have a Jira case on that if it doesnt already exist? 16:10:18 yeah we should 16:10:38 <[1]JonasB> Fuel ODL integration only supports L2 16:10:42 if the issue is transient (cleanup not fast enough) - have we tried to add some delay between tests? 16:10:51 <[1]JonasB> So no floating IP TCs will pass 16:10:53 I think the problem is even if you try to cleanup neutron, ODL does not always cleanup correctly, so some of the neutron clean fails between tests 16:11:10 ah ok 16:11:19 it is a known bug in helium 16:11:26 we might also info this.. should go into release notes IMHO 16:11:33 and even if we consider only Temepest (that is run before ODL) 16:12:29 #info when deleting objects in neutron while using ODL, some objects may fail to delete (known ODL Helium issue) 16:12:38 #info quota increase partially required because of lack of proper clean up between tests (known bug in ODL helium: ODL does not always cleanup correctly, so some of the neutron clean fails between tests) 16:12:49 the suite is fully managed by rally, it is an upstream suite already commonly used..it will be strandge to add sleep at the end of the tests... 16:13:09 morgan_orange: got it - thanks 16:13:16 let's see the consequence of the extension of the quotas 16:13:26 #info there are other errors in the neutron logs, showing illegal values during network creation. I think there might be some bugs with some test cases 16:13:29 when do you plan to change things? 16:13:36 #info its hard to say which test case right now because they were all run by the time I checked the logs 16:14:26 #info second cause Multiple Networks 16:14:41 <[1]JonasB> Will start on Monday then, likely half to one day worth of work 16:14:42 at the end of a fresh install foreman used to have no network preconfigured (not there is one) 16:14:52 fuel 2 16:15:14 with jose we aligned pre conditions on fuel 2 networks in the script preparing tests 16:15:34 but it has consequences...some testcases are a bit "light" 16:15:45 not possible to pass the netopwrk id as parameter) 16:15:58 if 1 network the test will be OK, if 0 or more the test will fail.. 16:16:03 #info for Foreman, quickstack already supports modifying those default values, so it will take me a few minutes to make the change 16:16:15 the proper way will be to correct the testcases 16:16:43 morgan_orange: is there a way to figure out where these failures are coming froming (what testcases): 16:16:52 INFO neutron.api.v2.resource [req-79f9e37d-907e-4446-9002-d1299d971d78 None] create failed (client error): Invalid value for port -16 16:16:53 we could try to run the suite on a "1 networked tenant" to see if it fixed, otherwise I suggest to document this 16:16:59 INFO neutron.api.v2.resource [req-68d5b68b-1610-4344-8e56-ffb2862589ca None] create failed (client error): Invalid value for port 65536 16:17:33 trozet: probably yes with the id 68d5b68b-1610-4344-8e56-ffb2862589 16:18:00 not anymore because you relaunch the installation of POD2 16:18:07 morgan_orange: ok can you please look into that. I think those are easy test case fixes that will give us some more passes 16:18:09 ok 16:18:12 <[1]JonasB> Should we troubleshoot here? 16:18:16 if we see it again 16:18:37 ok we will tomorrow morning (for me) after the fresh install 16:18:45 ok thanks! 16:18:46 but is shall be possible to troubleshoot 16:19:01 so back to my testcase requiring 1 network 16:19:04 what do we do 16:19:23 adapt the env to the test cases or say that these test fails because they want a very precise env 16:19:46 can you expand on the problem. Are you saying that the test case requires a network already setup? 16:20:01 no 16:20:14 they require only that there is only one network 16:20:18 whatever the network 16:20:28 oh ok 16:20:35 using neutron client when listing the network if 0 or more than 1 is found => error 16:20:57 even though the test case doesn't create a network? 16:21:11 it represents 10 failures on the 27 on POD1.. 16:21:43 so does the test case create the network, then check to see if only 1 is on the bed? 16:22:22 I have to check if the test create teh network, not sure 16:22:35 i would think it should, otherwise why would it check to see if 1 is there 16:22:56 #info I suggest you change the test case to create a network if it does not already, then change the pass/fail to see if that network exists by ID/name 16:23:17 the issue is where you have already 2 networks 16:23:26 shall we destroy them to please the test.. 16:24:10 no 16:24:20 i just spent all this effort making the provider network dont destroy it :) 16:25:08 may be stop opn the topic but if we consider POD1 the 2 errors represent 25 of the 27 errors.. 16:25:17 the test case shouldnt pass or fail based on number of networks 16:25:22 agree 16:25:29 it should do it based on what network it created and if that network exists 16:25:38 that is why I suggest to ducument that these tests will fail 16:25:49 we could later discuss with the author of the tests to propsoe evolution 16:25:51 sounds good to me 16:25:55 * frankbrockners sounds like a case for changing the test... 16:26:00 or dont run those tests 16:26:08 are we talking about the tempest smoke test suite here? 16:27:10 yes but I will run them .. because otherwise I need to create my own json file (to be maitained..we got the list automatically from tempest repo) ...I will just accept that they are failed and say why 16:27:24 ah ok 16:27:26 #link https://wiki.opnfv.org/r1_tempest 16:27:43 test cases id 1 16:28:30 all for me 16:28:59 morgan_orange: would it be a big deal to have your own json file for tests? At some point you likely have OPNFV native tests 16:29:15 that way we could exclude the "wrong" tests for now 16:29:35 and time permitting - we could change them to behave properly (and then drive a change upstream) 16:29:41 not a bid geal (already have it for the bench) but if new tests are added I have to recreate one, here we use the Tempest reference, if it evolves, it automatically evoles 16:29:53 failed tests will not stop CI 16:30:09 understand 16:30:26 let's go with "document failure" for now - because it is the simplest 16:30:39 but we should plan for OPNFV native tests at some point 16:31:03 sure it was discussed in test perf weekly meeting 16:31:10 but it should be original tests 16:31:31 ok 16:31:31 not subset of upstream where we ignore the one that failed 16:31:41 it is a good occasion to discuss with Tempest community 16:31:57 agreed - it shows OPNFV's value 16:32:11 thanks morgan_orange for the updates 16:32:26 let's move to POD1 and POD2 autodeploys / updates 16:32:30 <[1]JonasB> #info We have created a non HA deployment profile which is now autodeployed in LF 16:33:13 <[1]JonasB> #info We will see if we can get started with ODL tests tomorrow, not sure seems people are traveling from VC 16:33:22 <[1]JonasB> #inf all from me 16:33:28 <[1]JonasB> All from me 16:33:34 <[1]JonasB> Need to jump out 16:33:38 thanks Jonas 16:33:56 trozet: Quick update on POD2 16:34:18 #info external network patch is done. I manually redeployed LF POD 2 last night with the external net created. The network name is "provider_network" and subnet is "provider_subnet" 16:34:55 #info I started a redeploy in Jenkins: https://build.opnfv.org/ci/job/genesis-foreman-deploy/lastBuild/console once it is complete (hopefully success) we will run smoke test tempest and see what the results are 16:35:30 #info test suite may need slight modification to actually use the "provider_network" morgan_orange looking into that 16:35:53 thats it from me 16:36:18 shall I go ahead and make the quota changes? 16:36:25 do we have a consensus on those values? 16:36:49 +1 for me 16:36:55 let's give it a try 16:37:21 question 16:37:33 does that mean we have to change this on FUEL side then to match? 16:37:39 let's hope that both https://build.opnfv.org/ci/job/genesis-foreman-deploy/lastBuild/console and https://build.opnfv.org/ci/job/genesis-fuel-deploy/lastBuild/console wil succeed 16:37:52 cause those arent values we use currently = 16:38:41 lmcdasm: that was the question, do we have a consensus, I imagine you use the defualt value (http://pastebin.com/T8PmkJ1B) 16:38:44 <[1]JonasB> But issue a Jira case please 16:39:27 Correct Morgan - we use the default values - also, if you are hard coding in network names to be checked against a test case then we need consensus 16:39:49 since in the case of ODL/OVSDB, we use the private_ZYX as outlined in the integration document for the "new external" network you bring up during ODL spin up 16:39:51 #info quota defaults: https://jira.opnfv.org/browse/BGS-51 16:39:52 not provider 16:40:13 the question was for quota 16:40:32 ok. 16:40:35 but you are right wa may have also a question on the naming of the default network created during installation 16:40:58 i have talked with Jose about this before 16:41:14 and in his test VMs stuff, i suggested not to use the "newtorks' that are there / at deploy time 16:41:21 cause you may or may not be able to trust them 16:41:25 and its not installer agnostic 16:41:35 so you can just query external networks 16:41:38 and get the name that way 16:41:39 for each "test", the script should create its own networks. 16:41:58 (yes, you could - but its an extra step ) - its much better to have the test specify the creation of what it needs 16:42:07 then it runs away from assuming anything is there in advance. 16:42:08 agree but lots of tempest tests ahve not been written by us 16:42:19 no i think an installer should create the external network because only it is aware of the public network 16:42:20 ok.. and all of them use "provider_?" 16:42:22 the test cases have no idea 16:42:27 well. in the case of OVSDB to ODL , 16:42:31 you dont have that External Network anymore a 16:42:46 so you can build it - but its encapsulated and you dont have two endpoints, so im not sure what you will "use" for that 16:43:03 since there is no ML3, so you cant route / NAT on that new "external network" 16:43:04 lmcdasm: OVSDB is layer 2, network is layer 3 = handled by neutron l3 agent 16:43:45 I have to jump out 16:44:09 as for the test cases 16:44:19 when you have a test case that creates a VM and attaches it to a newtork 16:44:29 then you are Assuming that there is a network there in advanc e 16:44:48 is provider_net, provider_subnet - cause that will be passed in the nova boot part when they bring up the VM 16:45:07 i would say that you dont know (from one installer to the next - sure we can agree for our two, but that will rapidly get out of date) 16:45:16 if those " named elements" will be called the same thing consisitenyly 16:45:37 so better to have the test cases bring along the commands to do their tests that include the necessary neutron setups as well 16:45:40 (my thinking). 16:46:40 so an instance doesnt attach to the external network 16:46:45 it attaches to its tenant network 16:46:58 then there is a tenant router that has a gateway set to the external network 16:47:04 right 16:47:11 and the name used when doing router-interface-add? 16:47:15 from what morgan_orange told me: the test case tries to do that attach to external network 16:47:16 to the external network is? 16:47:28 i think thats a tempest config thing 16:47:35 not a hardcoded test case thing 16:47:40 better to have the test case create its own net-create --external=True 16:47:40 but we need to confirm 16:47:45 it cant 16:47:48 talking with Jose - he ran into this issue 16:47:49 why not? 16:47:59 when trying vPing 16:48:06 because it needs to know DNS, gateway, and floating ip range it can use 16:48:08 which it doesnt know 16:48:08 cause when you drop OVSDB and the networks created (by FUEL). 16:48:26 ok.. well..im not sure i agree 16:48:43 since if its a test casee where a VM needs an external address, then the tenant can have rights to reate that stuff 16:48:55 if its a VM that is attaching to a tenant network only, then they can do that. 16:49:09 my question is simply to say, you will never know from one install to the next what "external_network" name is 16:49:22 and when you bring up ODL, in the case of FUEL, it gets wiped out anywway 16:49:42 right so i think there are several questions 16:49:42 and you the "tenant' (cause now there is only 1 really - 1 VLAN in use) will have to create their own router 16:49:43 subnet 16:49:51 and such - as outlined in OVSDB/ODL integration page 16:49:53 #info does tempest test case try to create an external network, or can it use one provided 16:49:58 so depending on the context. 16:50:31 #info if can be provided, do tempest cases take the name of a provided network as a parameter? or is it a hardcoded name? 16:50:34 since once you are running under ODL control, there is no difference (from a neutron point of view) for the "externa" gateway - 16:50:41 and internal ones 16:51:05 (ODL has to manage the flows in/out).. so its a tough one (again - as Tim points oout - depends on the case). 16:51:34 morgan_orange: can you get us answers to those questions? 16:51:46 please 16:52:20 morgan already left 16:52:26 lmcdasm: can you add Fuel defaults to https://jira.opnfv.org/browse/BGS-51 so that we can have them documented in the Jira case? 16:54:02 let's bring the topic up tomorrow again when Morgan will be around 16:54:21 i think tim has a good idea 16:54:32 maybe we need to have in the JIRA case all the values (names) in use 16:54:38 or alternatively, lmcdasm could you write a quick email to Morgan asking for details on how his tests are setup wrt/ external connectivity 16:54:41 and context (before ODL spin up and after) 16:55:02 agree - documenting the names would be good 16:55:49 can i ask the context here though- just to make sure ;) - we are talking about names you aer setting up after you have spun up ODL and dont OVSDB integratino right (i.e after you wipe OVS and bring it up in VXLAN tunnel fashion?). 16:55:59 done* 16:56:44 folks - I have to run 16:56:47 we just want to know how the test cases use external network 16:57:02 does tempest try to create one, or can we provide one already? 16:57:07 and how do we pass the name of the one created 16:57:13 let's revisit the topic tomorrow - or try to catch morgan 16:57:28 will endmeeting now... you can continue chatting of course. 16:57:31 #endmeeting