08:00:07 #startmeeting Functest weekly meeting January 26th 08:00:07 Meeting started Tue Jan 26 08:00:07 2016 UTC. The chair is morgan_orange. Information about MeetBot at http://wiki.debian.org/MeetBot. 08:00:07 Useful Commands: #action #agreed #help #info #idea #link #topic. 08:00:07 The meeting name has been set to 'functest_weekly_meeting_january_26th' 08:00:17 #info Morgan Richomme 08:00:29 #info Juha Kosonen 08:00:37 #info Viktor Tikkanen 08:01:27 #link https://wiki.opnfv.org/functest_meeting 08:02:07 #topic review of action points 08:02:41 #info AP1: creation of odl page => done by me 08:02:43 #link https://wiki.opnfv.org/odl_brahmaputra_page 08:03:03 #info AP2 meimei1 add pexpect lib => done 08:03:36 #info AP3 run rally-cert on the target labs => done on fuel and apex 08:03:46 #info we will discuss that on topic rally 08:04:03 #info AP4 lixiaoguang update onos section in doc done 08:04:35 #info AP5 investigation on userdata => creation of 2 vPing scenario but still some issues on some scenario 08:05:01 #topic B-Release folow up 08:05:07 #topic vPings 08:05:10 #undo 08:05:10 Removing item from minutes: 08:05:19 #topic vPings 08:05:40 #link https://wiki.opnfv.org/vping_brahmaputra_page 08:06:01 Hi all 08:06:22 #info creation of a seconf vPing scenario (connection to the Vm to perform the ping instead of using userdata/cloudinit mechanism) 08:06:35 #info troubleshooting in progress 08:06:40 Hi RAGHAVENDRACHARI 08:07:02 if Jose joins he could give more details 08:07:46 #info if vPings fail => vIMS will fail 08:07:53 #topic onos 08:08:11 #info 11/11 on all the labs (to be check if test run on joid to update the table) 08:08:24 #link https://wiki.opnfv.org/onos_brahmaputra_page 08:08:27 #info Jose Lausuch 08:08:30 #info doc updated 08:08:44 Hi jose_lausuch additional info on work on vPings versus installers? 08:08:57 yes 08:09:01 well, additional ? 08:09:13 vping with floating ips didnt work on fuel yesterday 08:10:19 according to https://wiki.opnfv.org/vping_brahmaputra_page (which is not updated in real time...), it works only on joid/odl_l2... 08:10:35 maybe it is ok on apex now 08:10:45 I have to check 08:11:26 but the userdata vping should work on all , otherwise vIMS will faill too 08:11:26 same for me we may use the DB for real time info 08:11:26 http://testresults.opnfv.org/testapi/results?case=vPing&installer=apex 08:11:27 right? 08:11:36 seems ok 08:11:55 ko on compass http://testresults.opnfv.org/testapi/results?case=vPing&installer=compass 08:12:10 no result pushed by fuel 08:12:23 I think its because it timesout both of them 08:12:26 ok with joid http://testresults.opnfv.org/testapi/results?case=vPing&installer=joid 08:12:37 if timeout, shall we push results? 08:12:45 no 08:12:56 if no results...no success... 08:13:05 #topic tempest 08:13:05 ok 08:13:20 #link https://wiki.opnfv.org/tempest_brahmaputra_page 08:13:53 Compass scenario looks pretty stable 08:14:10 last run on onos => -50% on Apex and Fuel... 08:14:31 and an interesting test case joid / odl_l2 run on 3 labs giving...3 different results 08:14:52 Intel POD5 =>14% of success, Intel POD6 => 34% and Orange Pod2 => 91%.... 08:15:06 so different 08:15:13 then we have to analize the errors 08:15:16 I received a question by mail this morning on why we are focusing on bare metal and not including virtual dev 08:15:20 maybe some external connectivity or something 08:16:08 I think there are two major issues with tempest cases 08:16:59 1. ODL instability. As long as network creation will not succeed reliably, many test cases will fail because network creation is part of their test setup. 08:17:24 2. Suspected problem with worker threads. It is under investigation. 08:17:38 It seems to affect number of executed test cases. 08:17:49 viktor_nokia: if the problem is in rally...would it make sense to call tempest with testr? 08:18:22 Basically number of workers seems to be the same as numper of CPU cores (virtual or physical) in the test server 08:18:29 #info 2 major issues with tempest cases 08:18:35 #info 1. ODL instability. As long as network creation will not succeed reliably, many test cases will fail because network creation is part of their test setup. 08:18:43 #info 2. Suspected problem with worker threads. It is under investigation. 08:19:15 jose_lausuch: I'm not sure; we should handle tempest.conf somehow. 08:19:43 could be interesting to reference jumphost config to see the influence on the workers? 08:20:53 let's see the anwer of the threads on workers, and get more run to get feedback 08:20:55 yes, that would be a initial point to investigate 08:21:05 16 CPUs in LF POD 08:21:13 Results on Huwaei US look stable whatever the scenario 08:21:31 May-meimei: could you check the nb of CPU on Huwaei-US jumphost? 08:21:59 maybe with threading, 32 vCPUs... 08:22:01 #action jose_lausuch morgan_orange May-meimei check Jumphost HW conf (see influence on Rally workersà 08:22:08 Is it 32 or something else? 08:22:34 cat /proc/cpuinfo 08:22:39 that shows me 16 08:22:56 ok let's do it offline 08:23:09 #topic rally 08:23:33 ready to move to rally-cert? test done on my side seems OK, 08:23:59 I think the error faced by May-meimei was the same I got (missing test volume) but as it is created in the script it should work 08:24:06 so I would suggest to move to this config 08:24:19 and make the possible adaptations if needed 08:24:27 are you ok? 08:24:33 what config? 08:24:36 rally-cert 08:24:40 yes 08:24:43 what is the exact error with volume? 08:25:28 I'll move volume type creation to run_rally-cert.py, ok? 08:25:42 ah, the volume type thingy 08:25:44 juhak: ok 08:26:07 juhak: yes it would be easier 08:26:09 cinder type-create volume-test #provisional 08:26:20 instead of doing in the CI 08:26:49 just move the creation in the run_rally-cert.py, then in run_test.sh call run_rally-cert.py instead of run_rally.py 08:27:10 juhak: that is already created in run_Tests.sh 08:27:11 yep, will do 08:27:15 we dont need to do it 08:27:29 its before the call to rally or rally-cert 08:27:37 it isn't done in run_rally.py.. 08:27:41 so no need to do that 08:28:21 jose_lausuch: yes the question was to do it when it is necessary, would it not make more sense to do it in run_rally-cert.py instead of run_test.sh... 08:28:42 that is another thing 08:28:47 but not the cause of the problem 08:28:59 with rally-cert, the volume type would be there anyway 08:29:17 yes 08:29:26 but sure, lets put it inside the rally python script 08:29:53 #action juhak short refactoring to manage voume for rally-cert only 08:30:03 #info move to rally-cert scenario in CI 08:30:45 for Rally, as far as I can read the jenkins logs, tests look OK, I saw some erros on apex, but vIMS faced also these errors as far as I know (quota, cinder) 08:30:56 but globally the results look fine 08:31:09 we do not have a KPI like for tempest global % of test 08:31:23 not sure it would be easy to create... 08:32:06 unlike temepst we pushed the json in the DB and would not be able to graph something from Rally results 08:32:16 would it make sense to do an average? 08:32:28 good point 08:32:39 why not? 08:32:43 then we have some numbers at least 08:32:47 morgan_orange: we have 24cpu in jumpserver 08:33:00 thanks May-meimei 08:33:00 May-meimei: with/without HT? 08:33:48 juhak: what is your view on Rally KPI? a global one, 1 per modules 08:34:04 HT is hard thread? 08:34:14 Hyper Threading 08:34:14 May-meimei: you can check with cat /proc/cpuinfo 08:34:19 if they have the HT flag in there 08:34:23 maybe both? 08:34:26 if they have, then its the double 08:34:37 42 vCPUs 08:34:41 sorry, 48 08:35:16 viktor_nokia: however, why the number of cpus in the server you run the tests on matter? 08:35:35 juhak: yes, not critical (regarding the roadmap) but could be interesting to exchange with Rally community 08:35:38 I don't know :( 08:35:42 we are running against a SUT.. why the jumphost has effects? 08:35:44 ok 08:35:59 jose_lausuch: no HT flag in our /proc/cpuinfo 08:36:27 #topic ONOS 08:36:31 morgan_orange: yes, I agree 08:36:31 May-meimei: grep ht /proc/cpuinfo|head -1 08:36:40 May-meimei: does that command return anything? 08:36:42 #info tests are running fine, doc updated... 08:36:45 because workers are started in the server/VM where tempest/Rally is running 08:36:50 congratulations to lixiaoguang 08:37:05 #topic ODL 08:37:21 #info tests run on apex/compass/fuel 08:37:47 #info probelm on joid, probably due to the fact that ODL test scenario use the same IP for Neutron and keystone API, which is not the case on joid 08:37:59 #info case to be updated to take this specificity into account 08:38:19 #info feedback from redhat on the fact that the tests may be deprecated 08:38:31 #info discussions in progress with ODL community 08:38:50 but so far we have 18/18 or 15/18 and we shall be able to adapt the suite to joid 08:38:58 #action morgan_orange adapt ODL suite to joid 08:39:30 #action jose_lausuch check if tests are deprecated and if we should point to new ODL Robot scenario 08:39:40 jose_lausuch: Ok for ODL? 08:39:46 morgan_orange: yes 08:39:51 #topic vIMS 08:40:44 #info scenario used to work with nosdn but no scenario up&running with a controller scenario 08:41:26 vIMS is complex, needs external connectivity, so for me it is relevant to keep it in CI...as it shows if the solution out of the box is usable for deploying complex VNF 08:41:41 however we can see that physical config of the lab may lead to issues 08:42:08 but as it works with a pure neutron and as the VNF does not need anything special from a SDN poitn of view... 08:42:43 #info troubleshooting in progress, currently no odl or onos scenario are OK and we may have some restrictions 08:42:51 #info prerequisites for vIMS = vPing... 08:43:06 so if vPing is not working...vIMS will not work... 08:43:16 #topic Ovno 08:43:23 morgan_orange: can we set the restriction for vIMS only when nosdn scenarios? 08:44:00 jose_lausuch: I would say no..as the controller is not supposed to have an influence 08:44:19 but if we cannot deploy once, we will have to document it 08:44:29 and say that vIMS works only on nosdn scenario 08:44:37 fuel/odl_l2 worked once.. 08:44:42 ok 08:44:54 let's keep some time for troubleshooting... 08:45:17 officially release is still planned next week but...(keep this topic for last topic) 08:45:36 #info no OCl lab ready to test and wait for patch correction on ovno side 08:45:40 #topic promise 08:45:47 #info integration tests started 08:45:59 #info promise in the CI loop 08:46:24 #info so far issue due to dependencies, node version seems to be old on Ubuntu 14.04.... 08:46:30 #info investigation on promise side 08:46:46 #topic SDNVPN 08:46:52 good summary :) 08:47:04 I saw that it was merged yesterday 08:47:05 #info patch submitted yesterday to integrate tests on CI 08:47:19 and fdegir has just enabled that scenario for fuel jjobs 08:47:24 #info patch patched by May-meimei an else was missing in the if condition of run_test.sh 08:47:44 #info scenario available in CI fule/bgpvpn ...wait and see...and debug 08:47:57 any other test case missing? 08:48:26 Doctor scenario soon ready on apex I think, so could be added in CI if needed 08:48:43 ovno? 08:48:53 done above :) 08:48:59 ah yep 08:49:16 #topic doc 08:49:20 waiting for patch correction, yes, I had to give a -1 :D 08:50:11 #info update in progress but my wish to pleade doc8 lead to regression (thanks viktor_nokia for reporting) 08:50:22 so I will try to amend to please verybody... 08:50:36 difficult to get something stable until the tests are stable... 08:50:55 #info request to new contributors to review the current draft doc 08:51:15 priority is to make test run... 08:51:43 jose_lausuch: we should keep somewhere what we do with feature projects...and document it in the dev guide 08:52:18 yes 08:52:18 not easy when we are just proxy for feature test guys ... 08:52:47 the access to labs (but we have the same issue internally as we were not able to grant access to viktor_nokia or juhak on LF POD) is a problem 08:53:07 we have to keep that in mind for C release... 08:53:24 Valentin_Orange: welcome I gave a status on vIMS, maybe you want to add something? 08:53:35 #topic release date 08:53:53 I sent a mail to TSC last week 08:53:58 for anyone without that access, just contact us and we will try to run what you need 08:54:00 we have access 08:54:02 as you know release date is 4th of February 08:54:30 it is always possible to release something... 08:54:43 you have the Ubuntu/ODL way or the Debian way... 08:55:11 first way release something a provide a second release the day after because it is not working...second way wait until you got consistent stuff in the release 08:55:12 :) 08:55:37 as we are more industry than community driven, Debian way is not easy for the board 08:55:41 I'd go for debian way then 08:56:06 however as we are releasing only our second release we could be pragmatic and do not have to stick to the initial date 08:56:39 I suggest to add 3 weeks, so we keep in February and do not have impact on the plugfest 08:56:56 the goal is to finalize the integration of feature projects and get more stability 08:57:46 what is your view on this topic? 08:57:54 morgan_orange: I completelly agree 08:58:01 3 weeks would give that stability 08:58:12 and time to finish things like promise/bgpvpn 08:58:39 not sure :) but at least it will give a chance of feature project...that were totally unabled to do it as they got access to target lab late 09:00:08 some contributors may be a bit upset (if they work hard over Xmas time to reach the goal) but as said during board meeting, anyway the integration env was not ready before mid of January, it would be unfair to exclude the feature projects simply because it was not possible to do better 09:00:23 +1 09:00:42 I did some work on Xmas as well, same as Valentin_Orange... and we are not upset :) 09:01:06 there was a meeting for release C yesterday (but cannot attend all the meetings..) I think it is important to better understand the integration phase... 09:01:21 of course we will be better in the future as we will have this experience 09:01:21 Ok 09:01:24 so it is 10 09:01:35 any other stuff to share? 09:02:10 nope 09:02:39 ok thanks for attending, enjoy this new testing week...and see you next week for the release (or not) 09:02:46 I think we can close 09:02:48 #endmeeting