08:00:07 <morgan_orange> #startmeeting Functest weekly meeting January 26th
08:00:07 <collabot> Meeting started Tue Jan 26 08:00:07 2016 UTC.  The chair is morgan_orange. Information about MeetBot at http://wiki.debian.org/MeetBot.
08:00:07 <collabot> Useful Commands: #action #agreed #help #info #idea #link #topic.
08:00:07 <collabot> The meeting name has been set to 'functest_weekly_meeting_january_26th'
08:00:17 <morgan_orange> #info Morgan Richomme
08:00:29 <juhak> #info Juha Kosonen
08:00:37 <viktor_nokia> #info Viktor Tikkanen
08:01:27 <morgan_orange> #link https://wiki.opnfv.org/functest_meeting
08:02:07 <morgan_orange> #topic review of action points
08:02:41 <morgan_orange> #info AP1: creation of odl page => done by me
08:02:43 <morgan_orange> #link https://wiki.opnfv.org/odl_brahmaputra_page
08:03:03 <morgan_orange> #info AP2 meimei1 add pexpect lib => done
08:03:36 <morgan_orange> #info AP3 run rally-cert on the target  labs  => done on fuel and apex
08:03:46 <morgan_orange> #info we will discuss that on topic rally
08:04:03 <morgan_orange> #info AP4 lixiaoguang update onos section in doc done
08:04:35 <morgan_orange> #info AP5 investigation on userdata => creation of 2 vPing scenario but still some issues on some scenario
08:05:01 <morgan_orange> #topic B-Release folow up
08:05:07 <morgan_orange> #topic vPings
08:05:10 <morgan_orange> #undo
08:05:10 <collabot> Removing item from minutes: <MeetBot.ircmeeting.items.Topic object at 0x1ac81d0>
08:05:19 <morgan_orange> #topic vPings
08:05:40 <morgan_orange> #link https://wiki.opnfv.org/vping_brahmaputra_page
08:06:01 <RAGHAVENDRACHARI> Hi all
08:06:22 <morgan_orange> #info creation of a seconf vPing scenario (connection to the Vm to perform the ping instead of using userdata/cloudinit mechanism)
08:06:35 <morgan_orange> #info troubleshooting in progress
08:06:40 <morgan_orange> Hi RAGHAVENDRACHARI
08:07:02 <morgan_orange> if Jose joins he could give more details
08:07:46 <morgan_orange> #info if vPings fail => vIMS will fail
08:07:53 <morgan_orange> #topic onos
08:08:11 <morgan_orange> #info 11/11 on all the labs (to be check if test run on joid to update the table)
08:08:24 <morgan_orange> #link https://wiki.opnfv.org/onos_brahmaputra_page
08:08:27 <jose_lausuch> #info Jose Lausuch
08:08:30 <morgan_orange> #info doc updated
08:08:44 <morgan_orange> Hi jose_lausuch additional info on work on vPings versus installers?
08:08:57 <jose_lausuch> yes
08:09:01 <jose_lausuch> well, additional ?
08:09:13 <jose_lausuch> vping with floating ips didnt work on fuel yesterday
08:10:19 <morgan_orange> according to https://wiki.opnfv.org/vping_brahmaputra_page (which is not updated in real time...), it works only on joid/odl_l2...
08:10:35 <morgan_orange> maybe it is ok on apex now
08:10:45 <jose_lausuch> I have to check
08:11:26 <jose_lausuch> but the userdata vping should work on all , otherwise vIMS will faill too
08:11:26 <morgan_orange> same for me we may use the DB for real time info
08:11:26 <morgan_orange> http://testresults.opnfv.org/testapi/results?case=vPing&installer=apex
08:11:27 <jose_lausuch> right?
08:11:36 <morgan_orange> seems ok
08:11:55 <morgan_orange> ko on compass http://testresults.opnfv.org/testapi/results?case=vPing&installer=compass
08:12:10 <morgan_orange> no result pushed by fuel
08:12:23 <jose_lausuch> I think its because it timesout both of them
08:12:26 <morgan_orange> ok with joid http://testresults.opnfv.org/testapi/results?case=vPing&installer=joid
08:12:37 <jose_lausuch> if timeout, shall we push results?
08:12:45 <morgan_orange> no
08:12:56 <morgan_orange> if no results...no success...
08:13:05 <morgan_orange> #topic tempest
08:13:05 <jose_lausuch> ok
08:13:20 <morgan_orange> #link https://wiki.opnfv.org/tempest_brahmaputra_page
08:13:53 <morgan_orange> Compass scenario looks pretty stable
08:14:10 <morgan_orange> last run on onos => -50% on Apex and Fuel...
08:14:31 <morgan_orange> and an interesting test case joid / odl_l2 run on 3 labs giving...3 different results
08:14:52 <morgan_orange> Intel POD5 =>14% of success, Intel POD6 => 34% and Orange Pod2 => 91%....
08:15:06 <jose_lausuch> so different
08:15:13 <jose_lausuch> then we have to analize the errors
08:15:16 <morgan_orange> I received a question by mail this morning on why we are focusing on bare metal and not including virtual dev
08:15:20 <jose_lausuch> maybe some external connectivity or something
08:16:08 <viktor_nokia> I think there are two major issues with tempest cases
08:16:59 <viktor_nokia> 1. ODL instability. As long as network creation will not succeed reliably, many test cases will fail because network creation is part of their test setup.
08:17:24 <viktor_nokia> 2. Suspected problem with worker threads. It is under investigation.
08:17:38 <viktor_nokia> It seems to affect number of executed test cases.
08:17:49 <jose_lausuch> viktor_nokia: if the problem is in rally...would it make sense to call tempest with testr?
08:18:22 <viktor_nokia> Basically number of workers seems to be the same as numper of CPU cores (virtual or physical) in the test server
08:18:29 <morgan_orange> #info 2 major issues with tempest cases
08:18:35 <morgan_orange> #info 1. ODL instability. As long as network creation will not succeed reliably, many test cases will fail because network creation is part of their test setup.
08:18:43 <morgan_orange> #info 2. Suspected problem with worker threads. It is under investigation.
08:19:15 <viktor_nokia> jose_lausuch: I'm not sure; we should handle tempest.conf somehow.
08:19:43 <morgan_orange> could be interesting to reference jumphost config to see the influence on the workers?
08:20:53 <morgan_orange> let's see the anwer of the threads on workers, and get more run to get feedback
08:20:55 <jose_lausuch> yes, that would be a initial point to investigate
08:21:05 <jose_lausuch> 16 CPUs in LF POD
08:21:13 <morgan_orange> Results on Huwaei US look stable whatever the scenario
08:21:31 <morgan_orange> May-meimei: could you check the nb of CPU on Huwaei-US jumphost?
08:21:59 <jose_lausuch> maybe with threading, 32 vCPUs...
08:22:01 <morgan_orange> #action jose_lausuch morgan_orange May-meimei check Jumphost HW conf (see influence on Rally workersà
08:22:08 <viktor_nokia> Is it 32 or something else?
08:22:34 <jose_lausuch> cat /proc/cpuinfo
08:22:39 <jose_lausuch> that shows me 16
08:22:56 <morgan_orange> ok let's do it offline
08:23:09 <morgan_orange> #topic rally
08:23:33 <morgan_orange> ready to move to rally-cert?  test done on my side seems OK,
08:23:59 <morgan_orange> I think the error faced by May-meimei was the same I got (missing test volume) but as it is created in the script it should work
08:24:06 <morgan_orange> so I would suggest to move to this config
08:24:19 <morgan_orange> and make the possible adaptations if needed
08:24:27 <morgan_orange> are you ok?
08:24:33 <jose_lausuch> what config?
08:24:36 <jose_lausuch> rally-cert
08:24:40 <morgan_orange> yes
08:24:43 <jose_lausuch> what is the exact error with volume?
08:25:28 <juhak> I'll move volume type creation to run_rally-cert.py, ok?
08:25:42 <jose_lausuch> ah, the volume type thingy
08:25:44 <jose_lausuch> juhak: ok
08:26:07 <morgan_orange> juhak: yes it would be easier
08:26:09 <morgan_orange> cinder type-create volume-test #provisional
08:26:20 <morgan_orange> instead of doing in the CI
08:26:49 <morgan_orange> just move the creation in the run_rally-cert.py, then in run_test.sh call run_rally-cert.py instead of run_rally.py
08:27:10 <jose_lausuch> juhak: that is already created in run_Tests.sh
08:27:11 <juhak> yep, will do
08:27:15 <jose_lausuch> we dont need to do it
08:27:29 <jose_lausuch> its before the call to rally or rally-cert
08:27:37 <jose_lausuch> it isn't done in run_rally.py..
08:27:41 <jose_lausuch> so no need to do that
08:28:21 <morgan_orange> jose_lausuch: yes the question was to do it when it is necessary, would it not make more sense to do it in run_rally-cert.py instead of run_test.sh...
08:28:42 <jose_lausuch> that is another thing
08:28:47 <jose_lausuch> but not the cause of the problem
08:28:59 <jose_lausuch> with rally-cert, the volume type would be there anyway
08:29:17 <morgan_orange> yes
08:29:26 <jose_lausuch> but sure, lets put it inside the rally python script
08:29:53 <morgan_orange> #action juhak short refactoring to manage voume for rally-cert only
08:30:03 <morgan_orange> #info move to rally-cert scenario in CI
08:30:45 <morgan_orange> for Rally, as far as I can read the jenkins logs, tests look OK, I saw some erros on apex, but vIMS faced also these errors as far as I know (quota, cinder)
08:30:56 <morgan_orange> but globally the results look fine
08:31:09 <morgan_orange> we do not have a KPI like for tempest global % of test
08:31:23 <morgan_orange> not sure it would be easy to create...
08:32:06 <morgan_orange> unlike temepst we pushed the json in the DB and would not be able to graph something from Rally results
08:32:16 <morgan_orange> would it make sense to do an average?
08:32:28 <jose_lausuch> good point
08:32:39 <jose_lausuch> why not?
08:32:43 <jose_lausuch> then we have some numbers at least
08:32:47 <May-meimei> morgan_orange: we have 24cpu in jumpserver
08:33:00 <morgan_orange> thanks May-meimei
08:33:00 <jose_lausuch> May-meimei: with/without HT?
08:33:48 <morgan_orange> juhak: what is your view on Rally KPI? a global one, 1 per modules
08:34:04 <May-meimei> HT is hard thread?
08:34:14 <morgan_orange> Hyper Threading
08:34:14 <jose_lausuch> May-meimei: you can check with cat /proc/cpuinfo
08:34:19 <jose_lausuch> if they have the HT flag in there
08:34:23 <juhak> maybe both?
08:34:26 <jose_lausuch> if they have, then its the double
08:34:37 <jose_lausuch> 42 vCPUs
08:34:41 <jose_lausuch> sorry, 48
08:35:16 <jose_lausuch> viktor_nokia: however, why the number of cpus in the server you run the tests on matter?
08:35:35 <morgan_orange> juhak: yes, not critical (regarding the roadmap) but could be interesting to exchange with Rally community
08:35:38 <viktor_nokia> I don't know :(
08:35:42 <jose_lausuch> we are running against a SUT.. why the jumphost has effects?
08:35:44 <jose_lausuch> ok
08:35:59 <May-meimei> jose_lausuch: no HT flag in our /proc/cpuinfo
08:36:27 <morgan_orange> #topic ONOS
08:36:31 <juhak> morgan_orange: yes, I agree
08:36:31 <jose_lausuch> May-meimei: grep ht /proc/cpuinfo|head -1
08:36:40 <jose_lausuch> May-meimei: does that command return anything?
08:36:42 <morgan_orange> #info tests are running fine, doc updated...
08:36:45 <viktor_nokia> because workers are started in the server/VM where tempest/Rally is running
08:36:50 <morgan_orange> congratulations to lixiaoguang
08:37:05 <morgan_orange> #topic ODL
08:37:21 <morgan_orange> #info tests run on apex/compass/fuel
08:37:47 <morgan_orange> #info probelm on joid, probably due to the fact that ODL test scenario use the same IP for Neutron and keystone API, which is not the case on joid
08:37:59 <morgan_orange> #info case to be updated to take this specificity into account
08:38:19 <morgan_orange> #info feedback from redhat on the fact that the tests may be deprecated
08:38:31 <morgan_orange> #info discussions in progress with ODL community
08:38:50 <morgan_orange> but so far we have 18/18 or 15/18 and we shall be able to adapt the suite to joid
08:38:58 <morgan_orange> #action morgan_orange adapt ODL suite to joid
08:39:30 <morgan_orange> #action jose_lausuch check if tests are deprecated and if we should point to new ODL Robot scenario
08:39:40 <morgan_orange> jose_lausuch: Ok for ODL?
08:39:46 <jose_lausuch> morgan_orange: yes
08:39:51 <morgan_orange> #topic vIMS
08:40:44 <morgan_orange> #info scenario used to work with nosdn but no scenario up&running with a controller scenario
08:41:26 <morgan_orange> vIMS is complex, needs external connectivity, so for me it is relevant to keep it in CI...as it shows if the solution out of the box is usable for deploying complex VNF
08:41:41 <morgan_orange> however we can see that physical config of the lab may lead to issues
08:42:08 <morgan_orange> but as it works with a pure neutron and as the VNF does not need anything special from a SDN poitn of view...
08:42:43 <morgan_orange> #info troubleshooting in progress, currently no odl or onos scenario are OK and we may have some restrictions
08:42:51 <morgan_orange> #info prerequisites for vIMS = vPing...
08:43:06 <morgan_orange> so if vPing is not working...vIMS will not work...
08:43:16 <morgan_orange> #topic Ovno
08:43:23 <jose_lausuch> morgan_orange: can we set the restriction for vIMS only when nosdn scenarios?
08:44:00 <morgan_orange> jose_lausuch: I would say no..as the controller is not supposed to have an influence
08:44:19 <morgan_orange> but if we cannot deploy once, we will have to document it
08:44:29 <morgan_orange> and say that vIMS works only on nosdn scenario
08:44:37 <morgan_orange> fuel/odl_l2 worked once..
08:44:42 <jose_lausuch> ok
08:44:54 <morgan_orange> let's keep some time for troubleshooting...
08:45:17 <morgan_orange> officially release is still planned next week but...(keep this topic for last topic)
08:45:36 <morgan_orange> #info no OCl lab ready to test and wait for patch correction on ovno side
08:45:40 <morgan_orange> #topic promise
08:45:47 <morgan_orange> #info integration tests started
08:45:59 <morgan_orange> #info promise in the CI loop
08:46:24 <morgan_orange> #info so far issue due to dependencies, node version seems to be old on Ubuntu 14.04....
08:46:30 <morgan_orange> #info investigation on promise side
08:46:46 <morgan_orange> #topic SDNVPN
08:46:52 <jose_lausuch> good summary :)
08:47:04 <jose_lausuch> I saw that it was merged yesterday
08:47:05 <morgan_orange> #info patch submitted yesterday to integrate tests on CI
08:47:19 <jose_lausuch> and fdegir has just enabled that scenario for fuel jjobs
08:47:24 <morgan_orange> #info patch patched by May-meimei an else was missing in the if condition of run_test.sh
08:47:44 <morgan_orange> #info scenario available in CI fule/bgpvpn ...wait and see...and debug
08:47:57 <morgan_orange> any other test case missing?
08:48:26 <morgan_orange> Doctor scenario soon ready on apex I think, so could be added in CI if needed
08:48:43 <jose_lausuch> ovno?
08:48:53 <morgan_orange> done above :)
08:48:59 <jose_lausuch> ah yep
08:49:16 <morgan_orange> #topic doc
08:49:20 <jose_lausuch> waiting for patch correction, yes, I had to give a -1 :D
08:50:11 <morgan_orange> #info update in progress but my wish to pleade doc8 lead to regression (thanks viktor_nokia for reporting)
08:50:22 <morgan_orange> so I will try to amend to please verybody...
08:50:36 <morgan_orange> difficult to get something stable until the tests are stable...
08:50:55 <morgan_orange> #info request to new contributors to review the current draft doc
08:51:15 <morgan_orange> priority is to make test run...
08:51:43 <morgan_orange> jose_lausuch: we should keep somewhere what we do with feature projects...and document it in the dev guide
08:52:18 <jose_lausuch> yes
08:52:18 <morgan_orange> not easy when we are just proxy for feature test guys ...
08:52:47 <morgan_orange> the access to labs (but we have the same issue internally as we were not able to grant access to viktor_nokia or juhak on LF POD) is a problem
08:53:07 <morgan_orange> we have to keep that in mind for C release...
08:53:24 <morgan_orange> Valentin_Orange: welcome I gave a status on vIMS, maybe you want to add something?
08:53:35 <morgan_orange> #topic release date
08:53:53 <morgan_orange> I sent a mail to TSC last week
08:53:58 <jose_lausuch> for anyone without that access, just contact us and we will try to run what you need
08:54:00 <jose_lausuch> we have access
08:54:02 <morgan_orange> as you know release date is 4th of February
08:54:30 <morgan_orange> it is always possible to release something...
08:54:43 <morgan_orange> you have the Ubuntu/ODL way or the Debian way...
08:55:11 <morgan_orange> first way release something a provide a second release the day after because it is not working...second way wait until you got consistent stuff in the release
08:55:12 <jose_lausuch> :)
08:55:37 <morgan_orange> as we are more industry than community driven, Debian way is not easy for the board
08:55:41 <jose_lausuch> I'd go for debian way then
08:56:06 <morgan_orange> however as we are releasing only our second release we could be pragmatic and do not have to stick to the initial date
08:56:39 <morgan_orange> I suggest to add 3 weeks, so we keep in February and do not have impact on the plugfest
08:56:56 <morgan_orange> the goal is to finalize the integration of feature projects and get more stability
08:57:46 <morgan_orange> what is your view on this topic?
08:57:54 <jose_lausuch> morgan_orange: I completelly agree
08:58:01 <jose_lausuch> 3 weeks would give that stability
08:58:12 <jose_lausuch> and time to finish things like promise/bgpvpn
08:58:39 <morgan_orange> not sure :) but at least it will give a chance of feature project...that were totally unabled to do it as they got access to target lab late
09:00:08 <morgan_orange> some contributors may be a bit upset (if they work hard over Xmas time to reach the goal) but as said during board meeting, anyway the integration env was not ready before mid of January, it would be unfair to exclude the feature projects simply because it was not possible to do better
09:00:23 <jose_lausuch> +1
09:00:42 <jose_lausuch> I did some work on Xmas as well, same as Valentin_Orange... and we are not upset :)
09:01:06 <morgan_orange> there was a meeting for release C yesterday (but cannot attend all the meetings..) I think it is important to better understand the integration phase...
09:01:21 <morgan_orange> of course we will be better in the future as we will have this experience
09:01:21 <morgan_orange> Ok
09:01:24 <morgan_orange> so it is 10
09:01:35 <morgan_orange> any other stuff to share?
09:02:10 <jose_lausuch> nope
09:02:39 <morgan_orange> ok thanks for attending, enjoy this new testing week...and see you next week for the release (or not)
09:02:46 <jose_lausuch> I think we can close
09:02:48 <morgan_orange> #endmeeting