#opnfv-release log

15:00:45 <dmcbride> #startmeeting OPNFV Colorado Release daily
15:00:45 <collabot> Meeting started Thu Sep 15 15:00:45 2016 UTC.  The chair is dmcbride. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:45 <collabot> Useful Commands: #action #agreed #help #info #idea #link #topic.
15:00:45 <collabot> The meeting name has been set to 'opnfv_colorado_release_daily'
15:00:50 <dmcbride> #topic roll call
15:00:53 <ulik> #info Uli Kleber
15:00:57 <dmcbride> #info David McBride
15:01:10 <dali`> #info Dan Lilliehorn (ARMBand/Enea)
15:01:34 <bryan_att> #info Bryan Sullivan
15:02:55 <jmorgan1> #info Jack Morgan (Pharos)
15:03:15 <dmcbride> can someone post the link for yardstick?
15:04:40 <dmcbride> nevermind, I have it
15:05:19 <dmcbride> #topic functest and yardstick
15:05:46 <dmcbride> do we have anyone from the test team available?
15:06:53 <dmcbride> haha - I see that Jose has been playing with javascript
15:07:09 <dmcbride> we now have fancy dials instead of red and green balls
15:07:38 <dmcbride> http://testresults.opnfv.org/reporting/functest/release/colorado/index-status-apex.html
15:08:16 <dmcbride> #info functest looks good, but we still seem to be having problems with yardstick
15:08:51 <dmcbride> dali`:  thanks for joining
15:08:56 <rprakash> #info rprakash
15:09:20 <dmcbride> dali`:  emails I've seen from you and Bob are encouraging
15:09:29 <dmcbride> dali`: any issues you'd like to raise?
15:10:02 <dali`> Not really. Things are good. We had successful runs of both functest and yardstick in the last hour, 100% on functest and complete suite running on yardstick.
15:10:24 <dmcbride> dali`: that's great.  Good news.
15:10:41 <dali`> We see spurious issues with CI from time to time (network issues, problems with floating IPs, tempest_smoke_serial issues) but all of these happen on other PODs as well.
15:12:35 <dali`> Our plan is to send you and Bob a more detailed status summary tonight, with details on scenarios etc.
15:12:48 <dmcbride> I believe that Jose will be joining us
15:13:13 <jose_lausuch> sorry  I thought it was canceleted today
15:13:41 <jmorgan1> jose_lausuch: no problem, we were just talking about you ;)
15:13:52 <jose_lausuch> I hope not bad things :)
15:14:10 <jmorgan1> jose_lausuch: David likes the work you've done on javascript
15:14:28 <jose_lausuch> which work? the gauges?
15:14:33 <dmcbride> I like the dials, that's cool
15:14:56 <jose_lausuch> :)
15:15:03 <dmcbride> jose_lausuch: we still seem to be having issues with yardstick, though
15:15:11 <jose_lausuch> I think it helps to avoid confusion, before, the "red" icon was too red...
15:16:03 <dmcbride> console log indicates soft errors (SLA exceeded)
15:16:03 <dmcbride> Not clear on what yardstick is doing or if their SLAs are reasonable given our set up.
15:16:19 <dmcbride> I received this message from Greg E this morning:
15:16:30 <jose_lausuch> aha
15:16:36 <jose_lausuch> is kubi1 here?
15:16:45 <dmcbride> kubi1: ping
15:16:47 <jose_lausuch> We removed the Doctor testcase in yardstick btw
15:16:53 <jose_lausuch> so, a lot of scenarios will turn blue
15:16:55 <jose_lausuch> or green
15:17:00 <jose_lausuch> blue in jenkins
15:17:10 <jose_lausuch> can you point to the yarsdstick failures?
15:17:37 <dmcbride> unfortunately, no
15:18:00 <dmcbride> I was hoping Greg could join us to clarify, but I don't think he's online
15:18:19 <ulik> kubi is not here due to holiday in China
15:18:33 <jose_lausuch> ulik: ah ok, thanks
15:19:34 <jose_lausuch> I'm looking at http://testresults.opnfv.org/reporting/yardstick/release/colorado/index-status-fuel.html   and https://build.opnfv.org/ci/view/yardstick/job/yardstick-fuel-baremetal-daily-colorado/
15:19:52 <jose_lausuch> mixure of blue/red
15:21:09 <jose_lausuch> I can talk only about the sfc scenario with odl_l2
15:21:18 <bryan_att> dmcbride: when you get to an open spot, I have a question about https://build.opnfv.org/ci/ vs http://testresults.opnfv.org/reporting/functest/release/colorado/index-status-apex.html
15:21:18 <jose_lausuch> I think it will turn green soon
15:21:24 <jose_lausuch> but can't tell much about the others
15:21:52 <dmcbride> jose_lausuch:  Greg E just joined us
15:22:23 <dmcbride> Greg_E_: jose_lausuch was asking for a pointer to the yardstick failures you mentioned in your email today
15:23:16 <dmcbride> in the mean time, bryan_att had something he wanted to bring up
15:23:20 <dmcbride> bryan_att: go ahead
15:23:35 <bryan_att> Per jenkins, the last successful Apex build was 14 days ago, yet the testresults page shows everything for functest is running fine. what is the testresults page based on?
15:23:36 <Greg_E_> https://build.opnfv.org/ci/view/fuel/job/yardstick-fuel-baremetal-daily-colorado/
15:24:09 <bryan_att> How can functest be running fine if Apex is not working at all? Or am I misinterpreting the jenkins page?
15:24:12 <dmcbride> #link  https://build.opnfv.org/ci/view/fuel/job/yardstick-fuel-baremetal-daily-colorado/
15:24:21 <Greg_E_> nosdn-ovs, nosdn-kvm, l2-sfc
15:24:29 <Greg_E_> etc are failing
15:24:33 <jose_lausuch> Greg_E_: most of the failures I see are due to floating ips issue (not reachable)
15:24:39 <Greg_E_> issue seems to be timing
15:24:47 <jose_lausuch> odl_l2_sfc will turn blue soon, since we use Boron RC3.5
15:24:53 <dmcbride> #info ^ pointer to yardstick failures noticed by Greg_E_
15:25:08 <Greg_E_> I thought that floats are not really supported in most scenarios
15:25:15 <bryan_att> Also, the testresults page shows sunny for copper in Apex HA scenarios - but that has been disabled and didn't work before it was. So where is the sunny indicator coming from?
15:25:33 <jose_lausuch> then the scenario owners should take responsability and decide what test cases to run
15:25:52 <jose_lausuch> if floating ips is not supported, yardstick will fail, since its based on ssh-ing a VM though a floatip
15:26:12 <jose_lausuch> bryan_att: I have an explanation for you :)
15:26:27 <jose_lausuch> but I'd like to touch 1 topic at a time
15:26:41 <bryan_att> OK, I thought it was my turn - let me know when
15:26:56 <dmcbride> sorry - that was my fault
15:27:01 <jose_lausuch> I think there are no turns
15:27:09 <bryan_att> we need a queue
15:27:16 <dmcbride> ok - lets start with Greg_E_ issue
15:27:30 <bryan_att> pretty standard IRC meeting logistics - someone has to manage the speaker/topic queue
15:27:48 <jose_lausuch> all the yardstick 'reds' that I see are showing the same error:ssh.py:256 DEBUG Ssh is still unavailable: SSHError("Exception <class 'paramiko.ssh_exception.NoValidConnectionsError'> was raised during connect. Exception value is: NoValidConnectionsError(None, 'Unable to connect to port 22 on  or 10.118.101.195'
15:27:48 <dmcbride> bryan_att: yes - that would be me and I messed it up
15:27:52 <jose_lausuch> floating ip issues
15:28:08 <jose_lausuch> dmcbride: can we info that?
15:28:18 <dmcbride> everyone can info
15:28:20 <jose_lausuch> Greg_E_: do you agree?
15:28:39 <Greg_E_> if these are floating ip tests
15:28:55 <Greg_E_> what can scenario owners do to blacklist them from running
15:28:58 <dmcbride> #info jose_lausuch says:  all the yardstick 'reds' that I see are showing the same error:ssh.py:256 DEBUG Ssh is still unavailable: SSHError("Exception <class 'paramiko.ssh_exception.NoValidConnectionsError'> was raised during connect. Exception value is: NoValidConnectionsError(None, 'Unable to connect to port 22 on  or 10.118.101.195'
15:29:19 <jose_lausuch> Greg_E_: we did that for bgpvpn test case, where floating ips are not supported
15:29:35 <Greg_E_> do you need to do it
15:29:37 <jose_lausuch> but maybe no one is taking a look at the jenkins logs
15:29:46 <Greg_E_> or is it something that we can do ourselves
15:29:49 <dali`> As I mentioned before, we also see these issues which cause 2/3 of all functest/yardstick runs to fail for ARMBand.
15:29:53 <jose_lausuch> Greg_E_: yes, you need to specifiy which test cases to run for your scenario
15:29:57 <jose_lausuch> and change a yaml file in yardstick
15:30:03 <Greg_E_> ok
15:30:08 <bryan_att> be aware that the SSH failure may be a timing issue; ssh does not go active immediately; I have has to put delays or loops in my tests to avoid false negatives
15:30:10 <Greg_E_> I’ll let the team know
15:30:14 <dmcbride> #info Greg_E_ asks:  if these are floating ip tests, what can scenario owners do to blacklist them from running
15:30:45 <bryan_att> but floating IPs work fine in Apex and JOID
15:30:59 <jose_lausuch> bryan_att: not in all the scenarios
15:31:01 <bryan_att> with SSH or any other open port
15:31:09 <jose_lausuch> I think odl_l3 also had some issues
15:31:14 <jose_lausuch> or even onos sometimes, I dont remember
15:31:35 <jose_lausuch> but I think that is the responsability of the scenario owners
15:31:41 <jose_lausuch> to have a look at and take action
15:31:47 <jose_lausuch> not the fuel/yardstick team
15:31:56 <jose_lausuch> we can just report
15:32:19 <dmcbride> jose_lausuch: so, this is a matter of scenario owners selecting the correct set of tests?
15:32:55 <jose_lausuch> dmcbride: yes and no, for the scenarios where floatip is not supported, the default yardstick test cases will fail
15:33:09 <jose_lausuch> but for those where floatips is supported, then it might be another thing
15:33:32 <dmcbride> Greg_E_:  does that give you the information you need?
15:34:24 <jose_lausuch> I see for kvm scenario another type of error: error: failed to deploy stack: 'ERROR: Authentication failed: Authentication required'
15:34:29 <dmcbride> dali`:  is this helpful? Have you customized the test selection in yardstick, or are you just using the defaults?
15:34:31 <jose_lausuch> so its a different issue
15:34:35 <Greg_E_> can we get somebody from yardstick team to look at the current failures and let us know the reason
15:34:44 <jose_lausuch> Greg_E_: that would be ideal
15:34:50 <Greg_E_> we are using defaults
15:35:05 <Greg_E_> nobody knows enough about yardstick to customize
15:35:11 <dali`> We use the default test selections for the scenarios. So this is kind of helpful, we won't chase that issue anymore. We just have to run many test runs, sometimes they pass.
15:35:24 <jose_lausuch> Greg_E_: I had to ask them directly to customize bgpvpn scenario tests
15:35:33 <dmcbride> dali`: that doesn't sound good
15:35:51 <Greg_E_> who can I put in touch with fuel eng team to handle this further
15:36:04 <jose_lausuch> kubi
15:36:06 <jose_lausuch> the PTL
15:36:12 <jose_lausuch> but he is OoO
15:36:31 <Greg_E_> anybody else who is working can help?
15:36:55 <Greg_E_> did he appoint a deputy while he is OOO
15:36:55 <jose_lausuch> I can help to guide you customizing your test cases
15:37:00 <Greg_E_> ok
15:37:03 <Greg_E_> that is great
15:37:10 <Greg_E_> can I have your email address
15:37:21 <jose_lausuch> jose.lausuch@ericsson.com
15:37:27 <Greg_E_> thanks
15:37:31 <jose_lausuch> np
15:37:54 <dmcbride> jorgen.w.karlsson@ericsson.com
15:37:54 <dmcbride> houjingwen@huawei.com
15:37:54 <dmcbride> wenjing_chu@dell.com
15:37:56 <dmcbride> liangqi1@huawei.com
15:37:58 <jose_lausuch> in functest we are excluding the tests with floatingips for bgpvpn scenarios and odl_l3
15:37:58 <dmcbride> jean.gaoliang@huawei.com
15:38:00 <dmcbride> vincenzo.m.riccobene@intel.com
15:38:02 <dmcbride> here's the list of committers for yardstick:
15:38:11 <jose_lausuch> jorgen is not longer working in yardstick
15:39:11 <dmcbride> ok - we are past time, but I'm happy to continue if participants are willing
15:39:22 <dmcbride> bryan_att: let's switch to your issue
15:39:36 <Greg_E_> is yardstick a mandatory for the C1 release?
15:39:49 <bryan_att> ok, just looking for clarifications as noted earlier in the channel
15:40:06 <jose_lausuch> Kubi (PTL) = jean.gaoliang@huawei.com
15:40:16 <jose_lausuch> limingjiang <limingjiang@huawei.com> is also very helpful
15:40:43 <dmcbride> Greg_E_: the philosophy for all tests for Colorado is that the test results are guidelines, but scenario owners determine whether they will release, or not
15:40:54 <Greg_E_> ok
15:41:11 <dmcbride> Greg_E_: however, scenario owners need to provide justification and explanation for failures in the release notes
15:41:24 <Greg_E_> understood
15:41:28 <dmcbride> Greg_E_: see Sofia's recent email on release notes
15:41:35 <Greg_E_> k
15:41:39 <jose_lausuch> bryan_att: answering your first question, it does not show when the scenario is broken if we do not report errors. e.G. error in healthcheck => stop CI => no error pushed to DB, reporting based on last (successfull results)
15:41:58 <jose_lausuch> that is something to be improved for sure
15:42:27 <jose_lausuch> so if a given scenario is not run during the past days, it will still count the previous runs
15:42:46 <jose_lausuch> if they are successful, then you'll have positive numbers there
15:42:48 <bryan_att> even if they were months ago?
15:42:57 <jose_lausuch> we need to show the "current" reality
15:43:08 <bryan_att> I guess you are talking about the testresults page, right?
15:43:08 <jose_lausuch> i think there is some delay or something, I'm not sure
15:43:16 <dmcbride> jose_lausuch: doesn't it use a 50-day window?
15:43:22 <bryan_att> why it shows sunny when it's not really
15:43:26 <jose_lausuch> I can ask Morgan and Serena, who implemented the dashboard
15:43:29 <dmcbride> only uses results form the past 50 days?
15:43:30 <jose_lausuch> yes
15:43:36 <jose_lausuch> I think its something like that
15:43:42 <jose_lausuch> dont remember if its 50 or whatever days
15:43:50 <bryan_att> but nearing release, the last *week* is the important timeframe
15:44:01 <jose_lausuch> bryan_att: we need to remove copper from apex, that has to be manual and we forgot
15:44:28 <bryan_att> Apex HA only
15:44:34 <jose_lausuch> but during 1 week, the scenarios are not run 4 times, there is no time to run all of them 4 times
15:44:35 <bryan_att> Non-HA is fine.
15:44:39 <jose_lausuch> ok
15:44:43 <jose_lausuch> I take a note
15:44:52 <bryan_att> ok, two weeks. but not 50 days.
15:45:04 <jose_lausuch> ok, I will propose it
15:45:12 <bryan_att> The scenarios that are planned are shown on the release wiki page
15:45:24 <bryan_att> which outlines the scenarios per project
15:45:39 <bryan_att> I took HA off that table when the issues were found
15:45:59 <bryan_att> Note also that in some HA scenarios, the Congress service is not being installed in HA mode and there, it works
15:46:10 <bryan_att> it just does not work behind HA Proxy
15:46:29 <bryan_att> So for example, in JOID, it is working in the HA scenario
15:46:45 <bryan_att> But I took it off the expected list just to there is no confusion (or less)
15:46:50 <bryan_att> that's all for me
15:47:15 <bryan_att> https://wiki.opnfv.org/display/SWREL/Colorado+scenario+inventory+and+dependencies
15:47:35 <dmcbride> ok - good discussion
15:47:47 <dmcbride> we're about 15 minutes over
15:48:08 <dmcbride> does anyone have anything else that's urgent that they would like to bring up?
15:48:31 <jose_lausuch> ok
15:48:39 <jose_lausuch> bryan_att: noted down, will talk to the team
15:48:53 <jose_lausuch> nope, I need to leave now
15:49:00 <dmcbride> ok - same time, same place on Friday
15:49:26 <dmcbride> #endmeeting