15:00:55 #startmeeting Testing working group weekly meeting 7/9 15:00:55 Meeting started Thu Sep 7 15:00:55 2017 UTC. The chair is morgan_orange. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:55 Useful Commands: #action #agreed #help #info #idea #link #topic. 15:00:55 The meeting name has been set to 'testing_working_group_weekly_meeting_7_9' 15:01:12 #topic call role 15:01:15 has the bridge opened? 15:01:25 #info Maryam Tahhan 15:01:43 gabriel_yuyang: can you open it, still some right issue 15:01:49 #info Morgan Richomme 15:01:55 #info Alec Hothan 15:02:30 morgan_orange: ok 15:02:39 #info Trevor Cooper 15:02:56 I will open it... it's the same bridge as barometer 15:03:30 #topic action point follow-up 15:03:36 #info AP1: gabriel_yuyang collect names for intel18 access 15:03:42 #info done, JIRA to be created soon 15:03:49 #info AP2 mbeierl create wiki page on docker creation for arm 15:04:05 #chair trevor_intel gabriel_yuyang 15:04:05 Current chairs: gabriel_yuyang morgan_orange trevor_intel 15:04:12 #info Mark Beierl 15:04:14 #info AP3 testing PTL: provide feedback to Alec 15:04:21 morgan_orange: shoot. I need to get working on that 15:04:22 #info mail discussion initiated 15:04:32 #info AP4 review https://etherpad.openstack.org/p/etsi-nfv-openstack-gathering-denver & https://etherpad.openstack.org/p/qa-queens-ptg 15:04:36 #info AP5 morgan_orange plan a topic on Testing group contribution to PTG next week 15:04:41 #info done see next section 15:04:47 #info AP6 mbeierl share the mail new testing features for Euphrates 15:04:55 #info done 15:05:27 #topic Barometer 15:06:03 #link https://wiki.opnfv.org/display/fastpath 15:06:42 #info Barometer = OPNFV telemetry project 15:07:03 #info Maryam Tahhan presents overview of Barometer 15:07:51 #info scope NFVI + Hypervisor 15:08:30 #info Email from Amar on Euphrates release 15:08:31 #link https://lists.opnfv.org/pipermail/opnfv-tech-discuss/2017-August/017698.html 15:10:43 #info collectd = system stats collection daemon 15:10:55 #info barometer based on collectd (10 years, stable, widely adopted by industry, modular, ..) 15:14:10 #link https://wiki.opnfv.org/display/fastpath/Collectd+Metrics+and+Events 15:19:36 #info More than 90 plugins cover many interfaces 15:26:22 #info Barometer only tested with Apex today ... due to resourcing for Euphrates 15:27:38 #info Morgan proposes to use Barometer for long duration tests 15:28:14 morgan_orange: where should I put the page on multiarch docker? Under the test working group, or ...? 15:28:39 mbeierl: I would suggest a page under testing/Euphrates ? 15:28:49 morgan_orange: ok, thanks. 15:30:36 #info Barometer dockerization is WIP (collecd daemon, Influxdb with Grafana) 15:32:27 #info question on prometheus => integration with collectd relatively easy 15:33:53 #info Prometheus uses pull model ... collectd typically uses push 15:35:31 #info no clustering support in prometheus (local stoage) 15:36:54 mtahhan: I'd like to touch base later about having Barometer run and collect host metrics while StorPerf is running to show how Ceph is behaving on the host :) 15:37:20 mtahhan: @mentioned you in the StorPerf F release planning page so I remember to do that :) 15:38:51 bryan_att: ONAP's telemetry project is Baramoter? 15:39:04 mbeierl: sure thing... no stress :D 15:41:01 I need to drop now, thanks everyone! 15:44:23 #topic Euphrates Documentation 15:54:26 #topic OpenStack PTG review: OPNFV testing group proposal: https://etherpad.openstack.org/p/qa-queens-ptg https://etherpad.openstack.org/p/etsi-nfv-openstack-gathering-denver 15:54:57 #action morgan_orange create wiki to irganize doc cross review + testing group doc 15:55:13 #info Gabriel to summarise what we are planning for long duration 15:55:37 #action alec-cisco jose sync with infra group to create the docker best practice solution in the documentation 15:55:49 #topic OpenStack PTG review 15:55:57 #link https://etherpad.openstack.org/p/qa-queens-ptg 15:56:12 #info Bryan suggests to prioritize failure modes that are known to occur 15:56:15 #agree gabriel_yuyang to summarize Testing group activity in the etherpad for the OpenStack group 15:57:56 #info Morgan suggests asking EUAG for input on failure mode priorities 15:58:03 #topic AoB 15:59:13 β€œIn each cluster's (of 1,800 servers) first year, it's typical that 1,000 individual 15:59:13 machine failures will occur; thousands of hard drive failures will occur; one 15:59:13 power distribution unit will fail, bringing down 500 to 1,000 machines for 15:59:13 about 6 hours; 20 racks will fail, each time causing 40 to 80 machines to vanish 15:59:13 from the network; 5 racks will "go wonky," with half their network packets 15:59:14 missing in action; and there's about a 50 percent chance that the cluster will 15:59:16 overheat, taking down most of the servers in less than 5 minutes and taking 1 15:59:18 to 2 days to recover. β€œ – Jeff Dean 2008 15:59:21 https://www.cnet.com/news/google-spotlights-data-center-inner-workings/ 15:59:28 and some interesting reading here: 15:59:36 https://blog.thousandeyes.com/top-internet-outages-2016/ 16:00:13 first qoute was from google launching a cluster 16:04:55 #info TSC election in progress: woudl be good to have a Testing group candidature 16:05:18 #info but as testing group is unformal => no nomination on behlaf, just use standard way 16:05:21 #endmeeting