#fdio-csit: FD.io CSIT project meeting

Meeting started by mackonstan at 14:02:22 UTC (full logs).

Meeting summary

    1. Tibor Frank (tifrank, 14:02:30)
    2. Vratko Polak. (vrpolak, 14:02:37)
    3. Jan Gelety (jgelety, 14:02:46)

  1. Agenda bashing (mackonstan, 14:04:06)
  2. FD.io CSIT lab infrastructure (mackonstan, 14:04:37)
    1. Juraj: want to add two thunderX2 servers for vpp-device tests. And use existing thunderX2 for 2-node tests. (mackonstan, 14:06:48)
    2. Juraj: have question re power for new servers - will contact vexxhost team. (mackonstan, 14:09:15)
    3. CLX servers: Peter - CLX perf servers running, in all perf jobs including trending. Presented in trending pages as 2n-clx. (mackonstan, 14:10:55)
    4. ACTION: Peter: implement PBF functionality (priority based frequency) based on Intel recommendation. Draft tech proposal of PBF usage to be run by csit-dev. (mackonstan, 14:12:32)

  3. Inputs from LFN and FD.io projects (mackonstan, 14:13:00)
    1. VPP: check vpp meeting notes re vpp v19.08.2 status (mackonstan, 14:14:29)
    2. TSC: no update (mackonstan, 14:14:43)

  4. Releases (mackonstan, 14:14:55)
    1. CSIT-1908 report updates: reconf tests graphs replaced by link pointing to CSIT-1908_1 report with corrected test results after reconf tests fixes got applied. (mackonstan, 14:16:03)
    2. CSIT-1908.1 report: Tibor - generated this morning CEST. Missing some runs for 3n-skx, 3n-hsw, 3n-tsh. Report data has been validated for presentation, but need to validate data before announcing. (mackonstan, 14:23:58)
    3. CSIT-2001: plan to be captured on wiki (mackonstan, 14:25:30)
    4. https://wiki.fd.io/view/CSIT/csit2001_plan (mackonstan, 14:25:38)

  5. Operational status (mackonstan, 14:26:05)
    1. Jan: issue with vpp-device environment. Peter fixed. Peter to do TOI for future similar situations. (mackonstan, 14:26:40)
    2. Jan: trending shows failing multi-core tests, mostly 4c, some 2c. Suspecting Ole's patch touching stats handling by workers. Already interacting on slack channel #csit-dev. (mackonstan, 14:28:26)
    3. Info from Vanessa:The Nexus maintenance was completed successfully. While performance appears to have improved, the hung jobs issue is not resolved. We modified the cronjobs on Monday. The next step is to move to the global-jjb lf-publish macro for logs. I'm working with Ed Kern to make sure the macro works within the nomad containers. (tifrank, 14:34:25)
    4. Tibor: still observing some data upload and download failures - 19(1 failure), 22(1), 24(2), 26(1)-Sep, so more sporadic failures despite lots of activity (lots of uploads/downloads, many generations of report). Could be a different problem causing connectivity errors. Will paste this update into the helpdesk ticket. (mackonstan, 14:34:37)
    5. Tibor: no issues observed since Sunday. (mackonstan, 14:35:18)
    6. Ed: heads-up - another jenkins maintenance to apply changes listed by Vanessa to address the ongoing problem. (mackonstan, 14:36:40)
    7. Maciek: re Centos CI vpp-device - Thomas F. Herbert confirmed that he will have a limited bandwidth to support this. Will send a note to the list. (mackonstan, 14:39:32)
    8. Peter: Centos environment is missing keys. Ed to take it on, and see if it can be addressed in a similar fashion as done for vpp project. (mackonstan, 14:42:31)
    9. HoneyComb tests, Maciek: close the action re removing HoneyComb tests from CSIT repo, as HC project is dormant and lost sync with VPP. (mackonstan, 14:44:22)

  6. VPP code performance (mackonstan, 14:44:55)
    1. trending - no new issues, apart from the multi-core listed by Jan earlier. (mackonstan, 14:45:45)
    2. Tibor: planning to add code to generate emails with anomaly information similar format as for failures, based on trending analytics (mackonstan, 14:46:49)

  7. Developments (mackonstan, 14:47:03)
    1. VAT to PAPI: Jan - completed. One open item: scale tests. (mackonstan, 14:48:05)
    2. move to Python3: Jan - no update. (mackonstan, 14:48:20)
    3. vpp-api crc checks - Vratko: code updated to address vpp patch racing condition. Improved process description merged. Awaiting another vpp patch situation to have it fully verified in production. (mackonstan, 14:51:04)
    4. Vratko: next proposal for aligning csit and vpp master branches, driven by experience with vpp api crc checks. Removes need for vpp_stable and csit oper branches. See patch: (mackonstan, 14:52:34)
    5. current LF plan of record on timeout issue is to switch to new volume type on jenkins that was already done to nexus. that outage should be scheduled before eow. If that doesnt get around the issue they want to take a look at changing the log pulling to either console log OR console timestamp log but not both. (snergster, 14:53:05)
    6. Vratko: working on a script for automated git bisecting performance regressions. (mackonstan, 14:53:30)
    7. https://gerrit.fd.io/r/c/csit/+/22354 <-- Draft document for future improvements of API process. (vrpolak, 14:54:21)
    8. Peter: vpp in unprivileged containers - making good progress. Issues with hugepages… vfio-pci handling is wip. (mackonstan, 14:55:43)
    9. https://gerrit.fd.io/r/c/csit/+/22261 <-- Work in progress for script that bisects to locate a regression. (vrpolak, 14:55:44)

  8. Test environments (mackonstan, 14:55:54)
    1. re VIRL shutdown - Jan/Peter: only one test left running - dot1ad (already covered by performance test). (mackonstan, 14:57:22)
    2. re VIRL shutdown - Maciek: proposed course of action - review any other dependencies relying on VIRL env (nsh_sfc, hc, …), address those by for the last final time contacting the projects, and upon resolution, remove the not needed tests (VIRL mainly), and follow this by shutting down VIRL environment. (mackonstan, 14:59:11)
    3. re VIRL shutdown - proposal(s) for repurposing the VIRL servers. (mackonstan, 15:00:04)
    4. Peter: consider leaving one VIRL server for any remaining tasks that need VIRL. (mackonstan, 15:00:40)


Meeting ended at 15:01:33 UTC (full logs).

Action items

  1. Peter: implement PBF functionality (priority based frequency) based on Intel recommendation. Draft tech proposal of PBF usage to be run by csit-dev.


People present (lines said)

  1. mackonstan (41)
  2. collabot` (4)
  3. vrpolak (3)
  4. tifrank (2)
  5. jgelety (1)
  6. snergster (1)


Generated by MeetBot 0.1.4.