=================== #fdio-vpp: fdio-vpp =================== Meeting started by DaveBarach at 15:00:28 UTC. The full logs are available at http://ircbot.wl.linuxfoundation.org/meetings/fdio-vpp/2021/fdio_vpp/fdio-vpp-fdio_vpp.2021-05-11-15.00.log.html . Meeting summary --------------- * mackonstan (mackonstan, 15:02:59) * CSIT (maciek reporting) (DaveBarach, 15:04:34) * Physical and virtual infrastructure updates (mackonstan, 15:04:47) * Vexxhost DC move almost done, last four servers will be moved from MTL1 to YUL1 tomorrow, and we are done with the phy machines move. (mackonstan, 15:05:20) * Mgmt/IPMI IPv4 addr renumbering to happen shortly, to put all hosts in the same subnet(s). (mackonstan, 15:05:58) * Open item1: OpenStack vRouter still used for accessing LF IT VM applications left behind in MTL1 (jenkins master, gerrit, etc) (mackonstan, 15:06:48) * Resolution1: LF IT VM apps will move to YUL1 in the next few weeks, and then all problems should go away. (mackonstan, 15:07:05) * Open item2: intermittent (much less frequent now after we went into full daily esca calls with involved parties) git fetch failures and jenkins connection resets. (mackonstan, 15:07:18) * LINK: https://secure.vexxhost.com/billing/viewticket.php?tid=NOB-607778&c=4Dp0GdHT (mackonstan, 15:07:26) * Resolution2: Continue daily 15min calls for situation review with involved parties, until all parties satisfies and min 2-day uninterrupted operation evident. (mackonstan, 15:07:38) * Test breakages: (mackonstan, 15:09:40) * NAT44ed multi-worker keep testing intermittently, less frequently after recent patch, but still vpp crashing. (mackonstan, 15:09:51) * Sporadic VPP crashes in get statistics. (mackonstan, 15:10:15) * Few other under investigation. (mackonstan, 15:10:58) * Work highlights: (mackonstan, 15:11:18) * CSIT in AWS - 2-node and 3-node tests running smoothly, ENA DPDK driver making VPP packets drop on tx. Moving ahead with Jenkins onboarding, will be publishing results for a subset of CSIT tests in CSIT-2106 report. (mackonstan, 15:11:25) * Merging VPP & Linux telemetry - VPP perfmon bundles, Linux bcc/bpf tracing tools, using OpenMetrics format for storage and post-processing. (mackonstan, 15:12:06) * Moving to json models for test oper data and results storage, querying and post processing. Would be good to hear from vpp-dev community what queries people would like execute against CST test result data e.g. over specific time period or for specific git patch period to say verify specific patch(set) impact on things. (mackonstan, 15:13:19) * Ongoing work to make TRex behaving as a deterministic and reliable traffic generator at high 100GbE rates. (mackonstan, 15:13:31) * Revamp of ipsec tests, as CSIT suffering from test suite overload (269 tests at last count). See Maciek recent patches for tests being axed, under review. (mackonstan, 15:13:43) * Generic effort to reduce number of tests, remove redundant packet path testing. See Maciek recent patches, under review. (mackonstan, 15:13:56) * Other CSIT-2106 work, see link (mackonstan, 15:14:10) * LINK: https://wiki.fd.io/view/CSIT/csit2106_plan (mackonstan, 15:14:17) * Host Stack(Florin) (DaveBarach, 15:14:41) * lots of patches in the last month (DaveBarach, 15:14:59) * improvements in session layer for connect/listen APIs - Lots more config knobs (DaveBarach, 15:15:25) * working to improve active-open performance (DaveBarach, 15:15:44) * moving active-opens to the first worker since the main thread tends to sleep a lot (DaveBarach, 15:16:03) * improve half-open connection tracking (DaveBarach, 15:16:15) * bunch of TCP cleanup, bulk buffer translation (DaveBarach, 15:16:57) * improvements in vcl test code, server (DaveBarach, 15:17:12) * now have a DTLS vcl test (DaveBarach, 15:17:49) * Documentation (Ole reporting) (DaveBarach, 15:18:19) * need to find a home for documentation, e.g. to auto-update main website docs (DaveBarach, 15:19:12) * dwallace: LFN has a license for readthedocs (DaveBarach, 15:19:50) * any community volunteers for maintaining / writing docs more than welcome (DaveBarach, 15:20:35) * dwallace: need to help e.g. Google find up-to-date docs (DaveBarach, 15:21:12) * Release Mgmt (Andrew) (DaveBarach, 15:21:28) * 21.06 RC1 in a few weeks (DaveBarach, 15:21:58) * 5/25 (Weds) will pull the release throttle (DaveBarach, 15:22:13) * Coverity (DaveBarach, 15:23:02) * look at list on github, broken out by owner/maintainer (DaveBarach, 15:23:27) * LINK: https://github.com/vpp-dev/vpp-coverity-report (DaveBarach, 15:26:20) * vppapigen "training wheels" to be removed in this release (DaveBarach, 15:27:47) * vppapigen added message status (experimental, production, etc)  to JSON (DaveBarach, 15:28:27) * Infra Status(DaveW) (DaveBarach, 15:29:05) * three intermittent false failures: punt tests fixed (DaveBarach, 15:29:32) * vpp device job fails when 2 jobs run / both try to reconfigure the i40e at the same time (DaveBarach, 15:29:59) * intermittent vcl / ldp make test failure on the arm platform (DaveBarach, 15:30:18) * "that one is driving me crazy..." (DaveBarach, 15:30:38) * reenabled Naginator to (temporarily) address Jenkins comms reset problems (DaveBarach, 15:31:14) * trying to avoid Vexxhost virtual-router bailing-wire / bubble-gum to improve network reliability (DaveBarach, 15:32:06) * DW spending hours/day updating vexxhost ticket w/ data (DaveBarach, 15:32:40) * proposal to use vpp instead of current  virtual router technology, early stage discussions (DaveBarach, 15:33:24) * make test (cont'd from last meeting) (DaveBarach, 15:38:41) * short-term, move tests back to centralized location (DaveBarach, 15:39:34) * node enqueue improvements (DaveBarach, 15:39:52) * currently: enqueues very fast when all pkts go to same destination (DaveBarach, 15:40:19) * rewrote vlib_node_enqueue_to_next(...) to use SIMD instrs (DaveBarach, 15:40:51) * significant change, but reduces 20 clocks to 2 or 3 clocks in the general case (DaveBarach, 15:41:46) * handoff code in progress (DaveBarach, 15:43:16) * multiple tx queue support in progress (DaveBarach, 15:43:33) * not clear whether the two in-progress items will end up in 21.06 (DaveBarach, 15:44:27) * will try to combine handoff frames (DaveBarach, 15:45:14) * should improve high worker count scenarios where the number of tx queues is lower than the number of workers (DaveBarach, 15:48:27) * multiple places hash packets to queues. Want to create infra to handle the problem in a general way (DaveBarach, 15:51:09) Meeting ended at 15:54:22 UTC. People present (lines said) --------------------------- * DaveBarach (47) * mackonstan (22) * collab-meetbot (5) * dmarion (0) Generated by `MeetBot`_ 0.1.4