15:00:28 <DaveBarach> #startmeeting fdio-vpp
15:00:28 <collab-meetbot> Meeting started Tue May 11 15:00:28 2021 UTC and is due to finish in 60 minutes.  The chair is DaveBarach. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:28 <collab-meetbot> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:28 <collab-meetbot> The meeting name has been set to 'fdio_vpp'
15:02:59 <mackonstan> #info mackonstan
15:04:12 <DaveBarach> #chair dmarion
15:04:12 <collab-meetbot> Warning: Nick not in channel: dmarion
15:04:12 <collab-meetbot> Current chairs: DaveBarach dmarion
15:04:34 <DaveBarach> #topic CSIT (maciek reporting)
15:04:47 <mackonstan> #info Physical and virtual infrastructure updates
15:05:20 <mackonstan> #info Vexxhost DC move almost done, last four servers will be moved from MTL1 to YUL1 tomorrow, and we are done with the phy machines move.
15:05:58 <mackonstan> #info Mgmt/IPMI IPv4 addr renumbering to happen shortly, to put all hosts in the same subnet(s).
15:06:48 <mackonstan> #info Open item1: OpenStack vRouter still used for accessing LF IT VM applications left behind in MTL1 (jenkins master, gerrit, etc)
15:07:05 <mackonstan> #info Resolution1: LF IT VM apps will move to YUL1 in the next few weeks, and then all problems should go away.
15:07:18 <mackonstan> #info Open item2: intermittent (much less frequent now after we went into full daily esca calls with involved parties) git fetch failures and jenkins connection resets.
15:07:26 <mackonstan> #link https://secure.vexxhost.com/billing/viewticket.php?tid=NOB-607778&c=4Dp0GdHT
15:07:38 <mackonstan> #info Resolution2: Continue daily 15min calls for situation review with involved parties, until all parties satisfies and min 2-day uninterrupted operation evident.
15:09:40 <mackonstan> #info Test breakages:
15:09:51 <mackonstan> #info NAT44ed multi-worker keep testing intermittently, less frequently after recent patch, but still vpp crashing.
15:10:15 <mackonstan> #info Sporadic VPP crashes in get statistics.
15:10:58 <mackonstan> #info Few other under investigation.
15:11:18 <mackonstan> #info Work highlights:
15:11:25 <mackonstan> #info CSIT in AWS - 2-node and 3-node tests running smoothly, ENA DPDK driver making VPP packets drop on tx. Moving ahead with Jenkins onboarding, will be publishing results for a subset of CSIT tests in CSIT-2106 report.
15:12:06 <mackonstan> #info Merging VPP & Linux telemetry - VPP perfmon bundles, Linux bcc/bpf tracing tools, using OpenMetrics format for storage and post-processing.
15:13:19 <mackonstan> #info Moving to json models for test oper data and results storage, querying and post processing. Would be good to hear from vpp-dev community what queries people would like execute against CST test result data e.g. over specific time period or for specific git patch period to say verify specific patch(set) impact on things.
15:13:31 <mackonstan> #info Ongoing work to make TRex behaving as a deterministic and reliable traffic generator at high 100GbE rates.
15:13:43 <mackonstan> #info Revamp of ipsec tests, as CSIT suffering from test suite overload (269 tests at last count). See Maciek recent patches for tests being axed, under review.
15:13:56 <mackonstan> #info Generic effort to reduce number of tests, remove redundant packet path testing. See Maciek recent patches, under review.
15:14:10 <mackonstan> #info Other CSIT-2106 work, see link
15:14:17 <mackonstan> #link https://wiki.fd.io/view/CSIT/csit2106_plan
15:14:41 <DaveBarach> #topic Host Stack(Florin)
15:14:59 <DaveBarach> #info lots of patches in the last month
15:15:25 <DaveBarach> #info improvements in session layer for connect/listen APIs - Lots more config knobs
15:15:44 <DaveBarach> #info working to improve active-open performance
15:16:03 <DaveBarach> #info moving active-opens to the first worker since the main thread tends to sleep a lot
15:16:15 <DaveBarach> #info improve half-open connection tracking
15:16:57 <DaveBarach> #info bunch of TCP cleanup, bulk buffer translation
15:17:12 <DaveBarach> #info improvements in vcl test code, server
15:17:49 <DaveBarach> #info now have a DTLS vcl test
15:18:19 <DaveBarach> #topic Documentation (Ole reporting)
15:19:12 <DaveBarach> #info need to find a home for documentation, e.g. to auto-update main website docs
15:19:50 <DaveBarach> #info dwallace: LFN has a license for readthedocs
15:20:35 <DaveBarach> #info any community volunteers for maintaining / writing docs more than welcome
15:21:12 <DaveBarach> #info dwallace: need to help e.g. Google find up-to-date docs
15:21:28 <DaveBarach> #topic Release Mgmt (Andrew)
15:21:58 <DaveBarach> #info 21.06 RC1 in a few weeks
15:22:13 <DaveBarach> #info 5/25 (Weds) will pull the release throttle
15:23:02 <DaveBarach> #topic Coverity
15:23:27 <DaveBarach> #info look at list on github, broken out by owner/maintainer
15:26:20 <DaveBarach> #link https://github.com/vpp-dev/vpp-coverity-report
15:27:47 <DaveBarach> #info vppapigen "training wheels" to be removed in this release
15:28:27 <DaveBarach> #info vppapigen added message status (experimental, production, etc)  to JSON
15:29:05 <DaveBarach> #topic Infra Status(DaveW)
15:29:32 <DaveBarach> #info three intermittent false failures: punt tests fixed
15:29:59 <DaveBarach> #info vpp device job fails when 2 jobs run / both try to reconfigure the i40e at the same time
15:30:18 <DaveBarach> #info intermittent vcl / ldp make test failure on the arm platform
15:30:38 <DaveBarach> #info "that one is driving me crazy..."
15:31:14 <DaveBarach> #info reenabled Naginator to (temporarily) address Jenkins comms reset problems
15:32:06 <DaveBarach> #info trying to avoid Vexxhost virtual-router bailing-wire / bubble-gum to improve network reliability
15:32:40 <DaveBarach> #info DW spending hours/day updating vexxhost ticket w/ data
15:33:24 <DaveBarach> #info proposal to use vpp instead of current  virtual router technology, early stage discussions
15:38:41 <DaveBarach> #topic make test (cont'd from last meeting)
15:39:34 <DaveBarach> #Info short-term, move tests back to centralized location
15:39:52 <DaveBarach> #topic node enqueue improvements
15:40:19 <DaveBarach> #info currently: enqueues very fast when all pkts go to same destination
15:40:51 <DaveBarach> #info rewrote vlib_node_enqueue_to_next(...) to use SIMD instrs
15:41:46 <DaveBarach> #info significant change, but reduces 20 clocks to 2 or 3 clocks in the general case
15:43:16 <DaveBarach> #info handoff code in progress
15:43:33 <DaveBarach> #info multiple tx queue support in progress
15:44:27 <DaveBarach> #info not clear whether the two in-progress items will end up in 21.06
15:45:14 <DaveBarach> #info will try to combine handoff frames
15:48:27 <DaveBarach> #info should improve high worker count scenarios where the number of tx queues is lower than the number of workers
15:51:09 <DaveBarach> #info multiple places hash packets to queues. Want to create infra to handle the problem in a general way
15:54:22 <DaveBarach> #endmeeting