13:04:30 #startmeeting CIP IRC weekly meeting 13:04:30 Meeting started Thu Mar 14 13:04:30 2024 UTC and is due to finish in 60 minutes. The chair is jki. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:04:30 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:04:30 The meeting name has been set to 'cip_irc_weekly_meeting' 13:04:38 #topic AI review 13:04:48 - prepare blog entry on SLTS kernel state and challenges [Jan] 13:05:03 no relevant progress yet :( 13:05:10 - migrate kernelci bot reports away from cip-dev [Chris] 13:05:26 looks to me like they are really off now 13:05:48 hi. 13:05:56 maybe chris will join later 13:06:02 other AIs? 13:06:17 5 13:06:19 4 13:06:21 3 13:06:22 2 13:06:24 1 13:06:27 #topic Kernel maintenance updates 13:06:48 i released 4.4-cip85 and reviewed 6.1.80 13:06:52 This week reported 5 new CVEs and 8 updated CVEs. 13:06:54 I reviewed 6.1.80, and reviewing 6.1.81. 13:07:02 I'm doing reviews -- 6.1.81 and .82. 13:07:39 so few CVEs - something broken in the new process?? :) 13:08:23 not sure :( 13:08:31 And useful CVEs, wow. 13:08:44 Anyway, I believe we should not get too used to it. 13:09:08 we need to observe closely, yes 13:09:26 we can't draw final conclusions yet, that is for sure 13:10:48 fyi: https://lore.kernel.org/all/20240311150054.2945210-2-vegard.nossum@oracle.com/ 13:11:30 need to read, looks interesting 13:11:37 masami: I seen that one, but I am not really sure what it means. 13:13:24 it will describe what is vulnerability so I hope it improves kernel cve process. 13:14:16 can we actively (and visibly) contribute to it? 13:15:31 This might be good place to someone from security team to take a look. 13:16:05 worth at least a try, but I'm already afraid they won't have the bandwidth 13:16:10 I saw an interesting CVE this week... 13:16:12 https://lore.kernel.org/linux-cve-announce/2024022840-CVE-2021-47050-5ba5@gregkh/T/ 13:16:22 It was released e/Feb 13:16:32 But the original issue was fixed a couple of years ago 13:16:48 Are they now going through every bug in history? 13:16:57 Yes, basically. 13:17:03 seems like - would explain the initial peak 13:17:21 But surely there would be thousands and thousands? Not just hundereds? 13:17:31 Notice that: 13:17:42 They don't describe the vulnerability. 13:17:54 Sentence in description is incomplete. 13:18:04 And this is not security vulnerability by any means. 13:19:38 Interestingly as well, this issue actually affects CIP 4.19 13:19:55 So there was some value in this - we wouldn't have realised otherwise 13:20:20 What value do you see? 13:20:44 This is robustness against bad device tree. Not any kind of security hole. 13:20:58 Well, we have a NULL pointer dereference we can fix 13:21:27 The point is, we're not picking up bug fixes in the CIP kernels that are fixed in LTS etc. Isn't that the point of CIP SLTS? 13:22:00 1) this does not fix anything 13:22:20 2) it may even be buggy. How does it fix the null deref? 13:22:32 3) we can't really backport all the ... that gets put into the stable tree. 13:22:50 4) now we'll have to explain people (as I'm now explaining to you) that some CVEs are jun 13:22:52 junk 13:22:53 :-(. 13:24:01 So are we not monitoring upstream or LTS for (potential) bug fixes that affect the code we have in SLTS? Or are you saying that this patch has already been evaluated and deemed not valid for CIP? 13:24:31 We are not evaluating patches to backport between 5.10 and 4.19. 13:24:43 (And besides, after short evaluation, the patch is junk). 13:25:09 not scanning for LTS-only patches may actually be a current gap 13:25:22 will become smaller when LTS gets shorted, but we should think about that 13:25:36 We relly on -stable team to do the evaluations, and they clearly did not do good job there. 13:25:46 This patch should not have been in -stable in the first place. 13:26:02 yes, don't disagree on the concrete patch 13:26:08 it's more about the general process 13:26:13 indeed 13:26:44 But about 30% of patches in stable are "this does not fix anything" 13:26:52 are we sure we will most likely catch all LTS patches for our own kernels, even if they enter later in LTS? 13:27:09 and Greg does not really listen to feedback unless patch is obviously broken. 13:27:16 Feel free to get him to revert this one :-). 13:27:32 If we don't trust LTS at all, shouldn't we be monitoring all of the bug fixes etc. going into mainline to determine what's "correct" for CIP? 13:28:27 That's what SUSE does. They get much less patches to work with. But the CVE numbers will make their job harder. 13:28:38 jki: ? 13:28:56 jki: We rely on stable team to maintain 6.1, 5.10 and 4.19 13:29:11 jki: Between 4.19 and 4.4, we apply "whatever applies" without much review 13:29:20 jki: And then take manual look at the rest. 13:29:26 ok 13:29:46 But this patch is a) bad, and b) much less severity than we would bother manually backporting it. 13:30:28 Okay. 13:30:38 Its actually not okay, 13:30:56 But then it circles back to the original issue - there are now unresolved CVEs for SLTS that users will want to know why aren't fixed... 13:31:05 because now we'll have much more questions "why do do you have CVE-x-y unfixed". 13:31:11 heh 13:31:24 yes, our (CIP in general) effort will likely shift a bit (or more) from doing to explaining/documenting 13:31:32 patersonc: Yes. And I guess someone should do a blog post 13:31:59 "Greg is spamming CVE numbers with junk, that's why you see so many unfixed CVEs". 13:32:28 Even better would be to contest junk CVEs such as this one. 13:33:51 It might be worth using our LF contacts to stop junk comming into CVE database in the first place. 13:34:18 Because I somehow suspect contesting the CVEs will nto be easy. 13:34:22 we still need to wait a bit to let the system settle 13:34:41 yes, these discussions at least cost a lot of time 13:35:20 Regardless of the CVEs and LTS, pave1 are you suggesting that that commit shouldn't have been accepted upstream to start with? Wouldn't that solve the whole loop (going forward..) 13:35:49 jki: I guess it is more diplomatic to wait. But we'll have few thousand CVEs people will ask about. 13:36:17 patersonc: Yes, upstream should have rejected. 13:36:25 But I don't see how that helps us. 13:36:43 Plus of course -stable team should have rejected it. 13:36:49 collect data, persistent examples (the original ones are not rejected), and then we may act 13:37:10 And then they should realized this should not have CVE... 13:37:44 But I'm under impression that -stable team would not revert the patch even after feedback -- because it is already in and does not hurt. 13:38:39 And it would really break the CVEs to revert the "fixes" to the CVEs :P 13:38:58 we can't argue based on impressions, we will need facts 13:38:59 Not sure. I don't have experience with that part. 13:39:09 jki: Sure 13:39:56 jki: I guess we should take someone experienced in security to do the analysis, so that it can't be easily brushed out with 'but Greg believes otherwise'. 13:40:26 folks, I'm getting an urgent call - give me 5 min. 13:40:28 jki: This is going to be huge problem for the security team in the long run... 13:42:30 (or with "but it is better to err on side of caution"). 13:45:38 back 13:46:34 yes, I'm still scouting for comments internally, but also those folks will need more datapoints than what we have after only 3 weeks 13:47:22 Ok. So should I spend say 10 hours 13:47:36 trying to randomly pick some "new" CVEs and analyze them? 13:48:03 yes, I think we should invest this (after the next pending -rt release ;-) ) 13:48:12 Or would it be more valuable to pick the "worst" CVEs? 13:48:31 the more data, the better 13:48:39 Ok ;-). 13:48:45 bad examples help, but their share in the overall number as well 13:49:22 ok, clock is ticking for today - should we move on? 13:49:55 5 13:49:57 4 13:49:58 3 13:49:59 2 13:50:01 1 13:50:03 #topic Kernel release status 13:50:10 4.4-rt is late, rest on time 13:50:12 I need to do 4.4-rt. 13:50:19 any blockers? 13:50:38 Not really. I was waiting for 4.4.x. 13:50:43 good 13:50:47 then move on 13:50:52 #topic Kernel testing 13:51:37 still investigating in the kernelci core api but not updates yet 13:51:49 6.8 is out. Would be good to start testing 6.8-stable. 13:52:04 kernelci is also in a bit of a change phase 13:52:26 pave1: I'll set it up when the first -rc is out (it wasn't when I last checked) 13:52:42 -rc1 is now out. 13:52:51 pave1: Okay 13:53:05 Thank you! 13:54:09 In the last "test report" email I sent I linked to Sietze's work on test reporting, did anyone see it? 13:54:15 https://cip-playground.gitlab.io/squad-hacking/cip-release-reports/ 13:54:33 Each build done gets a pretty/easy html page with a summary 13:54:46 Still WIP, but a good starting point 13:55:15 e.g. https://cip-playground.gitlab.io/squad-hacking/cip-release-reports/linux-cip/linux-4.19.y-cip/4.19.306-cip107_800dfc28d.html 13:55:51 We currently have "There are failed tests" and "there are no failed tests". 13:55:58 More different messages would be now. 13:56:02 be nice. 13:56:25 And 4.19 is failing a lot -- is that expected? https://cip-playground.gitlab.io/squad-hacking/cip-release-reports/linux-cip/linux-4.19.y-cip/4.19.309-cip107_bae57856d.html 13:56:33 Do the red cross and green tick not help? 13:57:19 There are lots of errors for LTP, always have been. We still need to work if they are meant to be failing or not. At least with squad we can now work out if there are any changes in the lists 13:58:04 Dunno. Negative sentences can be confusing. Green tick makes it better. 13:58:09 "All ok" would be even better :-). 13:58:54 Please add feedback to the gitlab issue: https://gitlab.com/cip-project/cip-testing/testing/-/issues/211 13:59:09 And "incomplete tests" is something for you to investigate, too? 14:00:46 Sure 14:01:54 okay - need to leave soon - more testing topics? 14:02:31 5 14:02:33 4 14:02:34 3 14:02:36 2 14:02:37 1 14:02:43 #topic AOB 14:03:07 6.1.81 was kind of strange, having series of nfs and x86 changes. 14:03:21 It looks like this was due to CVEs... and it seems we have another x86 CPU bug. 14:03:34 This time it effects Atom, so it may be more relevant to us. 14:04:01 I've seen Intel information, still if someone had better explanation what is going on there, it would be nice. 14:04:02 yeah, another HW vul 14:04:28 register content leakage on context switch, but I didn't look into all details 14:04:55 but it can become an interesting topic if the mitigation is hard to backport 14:04:59 Yes, they seem to have some kind of "quick store forwarding" circuit, or something. 14:05:38 we will need to check correlation of affected cores, their age, and CIP kernel usage for them (to the degree that's guessable) 14:06:04 -stable has backports of the fix for 6.1, but not 5.10 14:06:09 Greg is not spamming anything, he is doing exactly what the cve board told him to do, pave1 please stop spraying lies 14:06:26 Oh, welcome. 14:07:14 gregkh: Obviously I don't know what they told you, but the end result looks suspiciously like a spam. 14:07:17 I've always been here :) 14:07:43 gregkh: So... how long have you been analysing CVE-2021-47050 ? 14:08:04 pave1: if you are upset at what we are required to do, please take it up with the cve group, as we are doing what they required us to do 14:08:38 gregkh: Are you required to submit non-bugs as a CVEs? 14:08:46 gregkh: how long have you been analysing CVE-2021-47050 ? 14:09:20 That came from the gsd import into cave, still ongoing will take a few months 14:09:39 *cve, autocorrect... 14:09:58 Anyway again if people have questions email us please 14:10:32 gregkh: Yes, and it has your name on it, so I assume you have analysed it. Did you? 14:12:10 I must have, along with others, but I'm on vacation now, and can't remember specifics, if you have questions, again, email 14:13:05 Thanks Greg 14:13:35 good that we are talking :) 14:13:39 thanks! 14:13:43 gregkh: I don't have questions. I'm complaining about changelogs being copy pasted into CVEs without analysis. I believe "spam" is quite suitable word for that. 14:13:54 And in reading that record, looks real to me, if anyone doesnt think so, again, email 14:14:18 We take the log message directly, that's how we create cves 14:14:36 That's how we do for all kernel CVe records 14:14:47 gregkh: Yes, and that's a part of the problem. Because CVEs created that way don't make sense. 14:15:25 In the Linux kernel, the following vulnerability has been resolved: 14:15:29 memory: renesas-rpc-if: fix possible NULL pointer dereference of resource 14:16:00 Vulnerability would be "Malicious dts could lead to system crash". 14:16:08 Patch title is not vulnerability description. 14:18:25 folks, don't need to stop discussions, but I would close the irc call for this week now 14:18:44 Yep, lets do that. 14:18:51 maybe best to actually use email for this particular CVE and discuss that way further 14:19:00 #endmeeting