12:03:40 #startmeeting CIP IRC weekly meeting 12:03:40 Meeting started Thu Oct 20 12:03:40 2022 UTC and is due to finish in 60 minutes. The chair is jki. Information about MeetBot at http://wiki.debian.org/MeetBot. 12:03:40 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 12:03:40 The meeting name has been set to 'cip_irc_weekly_meeting' 12:03:46 #topic AI review 12:03:56 1. Add qemu-riscv to cip-kernel-config - patersonc 12:04:30 No updates 12:04:54 2. Ask Florian to support with 4.4 kernel-ci reports - jki 12:05:06 done, Florian will look into that soon 12:05:24 he said only one issue was remaining if he was recalling correctly 12:05:35 any other AIs? 12:06:01 If 4.4 reports are useful or close to being useful... should I take another look? 12:06:21 I asked Florian to approach you on that 12:06:52 but you may also give feedback earlier, if you like 12:07:17 Ok :-). 12:08:22 ok, moving on in... 12:08:33 3 12:08:36 2 12:08:38 1 12:08:46 #topic Kernel maintenance updates 12:09:21 I did reviews on 5.10.149 and .150. 12:09:27 This week reported 23 new CVEs that three of them are remote code execution vulnerabilities. 12:09:37 These vulnerabilities have been fixed in stable kernels. 12:09:40 (.150 is huge, still a lot to do). 12:10:33 i did 5.10.150 12:13:59 anything else for this topic? 12:15:50 3 12:15:51 2 12:15:53 1 12:15:56 #topic Kernel testing 12:16:21 replaying to florian issue 12:16:38 It appears that a lot of our RT tests aren't completing properly. The tests run, but the Python script that collates the results at the end doesn't run. 12:16:38 Need to investigate further. 12:17:31 Have you noticed this before Pavel? Do you just check the latency results and not worry about the script at the end? 12:17:58 I just watch for green ticks :-). Have not noticed that before, sorry. 12:18:45 Does anyone check test results? Or just that the lava job ran until the end? 12:19:16 patersonc: I suspect noone does. 12:20:30 So how are we confirming that we don't see regressions? If we don't look at the results it's just a load of boot tests 12:20:47 most of the issues that Florian pointed out about 4.4 looks solved upstream by the way 12:20:59 https://github.com/kernelci/kernelci-core/issues/1053#issuecomment-1285433617 12:21:55 Yeah, and a bunch of compile tests :-). A lot of kernel issues manifest by kernel not booting. 12:22:00 we have no more errors on 4.4 only some warnings 12:22:11 alicef: Ok, thanks. 12:22:37 pavel: do we need/want to resolve those as well? 12:22:39 pave1: sorry I couldn't find not booting kernel on the last 4.4 12:22:55 alicef: So I can just ignore the warnings, and only care if errors appear? 12:22:59 I'm currently looking a some strange smc regression on 4.4 qemu 12:23:56 jki: Ignoring warnings should be simple enough. Getting rid of warnings would be nice, but is not high priority. 12:24:43 pave1: I also don't know about that. Depend from the warnings and errors. I will give a shallow look if there is any warning that is interesting 12:25:50 patersonc: ? 12:26:01 alicef: Thanks. 12:26:57 pave1: actually you can just give a look at the end of this page and looks if there is something that is interesting. but I don't think there is https://linux.kernelci.org/build/cip/branch/linux-4.4.y-cip/kernel/v4.4.302-cip70-98-g7f7838c92740f/ 12:27:29 I will look if something as been not inserted that 12:27:47 like I cannot find the smc regression on that list 12:27:56 alicef: Ok, let me do that over the week. 12:28:38 they are really few warning. they are mostly like suggest parentesis and similar. thanks 12:29:00 Great 12:29:27 cool! 12:29:33 patersonc: Could we get three results in the gitlab-ci? 12:29:49 patersonc: Green -- nothing to see, noone needs to look here. 12:30:16 Red -- something is wrong in the kernel, either it failed to boot or some test failed. 12:30:38 Yellow or something -- something is wrong in the labs. Power failed, docker stuff is acting funny, ... 12:31:06 GitLab CI doesn't support this, sorry 12:31:32 Its important that we dont get green when theres some problem hidden in the logs. 12:31:55 pave1: following is the strange smc regression 12:31:59 Ok, next best thing: can we get last line of the log saying what is it? 12:32:02 why should gitlab not support this? 12:32:25 Lava runs can be translated into pipeline states - if they return clear results 12:32:59 pave1: We could trawl the test case results and make the whole thing a red cross if there is a single error? 12:33:02 on v4.4.302-cip70 all cve pass https://storage.kernelci.org/cip/linux-4.4.y-cip/v4.4.302-cip70/x86_64/x86_64_defconfig/gcc-10/lab-collabora/smc-qemu_x86_64.html 12:33:09 or did you mean the yellow state? 12:33:49 on v4.4.302-cip70-98-g7f7838c92740f we have a CVE-2020-0543: fail https://storage.kernelci.org/cip/linux-4.4.y-cip/v4.4.302-cip70-98-g7f7838c92740f/x86_64/x86_64_defconfig/gcc-10/lab-collabora/smc-qemu_x86_64.html 12:34:42 CVE-2020-0543: VULN (Your CPU microcode may need to be updated to mitigate the vulnerability) but is same board 12:34:46 Let me look into it. But last time I looked it wasn't something we can control, other then pass (green tick) or fail (red cross) 12:35:43 patersonc: Can we have two phases? We currently have build, test. 12:35:57 Have build, tests finish, tests report passing result? 12:36:01 looks like same board some toolchain but different smc test results O_o 12:37:05 pave1: Yea we could do something like this. And then just have to look into the collated test results in the last job 12:38:43 alicef: If it is spectre/meltdown checker on Qemu.. then I suggest we don't have to care much. 12:39:25 pave1: ok thanks. than we have only the warning I pointed you 12:40:05 alicef: Ok, let me look into that and report via email. 12:40:20 ok 12:41:27 It looks like we can set some exit codes now in GitLab CI: https://docs.gitlab.com/ee/ci/yaml/#allow_failureexit_codes 12:42:04 great to see this progress! 12:43:06 Although actually, that looks like it's used to determine if a job fails or passes based on return codes from scripts. It's not controlling whether we can set a yellow warning in the pipelines view. 12:43:08 So not so useful 12:44:43 yeah, a yellow state is not known to me either 12:44:51 fail or pass, that's it 12:46:55 Bit annoying. 12:47:12 well, open an issue with gitlab.com ;) 12:47:34 We can create some scripts to parse the results better, but at some point we may as well just use KernelCI as it is better for this kind of thing 12:47:58 Is it possible to not finish, or finish without returning the result? 12:47:59 This was the main drive for using KernelCI - much more advanced than our gitlab CI setup 12:48:41 pave1: Not finishing would eventually lead to a timeout, which would be a red cross 12:48:51 if you don't finish a job, you will empty out pockets with AWS bills ;) 12:49:12 The only option really is to fail the whole job if a single test case didn't pass 12:49:17 jki: That too :D 12:50:13 patersonc: That is good option. And add a line at the end of log explaining "test returned failure" so that we know it is different from "lab does not have power". 12:52:06 We pretty much have this already 12:52:06 e.g. https://gitlab.com/cip-project/cip-kernel/linux-cip/-/jobs/3184699360 12:52:25 Test case results are all there 12:52:43 We just don't have a overall fail if a single test case fails - only if the entire lava job failed 12:53:27 Example with some fails: 12:53:28 https://gitlab.com/cip-project/cip-kernel/linux-cip/-/jobs/3141003858 12:54:10 But then, in the case of SMC, some of those test cases are expected to fail. 12:54:24 So we'd need to have a way of knowing when failures are expected... 12:54:31 Or whether they are regressions 12:55:11 Drop SMC for now :-). 12:56:02 Or blacklist SMC from qemu targets. 12:56:13 I'm sure that all the LTP test cases don't pass.. 12:57:26 e.g. https://lava.ciplatform.org/results/763461 13:01:27 we've reached the top of the hour - anything else on testing? 13:02:12 3 13:02:14 2 13:02:16 1 13:02:19 #topic AOB 13:03:17 anyone anything? 13:04:05 3 13:04:07 2 13:04:09 1 13:04:11 #endmeeting