12:03:40 <jki> #startmeeting CIP IRC weekly meeting 12:03:40 <collab-meetbot> Meeting started Thu Oct 20 12:03:40 2022 UTC and is due to finish in 60 minutes. The chair is jki. Information about MeetBot at http://wiki.debian.org/MeetBot. 12:03:40 <collab-meetbot> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 12:03:40 <collab-meetbot> The meeting name has been set to 'cip_irc_weekly_meeting' 12:03:46 <jki> #topic AI review 12:03:56 <jki> 1. Add qemu-riscv to cip-kernel-config - patersonc 12:04:30 <patersonc[m]> No updates 12:04:54 <jki> 2. Ask Florian to support with 4.4 kernel-ci reports - jki 12:05:06 <jki> done, Florian will look into that soon 12:05:24 <jki> he said only one issue was remaining if he was recalling correctly 12:05:35 <jki> any other AIs? 12:06:01 <pave1> If 4.4 reports are useful or close to being useful... should I take another look? 12:06:21 <jki> I asked Florian to approach you on that 12:06:52 <jki> but you may also give feedback earlier, if you like 12:07:17 <pave1> Ok :-). 12:08:22 <jki> ok, moving on in... 12:08:33 <jki> 3 12:08:36 <jki> 2 12:08:38 <jki> 1 12:08:46 <jki> #topic Kernel maintenance updates 12:09:21 <pave1> I did reviews on 5.10.149 and .150. 12:09:27 <masami> This week reported 23 new CVEs that three of them are remote code execution vulnerabilities. 12:09:37 <masami> These vulnerabilities have been fixed in stable kernels. 12:09:40 <pave1> (.150 is huge, still a lot to do). 12:10:33 <uli> i did 5.10.150 12:13:59 <jki> anything else for this topic? 12:15:50 <jki> 3 12:15:51 <jki> 2 12:15:53 <jki> 1 12:15:56 <jki> #topic Kernel testing 12:16:21 <alicef> replaying to florian issue 12:16:38 <patersonc[m]> It appears that a lot of our RT tests aren't completing properly. The tests run, but the Python script that collates the results at the end doesn't run. 12:16:38 <patersonc[m]> Need to investigate further. 12:17:31 <patersonc[m]> Have you noticed this before Pavel? Do you just check the latency results and not worry about the script at the end? 12:17:58 <pave1> I just watch for green ticks :-). Have not noticed that before, sorry. 12:18:45 <patersonc[m]> Does anyone check test results? Or just that the lava job ran until the end? 12:19:16 <pave1> patersonc: I suspect noone does. 12:20:30 <patersonc[m]> So how are we confirming that we don't see regressions? If we don't look at the results it's just a load of boot tests 12:20:47 <alicef> most of the issues that Florian pointed out about 4.4 looks solved upstream by the way 12:20:59 <alicef> https://github.com/kernelci/kernelci-core/issues/1053#issuecomment-1285433617 12:21:55 <pave1> Yeah, and a bunch of compile tests :-). A lot of kernel issues manifest by kernel not booting. 12:22:00 <alicef> we have no more errors on 4.4 only some warnings 12:22:11 <pave1> alicef: Ok, thanks. 12:22:37 <jki> pavel: do we need/want to resolve those as well? 12:22:39 <alicef> pave1: sorry I couldn't find not booting kernel on the last 4.4 12:22:55 <pave1> alicef: So I can just ignore the warnings, and only care if errors appear? 12:22:59 <alicef> I'm currently looking a some strange smc regression on 4.4 qemu 12:23:56 <pave1> jki: Ignoring warnings should be simple enough. Getting rid of warnings would be nice, but is not high priority. 12:24:43 <alicef> pave1: I also don't know about that. Depend from the warnings and errors. I will give a shallow look if there is any warning that is interesting 12:25:50 <pave1> patersonc: ? 12:26:01 <pave1> alicef: Thanks. 12:26:57 <alicef> pave1: actually you can just give a look at the end of this page and looks if there is something that is interesting. but I don't think there is https://linux.kernelci.org/build/cip/branch/linux-4.4.y-cip/kernel/v4.4.302-cip70-98-g7f7838c92740f/ 12:27:29 <alicef> I will look if something as been not inserted that 12:27:47 <alicef> like I cannot find the smc regression on that list 12:27:56 <pave1> alicef: Ok, let me do that over the week. 12:28:38 <alicef> they are really few warning. they are mostly like suggest parentesis and similar. thanks 12:29:00 <patersonc[m]> Great 12:29:27 <jki> cool! 12:29:33 <pave1> patersonc: Could we get three results in the gitlab-ci? 12:29:49 <pave1> patersonc: Green -- nothing to see, noone needs to look here. 12:30:16 <pave1> Red -- something is wrong in the kernel, either it failed to boot or some test failed. 12:30:38 <pave1> Yellow or something -- something is wrong in the labs. Power failed, docker stuff is acting funny, ... 12:31:06 <patersonc[m]> GitLab CI doesn't support this, sorry 12:31:32 <pave1> Its important that we dont get green when theres some problem hidden in the logs. 12:31:55 <alicef> pave1: following is the strange smc regression 12:31:59 <pave1> Ok, next best thing: can we get last line of the log saying what is it? 12:32:02 <jki> why should gitlab not support this? 12:32:25 <jki> Lava runs can be translated into pipeline states - if they return clear results 12:32:59 <patersonc[m]> pave1: We could trawl the test case results and make the whole thing a red cross if there is a single error? 12:33:02 <alicef> on v4.4.302-cip70 all cve pass https://storage.kernelci.org/cip/linux-4.4.y-cip/v4.4.302-cip70/x86_64/x86_64_defconfig/gcc-10/lab-collabora/smc-qemu_x86_64.html 12:33:09 <jki> or did you mean the yellow state? 12:33:49 <alicef> on v4.4.302-cip70-98-g7f7838c92740f we have a CVE-2020-0543: fail https://storage.kernelci.org/cip/linux-4.4.y-cip/v4.4.302-cip70-98-g7f7838c92740f/x86_64/x86_64_defconfig/gcc-10/lab-collabora/smc-qemu_x86_64.html 12:34:42 <alicef> CVE-2020-0543: VULN (Your CPU microcode may need to be updated to mitigate the vulnerability) but is same board 12:34:46 <patersonc[m]> Let me look into it. But last time I looked it wasn't something we can control, other then pass (green tick) or fail (red cross) 12:35:43 <pave1> patersonc: Can we have two phases? We currently have build, test. 12:35:57 <pave1> Have build, tests finish, tests report passing result? 12:36:01 <alicef> looks like same board some toolchain but different smc test results O_o 12:37:05 <patersonc[m]> pave1: Yea we could do something like this. And then just have to look into the collated test results in the last job 12:38:43 <pave1> alicef: If it is spectre/meltdown checker on Qemu.. then I suggest we don't have to care much. 12:39:25 <alicef> pave1: ok thanks. than we have only the warning I pointed you 12:40:05 <pave1> alicef: Ok, let me look into that and report via email. 12:40:20 <alicef> ok 12:41:27 <patersonc[m]> It looks like we can set some exit codes now in GitLab CI: https://docs.gitlab.com/ee/ci/yaml/#allow_failureexit_codes 12:42:04 <jki> great to see this progress! 12:43:06 <patersonc[m]> Although actually, that looks like it's used to determine if a job fails or passes based on return codes from scripts. It's not controlling whether we can set a yellow warning in the pipelines view. 12:43:08 <patersonc[m]> So not so useful 12:44:43 <jki> yeah, a yellow state is not known to me either 12:44:51 <jki> fail or pass, that's it 12:46:55 <patersonc[m]> Bit annoying. 12:47:12 <jki> well, open an issue with gitlab.com ;) 12:47:34 <patersonc[m]> We can create some scripts to parse the results better, but at some point we may as well just use KernelCI as it is better for this kind of thing 12:47:58 <pave1> Is it possible to not finish, or finish without returning the result? 12:47:59 <patersonc[m]> This was the main drive for using KernelCI - much more advanced than our gitlab CI setup 12:48:41 <patersonc[m]> pave1: Not finishing would eventually lead to a timeout, which would be a red cross 12:48:51 <jki> if you don't finish a job, you will empty out pockets with AWS bills ;) 12:49:12 <patersonc[m]> The only option really is to fail the whole job if a single test case didn't pass 12:49:17 <patersonc[m]> jki: That too :D 12:50:13 <pave1> patersonc: That is good option. And add a line at the end of log explaining "test returned failure" so that we know it is different from "lab does not have power". 12:52:06 <patersonc[m]> We pretty much have this already 12:52:06 <patersonc[m]> e.g. https://gitlab.com/cip-project/cip-kernel/linux-cip/-/jobs/3184699360 12:52:25 <patersonc[m]> Test case results are all there 12:52:43 <patersonc[m]> We just don't have a overall fail if a single test case fails - only if the entire lava job failed 12:53:27 <patersonc[m]> Example with some fails: 12:53:28 <patersonc[m]> https://gitlab.com/cip-project/cip-kernel/linux-cip/-/jobs/3141003858 12:54:10 <patersonc[m]> But then, in the case of SMC, some of those test cases are expected to fail. 12:54:24 <patersonc[m]> So we'd need to have a way of knowing when failures are expected... 12:54:31 <patersonc[m]> Or whether they are regressions 12:55:11 <pave1> Drop SMC for now :-). 12:56:02 <pave1> Or blacklist SMC from qemu targets. 12:56:13 <patersonc[m]> I'm sure that all the LTP test cases don't pass.. 12:57:26 <patersonc[m]> e.g. https://lava.ciplatform.org/results/763461 13:01:27 <jki> we've reached the top of the hour - anything else on testing? 13:02:12 <jki> 3 13:02:14 <jki> 2 13:02:16 <jki> 1 13:02:19 <jki> #topic AOB 13:03:17 <jki> anyone anything? 13:04:05 <jki> 3 13:04:07 <jki> 2 13:04:09 <jki> 1 13:04:11 <jki> #endmeeting