Corrected the NVDLA driver to work with 5.13 #28

cybergaszcz · 2021-06-20T15:36:08Z

Corrected changes in API for the NVDLA driver to work with upstream kernel. Below results of tests:

./nvdla_runtime --loadable fast-math.nvdla --image 0_3.jpg --rawdump
creating new runtime context...
Emulator starting
dlaimg height: 28 x 28 x 1: LS: 128 SS: 0 Size: 3584
Enter:dla_read_network_config
(DLA_TEST) Error 0x00000004: Mismatched channel: 1 != 4 (in TestExit:dla_read_network_config status=0
Utils.cpp, function createImageCopy(), line 160)
submitting tasEnter: dla_initiate_processors
ks...
...
reset engine done
Work Found!
Work Done
execution time = 1797976.000000 s
Shutdown signal received, exiting
Test pass
[riscv@fedora-starfive ~]$ ./nvdla_runtime --loadable fast-math.nvdla 
[riscv@fedora-starfive ~]$ cat output.dimg 
0 0 0 117 0 2 0 0 0 0

./nvdla_runtime --loadable fast-math.nvdla --image 0_3.jpg --rawdump
creating new runtime context...
Emulator starting
dlaimg height: 28 x 28 x 1: LS: 128 SS: 0 Size: 3584Enter:dla_read_network_config

(DLA_TEST) Error 0x00000004: Mismatched channel: 1 != 4 (in TeExit:dla_read_network_config status=0
stUtils.cpp, function createImageCopy(), line 160)
submitting tEnter: dla_initiate_processors
asks
...
reset engine done
Work Found!
Work Done
execution time = 1794760.000000 s
Shutdown signal received, exiting
Test pass
[riscv@fedora-starfive ~]$ ls
0_3.jpg  fast-math.nvdla        libjpeg.a      output.dimg
0_7.jpg  fast-math-small.nvdla  nvdla_runtime
[riscv@fedora-starfive ~]$ cat output.dimg 
0 0 0 0 0 0 0 119 0 0

Message-id: <20180227002123.21608-1-ahs3@redhat.com> Patchwork-id: 206052 O-Subject: [RHEL8 BZ1518076 PATCH] ACPI: APEI: arm64: Ignore broken HPE moonshot APEI support Bugzilla: 1518076 RH-Acked-by: Mark Salter <msalter@redhat.com> RH-Acked-by: Jeremy McNicoll <jmcnicol@redhat.com> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1518076 Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=15417197 Tested: compile-only; several other patches are required for full booting QE has tested limited boot (see comment#12 of BZ) This is a re-post of a RHEL-ALT-7.5 patch specific to aarch64 moonshots that we use in beaker. It is required for these machines to boot. commit 8a663a264863efedf8bb4a9d76ac603920fdd739 Author: Robert Richter <rrichter@redhat.com> Date: Wed Aug 16 19:49:30 2017 -0400 [acpi] APEI: arm64: Ignore broken HPE moonshot APEI support From: Mark Salter <msalter@redhat.com> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1344237 Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13768971 Tested: Booted on moonshot with patched 4.11.0-20 kernel Upstream: RHEL-only The aarch64 HP moonshot platforms we have in beaker and elsewhere have a firmware bug which causes a spurious fatal memory error via APEI at boot time. This platform is no longer supported and no further firmware updates are expected. This is a downstream-only hack to avoid the problem by bailing out of HEST table probing if we detect a moonshot HEST table. Signed-off-by: Mark Salter <msalter@redhat.com> Signed-off-by: Robert Richter <rrichter@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Upstream Status: RHEL only Signed-off-by: Al Stone <ahs3@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com>

Message-id: <20180510173844.29580-3-msalter@redhat.com> Patchwork-id: 214383 O-Subject: [RHEL-8 BZ1519554 2/3] ACPI / irq: Workaround firmware issue on X-Gene based m400 Bugzilla: 1519554 RH-Acked-by: Al Stone <astone@redhat.com> RH-Acked-by: Tony Camuso <tcamuso@redhat.com> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1519554 Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=16144520 The ACPI firmware on the xgene-based m400 platorms erroneously describes its UART interrupt as ACPI_PRODUCER rather than ACPI_CONSUMER. This leads to the UART driver being unable to find its interrupt and the kernel unable find a console. Work around this by avoiding the producer/consumer check for X-Gene UARTs. Upstream Status: RHEL only Signed-off-by: Mark Salter <msalter@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com>

Message-id: <20180510173844.29580-4-msalter@redhat.com> Patchwork-id: 214381 O-Subject: [RHEL-8 BZ1519554 3/3] aarch64: acpi scan: Fix regression related to X-Gene UARTs Bugzilla: 1519554 RH-Acked-by: Al Stone <astone@redhat.com> RH-Acked-by: Tony Camuso <tcamuso@redhat.com> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1519554 Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=16144520 Commit e361d1f ("ACPI / scan: Fix enumeration for special UART devices") caused a regression with some X-Gene based platforms (Mustang and M400) with invalid DSDT. The DSDT makes it appear that the UART device is also a slave device attached to itself. With the above commit the UART won't be enumerated by ACPI scan (slave serial devices shouldn't be). So check for X-Gene UART device and skip slace device check on it. Upstream Status: RHEL only Signed-off-by: Mark Salter <msalter@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com>

…tion Message-id: <20180604013831.523644967@redhat.com> Patchwork-id: 8165 O-Subject: [kernel team] [PATCH RHEL8.0 V2 1/2] kdump: round up the total memory size to 128M for crashkernel reservation Bugzilla: 1507353 RH-Acked-by: Don Zickus <dzickus@redhat.com> RH-Acked-by: Baoquan He <bhe@redhat.com> RH-Acked-by: Pingfan Liu <piliu@redhat.com> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1507353 Build: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=16534135 Tested: ppc64le, x86_64 with several memory sizes. The total memory size we get in kernel is usually slightly less than 2G with 2G memory module machine. The main reason is bios/firmware reserve some area it will not export all memory as usable to Linux. 2G memory X86 kvm guest test result of the total_mem value: UEFI boot with ovmf: 0x7ef10000 Legacy boot kvm guest: 0x7ff7cc00 This is also a problem on arm64 UEFI booted system according to my test. Thus for example crashkernel=1G-2G:128M, if we have a 1G memory machine, we get total size 1023M from firmware then it will not fall into 1G-2G thus no memory reserved. User will never know that, it is hard to let user to know the exact total value we get in kernel An option is to use dmi/smbios to get physical memory size, but it's not reliable as well. According to Prarit hardware vendors sometimes screw this up. Thus round up total size to 128M to workaround this problem. Posted below patch in upstream, but no response yet: http://lists.infradead.org/pipermail/kexec/2018-April/020568.html Upstream Status: RHEL only Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com>

@offset

Rebased for v5.3-rc1 because the documentation has moved. Message-id: <20180604013831.574215750@redhat.com> Patchwork-id: 8166 O-Subject: [kernel team] [PATCH RHEL8.0 V2 2/2] kdump: add support for crashkernel=auto Bugzilla: 1507353 RH-Acked-by: Don Zickus <dzickus@redhat.com> RH-Acked-by: Baoquan He <bhe@redhat.com> RH-Acked-by: Pingfan Liu <piliu@redhat.com> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1507353 Build: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=16534135 Tested: ppc64le, x86_64 with several memory sizes. kdump qe tested 160M on various x86 machines in lab. We continue to provide crashkernel=auto like we did in RHEL6 and RHEL7, this will simplify the kdump deployment for common use cases that kdump just works with the auto reserved values. But this is still a best effort estimation, we can not know the exact memory requirement because it depends on a lot of different factors. The implementation of crashkernel=auto is simplified as a wrapper to use below kernel cmdline: x86_64: crashkernel=1G-64G:160M,64G-1T:256M,1T-:512M s390x: crashkernel=4G-64G:160M,64G-1T:256M,1T-:512M arm64: crashkernel=2G-:512M ppc64: crashkernel=2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G The difference between this way and the old implementation in RHEL6/7 is we do not scale the crash reserved memory size according to system memory size anymore. Latest effort to move upstream is below thread: https://lkml.org/lkml/2018/5/20/262 But unfortunately it is still unlikely to be accepted, thus we will still use a RHEL only patch in RHEL8. Copied old patch description about the history reason see below: ''' Non-upstream explanations: Besides "crashkenrel=X@Y" format, upstream also has advanced "crashkernel=range1:size1[,range2:size2,...][@offset]", and "crashkernel=X,high{low}" formats, but they need more careful manual configuration, and have different values for different architectures. Most of the distributions use the standard "crashkernel=X@Y" upstream format, and use crashkernel range format for advanced scenarios, heavily relying on the user's involvement. While "crashkernel=auto" is redhat's special feature, it exists and has been used as the default boot cmdline since 2008 rhel6. It does not require users to figure out how many crash memory size for their systems, also has been proved to be able to work pretty well for common scenarios. "crashkernel=auto" was tested/based on rhel-related products, as we have stable kernel configurations which means more or less stable memory consumption. In 2014 we tried to post them again to upstream but NACKed by people because they think it's not general and unnecessary, users can specify their own values or do that by scripts. However our customers insist on having it added to rhel. Also see one previous discussion related to this backport to Pegas: On 10/17/2016 at 10:15 PM, Don Zickus wrote: > On Fri, Oct 14, 2016 at 10:57:41AM +0800, Dave Young wrote: >> Don, agree with you we should evaluate them instead of just inherit >> them blindly. Below is what I think about kdump auto memory: >> There are two issues for crashkernel=auto in upstream: >> 1) It will be seen as a policy which should not go to kernel >> 2) It is hard to get a good number for the crash reserved size, >> considering various different kernel config options one can setups. >> In RHEL we are easier because our supported Kconfig is limited. >> I digged the upstream mail archive, but I'm not sure I got all the >> information, at least Michael Ellerman was objecting the series for >> 1). > Yes, I know. Vivek and I have argued about this for years. :-) > > I had hoped all the changes internally to the makedumpfile would allow > the memory configuration to stabilize at a number like 192M or 128M and > only in the rare cases extend beyond that. > > So I always treated that as a temporary hack until things were better. > With the hope of every new RHEL release we get smarter and better. :-) > Ideally it would be great if we could get the number down to 64M for most > cases and just turn it on in Fedora. Maybe someday.... ;-) > > We can have this conversation when the patch gets reposted/refreshed > for upstream on rhkl? > > Cheers, > Don We had proposed to drop the historic crashkernel=auto code and move to use crashkernel=range:size format and pass them in anaconda. The initial reason is crashkernel=range:size works just fine because we do not need complex algorithm to scale crashkernel reserved size any more. The old linear scaling is mainly for old makedumpfile requirements, now it is not necessary. But With the new approach, backward compatibility is potentially at risk. For e.g. let's consider the following cases: 1) When we upgrade from an older distribution like rhel-alt-7.4(which uses crashkernel=auto) to rhel-alt-7.5 (which uses the crashkernel=xY format) In this case we can use anaconda scripts for checking 'crashkernel=auto' in kernel spec and update to the new 'crashkernel=range:size' format. 2) When we upgrade from rhel-alt-7.5(which uses crashkernel=xY format) to rhel-alt-7.6(which uses crashkernel=xY format), but the x and/or Y values are changed in rhel-alt-7.6. For example from crashkernel=2G-:160M to crashkernel=2G-:192M, then we have no way to determine if the X and/or Y values were distribution provided or user specified ones. Since it is recommended to give precedence to user-specified values, so we cannot do an upgrade in such a case." Thus turn back to resolve it in kernel, and add a simpler version which just hacks to use the range:size style in code, and make rhel-only code easily to maintain. ''' Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Upstream Status: RHEL only Signed-off-by: Jeremy Cline <jcline@redhat.com>

Message-id: <20180612005422.GA2568@dhcp-128-65.nay.redhat.com> Patchwork-id: 8201 O-Subject: [kernel team] [RHEL8.0 PATCH V2] kdump: fix a grammar issue in a kernel message Bugzilla: 1507353 RH-Acked-by: Myron Stowe <mstowe@redhat.com> RH-Acked-by: Laszlo Ersek <lersek@redhat.com> RH-Acked-by: Jiri Benc <jbenc@redhat.com> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1507353 Upstream Status: RHEL-only as crashkernel=auto is not accepted in upstream Build: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=16661950 Test: verified on a kvm guest s/choosed/chosen Upstream Status: RHEL only Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com>

Message-id: <1528412373-19128-2-git-send-email-rrichter@redhat.com> Patchwork-id: 220950 O-Subject: [RHEL-8.0 BZ 1563590 v2 1/2] PCI: Vulcan: AHCI PCI bar fix for Broadcom Vulcan early silicon Bugzilla: 1563590 RH-Acked-by: Dean Nelson <dnelson@redhat.com> RH-Acked-by: Mark Langsdorf <mlangsdo@redhat.com> RH-Acked-by: Mark Salter <msalter@redhat.com> From: Ashok Kumar Sekar <asekar@redhat.com> PCI BAR 5 is not setup correctly for the on-board AHCI controller on Broadcom's Vulcan processor. Added a quirk to fix BAR 5 by using BAR 4's resources which are populated correctly but NOT used by the AHCI controller actually. RHEL-only: Both patches are in RHEL-7.6 also. Inclusion of the patches into RHEL-8 was discussed. Since there are partners with Ax system configurations it was decided to carry them in RHEL8 too. See: https://bugzilla.redhat.com/show_bug.cgi?id=1563590#c1 Upstream Status: RHEL only Signed-off-by: Ashok Kumar Sekar <asekar@redhat.com> Signed-off-by: Jayachandran C <jchandra@broadcom.com> Signed-off-by: Robert Richter <rrichter@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com>

Message-id: <1528412373-19128-3-git-send-email-rrichter@redhat.com> Patchwork-id: 220952 O-Subject: [RHEL-8.0 BZ 1563590 v2 2/2] ahci: thunderx2: Fix for errata that affects stop engine Bugzilla: 1563590 RH-Acked-by: Dean Nelson <dnelson@redhat.com> RH-Acked-by: Mark Langsdorf <mlangsdo@redhat.com> RH-Acked-by: Mark Salter <msalter@redhat.com> From: Jayachandran C <jnair@caviumnetworks.com> Apply workaround for this errata: Synopsis: Resetting PxCMD.ST may hang the SATA device Description: An internal ping-pong buffer state is not reset correctly for an PxCMD.ST=0 command for a SATA channel. This may cause the SATA interface to hang when a PxCMD.ST=0 command is received. Workaround: A SATA_BIU_CORE_ENABLE.sw_init_bsi must be asserted by the driver whenever the PxCMD.ST needs to be de-asserted. This will reset both the ports. So, it may not always work in a 2 channel SATA system. Resolution: Fix in B0. Add the code to ahci_stop_engine() to do this. It is not easy to stop the other "port" since it is associated with a different AHCI interface. Please note that with this fix, SATA reset does not hang any more, but it can cause failures on the other interface if that is in active use. Unfortunately, we have nothing other the the CPU ID to check if the SATA block has this issue. RHEL-only: Both patches are in RHEL-7.6 also. Inclusion of the patches into RHEL-8 was discussed. Since there are partners with Ax system configurations it was decided to carry them in RHEL8 too. See: https://bugzilla.redhat.com/show_bug.cgi?id=1563590#c1 [v3 with new delays] Signed-off-by: Jayachandran C <jnair@caviumnetworks.com> Upstream Status: RHEL only Signed-off-by: Robert Richter <rrichter@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com>

Message-id: <1531768843-2544-4-git-send-email-dbrace@redhat.com> Patchwork-id: 224988 O-Subject: [RHEL 8.0 e-stor V2 PATCH 3/5] scsi: smartpqi: add inspur advantech ids Bugzilla: 1503736 RH-Acked-by: Ewan Milne <emilne@redhat.com> RH-Acked-by: Tomas Henzl <thenzl@redhat.com> From: Kevin Barnett <kevin.barnett@microsemi.com> Add support for these new device IDs: Advantech MIC-8312BridgeB INSPUR PM8204-2GB INSPUR PM8204-4GB INSPUR PM8222-SHBA Upstream Status: RHEL only Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 9f8d05f) Signed-off-by: Don Brace <dbrace@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com>

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1670017 Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=20147017 Commit 913a89f ("ipmi: Don't initialize anything in the core until something uses it") added new locking which broke context. Message-id: <20180713142210.15700-1-tcamuso@redhat.com> Patchwork-id: 224899 O-Subject: [RHEL8 BZ 1583537 1/1] ipmi: do not configure ipmi for HPE m400 Bugzilla: 1583537 RH-Acked-by: Dean Nelson <dnelson@redhat.com> RH-Acked-by: Al Stone <ahs3@redhat.com> RH-Acked-by: Mark Salter <msalter@redhat.com> bugzilla:https://bugzilla.redhat.com/show_bug.cgi?id=1583537 brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=17150528 RHEL-only The ARM-based HPE m400 reports host-side ipmi as residing in intel port-io space, which does not exist in ARM processors. Therefore, when running on an m400, host-side ipmi configuration code must simply return zero without trying to configure the host-side ipmi. This patch prevents panic on boot by averting attempts to configure host-side ipmi on this platform. Though HPE m400 is not certified with RHEL, and HPE has relegated it to EOL status, the platform is still used extensively in ARM development and test for RHEL. Testing: Boot without blacklisting ipmi and check to see that no ipmi modules are loaded. Signed-off-by: Tony Camuso <tcamuso@redhat.com> cc: Prarit Bhargava <prarit@redhat.com> cc: Brendan Conoboy <blc@redhat.com> cc: Jeff Bastian <jbastian@redhat.com> cc: Scott Herold <sherold@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Upstream Status: RHEL only Signed-off-by: Laura Abbott <labbott@redhat.com> Acked-by: Tony Camuso <tcamuso@redhat.com> Acked-by: Dean Nelson <dnelson@redhat.com> Acked-by: Jarod Wilson <jarod@redhat.com> Acked-by: Mark Salter <msalter@redhat.com>

Message-id: <20190520222102.19488-1-labbott@redhat.com> Patchwork-id: 259215 O-Subject: [ARK INTERNAL PATCH] iommu/arm-smmu: workaround DMA mode issues Bugzilla: RH-Acked-by: Mark Langsdorf <mlangsdo@redhat.com> RH-Acked-by: Mark Salter <msalter@redhat.com> From: Mark Salter <msalter@redhat.com> Rebased for v5.2-rc1 Bugzilla: 1652259 Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=19244562 Upstream status: RHEL only. rhel8 commit 65feb1ed0ec9a088a63a90d46c0f7563ac96ad0f Author: Mark Salter <msalter@redhat.com> Date: Wed Nov 21 17:15:59 2018 +0100 [iommu] iommu/arm-smmu: workaround DMA mode issues Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1624077 Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=18112820 Testing: Verified iommu.passthrough=1 no longer needed on gigabyte platforms. Upstream Status: RHEL-only In RHEL_ALT 7.5 we carried a RHEL-only patch which forced the arm smmuv2 into bypass mode due to performance issues on CN88xx. This was intended to be a temporary hack until the issues were resolved. Another vendor had issues with the iommu in bypass mode so we reverted the RHEL-only patch so that iommu is in DMA mode by default (upstream default). It turns on that there are remaining SMMU DMA mode issues on Gigabyte platformws with CN88xx cpus. The problem manifests itself by pcie card drivers failing to initialize the cards when SMMU is in DMA mode. The root cause has not been determined yet, but looks likely to be a hw or firmware issue. This patch forces bypass mode for Gigabyte platforms. CN88xx isn't officially supported in RHEL but we have a lot of them being used internally for testing, so I think we want this to support that use case in RHEL8. Signed-off-by: Mark Salter <msalter@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Acked-by: Mark Salter <msalter@redhat.com> Acked-by: Donald Dutile <ddutile@redhat.com> Upstream Status: RHEL only Signed-off-by: Laura Abbott <labbott@redhat.com>

Message-id: <20191001181256.22935-1-jcline@redhat.com> Patchwork-id: 275498 O-Subject: [ARK INTERNAL PATCH] [ARK INTERNAL PATCH] [redhat] Add patch to drop the EXPERT setting from ARM64_FORCE_52BIT Bugzilla: RH-Acked-by: Laura Abbott <labbott@redhat.com> We don't turn on EXPERT as there are few settings we actually want to mess with. Remove the dependency for ARM64_FORCE_52BIT as we do want that on in debug builds to help find 52-bit bugs. Upstream Status: RHEL only Signed-off-by: Jeremy Cline <jcline@redhat.com>

This adds efi_status_to_str() for use when printing efi_status_t messages, and reworks efi_status_to_err() so that the two use a common list of errors. Upstream Status: RHEL only Signed-off-by: Peter Jones <pjones@redhat.com>

Upstream Status: RHEL only Signed-off-by: Peter Jones <pjones@redhat.com> Signed-off-by: Jeremy Cline <jcline@redhat.com>

In order to automatically lock down kernels running on UEFI machines booted in Secure Boot mode, expose the lock_kernel_down() hook. Upstream Status: RHEL only Signed-off-by: Jeremy Cline <jcline@redhat.com>

UEFI machines can be booted in Secure Boot mode. Add an EFI_SECURE_BOOT flag that can be passed to efi_enabled() to find out whether secure boot is enabled. Move the switch-statement in x86's setup_arch() that inteprets the secure_boot boot parameter to generic code and set the bit there. Upstream Status: RHEL only Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> cc: linux-efi@vger.kernel.org [Rebased for context; efi_is_table_address was moved to arch/x86] Signed-off-by: Jeremy Cline <jcline@redhat.com>

UEFI Secure Boot provides a mechanism for ensuring that the firmware will only load signed bootloaders and kernels. Certain use cases may also require that all kernel modules also be signed. Add a configuration option that to lock down the kernel - which includes requiring validly signed modules - if the kernel is secure-booted. Upstream Status: RHEL only Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeremy Cline <jcline@redhat.com>

Automatically lock down the kernel to LOCKDOWN_CONFIDENTIALITY_MAX if the IPL secure flag is set. Upstream Status: RHEL only Suggested-by: Philipp Rudo <prudo@redhat.com> Signed-off-by: Jeremy Cline <jcline@redhat.com>

This is a hack, but it's what the other distros currently use for aarch64 with 4K pages so we'll do the same while upstream decides what the best outcome is (which isn't this). Upstream Status: RHEL only Signed-off-by: Peter Robinson <pbrobinson@gmail.com> [Add a dependency on RHEL_DIFFERENCES] Signed-off-by: Jeremy Cline <jcline@redhat.com>

We will use this to force CONFIG_HIGHPTE off on LPAE for now Signed-off-by: Jon Masters <jcm@redhat.com>

Patch for disconnect issues with storage attached to a tegra-ehci controller

The IRQ from rmi4 may interfere with the one we currently use on i2c-hid. Given that there is already a need for an external API from rmi4 to forward the attention data, we can, in this particular case rely on a separate workqueue to prevent cursor jumps. Reported-by: Cameron Gutman <aicommander@gmail.com> Reported-by: Thorsten Leemhuis <linux@leemhuis.info> Reported-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Andrew Duggan <aduggan@synaptics.com> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Lyude <lyude@redhat.com>

This patch completes commit 278311e ("kexec, KEYS: Make use of platform keyring for signature verify") which, while adding the platform keyring for bzImage verification, neglected to also add this keyring for module verification. As such, kernel modules signed with keys from the MokList variable were not successfully verified. Signed-off-by: Robert Holmes <robeholmes@gmail.com> Signed-off-by: Jeremy Cline <jcline@redhat.com>

Now if DEFAULT_OFF set to y, kmemleak_init will start the cleanup_work workqueue. Then late_init call will set kmemleak_initialized to 1, the cleaup workqueue will try to do cleanup, triggering: [24.738773] ================================================================== [24.742784] BUG: KASAN: global-out-of-bounds in __kmemleak_do_cleanup+0x166/0x180 [24.744144] Key type ._fscrypt registered [24.745680] Read of size 8 at addr ffffffff88746c90 by task kworker/3:1/171 [24.745687] [24.745697] CPU: 3 PID: 171 Comm: kworker/3:1 Not tainted 5.3.0-v5.3-12475-gcbafe18 #1 [24.745701] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [24.745710] Workqueue: events kmemleak_do_cleanup [24.745717] Call Trace: [24.745736] dump_stack+0x7c/0xc0 [24.745755] print_address_description.constprop.4+0x1f/0x300 [24.751562] Key type .fscrypt registered [24.754370] __kasan_report.cold.8+0x76/0xb2 [24.754388] ? __kmemleak_do_cleanup+0x166/0x180 [24.754407] kasan_report+0xe/0x20 [24.778543] __kmemleak_do_cleanup+0x166/0x180 [24.780795] process_one_work+0x919/0x17d0 [24.782929] ? pwq_dec_nr_in_flight+0x320/0x320 [24.785092] worker_thread+0x87/0xb40 [24.786948] ? __kthread_parkme+0xc3/0x190 [24.789217] ? process_one_work+0x17d0/0x17d0 [24.791414] kthread+0x333/0x3f0 [24.793031] ? kthread_create_worker_on_cpu+0xc0/0xc0 [24.795473] ret_from_fork+0x3a/0x50 [24.797303] [24.798091] The buggy address belongs to the variable: [24.800634] mem_pool_free_count+0x10/0x40 [24.802656] [24.803434] Memory state around the buggy address: [24.805793] ffffffff88746b80: 04 fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00 [24.809177] ffffffff88746c00: 00 fa fa fa fa fa fa fa 00 00 fa fa fa fa fa fa [24.812407] >ffffffff88746c80: 04 fa fa fa fa fa fa fa 00 00 fa fa fa fa fa fa [24.815638] ^ [24.817372] ffffffff88746d00: 00 00 fa fa fa fa fa fa 00 00 00 00 00 00 00 00 [24.820740] ffffffff88746d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [24.824021] ================================================================== Fixes: c566586 ("mm: kmemleak: use the memory pool for early allocations") Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com>

KernelCI reports that bcm2835_defconfig is no longer booting since commit ac7c3e4 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"): https://lkml.org/lkml/2019/9/26/825 I also received a regression report from Nicolas Saenz Julienne: https://lkml.org/lkml/2019/9/27/263 This problem has cropped up on arch/arm/config/bcm2835_defconfig because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends to prefer not inlining functions with -Os. I was able to reproduce it with other boards and defconfig files by manually enabling CONFIG_CC_OPTIMIZE_FOR_SIZE. The __get_user_check() specifically uses r0, r1, r2 registers. So, uaccess_save_and_enable() and uaccess_restore() must be inlined in order to avoid those registers being overwritten in the callees. Prior to commit 9012d01 ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for inlining functions, except on x86. Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING. So, __always_inline is now the only guaranteed way of forcible inlining. I want to keep as much compiler's freedom as possible about the inlining decision. So, I changed the function call order instead of adding __always_inline around. Call uaccess_save_and_enable() before assigning the __p ("r0"), and uaccess_restore() after evacuating the __e ("r0"). Fixes: 9012d01 ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING") Reported-by: "kernelci.org bot" <bot@kernelci.org> Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Tested-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

The soc node should only have "simple-bus" in compatible. This matches what is done for: microchip/microchip-mpfs.dtsi canaan/k210.dtsi sifive/fu740-c000.dtsi This allows starfive.yaml to pass dt_binding_check without adding: - items: - const: starfive,jh7100 - const: simple-bus Signed-off-by: Drew Fustini <drew@beagleboard.org>

…a_driver

esmil · 2021-06-20T23:17:09Z

Nice! I can't merge this as-is though as it contains merges of a lot of unrelated commits and that won't work when rebasing the beaglev branch on the next rc. Could you try if this for you:
https://github.com/esmil/linux/tree/beaglev-nvdla

cybergaszcz · 2021-06-21T06:10:40Z

Could you try if this for you:
https://github.com/esmil/linux/tree/beaglev-nvdla
Ok, this branch contains all these changes related to the NVDLA. I am closing this pull request.

esmil · 2021-06-21T12:38:46Z

@cybergaszcz So did you try building that branch and running it?

cybergaszcz · 2021-06-21T14:10:40Z

Yes, I have tested it and works properly. Thank you for support.

[riscv@fedora-starfive ~]$ uname -a
Linux fedora-starfive 5.13.0-rc7-beaglev-308898-g56b4d6160381 #1 SMP Mon Jun 21 15:58:06 CEST 2021 riscv64 riscv64 riscv64 GNU/Linux
[riscv@fedora-starfive ~]$ dmesg | grep nvdla
[    1.444778] NVDLA 11940000.nvdla: coherent device 0 dev->dma_coherent 0
[    1.453576] Probe NVDLA config nvidia,nvdla_os_initial
[    1.471860] [drm] Initialized nvdla 0.0.0 20171017 for 11940000.nvdla on minor 0
[    1.480906] NVDLA 11940000.nvdla: Get mem from memory-region

./nvdla_runtime --loadable fast-math.nvdla  --image0_7.jpg --rawdump
creating new runtime context...
Emulator starting
dlaimg height: 28 x 28 x 1: LS: 128 SS: 0 Size: 3584Enter:dla_read_network_config

(DLA_TEST) Error 0x00000004: Mismatched channel: 1 != 4 (in TeExit:dla_read_network_config status=0
stUtils.cpp, function createImageCopy(), line 160)
submitting tEnter: dla_initiate_processors
asks...
...
reset engine done
Work Found!
Work Done
execution time = 1799008.000000 s
Shutdown signal received, exiting
Test pass                                                                                                  cat output.dimg 
0 0 0 0 0 0 0 119 0 0 [riscv@fedora-starfive ~]$ 

[riscv@fedora-starfive ~]$ cat /proc/interrupts 
           CPU0       CPU1       
  5:      11941       9474  RISC-V INTC   5  riscv-timer
  7:        850          0  SiFive PLIC  73  ttyS0
  8:          0          0  SiFive PLIC   2  dw_axi_dmac_platform
 10:          0          0  SiFive PLIC   1  dw_axi_dmac_platform
 11:         85          0  SiFive PLIC  44  xhci-hcd:usb1
 13:          0          0  SiFive PLIC  43  104c0000.usb
 14:          0          0  SiFive PLIC  32  11910000.gpio
 15:        253          0  SiFive PLIC  96  118b0000.i2c
 16:          0          0  SiFive PLIC  97  118c0000.i2c
 17:          0          0  SiFive PLIC  74  12450000.i2c
 18:          7          0  SiFive PLIC  98  118d0000.trng
 20:          0          0  SiFive PLIC   6  eth0
 21:          0          0  SiFive PLIC   7  eth0
 22:         12          0  SiFive PLIC  22  11940000.nvdla
 26:          0          0  SiFive PLIC  70  12410000.spi
 29:      24024          0  SiFive PLIC   4  dw-mci
 30:       7234          0  SiFive PLIC   5  dw-mci
 31:       3698          0  SiFive PLIC 101  sf_lcdc
 32:          0          0  SiFive PLIC 103  sf_vpp1
 35:         10          0  SiFive PLIC 122  124a0000.tmon
IPI0:        92        105  Rescheduling interrupts
IPI1:      2528      12933  Function call interrupts
IPI2:         0          0  CPU stop interrupts
IPI3:         0          0  IRQ work interrupts
IPI4:         0          0  Timer broadcast interrupts

esmil · 2021-06-21T14:59:28Z

Thanks. I'll push this to the beaglev branch then.

esmil · 2021-06-23T11:15:42Z

@cybergaszcz Hey, I just noticed this driver includes code to flush caches. This should be obsoleted by Atish' generic non-coherent dma solution that is on the beaglev branch. Could you try if something like this still works for you on the beaglev branch: http://sprunge.us/rBNdHk

cybergaszcz · 2021-06-24T05:14:49Z

Hey,
Thanks. Yes, I have checked and everything works fine:

[riscv@fedora-starfive ~]$ uname -a
Linux fedora-starfive 5.13.0-rc7-beaglev-308899-gc7f42bb10e72 #2 SMP Thu Jun 24 07:06:52 CEST 2021 riscv64 riscv64 riscv64 GNU/Linux
[riscv@fedora-starfive ~]$ dmesg | grep nvdla
[    1.448746] NVDLA 11940000.nvdla: coherent device 0 dev->dma_coherent 0
[    1.457482] Probe NVDLA config nvidia,nvdla_os_initial
[    1.475764] [drm] Initialized nvdla 0.0.0 20171017 for 11940000.nvdla on minor 0
[    1.484817] NVDLA 11940000.nvdla: Get mem from memory-region
Work Found!
Work Done
execution time = 1794150.000000 s
Shutdown signal received, exiting
Test pass
[riscv@fedora-starfive ~]$ ls
0_3.jpg  fast-math.nvdla        libjpeg.a      output.dimg
0_7.jpg  fast-math-small.nvdla  nvdla_runtime
[riscv@fedora-starfive ~]$ cat output.dimg 
0 0 0 0 0 0 0 119 0 0 [riscv@fedora-starfive ~]$ cat /proc/interrupts 
           CPU0       CPU1       
  5:      13756      10262  RISC-V INTC   5  riscv-timer
  7:        928          0  SiFive PLIC  73  ttyS0
  8:          0          0  SiFive PLIC   2  dw_axi_dmac_platform
 10:          0          0  SiFive PLIC   1  dw_axi_dmac_platform
 11:         86          0  SiFive PLIC  44  xhci-hcd:usb1
 13:          0          0  SiFive PLIC  43  104c0000.usb
 14:          0          0  SiFive PLIC  32  11910000.gpio
 15:        254          0  SiFive PLIC  96  118b0000.i2c
 16:          0          0  SiFive PLIC  97  118c0000.i2c
 17:          0          0  SiFive PLIC  74  12450000.i2c
 18:          7          0  SiFive PLIC  98  118d0000.trng
 20:          0          0  SiFive PLIC   6  eth0
 21:          0          0  SiFive PLIC   7  eth0
 22:         12          0  SiFive PLIC  22  11940000.nvdla
 26:          0          0  SiFive PLIC  70  12410000.spi
 29:      31016          0  SiFive PLIC   4  dw-mci
 30:       7309          0  SiFive PLIC   5  dw-mci
 31:       5731          0  SiFive PLIC 101  sf_lcdc
 32:          0          0  SiFive PLIC 103  sf_vpp1
 35:         16          0  SiFive PLIC 122  124a0000.tmon
IPI0:        90        113  Rescheduling interrupts
IPI1:      3104      17331  Function call interrupts
IPI2:         0          0  CPU stop interrupts
IPI3:         0          0  IRQ work interrupts
IPI4:         0          0  Timer broadcast interrupts

esmil · 2021-06-24T08:15:52Z

Thanks! I'll push that patch to beaglev then.

geertu · 2021-06-29T07:48:07Z

Note that this driver is using dma_declare_coherent_memory(), which is planned to be removed by Christoph.
https://lore.kernel.org/linux-sh/20210623133205.GA28589@lst.de/

cybergaszcz · 2021-06-29T07:53:43Z

Note that this driver is using dma_declare_coherent_memory(), which is planned to be removed by Christoph.
https://lore.kernel.org/linux-sh/20210623133205.GA28589@lst.de/

In final version of the BeagleV there will be no NVDLA. I have adapted it only to test present HW on beta version of the BeagleV.

pdp7 · 2021-06-29T17:30:11Z

@cybergaszcz We have not communicated this to the public yet, but there will be more JH7100 SoC's produced as a ~3,000 run. It won't be full mass production wafer run like the JH7110 will be, but it does mean your efforts for NVDLA on the JH7100 will have a benefit beyond just those in the beta developer program.

[ Upstream commit b5befe8 ] An srcu_struct structure that is initialized before rcu_init_geometry() will have its srcu_node hierarchy based on CONFIG_NR_CPUS. Once rcu_init_geometry() is called, this hierarchy is compressed as needed for the actual maximum number of CPUs for this system. Later on, that srcu_struct structure is confused, sometimes referring to its initial CONFIG_NR_CPUS-based hierarchy, and sometimes instead to the new num_possible_cpus() hierarchy. For example, each of its ->mynode fields continues to reference the original leaf rcu_node structures, some of which might no longer exist. On the other hand, srcu_for_each_node_breadth_first() traverses to the new node hierarchy. There are at least two bad possible outcomes to this: 1) a) A callback enqueued early on an srcu_data structure (call it *sdp) is recorded pending on sdp->mynode->srcu_data_have_cbs in srcu_funnel_gp_start() with sdp->mynode pointing to a deep leaf (say 3 levels). b) The grace period ends after rcu_init_geometry() shrinks the nodes level to a single one. srcu_gp_end() walks through the new srcu_node hierarchy without ever reaching the old leaves so the callback is never executed. This is easily reproduced on an 8 CPUs machine with CONFIG_NR_CPUS >= 32 and "rcupdate.rcu_self_test=1". The srcu_barrier() after early tests verification never completes and the boot hangs: [ 5413.141029] INFO: task swapper/0:1 blocked for more than 4915 seconds. [ 5413.147564] Not tainted 5.12.0-rc4+ #28 [ 5413.151927] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 5413.159753] task:swapper/0 state:D stack: 0 pid: 1 ppid: 0 flags:0x00004000 [ 5413.168099] Call Trace: [ 5413.170555] __schedule+0x36c/0x930 [ 5413.174057] ? wait_for_completion+0x88/0x110 [ 5413.178423] schedule+0x46/0xf0 [ 5413.181575] schedule_timeout+0x284/0x380 [ 5413.185591] ? wait_for_completion+0x88/0x110 [ 5413.189957] ? mark_held_locks+0x61/0x80 [ 5413.193882] ? mark_held_locks+0x61/0x80 [ 5413.197809] ? _raw_spin_unlock_irq+0x24/0x50 [ 5413.202173] ? wait_for_completion+0x88/0x110 [ 5413.206535] wait_for_completion+0xb4/0x110 [ 5413.210724] ? srcu_torture_stats_print+0x110/0x110 [ 5413.215610] srcu_barrier+0x187/0x200 [ 5413.219277] ? rcu_tasks_verify_self_tests+0x50/0x50 [ 5413.224244] ? rdinit_setup+0x2b/0x2b [ 5413.227907] rcu_verify_early_boot_tests+0x2d/0x40 [ 5413.232700] do_one_initcall+0x63/0x310 [ 5413.236541] ? rdinit_setup+0x2b/0x2b [ 5413.240207] ? rcu_read_lock_sched_held+0x52/0x80 [ 5413.244912] kernel_init_freeable+0x253/0x28f [ 5413.249273] ? rest_init+0x250/0x250 [ 5413.252846] kernel_init+0xa/0x110 [ 5413.256257] ret_from_fork+0x22/0x30 2) An srcu_struct structure that is initialized before rcu_init_geometry() and used afterward will always have stale rdp->mynode references, resulting in callbacks to be missed in srcu_gp_end(), just like in the previous scenario. This commit therefore causes init_srcu_struct_nodes to initialize the geometry, if needed. This ensures that the srcu_node hierarchy is properly built and distributed from the get-go. Suggested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Uladzislau Rezki <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit d412137 ] The perf_buffer fails on system with offline cpus: # test_progs -t perf_buffer test_perf_buffer:PASS:nr_cpus 0 nsec test_perf_buffer:PASS:nr_on_cpus 0 nsec test_perf_buffer:PASS:skel_load 0 nsec test_perf_buffer:PASS:attach_kprobe 0 nsec test_perf_buffer:PASS:perf_buf__new 0 nsec test_perf_buffer:PASS:epoll_fd 0 nsec skipping offline CPU #24 skipping offline CPU #25 skipping offline CPU #26 skipping offline CPU #27 skipping offline CPU #28 skipping offline CPU #29 skipping offline CPU #30 skipping offline CPU #31 test_perf_buffer:PASS:perf_buffer__poll 0 nsec test_perf_buffer:PASS:seen_cpu_cnt 0 nsec test_perf_buffer:FAIL:buf_cnt got 24, expected 32 Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED Changing the test to check online cpus instead of possible. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20211021114132.8196-2-jolsa@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>

kasan detects access beyond the end of the xibm->bitmap allocation: BUG: KASAN: slab-out-of-bounds in _find_first_zero_bit+0x40/0x140 Read of size 8 at addr c00000001d1d0118 by task swapper/0/1 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc2-00001-g90df023b36dd #28 Call Trace: [c00000001d98f770] [c0000000012baab8] dump_stack_lvl+0xac/0x108 (unreliable) [c00000001d98f7b0] [c00000000068faac] print_report+0x37c/0x710 [c00000001d98f880] [c0000000006902c0] kasan_report+0x110/0x354 [c00000001d98f950] [c000000000692324] __asan_load8+0xa4/0xe0 [c00000001d98f970] [c0000000011c6ed0] _find_first_zero_bit+0x40/0x140 [c00000001d98f9b0] [c0000000000dbfbc] xive_spapr_get_ipi+0xcc/0x260 [c00000001d98fa70] [c0000000000d6d28] xive_setup_cpu_ipi+0x1e8/0x450 [c00000001d98fb30] [c000000004032a20] pSeries_smp_probe+0x5c/0x118 [c00000001d98fb60] [c000000004018b44] smp_prepare_cpus+0x944/0x9ac [c00000001d98fc90] [c000000004009f9c] kernel_init_freeable+0x2d4/0x640 [c00000001d98fd90] [c0000000000131e8] kernel_init+0x28/0x1d0 [c00000001d98fe10] [c00000000000cd54] ret_from_kernel_thread+0x5c/0x64 Allocated by task 0: kasan_save_stack+0x34/0x70 __kasan_kmalloc+0xb4/0xf0 __kmalloc+0x268/0x540 xive_spapr_init+0x4d0/0x77c pseries_init_irq+0x40/0x27c init_IRQ+0x44/0x84 start_kernel+0x2a4/0x538 start_here_common+0x1c/0x20 The buggy address belongs to the object at c00000001d1d0118 which belongs to the cache kmalloc-8 of size 8 The buggy address is located 0 bytes inside of 8-byte region [c00000001d1d0118, c00000001d1d0120) The buggy address belongs to the physical page: page:c00c000000074740 refcount:1 mapcount:0 mapping:0000000000000000 index:0xc00000001d1d0558 pfn:0x1d1d flags: 0x7ffff000000200(slab|node=0|zone=0|lastcpupid=0x7ffff) raw: 007ffff000000200 c00000001d0003c8 c00000001d0003c8 c00000001d010480 raw: c00000001d1d0558 0000000001e1000a 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: c00000001d1d0000: fc 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc c00000001d1d0080: fc fc 00 fc fc fc fc fc fc fc fc fc fc fc fc fc >c00000001d1d0100: fc fc fc 02 fc fc fc fc fc fc fc fc fc fc fc fc ^ c00000001d1d0180: fc fc fc fc 04 fc fc fc fc fc fc fc fc fc fc fc c00000001d1d0200: fc fc fc fc fc 04 fc fc fc fc fc fc fc fc fc fc This happens because the allocation uses the wrong unit (bits) when it should pass (BITS_TO_LONGS(count) * sizeof(long)) or equivalent. With small numbers of bits, the allocated object can be smaller than sizeof(long), which results in invalid accesses. Use bitmap_zalloc() to allocate and initialize the irq bitmap, paired with bitmap_free() for consistency. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220623182509.3985625-1-nathanl@linux.ibm.com

This fixes the following trace which is caused by hci_rx_work starting up *after* the final channel reference has been put() during sock_close() but *before* the references to the channel have been destroyed, so instead the code now rely on kref_get_unless_zero/l2cap_chan_hold_unless_zero to prevent referencing a channel that is about to be destroyed. refcount_t: increment on 0; use-after-free. BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0 Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705 CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S W 4.14.234-00003-g1fb6d0bd49a4-dirty #28 Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT) Workqueue: hci0 hci_rx_work Call trace: dump_backtrace+0x0/0x378 show_stack+0x20/0x2c dump_stack+0x124/0x148 print_address_description+0x80/0x2e8 __kasan_report+0x168/0x188 kasan_report+0x10/0x18 __asan_load4+0x84/0x8c refcount_dec_and_test+0x20/0xd0 l2cap_chan_put+0x48/0x12c l2cap_recv_frame+0x4770/0x6550 l2cap_recv_acldata+0x44c/0x7a4 hci_acldata_packet+0x100/0x188 hci_rx_work+0x178/0x23c process_one_work+0x35c/0x95c worker_thread+0x4cc/0x960 kthread+0x1a8/0x1c4 ret_from_fork+0x10/0x18 Cc: stable@kernel.org Reported-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Tested-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

commit 9c80e79 upstream. The assumption in __disable_kprobe() is wrong, and it could try to disarm an already disarmed kprobe and fire the WARN_ONCE() below. [0] We can easily reproduce this issue. 1. Write 0 to /sys/kernel/debug/kprobes/enabled. # echo 0 > /sys/kernel/debug/kprobes/enabled 2. Run execsnoop. At this time, one kprobe is disabled. # /usr/share/bcc/tools/execsnoop & [1] 2460 PCOMM PID PPID RET ARGS # cat /sys/kernel/debug/kprobes/list ffffffff91345650 r __x64_sys_execve+0x0 [FTRACE] ffffffff91345650 k __x64_sys_execve+0x0 [DISABLED][FTRACE] 3. Write 1 to /sys/kernel/debug/kprobes/enabled, which changes kprobes_all_disarmed to false but does not arm the disabled kprobe. # echo 1 > /sys/kernel/debug/kprobes/enabled # cat /sys/kernel/debug/kprobes/list ffffffff91345650 r __x64_sys_execve+0x0 [FTRACE] ffffffff91345650 k __x64_sys_execve+0x0 [DISABLED][FTRACE] 4. Kill execsnoop, when __disable_kprobe() calls disarm_kprobe() for the disabled kprobe and hits the WARN_ONCE() in __disarm_kprobe_ftrace(). # fg /usr/share/bcc/tools/execsnoop ^C Actually, WARN_ONCE() is fired twice, and __unregister_kprobe_top() misses some cleanups and leaves the aggregated kprobe in the hash table. Then, __unregister_trace_kprobe() initialises tk->rp.kp.list and creates an infinite loop like this. aggregated kprobe.list -> kprobe.list -. ^ | '.__.' In this situation, these commands fall into the infinite loop and result in RCU stall or soft lockup. cat /sys/kernel/debug/kprobes/list : show_kprobe_addr() enters into the infinite loop with RCU. /usr/share/bcc/tools/execsnoop : warn_kprobe_rereg() holds kprobe_mutex, and __get_valid_kprobe() is stuck in the loop. To avoid the issue, make sure we don't call disarm_kprobe() for disabled kprobes. [0] Failed to disarm kprobe-ftrace at __x64_sys_execve+0x0/0x40 (error -2) WARNING: CPU: 6 PID: 2460 at kernel/kprobes.c:1130 __disarm_kprobe_ftrace.isra.19 (kernel/kprobes.c:1129) Modules linked in: ena CPU: 6 PID: 2460 Comm: execsnoop Not tainted 5.19.0+ #28 Hardware name: Amazon EC2 c5.2xlarge/, BIOS 1.0 10/16/2017 RIP: 0010:__disarm_kprobe_ftrace.isra.19 (kernel/kprobes.c:1129) Code: 24 8b 02 eb c1 80 3d c4 83 f2 01 00 75 d4 48 8b 75 00 89 c2 48 c7 c7 90 fa 0f 92 89 04 24 c6 05 ab 83 01 e8 e4 94 f0 ff <0f> 0b 8b 04 24 eb b1 89 c6 48 c7 c7 60 fa 0f 92 89 04 24 e8 cc 94 RSP: 0018:ffff9e6ec154bd98 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffffff930f7b00 RCX: 0000000000000001 RDX: 0000000080000001 RSI: ffffffff921461c5 RDI: 00000000ffffffff RBP: ffff89c504286da8 R08: 0000000000000000 R09: c0000000fffeffff R10: 0000000000000000 R11: ffff9e6ec154bc28 R12: ffff89c502394e40 R13: ffff89c502394c00 R14: ffff9e6ec154bc00 R15: 0000000000000000 FS: 00007fe800398740(0000) GS:ffff89c812d80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c00057f010 CR3: 0000000103b54006 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> __disable_kprobe (kernel/kprobes.c:1716) disable_kprobe (kernel/kprobes.c:2392) __disable_trace_kprobe (kernel/trace/trace_kprobe.c:340) disable_trace_kprobe (kernel/trace/trace_kprobe.c:429) perf_trace_event_unreg.isra.2 (./include/linux/tracepoint.h:93 kernel/trace/trace_event_perf.c:168) perf_kprobe_destroy (kernel/trace/trace_event_perf.c:295) _free_event (kernel/events/core.c:4971) perf_event_release_kernel (kernel/events/core.c:5176) perf_release (kernel/events/core.c:5186) __fput (fs/file_table.c:321) task_work_run (./include/linux/sched.h:2056 (discriminator 1) kernel/task_work.c:179 (discriminator 1)) exit_to_user_mode_prepare (./include/linux/resume_user_mode.h:49 kernel/entry/common.c:169 kernel/entry/common.c:201) syscall_exit_to_user_mode (./arch/x86/include/asm/jump_label.h:55 ./arch/x86/include/asm/nospec-branch.h:384 ./arch/x86/include/asm/entry-common.h:94 kernel/entry/common.c:133 kernel/entry/common.c:296) do_syscall_64 (arch/x86/entry/common.c:87) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) RIP: 0033:0x7fe7ff210654 Code: 15 79 89 20 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb be 0f 1f 00 8b 05 9a cd 20 00 48 63 ff 85 c0 75 11 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3a f3 c3 48 83 ec 18 48 89 7c 24 08 e8 34 fc RSP: 002b:00007ffdbd1d3538 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 0000000000000008 RCX: 00007fe7ff210654 RDX: 0000000000000000 RSI: 0000000000002401 RDI: 0000000000000008 RBP: 0000000000000000 R08: 94ae31d6fda838a4 R0900007fe8001c9d30 R10: 00007ffdbd1d34b0 R11: 0000000000000246 R12: 00007ffdbd1d3600 R13: 0000000000000000 R14: fffffffffffffffc R15: 00007ffdbd1d3560 </TASK> Link: https://lkml.kernel.org/r/20220813020509.90805-1-kuniyu@amazon.com Fixes: 69d54b9 ("kprobes: makes kprobes/enabled works correctly for optimized kprobes.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reported-by: Ayushman Dutta <ayudutta@amazon.com> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Kuniyuki Iwashima <kuni1840@gmail.com> Cc: Ayushman Dutta <ayudutta@amazon.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Syzkaller produced the below call trace: BUG: KASAN: null-ptr-deref in io_msg_ring+0x3cb/0x9f0 Write of size 8 at addr 0000000000000070 by task repro/16399 CPU: 0 PID: 16399 Comm: repro Not tainted 6.1.0-rc1 #28 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 Call Trace: <TASK> dump_stack_lvl+0xcd/0x134 ? io_msg_ring+0x3cb/0x9f0 kasan_report+0xbc/0xf0 ? io_msg_ring+0x3cb/0x9f0 kasan_check_range+0x140/0x190 io_msg_ring+0x3cb/0x9f0 ? io_msg_ring_prep+0x300/0x300 io_issue_sqe+0x698/0xca0 io_submit_sqes+0x92f/0x1c30 __do_sys_io_uring_enter+0xae4/0x24b0 .... RIP: 0033:0x7f2eaf8f8289 RSP: 002b:00007fff40939718 EFLAGS: 00000246 ORIG_RAX: 00000000000001aa RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2eaf8f8289 RDX: 0000000000000000 RSI: 0000000000006f71 RDI: 0000000000000004 RBP: 00007fff409397a0 R08: 0000000000000000 R09: 0000000000000039 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004006d0 R13: 00007fff40939880 R14: 0000000000000000 R15: 0000000000000000 </TASK> Kernel panic - not syncing: panic_on_warn set ... We don't have a NULL check on file_ptr in io_msg_send_fd() function, so when file_ptr is NUL src_file is also NULL and get_file() dereferences a NULL pointer and leads to above crash. Add a NULL check to fix this issue. Fixes: e6130eb ("io_uring: add support for passing fixed file descriptors") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20221019171218.1337614-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>

[ Upstream commit e8b7445 ] For quite some time we were chasing a bug which looked like a sudden permanent failure of networking and mmc on some of our devices. The bug was very sensitive to any software changes and even more to any kernel debug options. Finally we got a setup where the problem was reproducible with CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma: [ 16.992082] ------------[ cut here ]------------ [ 16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes] [ 17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900 [ 17.018977] Modules linked in: xxxxx [ 17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 starfive-tech#28 [ 17.045345] Hardware name: xxxxx [ 17.049528] pstate: 60000005 (nZCv daif -PAN -UAO) [ 17.054322] pc : check_unmap+0x6a0/0x900 [ 17.058243] lr : check_unmap+0x6a0/0x900 [ 17.062163] sp : ffffffc010003c40 [ 17.065470] x29: ffffffc010003c40 x28: 000000004000c03c [ 17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800 [ 17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8 [ 17.081407] x23: 0000000000000000 x22: ffffffc010a08750 [ 17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000 [ 17.092032] x19: 0000000875e3e244 x18: 0000000000000010 [ 17.097343] x17: 0000000000000000 x16: 0000000000000000 [ 17.102647] x15: ffffff8879e4a988 x14: 0720072007200720 [ 17.107959] x13: 0720072007200720 x12: 0720072007200720 [ 17.113261] x11: 0720072007200720 x10: 0720072007200720 [ 17.118565] x9 : 0720072007200720 x8 : 000000000000022d [ 17.123869] x7 : 0000000000000015 x6 : 0000000000000098 [ 17.129173] x5 : 0000000000000000 x4 : 0000000000000000 [ 17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370 [ 17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000 [ 17.145082] Call trace: [ 17.147524] check_unmap+0x6a0/0x900 [ 17.151091] debug_dma_unmap_page+0x88/0x90 [ 17.155266] gem_rx+0x114/0x2f0 [ 17.158396] macb_poll+0x58/0x100 [ 17.161705] net_rx_action+0x118/0x400 [ 17.165445] __do_softirq+0x138/0x36c [ 17.169100] irq_exit+0x98/0xc0 [ 17.172234] __handle_domain_irq+0x64/0xc0 [ 17.176320] gic_handle_irq+0x5c/0xc0 [ 17.179974] el1_irq+0xb8/0x140 [ 17.183109] xiic_process+0x5c/0xe30 [ 17.186677] irq_thread_fn+0x28/0x90 [ 17.190244] irq_thread+0x208/0x2a0 [ 17.193724] kthread+0x130/0x140 [ 17.196945] ret_from_fork+0x10/0x20 [ 17.200510] ---[ end trace 7240980785f81d6f ]--- [ 237.021490] ------------[ cut here ]------------ [ 237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b [ 237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240 [ 237.041802] Modules linked in: xxxxx [ 237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 5.4.0 starfive-tech#28 [ 237.068941] Hardware name: xxxxx [ 237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO) [ 237.077900] pc : add_dma_entry+0x214/0x240 [ 237.081986] lr : add_dma_entry+0x214/0x240 [ 237.086072] sp : ffffffc010003c30 [ 237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00 [ 237.094683] x27: 0000000000000180 x26: ffffff8878e387c0 [ 237.099987] x25: 0000000000000002 x24: 0000000000000000 [ 237.105290] x23: 000000000000003b x22: ffffffc010a0fa00 [ 237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600 [ 237.115897] x19: 00000000ffffffef x18: 0000000000000010 [ 237.121201] x17: 0000000000000000 x16: 0000000000000000 [ 237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720 [ 237.131807] x13: 0720072007200720 x12: 0720072007200720 [ 237.137111] x11: 0720072007200720 x10: 0720072007200720 [ 237.142415] x9 : 0720072007200720 x8 : 0000000000000259 [ 237.147718] x7 : 0000000000000001 x6 : 0000000000000000 [ 237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001 [ 237.158325] x3 : 0000000000000006 x2 : 0000000000000007 [ 237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000 [ 237.168932] Call trace: [ 237.171373] add_dma_entry+0x214/0x240 [ 237.175115] debug_dma_map_page+0xf8/0x120 [ 237.179203] gem_rx_refill+0x190/0x280 [ 237.182942] gem_rx+0x224/0x2f0 [ 237.186075] macb_poll+0x58/0x100 [ 237.189384] net_rx_action+0x118/0x400 [ 237.193125] __do_softirq+0x138/0x36c [ 237.196780] irq_exit+0x98/0xc0 [ 237.199914] __handle_domain_irq+0x64/0xc0 [ 237.204000] gic_handle_irq+0x5c/0xc0 [ 237.207654] el1_irq+0xb8/0x140 [ 237.210789] arch_cpu_idle+0x40/0x200 [ 237.214444] default_idle_call+0x18/0x30 [ 237.218359] do_idle+0x200/0x280 [ 237.221578] cpu_startup_entry+0x20/0x30 [ 237.225493] rest_init+0xe4/0xf0 [ 237.228713] arch_call_rest_init+0xc/0x14 [ 237.232714] start_kernel+0x47c/0x4a8 [ 237.236367] ---[ end trace 7240980785f81d70 ]--- Lars was fast to find an explanation: according to the datasheet bit 2 of the rx buffer descriptor entry has a different meaning in the extended mode: Address [2] of beginning of buffer, or in extended buffer descriptor mode (DMA configuration register [28] = 1), indicates a valid timestamp in the buffer descriptor entry. The macb driver didn't mask this bit while getting an address and it eventually caused a memory corruption and a dma failure. The problem is resolved by explicitly clearing the problematic bit if hw timestamping is used. Fixes: 7b42961 ("net: macb: Add support for PTP timestamps in DMA descriptors") Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Co-developed-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>

farzadfch and others added 30 commits November 1, 2018 23:09

nvdla: add NVDLA driver

ae5775c

add the atom size for nv_small

0b8fd07

update nvdla driver

b6d83bc

Stable update

89362d1

Add efi_status_to_str() and rework efi_status_to_err().

e0e99dd

This adds efi_status_to_str() for use when printing efi_status_t messages, and reworks efi_status_to_err() so that the two use a common list of errors. Upstream Status: RHEL only Signed-off-by: Peter Jones <pjones@redhat.com>

Make get_cert_list() use efi_status_to_str() to print error messages.

a82f63c

Upstream Status: RHEL only Signed-off-by: Peter Jones <pjones@redhat.com> Signed-off-by: Jeremy Cline <jcline@redhat.com>

security: lockdown: expose a hook to lock the kernel down

cd7c2d9

In order to automatically lock down kernels running on UEFI machines booted in Secure Boot mode, expose the lock_kernel_down() hook. Upstream Status: RHEL only Signed-off-by: Jeremy Cline <jcline@redhat.com>

s390: Lock down the kernel when the IPL secure flag is set

7916966

Automatically lock down the kernel to LOCKDOWN_CONFIDENTIALITY_MAX if the IPL secure flag is set. Upstream Status: RHEL only Suggested-by: Philipp Rudo <prudo@redhat.com> Signed-off-by: Jeremy Cline <jcline@redhat.com>

arm: make CONFIG_HIGHPTE optional without CONFIG_EXPERT

681317e

We will use this to force CONFIG_HIGHPTE off on LPAE for now Signed-off-by: Jon Masters <jcm@redhat.com>

ARM: tegra: usb no reset

20ec6de

Patch for disconnect issues with storage attached to a tegra-ehci controller

Drop that for now

52b3593

pdp7 and others added 5 commits June 16, 2021 01:57

corrected API changes

09aa15b

Manual merged the NVDLA

f3c4bfe

Merge branch 'nvdla_driver' of github.com:cybergaszcz/linux into nvdl…

8817c2c

…a_driver

Corrected API changes for the NVDLA

ca466b8

cybergaszcz closed this Jun 20, 2021

cybergaszcz reopened this Jun 20, 2021

manual merge

e148a52

esmil force-pushed the esmil_starlight branch from 0b0891d to 0304da1 Compare June 20, 2021 22:17

cybergaszcz closed this Jun 21, 2021

cybergaszcz deleted the nvdla_driver branch June 21, 2021 06:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Corrected the NVDLA driver to work with 5.13 #28

Corrected the NVDLA driver to work with 5.13 #28

cybergaszcz commented Jun 20, 2021

esmil commented Jun 20, 2021

cybergaszcz commented Jun 21, 2021

esmil commented Jun 21, 2021

cybergaszcz commented Jun 21, 2021 •

edited

Loading

esmil commented Jun 21, 2021

esmil commented Jun 23, 2021

cybergaszcz commented Jun 24, 2021

esmil commented Jun 24, 2021

geertu commented Jun 29, 2021

cybergaszcz commented Jun 29, 2021

pdp7 commented Jun 29, 2021

Corrected the NVDLA driver to work with 5.13 #28

Corrected the NVDLA driver to work with 5.13 #28

Conversation

cybergaszcz commented Jun 20, 2021

esmil commented Jun 20, 2021

cybergaszcz commented Jun 21, 2021

esmil commented Jun 21, 2021

cybergaszcz commented Jun 21, 2021 • edited Loading

esmil commented Jun 21, 2021

esmil commented Jun 23, 2021

cybergaszcz commented Jun 24, 2021

esmil commented Jun 24, 2021

geertu commented Jun 29, 2021

cybergaszcz commented Jun 29, 2021

pdp7 commented Jun 29, 2021

cybergaszcz commented Jun 21, 2021 •

edited

Loading