Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "drivers/tty/serial/8250: update driver for VIC7100" #21

Closed
wants to merge 61 commits into from
Closed

Revert "drivers/tty/serial/8250: update driver for VIC7100" #21

wants to merge 61 commits into from

Conversation

pdp7
Copy link

@pdp7 pdp7 commented Jun 10, 2021

This reverts commit 3cdd633.

drivers/tty/serial/8250 does not need this #ifdef CONFIG_SOC_STARFIVE_VIC7100 hack for the console (uart3) to work.

@davidlt noticed that this was likely added for the "high speed" uart ports (hs_uart) but we also suspect that it might be possible to just use PORT_16550A for those as well. hs_uart is used for uart0 and uart1. uart0 is used for the Bluetooth module.

@esmil this revert commit probably not necessary. I would recommend just dropping 3cdd633 next time you rebase the starlight branch.

sunnanyong and others added 30 commits May 22, 2021 10:19
In riscv, a page table entry is leaf when any bit of read, write,
or execute bit is set. So add a macro:_PAGE_LEAF instead of
(_PAGE_READ | _PAGE_WRITE | _PAGE_EXEC), which is frequently used
to determine if it is a leaf page. This make code easier to read,
without any functional change.

Signed-off-by: Nanyong Sun <sunnanyong@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
In the definition in Documentation/vm/arch_pgtable_helpers.rst,
pmd_bad() means test a non-table mapped PMD, so it should also
return true when it is a leaf page.

Signed-off-by: Nanyong Sun <sunnanyong@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Add a parameter: stride for __sbi_tlb_flush_range(),
represent the page stride between the address of start and end.
Normally, the stride is PAGE_SIZE, and when flush huge page
address, the stride can be the huge page size such as:PMD_SIZE,
then it only need to flush one tlb entry if the address range
within PMD_SIZE.

Signed-off-by: Nanyong Sun <sunnanyong@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Bring Transparent HugePage support to riscv. A
transparent huge page is always represented as a pmd.

Signed-off-by: Nanyong Sun <sunnanyong@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
HAVE_MOVE_PUD enables remapping pages at the PUD level if both the source
and destination addresses are PUD-aligned.
HAVE_MOVE_PMD does similar speedup on the PMD level.

With HAVE_MOVE_PUD enabled, there is about a 143x improvement on qemu
With HAVE_MOVE_PMD enabled, there is about a 5x improvement on qemu

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The empty_zero_page sits at .bss..page_aligned section, so will be
cleared to zero during clearing bss, we don't need to clear it again.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Enable the PCI resource mapping on RISC-V using the generic framework.
This allows userspace applications to mmap PCI resources using
/sys/devices/pci*/*/resource* interface.
The mmap has been tested with Intel x520-DA2 NIC card on a HiFive
Unmatched board (SiFive FU740 SoC).

Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Make setup_bootmem() static.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The _sdata/_edata is already in sections.h, drop redundant
declaration.

Also move _xiprom/_exiprom declarations at the beginning of
the file, cleanup one CONFIG_XIP_KERNEL.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Directly passing the cpu to flush_icache_deferred() rather than calling
smp_processor_id() again.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
[Palmer: drop the QEMU performance numbers, and update the comment]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The has_fpu check sits at hot code path: switch_to(). Currently, has_fpu
is a bool variable if FPU=y, switch_to() checks it each time, we can
optimize out this check by turning the has_fpu into a static key.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Inspired by commit ba090f9 ("arm64: kprobes: Remove redundant
kprobe_step_ctx"), the ss_pending and match_addr of kprobe_step_ctx
are redundant because those can be replaced by KPROBE_HIT_SS and
&cur_kprobe->ainsn.api.insn[0] + GET_INSN_LENGTH(cur->opcode)
respectively.

Remove the kprobe_step_ctx to simplify the code.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
These functions are not needed after booting, so mark them as __init
to move them to the __init section.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Consolidate the following items in init.c

Staticize global vars as much as possible;
Add __initdata mark if the global var isn't needed after init
Add __init mark if the func isn't needed after init
Add __ro_after_init if the global var is read only after init

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Fix a Kconfig warning and many build errors:

WARNING: unmet direct dependencies detected for COMPACTION
  Depends on [n]: MMU [=n]
  Selected by [y]:
  - TRANSPARENT_HUGEPAGE [=y] && HAVE_ARCH_TRANSPARENT_HUGEPAGE [=y]

and the subseqent thousands of build errors and warnings.

Fixes: e88b333 ("riscv: mm: add THP support on 64-bit")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
We map kernel pages into all addresses spages, so they can be marked as
global.  This allows hardware to avoid flushing the kernel mappings when
moving between address spaces.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[Palmer: commit text]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Correct the order of the descriptions for the "interrupts" property to
match the order of the "interrupt-names" property.

Fixes: 68989fe ("dt-bindings: usb: Convert cdns-usb3.txt to YAML schema")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
As of commit 4cdc2ec ("mmc: dw_mmc: move rockchip related code
to a separate file"), dw_mmc-pltfm.c no longer uses the clock API.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Add vendor prefix for StarFive Technology Co. Ltd [1]. StarFive was
formed in 2018 and has now produced their first SoC, the JH7100, which
contains 64-bit RISC-V cores [2]. It used in the BeagleV Starlight [3].

[1] https://starfivetech.com/site/company
[2] https://github.com/beagleboard/beaglev-starlight
[3] https://github.com/starfive-tech/beaglev_doc

Signed-off-by: Drew Fustini <drew@beagleboard.org>
Add preliminary Device Tree bindings for the StarFive JH7100 Clock
Generator.

To be verified against documentation when it becomes available.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
…nitions

Add all clock outputs for the StarFive JH7100 Clock Generator, based on
the list of fixed-frequency clocks defined in jh7100.dtsi.

To be verified against documentation when it becomes available.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Add a preliminary driver for the StarFive JH7100 Clock Generator.
For now, all clocks are implemented as fixed-factor clocks relative to
osc0, based on the list of fixed-frequency clocks defined in
jh7100.dtsi.

To be updated when the documentation becomes available.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Add bindings for the GPIO controller in the StarFive JH7100 SoC [1].

[1] https://github.com/starfive-tech/beaglev_doc

Signed-off-by: Drew Fustini <drew@beagleboard.org>
This SoC is used on the BeagleV Starlight JH7100 board [1].

[1] https://github.com/beagleboard/beaglev-starlight

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Drew Fustini <drew@beagleboard.org>
The first DMAC instance in the StarFive JH7100 SoC supports 16 DMA
channels.

FIXME Given there are more changes to the driver than just increasing
      DMAC_MAX_CHANNELS, we probably need a new compatible value, too.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Add bindings for the temperature sensor on the Starfive JH7100 SoC.

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Register definitions based on sfctemp driver in the StarFive
5.10 kernel by Samin Guo <samin.guo@starfivetech.com>.

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Tom and others added 17 commits June 7, 2021 01:11
drivers/usb/cdns3/
drivers/usb/core/
drivers/usb/host/
include/linux/usb.h

Geert: Rebase to v5.13-rc1
Stafford: Don't flush NULL values

Signed-off-by: Stafford Horne <shorne@gmail.com>
1, add ov5640&sc2235 drivers, update stf_isp
2, add MIPI/CSI/DSI drivers for VIC7100
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Stephen L Arnold <nerdboy@gentoo.org>
…_EMULATION

riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `.L0 ':
tda998x.c:(.text+0x51c): undefined reference to `drm_encoder_cleanup'
riscv64-linux-gnu-ld: tda998x.c:(.text+0x534): undefined reference to `drm_encoder_cleanup'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `.L75':
tda998x.c:(.text+0x564): undefined reference to `drm_of_find_possible_crtcs'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `tda998x_encoder_destroy':
tda998x.c:(.text+0x58a): undefined reference to `drm_encoder_init'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `tda998x_bind':
tda998x.c:(.text+0x5b2): undefined reference to `drm_bridge_attach'
riscv64-linux-gnu-ld: tda998x.c:(.text+0x5c0): undefined reference to `drm_encoder_cleanup'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `.L0 ':
tda998x.c:(.text+0x692): undefined reference to `drm_bridge_remove'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `.L124':
tda998x.c:(.text+0x904): undefined reference to `hdmi_infoframe_pack'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `.L135':
tda998x.c:(.text+0x9e8): undefined reference to `drm_connector_cleanup'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `.L0 ':
tda998x.c:(.text+0xa00): undefined reference to `drm_connector_cleanup'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `.L143':
tda998x.c:(.text+0xa56): undefined reference to `drm_connector_init'
riscv64-linux-gnu-ld: tda998x.c:(.text+0xa66): undefined reference to `drm_connector_attach_encoder'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `tda998x_bridge_detach':
tda998x.c:(.text+0xa8a): undefined reference to `__drm_err'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `.L189':
tda998x.c:(.text+0xd16): undefined reference to `drm_do_get_edid'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `.L0 ':
tda998x.c:(.text+0xd34): undefined reference to `drm_connector_update_edid_property'
riscv64-linux-gnu-ld: tda998x.c:(.text+0xd4e): undefined reference to `drm_add_edid_modes'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `.L165':
tda998x.c:(.text+0xd5a): undefined reference to `drm_detect_monitor_audio'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `.L0 ':
tda998x.c:(.text+0xfb0): undefined reference to `__drm_dbg'
riscv64-linux-gnu-ld: tda998x.c:(.text+0x108a): undefined reference to `drm_kms_helper_hotplug_event'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `.L283':
tda998x.c:(.text+0x1844): undefined reference to `drm_hdmi_avi_infoframe_from_display_mode'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `.L361':
tda998x.c:(.text+0x2078): undefined reference to `drm_bridge_add'
riscv64-linux-gnu-ld: drivers/video/fbdev/starfive/tda998x.o: in function `.LANCHOR0':
tda998x.c:(.rodata+0x90): undefined reference to `drm_helper_connector_dpms'
riscv64-linux-gnu-ld: tda998x.c:(.rodata+0x98): undefined reference to `drm_atomic_helper_connector_reset'
riscv64-linux-gnu-ld: tda998x.c:(.rodata+0xb0): undefined reference to `drm_helper_probe_single_connector_modes'
riscv64-linux-gnu-ld: tda998x.c:(.rodata+0xd8): undefined reference to `drm_atomic_helper_connector_duplicate_state'
riscv64-linux-gnu-ld: tda998x.c:(.rodata+0xe0): undefined reference to `drm_atomic_helper_connector_destroy_state'

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
…flict

    starfive,vpp-lcdc 12000000.sfivefb: can't request region for resource [mem 0xfb000000-0xfcffffff]
    starfive,vpp-lcdc 12000000.sfivefb: Fail to allocate video RAM
    starfive,vpp-lcdc 12000000.sfivefb: starfive fb init fail
    starfive,vpp-lcdc 12000000.sfivefb: fb info init FAIL
    starfive,vpp-lcdc: probe of 12000000.sfivefb failed with error -16

devm_ioremap_resource() calls devm_request_mem_region(), which fails as
the reserved memory for the frame buffer is already present in the
resource list, cfr. /proc/iomem:

    fb000000-fcffffff : Reserved

Fix this by mapping the frame buffer memory using devm_ioremap().

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
This IP is also used on the Starfive JH7100 riscv64 SoC and presumably
also the upcoming JH7110 SoC.

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
The BeagleV Starlight v0.9 board doesn't have the IRQB line from the
pmic routed to the SoC, so this hack is needed to allow the driver
to be loaded without it. See
beagleboard/beaglev-starlight#14

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Based on the device tree in https://github.com/starfive-tech/u-boot/
with contributions from:
yanhong.wang <yanhong.wang@starfivetech.com>
Huan.Feng <huan.feng@starfivetech.com>
ke.zhu <ke.zhu@starfivetech.com>
yiming.li <yiming.li@starfivetech.com>
jack.zhu <jack.zhu@starfivetech.com>
Samin Guo <samin.guo@starfivetech.com>
Chenjieqin <Jessica.Chen@starfivetech.com>
bo.li <bo.li@starfivetech.com>

Rearranged, cleanups, fixes and TPS65086 added by Emil.
Cleanups, fixes and LED added by Geert.
Cleanups and GPIO fixes from Drew.
Thermal zone added by Stephen.

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Stephen L Arnold <nerdboy@gentoo.org>
Signed-off-by: Drew Fustini <drew@beagleboard.org>
For convenience this also adds a small starlight_defconfig and the
firmware needed for the brcmfmac driver along with the signed regulatory
database.

The firmware is from the linux-firmware repo and the regulatory database
from the wireless-regdb Fedora package.

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Drew Fustini <drew@beagleboard.org>
@esmil
Copy link
Owner

esmil commented Jun 13, 2021

I've dropped this patch when I rebased on rc6

@esmil esmil closed this Jun 13, 2021
esmil pushed a commit that referenced this pull request Aug 16, 2021
When using kprobe on powerpc booke series processor, Oops happens
as show bellow:

/ # echo "p:myprobe do_nanosleep" > /sys/kernel/debug/tracing/kprobe_events
/ # echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable
/ # sleep 1
[   50.076730] Oops: Exception in kernel mode, sig: 5 [#1]
[   50.077017] BE PAGE_SIZE=4K SMP NR_CPUS=24 QEMU e500
[   50.077221] Modules linked in:
[   50.077462] CPU: 0 PID: 77 Comm: sleep Not tainted 5.14.0-rc4-00022-g251a1524293d #21
[   50.077887] NIP:  c0b9c4e0 LR: c00ebecc CTR: 00000000
[   50.078067] REGS: c3883de0 TRAP: 0700   Not tainted (5.14.0-rc4-00022-g251a1524293d)
[   50.078349] MSR:  00029000 <CE,EE,ME>  CR: 24000228  XER: 20000000
[   50.078675]
[   50.078675] GPR00: c00ebdf0 c3883e90 c313e300 c3883ea0 00000001 00000000 c3883ecc 00000001
[   50.078675] GPR08: c100598c c00ea250 00000004 00000000 24000222 102490c2 bff4180c 101e60d4
[   50.078675] GPR16: 00000000 102454ac 00000040 10240000 10241100 102410f 10240000 00500000
[   50.078675] GPR24: 00000002 00000000 c3883ea0 00000001 00000000 0000c350 3b9b8d50 00000000
[   50.080151] NIP [c0b9c4e0] do_nanosleep+0x0/0x190
[   50.080352] LR [c00ebecc] hrtimer_nanosleep+0x14c/0x1e0
[   50.080638] Call Trace:
[   50.080801] [c3883e90] [c00ebdf0] hrtimer_nanosleep+0x70/0x1e0 (unreliable)
[   50.081110] [c3883f00] [c00ec004] sys_nanosleep_time32+0xa4/0x110
[   50.081336] [c3883f40] [c001509c] ret_from_syscall+0x0/0x28
[   50.081541] --- interrupt: c00 at 0x100a4d08
[   50.081749] NIP:  100a4d08 LR: 101b5234 CTR: 00000003
[   50.081931] REGS: c3883f50 TRAP: 0c00   Not tainted (5.14.0-rc4-00022-g251a1524293d)
[   50.082183] MSR:  0002f902 <CE,EE,PR,FP,ME>  CR: 24000222  XER: 00000000
[   50.082457]
[   50.082457] GPR00: 000000a2 bf980040 1024b4d0 bf980084 bf980084 64000000 00555345 fefefeff
[   50.082457] GPR08: 7f7f7f7f 101e0000 00000069 00000003 28000422 102490c2 bff4180c 101e60d4
[   50.082457] GPR16: 00000000 102454ac 00000040 10240000 10241100 102410f 10240000 00500000
[   50.082457] GPR24: 00000002 bf9803f4 10240000 00000000 00000000 100039e0 00000000 102444e8
[   50.083789] NIP [100a4d08] 0x100a4d08
[   50.083917] LR [101b5234] 0x101b5234
[   50.084042] --- interrupt: c00
[   50.084238] Instruction dump:
[   50.084483] 4bfffc40 60000000 60000000 60000000 9421fff0 39400402 914200c0 38210010
[   50.084841] 4bfffc20 00000000 00000000 00000000 <7fe00008> 7c0802a6 7c892378 93c10048
[   50.085487] ---[ end trace f6fffe98e2fa8f3e ]---
[   50.085678]
Trace/breakpoint trap

There is no real mode for booke arch and the MMU translation is
always on. The corresponding MSR_IS/MSR_DS bit in booke is used
to switch the address space, but not for real mode judgment.

Fixes: 21f8b2f ("powerpc/kprobes: Ignore traps that happened in real mode")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210809023658.218915-1-pulehui@huawei.com
esmil pushed a commit that referenced this pull request Aug 18, 2021
[ Upstream commit 43e8f76 ]

When using kprobe on powerpc booke series processor, Oops happens
as show bellow:

/ # echo "p:myprobe do_nanosleep" > /sys/kernel/debug/tracing/kprobe_events
/ # echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable
/ # sleep 1
[   50.076730] Oops: Exception in kernel mode, sig: 5 [#1]
[   50.077017] BE PAGE_SIZE=4K SMP NR_CPUS=24 QEMU e500
[   50.077221] Modules linked in:
[   50.077462] CPU: 0 PID: 77 Comm: sleep Not tainted 5.14.0-rc4-00022-g251a1524293d #21
[   50.077887] NIP:  c0b9c4e0 LR: c00ebecc CTR: 00000000
[   50.078067] REGS: c3883de0 TRAP: 0700   Not tainted (5.14.0-rc4-00022-g251a1524293d)
[   50.078349] MSR:  00029000 <CE,EE,ME>  CR: 24000228  XER: 20000000
[   50.078675]
[   50.078675] GPR00: c00ebdf0 c3883e90 c313e300 c3883ea0 00000001 00000000 c3883ecc 00000001
[   50.078675] GPR08: c100598c c00ea250 00000004 00000000 24000222 102490c2 bff4180c 101e60d4
[   50.078675] GPR16: 00000000 102454ac 00000040 10240000 10241100 102410f 10240000 00500000
[   50.078675] GPR24: 00000002 00000000 c3883ea0 00000001 00000000 0000c350 3b9b8d50 00000000
[   50.080151] NIP [c0b9c4e0] do_nanosleep+0x0/0x190
[   50.080352] LR [c00ebecc] hrtimer_nanosleep+0x14c/0x1e0
[   50.080638] Call Trace:
[   50.080801] [c3883e90] [c00ebdf0] hrtimer_nanosleep+0x70/0x1e0 (unreliable)
[   50.081110] [c3883f00] [c00ec004] sys_nanosleep_time32+0xa4/0x110
[   50.081336] [c3883f40] [c001509c] ret_from_syscall+0x0/0x28
[   50.081541] --- interrupt: c00 at 0x100a4d08
[   50.081749] NIP:  100a4d08 LR: 101b5234 CTR: 00000003
[   50.081931] REGS: c3883f50 TRAP: 0c00   Not tainted (5.14.0-rc4-00022-g251a1524293d)
[   50.082183] MSR:  0002f902 <CE,EE,PR,FP,ME>  CR: 24000222  XER: 00000000
[   50.082457]
[   50.082457] GPR00: 000000a2 bf980040 1024b4d0 bf980084 bf980084 64000000 00555345 fefefeff
[   50.082457] GPR08: 7f7f7f7f 101e0000 00000069 00000003 28000422 102490c2 bff4180c 101e60d4
[   50.082457] GPR16: 00000000 102454ac 00000040 10240000 10241100 102410f 10240000 00500000
[   50.082457] GPR24: 00000002 bf9803f4 10240000 00000000 00000000 100039e0 00000000 102444e8
[   50.083789] NIP [100a4d08] 0x100a4d08
[   50.083917] LR [101b5234] 0x101b5234
[   50.084042] --- interrupt: c00
[   50.084238] Instruction dump:
[   50.084483] 4bfffc40 60000000 60000000 60000000 9421fff0 39400402 914200c0 38210010
[   50.084841] 4bfffc20 00000000 00000000 00000000 <7fe00008> 7c0802a6 7c892378 93c10048
[   50.085487] ---[ end trace f6fffe98e2fa8f3e ]---
[   50.085678]
Trace/breakpoint trap

There is no real mode for booke arch and the MMU translation is
always on. The corresponding MSR_IS/MSR_DS bit in booke is used
to switch the address space, but not for real mode judgment.

Fixes: 21f8b2f ("powerpc/kprobes: Ignore traps that happened in real mode")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210809023658.218915-1-pulehui@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
esmil pushed a commit that referenced this pull request Oct 3, 2021
The id of slv_cnoc_mnoc_cfg node is mistakenly coded as id of
slv_blsp_1.  It causes the following warning on slv_blsp_1 node adding.
Correct the id of slv_cnoc_mnoc_cfg node.

[    1.948180] ------------[ cut here ]------------
[    1.954122] WARNING: CPU: 2 PID: 7 at drivers/interconnect/core.c:962 icc_node_add+0xe4/0xf8
[    1.958994] Modules linked in:
[    1.967399] CPU: 2 PID: 7 Comm: kworker/u16:0 Not tainted 5.14.0-rc6-next-20210818 #21
[    1.970275] Hardware name: Xiaomi Redmi Note 7 (DT)
[    1.978169] Workqueue: events_unbound deferred_probe_work_func
[    1.982945] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.988849] pc : icc_node_add+0xe4/0xf8
[    1.995699] lr : qnoc_probe+0x350/0x438
[    1.999519] sp : ffff80001008bb10
[    2.003337] x29: ffff80001008bb10 x28: 000000000000001a x27: ffffb83ddc61ee28
[    2.006818] x26: ffff2fe341d44080 x25: ffff2fe340f3aa80 x24: ffffb83ddc98f0e8
[    2.013938] x23: 0000000000000024 x22: ffff2fe3408b7400 x21: 0000000000000000
[    2.021054] x20: ffff2fe3408b7410 x19: ffff2fe341d44080 x18: 0000000000000010
[    2.028173] x17: ffff2fe3bdd0aac0 x16: 0000000000000281 x15: ffff2fe3400f5528
[    2.035290] x14: 000000000000013f x13: ffff2fe3400f5528 x12: 00000000ffffffea
[    2.042410] x11: ffffb83ddc9109d0 x10: ffffb83ddc8f8990 x9 : ffffb83ddc8f89e8
[    2.049527] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001
[    2.056645] x5 : 0000000000057fa8 x4 : 0000000000000000 x3 : ffffb83ddc9903b0
[    2.063764] x2 : 1a1f6fde34d45500 x1 : ffff2fe340f3a880 x0 : ffff2fe340f3a880
[    2.070882] Call trace:
[    2.077989]  icc_node_add+0xe4/0xf8
[    2.080247]  qnoc_probe+0x350/0x438
[    2.083718]  platform_probe+0x68/0xd8
[    2.087191]  really_probe+0xb8/0x300
[    2.091011]  __driver_probe_device+0x78/0xe0
[    2.094659]  driver_probe_device+0x80/0x110
[    2.098911]  __device_attach_driver+0x90/0xe0
[    2.102818]  bus_for_each_drv+0x78/0xc8
[    2.107331]  __device_attach+0xf0/0x150
[    2.110977]  device_initial_probe+0x14/0x20
[    2.114796]  bus_probe_device+0x9c/0xa8
[    2.118963]  deferred_probe_work_func+0x88/0xc0
[    2.122784]  process_one_work+0x1a4/0x338
[    2.127296]  worker_thread+0x1f8/0x420
[    2.131464]  kthread+0x150/0x160
[    2.135107]  ret_from_fork+0x10/0x20
[    2.138495] ---[ end trace 5eea8768cb620e87 ]---

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Fixes: f80a1d4 ("interconnect: qcom: Add SDM660 interconnect provider driver")
Link: https://lore.kernel.org/r/20210823014003.31391-1-shawn.guo@linaro.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
esmil pushed a commit that referenced this pull request Mar 7, 2022
When bringing down the netdevice or system shutdown, a panic can be
triggered while accessing the sysfs path because the device is already
removed.

    [  755.549084] mlx5_core 0000:12:00.1: Shutdown was called
    [  756.404455] mlx5_core 0000:12:00.0: Shutdown was called
    ...
    [  757.937260] BUG: unable to handle kernel NULL pointer dereference at           (null)
    [  758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280

    crash> bt
    ...
    PID: 12649  TASK: ffff8924108f2100  CPU: 1   COMMAND: "amsd"
    ...
     #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778
        [exception RIP: dma_pool_alloc+0x1ab]
        RIP: ffffffff8ee11acb  RSP: ffff89240e1a3968  RFLAGS: 00010046
        RAX: 0000000000000246  RBX: ffff89243d874100  RCX: 0000000000001000
        RDX: 0000000000000000  RSI: 0000000000000246  RDI: ffff89243d874090
        RBP: ffff89240e1a39c0   R8: 000000000001f080   R9: ffff8905ffc03c00
        R10: ffffffffc04680d4  R11: ffffffff8edde9fd  R12: 00000000000080d0
        R13: ffff89243d874090  R14: ffff89243d874080  R15: 0000000000000000
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core]
    #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core]
    #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core]
    #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core]
    #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core]
    #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core]
    #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core]
    #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46
    #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208
    #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3
    #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf
    #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596
    #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10
    #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5
    #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff
    #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f
    #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92

    crash> net_device.state ffff89443b0c0000
      state = 0x5  (__LINK_STATE_START| __LINK_STATE_NOCARRIER)

To prevent this scenario, we also make sure that the netdevice is present.

Signed-off-by: suresh kumar <suresh2514@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
esmil pushed a commit that referenced this pull request Mar 16, 2022
[ Upstream commit 4224cfd ]

When bringing down the netdevice or system shutdown, a panic can be
triggered while accessing the sysfs path because the device is already
removed.

    [  755.549084] mlx5_core 0000:12:00.1: Shutdown was called
    [  756.404455] mlx5_core 0000:12:00.0: Shutdown was called
    ...
    [  757.937260] BUG: unable to handle kernel NULL pointer dereference at           (null)
    [  758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280

    crash> bt
    ...
    PID: 12649  TASK: ffff8924108f2100  CPU: 1   COMMAND: "amsd"
    ...
     #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778
        [exception RIP: dma_pool_alloc+0x1ab]
        RIP: ffffffff8ee11acb  RSP: ffff89240e1a3968  RFLAGS: 00010046
        RAX: 0000000000000246  RBX: ffff89243d874100  RCX: 0000000000001000
        RDX: 0000000000000000  RSI: 0000000000000246  RDI: ffff89243d874090
        RBP: ffff89240e1a39c0   R8: 000000000001f080   R9: ffff8905ffc03c00
        R10: ffffffffc04680d4  R11: ffffffff8edde9fd  R12: 00000000000080d0
        R13: ffff89243d874090  R14: ffff89243d874080  R15: 0000000000000000
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core]
    #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core]
    #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core]
    #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core]
    #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core]
    #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core]
    #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core]
    #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46
    #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208
    #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3
    #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf
    #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596
    #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10
    #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5
    #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff
    #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f
    #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92

    crash> net_device.state ffff89443b0c0000
      state = 0x5  (__LINK_STATE_START| __LINK_STATE_NOCARRIER)

To prevent this scenario, we also make sure that the netdevice is present.

Signed-off-by: suresh kumar <suresh2514@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
esmil pushed a commit that referenced this pull request Jun 20, 2023
The cited commit adds a compeletion to remove dependency on rtnl
lock. But it causes a deadlock for multiple encapsulations:

 crash> bt ffff8aece8a64000
 PID: 1514557  TASK: ffff8aece8a64000  CPU: 3    COMMAND: "tc"
  #0 [ffffa6d14183f368] __schedule at ffffffffb8ba7f45
  #1 [ffffa6d14183f3f8] schedule at ffffffffb8ba8418
  #2 [ffffa6d14183f418] schedule_preempt_disabled at ffffffffb8ba8898
  #3 [ffffa6d14183f428] __mutex_lock at ffffffffb8baa7f8
  #4 [ffffa6d14183f4d0] mutex_lock_nested at ffffffffb8baabeb
  #5 [ffffa6d14183f4e0] mlx5e_attach_encap at ffffffffc0f48c17 [mlx5_core]
  #6 [ffffa6d14183f628] mlx5e_tc_add_fdb_flow at ffffffffc0f39680 [mlx5_core]
  #7 [ffffa6d14183f688] __mlx5e_add_fdb_flow at ffffffffc0f3b636 [mlx5_core]
  #8 [ffffa6d14183f6f0] mlx5e_tc_add_flow at ffffffffc0f3bcdf [mlx5_core]
  #9 [ffffa6d14183f728] mlx5e_configure_flower at ffffffffc0f3c1d1 [mlx5_core]
 #10 [ffffa6d14183f790] mlx5e_rep_setup_tc_cls_flower at ffffffffc0f3d529 [mlx5_core]
 #11 [ffffa6d14183f7a0] mlx5e_rep_setup_tc_cb at ffffffffc0f3d714 [mlx5_core]
 #12 [ffffa6d14183f7b0] tc_setup_cb_add at ffffffffb8931bb8
 #13 [ffffa6d14183f810] fl_hw_replace_filter at ffffffffc0dae901 [cls_flower]
 #14 [ffffa6d14183f8d8] fl_change at ffffffffc0db5c57 [cls_flower]
 #15 [ffffa6d14183f970] tc_new_tfilter at ffffffffb8936047
 #16 [ffffa6d14183fac8] rtnetlink_rcv_msg at ffffffffb88c7c31
 #17 [ffffa6d14183fb50] netlink_rcv_skb at ffffffffb8942853
 #18 [ffffa6d14183fbc0] rtnetlink_rcv at ffffffffb88c1835
 #19 [ffffa6d14183fbd0] netlink_unicast at ffffffffb8941f27
 #20 [ffffa6d14183fc18] netlink_sendmsg at ffffffffb8942245
 #21 [ffffa6d14183fc98] sock_sendmsg at ffffffffb887d482
 #22 [ffffa6d14183fcb8] ____sys_sendmsg at ffffffffb887d81a
 #23 [ffffa6d14183fd38] ___sys_sendmsg at ffffffffb88806e2
 #24 [ffffa6d14183fe90] __sys_sendmsg at ffffffffb88807a2
 #25 [ffffa6d14183ff28] __x64_sys_sendmsg at ffffffffb888080f
 #26 [ffffa6d14183ff38] do_syscall_64 at ffffffffb8b9b6a8
 #27 [ffffa6d14183ff50] entry_SYSCALL_64_after_hwframe at ffffffffb8c0007c
 crash> bt 0xffff8aeb07544000
 PID: 1110766  TASK: ffff8aeb07544000  CPU: 0    COMMAND: "kworker/u20:9"
  #0 [ffffa6d14e6b7bd8] __schedule at ffffffffb8ba7f45
  #1 [ffffa6d14e6b7c68] schedule at ffffffffb8ba8418
  #2 [ffffa6d14e6b7c88] schedule_timeout at ffffffffb8baef88
  #3 [ffffa6d14e6b7d10] wait_for_completion at ffffffffb8ba968b
  #4 [ffffa6d14e6b7d60] mlx5e_take_all_encap_flows at ffffffffc0f47ec4 [mlx5_core]
  #5 [ffffa6d14e6b7da0] mlx5e_rep_update_flows at ffffffffc0f3e734 [mlx5_core]
  #6 [ffffa6d14e6b7df8] mlx5e_rep_neigh_update at ffffffffc0f400bb [mlx5_core]
  #7 [ffffa6d14e6b7e50] process_one_work at ffffffffb80acc9c
  #8 [ffffa6d14e6b7ed0] worker_thread at ffffffffb80ad012
  #9 [ffffa6d14e6b7f10] kthread at ffffffffb80b615d
 #10 [ffffa6d14e6b7f50] ret_from_fork at ffffffffb8001b2f

After the first encap is attached, flow will be added to encap
entry's flows list. If neigh update is running at this time, the
following encaps of the flow can't hold the encap_tbl_lock and
sleep. If neigh update thread is waiting for that flow's init_done,
deadlock happens.

Fix it by holding lock outside of the for loop. If neigh update is
running, prevent encap flows from offloading. Since the lock is held
outside of the for loop, concurrent creation of encap entries is not
allowed. So remove unnecessary wait_for_completion call for res_ready.

Fixes: 95435ad ("net/mlx5e: Only access fully initialized flows in neigh update")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
esmil pushed a commit that referenced this pull request Sep 24, 2023
commit 133466c ("net: stmmac: use per-queue 64 bit statistics
where necessary") caused one regression as found by Uwe, the backtrace
looks like:

	INFO: trying to register non-static key.
	The code is fine but needs lockdep annotation, or maybe
	you didn't initialize this object before use?
	turning off the locking correctness validator.
	CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc1-00449-g133466c3bbe1-dirty #21
	Hardware name: STM32 (Device Tree Support)
	 unwind_backtrace from show_stack+0x18/0x1c
	 show_stack from dump_stack_lvl+0x60/0x90
	 dump_stack_lvl from register_lock_class+0x98c/0x99c
	 register_lock_class from __lock_acquire+0x74/0x293c
	 __lock_acquire from lock_acquire+0x134/0x398
	 lock_acquire from stmmac_get_stats64+0x2ac/0x2fc
	 stmmac_get_stats64 from dev_get_stats+0x44/0x130
	 dev_get_stats from rtnl_fill_stats+0x38/0x120
	 rtnl_fill_stats from rtnl_fill_ifinfo+0x834/0x17f4
	 rtnl_fill_ifinfo from rtmsg_ifinfo_build_skb+0xc0/0x144
	 rtmsg_ifinfo_build_skb from rtmsg_ifinfo+0x50/0x88
	 rtmsg_ifinfo from __dev_notify_flags+0xc0/0xec
	 __dev_notify_flags from dev_change_flags+0x50/0x5c
	 dev_change_flags from ip_auto_config+0x2f4/0x1260
	 ip_auto_config from do_one_initcall+0x70/0x35c
	 do_one_initcall from kernel_init_freeable+0x2ac/0x308
	 kernel_init_freeable from kernel_init+0x1c/0x138
	 kernel_init from ret_from_fork+0x14/0x2c

The reason is the rxq|txq_stats structures are not what expected
because stmmac_open() -> __stmmac_open() the structure is overwritten
by "memcpy(&priv->dma_conf, dma_conf, sizeof(*dma_conf));"
This causes the well initialized syncp member of rxq|txq_stats is
overwritten unexpectedly as pointed out by Johannes and Uwe.

Fix this issue by moving rxq|txq_stats back to stmmac_extra_stats. For
SMP cache friendly, we also mark stmmac_txq_stats and stmmac_rxq_stats
as ____cacheline_aligned_in_smp.

Fixes: 133466c ("net: stmmac: use per-queue 64 bit statistics where necessary")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20230917165328.3403-1-jszhang@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
esmil pushed a commit that referenced this pull request Sep 26, 2023
commit 0b0747d upstream.

The following processes run into a deadlock. CPU 41 was waiting for CPU 29
to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29
was hung by that spinlock with IRQs disabled.

  PID: 17360    TASK: ffff95c1090c5c40  CPU: 41  COMMAND: "mrdiagd"
  !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0
  !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0
  !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0
   # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0
   # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0
   # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0
   # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0
   # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0
   # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0
   # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0
   #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0
   #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0
   #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0
   #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0
   #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0
   #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0
   #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0
   #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0
   #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0
   #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

  PID: 17355    TASK: ffff95c1090c3d80  CPU: 29  COMMAND: "mrdiagd"
  !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0
  !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0
   # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0
   # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0
   # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0
   # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0
   # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0
   # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0
   # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0
   # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

The lock is used to synchronize different sysfs operations, it doesn't
protect any resource that will be touched by an interrupt. Consequently
it's not required to disable IRQs. Replace the spinlock with a mutex to fix
the deadlock.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
esmil pushed a commit that referenced this pull request Dec 11, 2023
When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a
cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when
removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be
dereferenced as wrong struct in irdma_free_pending_cqp_request().

  PID: 3669   TASK: ffff88aef892c000  CPU: 28  COMMAND: "kworker/28:0"
   #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34
   #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2
   #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f
   #3 [fffffe0000549eb8] do_nmi at ffffffff81079582
   #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4
      [exception RIP: native_queued_spin_lock_slowpath+1291]
      RIP: ffffffff8127e72b  RSP: ffff88aa841ef778  RFLAGS: 00000046
      RAX: 0000000000000000  RBX: ffff88b01f849700  RCX: ffffffff8127e47e
      RDX: 0000000000000000  RSI: 0000000000000004  RDI: ffffffff83857ec0
      RBP: ffff88afe3e4efc8   R8: ffffed15fc7c9dfa   R9: ffffed15fc7c9dfa
      R10: 0000000000000001  R11: ffffed15fc7c9df9  R12: 0000000000740000
      R13: ffff88b01f849708  R14: 0000000000000003  R15: ffffed1603f092e1
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
  -- <NMI exception stack> --
   #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b
   #6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4
   #7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363
   #8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma]
   #9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma]
   #10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma]
   #11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma]
   #12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb
   #13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6
   #14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278
   #15 [ffff88aa841efb88] device_del at ffffffff82179d23
   #16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice]
   #17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice]
   #18 [ffff88aa841efde8] process_one_work at ffffffff811c589a
   #19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff
   #20 [ffff88aa841eff10] kthread at ffffffff811d87a0
   #21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f

Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions")
Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn
Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com>
Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.