Commit Graph

1323858 Commits

Author SHA1 Message Date
Takashi Iwai
c34e9ab9a6 ASoC: Fixes for v6.13
A few small fixes for v6.13, all system specific - the biggest thing is
 the fix for jack handling over suspend on some Intel laptops.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmdR3IYACgkQJNaLcl1U
 h9CaAAgAhU+wN7LEym7648q33gVEy/I335+ZHf0gLEMnAF1iNzoOx0Gy3cXBPHq3
 sic1P37UmkIIWi6BTr19aBxQ9Z0Vk3WhUsk+elmg3vhR1lodBZ9m8lYlLyEGbCh/
 Ur/AFSoewPbYGdJAVL7FclDiMXlnallF6xFWbh9O9Fw85hTh4WF07dRyws8j9tZQ
 qMkoic95lLPZTCplt8vHVC9sTXWuVp1HNiUKZDLLQ044PRlehLH21W4HJJgk/Dtl
 TO5U1zZpY3gB1QsxaR3+6vMDgPHHCUxvRkb4/hydHmKIqoFGuu0Ootipm9su1b/L
 jOKeEX2Gk6416fHpTWPUvJQTlv9MAA==
 =lnCs
 -----END PGP SIGNATURE-----

Merge tag 'asoc-fix-v6.13-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.13

A few small fixes for v6.13, all system specific - the biggest thing is
the fix for jack handling over suspend on some Intel laptops.
2024-12-05 18:09:29 +01:00
Ming Lei
7678abee08 virtio-blk: don't keep queue frozen during system suspend
Commit 4ce6e2db00 ("virtio-blk: Ensure no requests in virtqueues before
deleting vqs.") replaces queue quiesce with queue freeze in virtio-blk's
PM callbacks. And the motivation is to drain inflight IOs before suspending.

block layer's queue freeze looks very handy, but it is also easy to cause
deadlock, such as, any attempt to call into bio_queue_enter() may run into
deadlock if the queue is frozen in current context. There are all kinds
of ->suspend() called in suspend context, so keeping queue frozen in the
whole suspend context isn't one good idea. And Marek reported lockdep
warning[1] caused by virtio-blk's freeze queue in virtblk_freeze().

[1] https://lore.kernel.org/linux-block/ca16370e-d646-4eee-b9cc-87277c89c43c@samsung.com/

Given the motivation is to drain in-flight IOs, it can be done by calling
freeze & unfreeze, meantime restore to previous behavior by keeping queue
quiesced during suspend.

Cc: Yi Sun <yi.sun@unisoc.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: virtualization@lists.linux.dev
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20241112125821.1475793-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-05 10:00:20 -07:00
Thomas Gleixner
76031d9536 clocksource: Make negative motion detection more robust
Guenter reported boot stalls on a emulated ARM 32-bit platform, which has a
24-bit wide clocksource.

It turns out that the calculated maximal idle time, which limits idle
sleeps to prevent clocksource wrap arounds, is close to the point where the
negative motion detection triggers.

  max_idle_ns:                    597268854 ns
  negative motion tripping point: 671088640 ns

If the idle wakeup is delayed beyond that point, the clocksource
advances far enough to trigger the negative motion detection. This
prevents the clock to advance and in the worst case the system stalls
completely if the consecutive sleeps based on the stale clock are
delayed as well.

Cure this by calculating a more robust cut-off value for negative motion,
which covers 87.5% of the actual clocksource counter width. Compare the
delta against this value to catch negative motion. This is specifically for
clock sources with a small counter width as their wrap around time is close
to the half counter width. For clock sources with wide counters this is not
a problem because the maximum idle time is far from the half counter width
due to the math overflow protection constraints.

For the case at hand this results in a tripping point of 1174405120ns.

Note, that this cannot prevent issues when the delay exceeds the 87.5%
margin, but that's not different from the previous unchecked version which
allowed arbitrary time jumps.

Systems with small counter width are prone to invalid results, but this
problem is unlikely to be seen on real hardware. If such a system
completely stalls for more than half a second, then there are other more
urgent problems than the counter wrapping around.

Fixes: c163e40af9 ("timekeeping: Always check for negative motion")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/all/8734j5ul4x.ffs@tglx
Closes: https://lore.kernel.org/all/387b120b-d68a-45e8-b6ab-768cd95d11c2@roeck-us.net
2024-12-05 16:03:24 +01:00
Suzuki K Poulose
16d5306629 coco: virt: arm64: Do not enable cca guest driver by default
As per the guidelines, new drivers may not be set to default on.
An expert user can always select it.

Reported-by: Dan Williams <dan.j.williams@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Sami Mujawar <sami.mujawar@arm.com>
Link: https://lore.kernel.org/r/6750c695194cd_2508129427@dwillia2-xfh.jf.intel.com.notmuch
Link: https://lore.kernel.org/r/20241205143634.306114-1-suzuki.poulose@arm.com
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-05 14:50:20 +00:00
Imre Deak
59ca0e1c97 drm/dp_mst: Use reset_msg_rx_state() instead of open coding it
Use reset_msg_rx_state() in drm_dp_mst_handle_up_req() instead of
open-coding it.

Cc: Lyude Paul <lyude@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241203160223.2926014-8-imre.deak@intel.com
2024-12-05 16:49:41 +02:00
Steven Rostedt
dc1b157b82 tracing: Fix archs that still call tracepoints without RCU watching
Tracepoints require having RCU "watching" as it uses RCU to do updates to
the tracepoints. There are some cases that would call a tracepoint when
RCU was not "watching". This was usually in the idle path where RCU has
"shutdown". For the few locations that had tracepoints without RCU
watching, there was an trace_*_rcuidle() variant that could be used. This
used SRCU for protection.

There are tracepoints that trace when interrupts and preemption are
enabled and disabled. In some architectures, these tracepoints are called
in a path where RCU is not watching. When x86 and arm64 removed these
locations, it was incorrectly assumed that it would be safe to remove the
trace_*_rcuidle() variant and also remove the SRCU logic, as it made the
code more complex and harder to implement new tracepoint features (like
faultable tracepoints and tracepoints in rust).

Instead of bringing back the trace_*_rcuidle(), as it will not be trivial
to do as new code has already been added depending on its removal, add a
workaround to the one file that still requires it (trace_preemptirq.c). If
the architecture does not define CONFIG_ARCH_WANTS_NO_INSTR, then check if
the code is in the idle path, and if so, call ct_irq_enter/exit() which
will enable RCU around the tracepoint.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/20241204100414.4d3e06d0@gandalf.local.home
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: 48bcda6848 ("tracing: Remove definition of trace_*_rcuidle()")
Closes: https://lore.kernel.org/all/bddb02de-957a-4df5-8e77-829f55728ea2@roeck-us.net/
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-05 09:28:58 -05:00
Imre Deak
2b245c97b1 drm/dp_mst: Reset message rx state after OOM in drm_dp_mst_handle_up_req()
After an out-of-memory error the reception state should be reset, so
that the next attempt receiving a message doesn't fail (due to getting a
start-of-message packet, while the reception state has already the
start-of-message flag set).

Cc: Lyude Paul <lyude@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241203160223.2926014-7-imre.deak@intel.com
2024-12-05 16:19:35 +02:00
Imre Deak
e54b00086f drm/dp_mst: Ensure mst_primary pointer is valid in drm_dp_mst_handle_up_req()
While receiving an MST up request message from one thread in
drm_dp_mst_handle_up_req(), the MST topology could be removed from
another thread via drm_dp_mst_topology_mgr_set_mst(false), freeing
mst_primary and setting drm_dp_mst_topology_mgr::mst_primary to NULL.
This could lead to a NULL deref/use-after-free of mst_primary in
drm_dp_mst_handle_up_req().

Avoid the above by holding a reference for mst_primary in
drm_dp_mst_handle_up_req() while it's used.

v2: Fix kfreeing the request if getting an mst_primary reference fails.

Cc: Lyude Paul <lyude@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com> (v1)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241204132007.3132494-1-imre.deak@intel.com
2024-12-05 16:19:30 +02:00
Imre Deak
3f61185503 drm/dp_mst: Fix down request message timeout handling
If receiving a reply for an MST down request message times out, the
thread receiving the reply in drm_dp_mst_handle_down_rep() could try to
dereference the drm_dp_sideband_msg_tx txmsg request message after the
thread waiting for the reply - calling drm_dp_mst_wait_tx_reply() - has
timed out and freed txmsg, hence leading to a use-after-free in
drm_dp_mst_handle_down_rep().

Prevent the above by holding the drm_dp_mst_topology_mgr::qlock in
drm_dp_mst_handle_down_rep() for the whole duration txmsg is looked up
from the request list and dereferenced.

v2: Fix unlocking mgr->qlock after verify_rx_request_type() fails.

Cc: Lyude Paul <lyude@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241203174632.2941402-1-imre.deak@intel.com
2024-12-05 16:19:09 +02:00
Imre Deak
b559b68d27 drm/dp_mst: Simplify error path in drm_dp_mst_handle_down_rep()
Simplify the error return path in drm_dp_mst_handle_down_rep(),
preparing for the next patch.

While at it use reset_msg_rx_state() instead of open-coding it.

Cc: Lyude Paul <lyude@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241203160223.2926014-4-imre.deak@intel.com
2024-12-05 16:18:58 +02:00
Imre Deak
4d49e77a97 drm/dp_mst: Verify request type in the corresponding down message reply
After receiving the response for an MST down request message, the
response should be accepted/parsed only if the response type matches
that of the request. Ensure this by checking if the request type code
stored both in the request and the reply match, dropping the reply in
case of a mismatch.

This fixes the topology detection for an MST hub, as described in the
Closes link below, where the hub sends an incorrect reply message after
a CLEAR_PAYLOAD_TABLE -> LINK_ADDRESS down request message sequence.

Cc: Lyude Paul <lyude@redhat.com>
Cc: <stable@vger.kernel.org>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12804
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241203160223.2926014-3-imre.deak@intel.com
2024-12-05 16:18:46 +02:00
Imre Deak
a6fa67d26d drm/dp_mst: Fix resetting msg rx state after topology removal
If the MST topology is removed during the reception of an MST down reply
or MST up request sideband message, the
drm_dp_mst_topology_mgr::up_req_recv/down_rep_recv states could be reset
from one thread via drm_dp_mst_topology_mgr_set_mst(false), racing with
the reading/parsing of the message from another thread via
drm_dp_mst_handle_down_rep() or drm_dp_mst_handle_up_req(). The race is
possible since the reader/parser doesn't hold any lock while accessing
the reception state. This in turn can lead to a memory corruption in the
reader/parser as described by commit bd2fccac61 ("drm/dp_mst: Fix MST
sideband message body length check").

Fix the above by resetting the message reception state if needed before
reading/parsing a message. Another solution would be to hold the
drm_dp_mst_topology_mgr::lock for the whole duration of the message
reception/parsing in drm_dp_mst_handle_down_rep() and
drm_dp_mst_handle_up_req(), however this would require a bigger change.
Since the fix is also needed for stable, opting for the simpler solution
in this patch.

Cc: Lyude Paul <lyude@redhat.com>
Cc: <stable@vger.kernel.org>
Fixes: 1d082618bb ("drm/display/dp_mst: Fix down/up message handling after sink disconnect")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13056
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241203160223.2926014-2-imre.deak@intel.com
2024-12-05 16:18:07 +02:00
Fernando Fernandez Mancera
73da582a47 x86/cpu/topology: Remove limit of CPUs due to disabled IO/APIC
The rework of possible CPUs management erroneously disabled SMP when the
IO/APIC is disabled either by the 'noapic' command line parameter or during
IO/APIC setup. SMP is possible without IO/APIC.

Remove the ioapic_is_disabled conditions from the relevant possible CPU
management code paths to restore the orgininal behaviour.

Fixes: 7c0edad364 ("x86/cpu/topology: Rework possible CPU management")
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20241202145905.1482-1-ffmancera@riseup.net
2024-12-05 14:43:32 +01:00
Colin Ian King
20c3b3e5f2 ALSA: hda/realtek: Fix spelling mistake "Firelfy" -> "Firefly"
There is a spelling mistake in a literal string in the alc269_fixup_tbl
quirk table. Fix it.

Fixes: 0d08f0eec9 ("ALSA: hda/realtek: fix micmute LEDs don't work on HP Laptops")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://patch.msgid.link/20241205102833.476190-1-colin.i.king@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-12-05 14:40:05 +01:00
Nícolas F. R. A. Prado
ec16a3cdf3
ASoC: mediatek: mt8188-mt6359: Remove hardcoded dmic codec
Remove hardcoded dmic codec from the UL_SRC dai link to avoid requiring
a dmic codec to be present for the driver to probe, as not every
MT8188-based platform might need a dmic codec. The codec can be assigned
to the dai link through the dai-link property in Devicetree on the
platforms where it is needed.

No Devicetree currently relies on it so it is safe to remove without
worrying about backward compatibility.

Fixes: 9f08dcbdde ("ASoC: mediatek: mt8188-mt6359: support new board with nau88255")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://patch.msgid.link/20241203-mt8188-6359-unhardcode-dmic-v1-1-346e3e5cbe6d@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-05 13:39:58 +00:00
David Woodhouse
d0ceea662d x86/mm: Add _PAGE_NOPTISHADOW bit to avoid updating userspace page tables
The set_p4d() and set_pgd() functions (in 4-level or 5-level page table setups
respectively) assume that the root page table is actually a 8KiB allocation,
with the userspace root immediately after the kernel root page table (so that
the former can enforce NX on on all the subordinate page tables, which are
actually shared).

However, users of the kernel_ident_mapping_init() code do not give it an 8KiB
allocation for its PGD. Both swsusp_arch_resume() and acpi_mp_setup_reset()
allocate only a single 4KiB page. The kexec code on x86_64 currently gets
away with it purely by chance, because it allocates 8KiB for its "control
code page" and then actually uses the first half for the PGD, then copies the
actual trampoline code into the second half only after the identmap code has
finished scribbling over it.

Fix this by defining a _PAGE_NOPTISHADOW bit (which can use the same bit as
_PAGE_SAVED_DIRTY since one is only for the PGD/P4D root and the other is
exclusively for leaf PTEs.). This instructs __pti_set_user_pgtbl() not to
write to the userspace 'shadow' PGD.

Strictly, the _PAGE_NOPTISHADOW bit doesn't need to be written out to the
actual page tables; since __pti_set_user_pgtbl() returns the value to be
written to the kernel page table, it could be filtered out. But there seems
to be no benefit to actually doing so.

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/412c90a4df7aef077141d9f68d19cbe5602d6c6d.camel@infradead.org
Cc: stable@kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
2024-12-05 13:04:00 +01:00
Purushothama Siddaiah
4c6ac5446d
spi: omap2-mcspi: Fix the IS_ERR() bug for devm_clk_get_optional_enabled()
The devm_clk_get_optional_enabled() function returns error
pointers(PTR_ERR()). So use IS_ERR() to check it.

Verified on K3-J7200 EVM board, without clock node mentioned
in the device tree.

Signed-off-by: Purushothama Siddaiah <psiddaiah@mvista.com>
Reviewed-by: Corey Minyard <cminyard@mvista.com>
Link: https://patch.msgid.link/20241205070426.1861048-1-psiddaiah@mvista.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-05 11:32:26 +00:00
Richard Weinberger
b29bf7119d jffs2: Fix rtime decompressor
The fix for a memory corruption contained a off-by-one error and
caused the compressor to fail in legit cases.

Cc: Kinsey Moore <kinsey.moore@oarcorp.com>
Cc: stable@vger.kernel.org
Fixes: fe051552f5 ("jffs2: Prevent rtime decompress memory corruption")
Signed-off-by: Richard Weinberger <richard@nod.at>
2024-12-05 12:31:40 +01:00
Catalin Marinas
cf3b16dae4 arm64: mte: Fix copy_highpage() warning on hugetlb folios
Commit 25c17c4b55 ("hugetlb: arm64: add mte support") improved the
copy_highpage() function to update the tags in the destination hugetlb
folio. However, when the source folio isn't tagged, the code takes the
non-hugetlb path where try_page_mte_tagging() warns as the destination
is a hugetlb folio:

  WARNING: CPU: 0 PID: 363 at arch/arm64/include/asm/mte.h:58 copy_highpage+0x1d4/0x2d8
  [...]
  pc : copy_highpage+0x1d4/0x2d8
  lr : copy_highpage+0x78/0x2d8
  [...]
  Call trace:
   copy_highpage+0x1d4/0x2d8 (P)
   copy_highpage+0x78/0x2d8 (L)
   copy_user_highpage+0x20/0x48
   copy_user_large_folio+0x1bc/0x268
   hugetlb_wp+0x190/0x860
   hugetlb_fault+0xa28/0xc10
   handle_mm_fault+0x2a0/0x2c0
   do_page_fault+0x12c/0x578
   do_mem_abort+0x4c/0xa8
   el0_da+0x44/0xb0
   el0t_64_sync_handler+0xc4/0x138
   el0t_64_sync+0x198/0x1a0

Change the check for the tagged status of the source folio so that it
does not fall through the non-hugetlb case. In addition, only perform
the copy (for the full folio) if the source page is the folio head and
warn if the destination folio is already tagged, for symmetry with the
non-hugetlb case.

Fixes: 25c17c4b55 ("hugetlb: arm64: add mte support")
Reported-by: Sasha Levin <sashal@kernel.org>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/Z0STR6VLt2MCalnY@sashalap
Link: https://lore.kernel.org/r/20241204175004.906754-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-05 11:25:44 +00:00
Catalin Marinas
c0900d15d3 arm64: Ensure bits ASID[15:8] are masked out when the kernel uses 8-bit ASIDs
Linux currently sets the TCR_EL1.AS bit unconditionally during CPU
bring-up. On an 8-bit ASID CPU, this is RES0 and ignored, otherwise
16-bit ASIDs are enabled. However, if running in a VM and the hypervisor
reports 8-bit ASIDs (ID_AA64MMFR0_EL1.ASIDBits == 0) on a 16-bit ASIDs
CPU, Linux uses bits 8 to 63 as a generation number for tracking old
process ASIDs. The bottom 8 bits of this generation end up being written
to TTBR1_EL1 and also used for the ASID-based TLBI operations as the
upper 8 bits of the ASID. Following an ASID roll-over event we can have
threads of the same application with the same 8-bit ASID but different
generation numbers running on separate CPUs. Both TLB caching and the
TLBI operations will end up using different actual 16-bit ASIDs for the
same process.

A similar scenario can happen in a big.LITTLE configuration if the boot
CPU only uses 8-bit ASIDs while secondary CPUs have 16-bit ASIDs.

Ensure that the ASID generation is only tracked by bits 16 and up,
leaving bits 15:8 as 0 if the kernel uses 8-bit ASIDs. Note that
clearing TCR_EL1.AS is not sufficient since the architecture requires
that the top 8 bits of the ASID passed to TLBI instructions are 0 rather
than ignored in such configuration.

Cc: stable@vger.kernel.org
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241203151941.353796-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-05 11:24:38 +00:00
Qinxin Xia
c2b46ae022 ACPI/IORT: Add PMCG platform information for HiSilicon HIP09A
HiSilicon HIP09A platforms using the same SMMU PMCG with HIP09
and thus suffers the same erratum. List them in the PMCG platform
information list without introducing a new SMMU PMCG Model.

Update the silicon-errata.rst as well.

Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com>
Link: https://lore.kernel.org/r/20241205013331.1484017-1-xiaqinxin@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-05 11:24:18 +00:00
Shradha Gupta
31f1b55d5d net :mana :Request a V2 response version for MANA_QUERY_GF_STAT
The current requested response version(V1) for MANA_QUERY_GF_STAT query
results in STATISTICS_FLAGS_TX_ERRORS_GDMA_ERROR value being set to
0 always.
In order to get the correct value for this counter we request the response
version to be V2.

Cc: stable@vger.kernel.org
Fixes: e1df5202e8 ("net :mana :Add remaining GDMA stats for MANA to ethtool")
Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1733291300-12593-1-git-send-email-shradhagupta@linux.microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-05 12:02:15 +01:00
Eric Dumazet
750e516033 net: avoid potential UAF in default_operstate()
syzbot reported an UAF in default_operstate() [1]

Issue is a race between device and netns dismantles.

After calling __rtnl_unlock() from netdev_run_todo(),
we can not assume the netns of each device is still alive.

Make sure the device is not in NETREG_UNREGISTERED state,
and add an ASSERT_RTNL() before the call to
__dev_get_by_index().

We might move this ASSERT_RTNL() in __dev_get_by_index()
in the future.

[1]

BUG: KASAN: slab-use-after-free in __dev_get_by_index+0x5d/0x110 net/core/dev.c:852
Read of size 8 at addr ffff888043eba1b0 by task syz.0.0/5339

CPU: 0 UID: 0 PID: 5339 Comm: syz.0.0 Not tainted 6.12.0-syzkaller-10296-gaaf20f870da0 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
 <TASK>
  __dump_stack lib/dump_stack.c:94 [inline]
  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
  print_address_description mm/kasan/report.c:378 [inline]
  print_report+0x169/0x550 mm/kasan/report.c:489
  kasan_report+0x143/0x180 mm/kasan/report.c:602
  __dev_get_by_index+0x5d/0x110 net/core/dev.c:852
  default_operstate net/core/link_watch.c:51 [inline]
  rfc2863_policy+0x224/0x300 net/core/link_watch.c:67
  linkwatch_do_dev+0x3e/0x170 net/core/link_watch.c:170
  netdev_run_todo+0x461/0x1000 net/core/dev.c:10894
  rtnl_unlock net/core/rtnetlink.c:152 [inline]
  rtnl_net_unlock include/linux/rtnetlink.h:133 [inline]
  rtnl_dellink+0x760/0x8d0 net/core/rtnetlink.c:3520
  rtnetlink_rcv_msg+0x791/0xcf0 net/core/rtnetlink.c:6911
  netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2541
  netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline]
  netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1347
  netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1891
  sock_sendmsg_nosec net/socket.c:711 [inline]
  __sock_sendmsg+0x221/0x270 net/socket.c:726
  ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2583
  ___sys_sendmsg net/socket.c:2637 [inline]
  __sys_sendmsg+0x269/0x350 net/socket.c:2669
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f2a3cb80809
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f2a3d9cd058 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f2a3cd45fa0 RCX: 00007f2a3cb80809
RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000008
RBP: 00007f2a3cbf393e R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f2a3cd45fa0 R15: 00007ffd03bc65c8
 </TASK>

Allocated by task 5339:
  kasan_save_stack mm/kasan/common.c:47 [inline]
  kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
  poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
  __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:394
  kasan_kmalloc include/linux/kasan.h:260 [inline]
  __kmalloc_cache_noprof+0x243/0x390 mm/slub.c:4314
  kmalloc_noprof include/linux/slab.h:901 [inline]
  kmalloc_array_noprof include/linux/slab.h:945 [inline]
  netdev_create_hash net/core/dev.c:11870 [inline]
  netdev_init+0x10c/0x250 net/core/dev.c:11890
  ops_init+0x31e/0x590 net/core/net_namespace.c:138
  setup_net+0x287/0x9e0 net/core/net_namespace.c:362
  copy_net_ns+0x33f/0x570 net/core/net_namespace.c:500
  create_new_namespaces+0x425/0x7b0 kernel/nsproxy.c:110
  unshare_nsproxy_namespaces+0x124/0x180 kernel/nsproxy.c:228
  ksys_unshare+0x57d/0xa70 kernel/fork.c:3314
  __do_sys_unshare kernel/fork.c:3385 [inline]
  __se_sys_unshare kernel/fork.c:3383 [inline]
  __x64_sys_unshare+0x38/0x40 kernel/fork.c:3383
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 12:
  kasan_save_stack mm/kasan/common.c:47 [inline]
  kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
  kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:582
  poison_slab_object mm/kasan/common.c:247 [inline]
  __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264
  kasan_slab_free include/linux/kasan.h:233 [inline]
  slab_free_hook mm/slub.c:2338 [inline]
  slab_free mm/slub.c:4598 [inline]
  kfree+0x196/0x420 mm/slub.c:4746
  netdev_exit+0x65/0xd0 net/core/dev.c:11992
  ops_exit_list net/core/net_namespace.c:172 [inline]
  cleanup_net+0x802/0xcc0 net/core/net_namespace.c:632
  process_one_work kernel/workqueue.c:3229 [inline]
  process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310
  worker_thread+0x870/0xd30 kernel/workqueue.c:3391
  kthread+0x2f0/0x390 kernel/kthread.c:389
  ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

The buggy address belongs to the object at ffff888043eba000
 which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 432 bytes inside of
 freed 2048-byte region [ffff888043eba000, ffff888043eba800)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x43eb8
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x4fff00000000040(head|node=1|zone=1|lastcpupid=0x7ff)
page_type: f5(slab)
raw: 04fff00000000040 ffff88801ac42000 dead000000000122 0000000000000000
raw: 0000000000000000 0000000000080008 00000001f5000000 0000000000000000
head: 04fff00000000040 ffff88801ac42000 dead000000000122 0000000000000000
head: 0000000000000000 0000000000080008 00000001f5000000 0000000000000000
head: 04fff00000000003 ffffea00010fae01 ffffffffffffffff 0000000000000000
head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 5339, tgid 5338 (syz.0.0), ts 69674195892, free_ts 69663220888
  set_page_owner include/linux/page_owner.h:32 [inline]
  post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1556
  prep_new_page mm/page_alloc.c:1564 [inline]
  get_page_from_freelist+0x3649/0x3790 mm/page_alloc.c:3474
  __alloc_pages_noprof+0x292/0x710 mm/page_alloc.c:4751
  alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265
  alloc_slab_page+0x6a/0x140 mm/slub.c:2408
  allocate_slab+0x5a/0x2f0 mm/slub.c:2574
  new_slab mm/slub.c:2627 [inline]
  ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3815
  __slab_alloc+0x58/0xa0 mm/slub.c:3905
  __slab_alloc_node mm/slub.c:3980 [inline]
  slab_alloc_node mm/slub.c:4141 [inline]
  __do_kmalloc_node mm/slub.c:4282 [inline]
  __kmalloc_noprof+0x2e6/0x4c0 mm/slub.c:4295
  kmalloc_noprof include/linux/slab.h:905 [inline]
  sk_prot_alloc+0xe0/0x210 net/core/sock.c:2165
  sk_alloc+0x38/0x370 net/core/sock.c:2218
  __netlink_create+0x65/0x260 net/netlink/af_netlink.c:629
  __netlink_kernel_create+0x174/0x6f0 net/netlink/af_netlink.c:2015
  netlink_kernel_create include/linux/netlink.h:62 [inline]
  uevent_net_init+0xed/0x2d0 lib/kobject_uevent.c:783
  ops_init+0x31e/0x590 net/core/net_namespace.c:138
  setup_net+0x287/0x9e0 net/core/net_namespace.c:362
page last free pid 1032 tgid 1032 stack trace:
  reset_page_owner include/linux/page_owner.h:25 [inline]
  free_pages_prepare mm/page_alloc.c:1127 [inline]
  free_unref_page+0xdf9/0x1140 mm/page_alloc.c:2657
  __slab_free+0x31b/0x3d0 mm/slub.c:4509
  qlink_free mm/kasan/quarantine.c:163 [inline]
  qlist_free_all+0x9a/0x140 mm/kasan/quarantine.c:179
  kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286
  __kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:329
  kasan_slab_alloc include/linux/kasan.h:250 [inline]
  slab_post_alloc_hook mm/slub.c:4104 [inline]
  slab_alloc_node mm/slub.c:4153 [inline]
  kmem_cache_alloc_node_noprof+0x1d9/0x380 mm/slub.c:4205
  __alloc_skb+0x1c3/0x440 net/core/skbuff.c:668
  alloc_skb include/linux/skbuff.h:1323 [inline]
  alloc_skb_with_frags+0xc3/0x820 net/core/skbuff.c:6612
  sock_alloc_send_pskb+0x91a/0xa60 net/core/sock.c:2881
  sock_alloc_send_skb include/net/sock.h:1797 [inline]
  mld_newpack+0x1c3/0xaf0 net/ipv6/mcast.c:1747
  add_grhead net/ipv6/mcast.c:1850 [inline]
  add_grec+0x1492/0x19a0 net/ipv6/mcast.c:1988
  mld_send_initial_cr+0x228/0x4b0 net/ipv6/mcast.c:2234
  ipv6_mc_dad_complete+0x88/0x490 net/ipv6/mcast.c:2245
  addrconf_dad_completed+0x712/0xcd0 net/ipv6/addrconf.c:4342
 addrconf_dad_work+0xdc2/0x16f0
  process_one_work kernel/workqueue.c:3229 [inline]
  process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310

Memory state around the buggy address:
 ffff888043eba080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888043eba100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888043eba180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                     ^
 ffff888043eba200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888043eba280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: 8c55facecd ("net: linkwatch: only report IF_OPER_LOWERLAYERDOWN if iflink is actually down")
Reported-by: syzbot+1939f24bdb783e9e43d9@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/674f3a18.050a0220.48a03.0041.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241203170933.2449307-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-05 11:57:26 +01:00
Paolo Abeni
7b998e073f netfilter pull request 24-12-05
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEjF9xRqF1emXiQiqU1w0aZmrPKyEFAmdQ7eIACgkQ1w0aZmrP
 KyEk+g//VdVM+7cVY+yqV+1gvST4yzCU7OkqGLhUZPYFsvARtLKqeSS2hygZAQzg
 bEqRuYCYShIh8zSCble9NxWd2u6h+QQWEkvxaWldsOHQ3olDBpzM0vWuqfbRV+GO
 TXtLwXXTzktMIW+itsZm2Dkz7MWhwcllKHuQRc6t5cu82Lm7vLjXRxEYyhcFbg7n
 2waSwhPsRKkNY3dLiIkJQS/4FXp7SQM+kRv0jyeQQ4oC9mmPYYyteVpjphTIPji7
 wpu1GgnmWEyMwNxYeoO7JZ69ESIc6wwUr712WB1FnGhjwQ1D0KdfgYnSxoc81TC4
 BGyP5i4yYHU9DmAKb04zve+LaF1gR6L51agi6sYcIsZQw3Cnr/YIY7w8O+EavwHb
 VDPvDouMtvDIKd4Hdtsna2o8d/UrkuE5/AuijHU40eEEEuMeuqiaby6Dj3ORYFrs
 edWjnkSqcvLGhx0bXzuFAWm/CxYqZsPcH0rWSUfuJu4wLevlrvKF0bkcA3NtcjDc
 bOEwXB1McKnWNBFeOnetf3PvA5Eu2HbYtt87rO1UeeBS+Bw8IO2uTu5X73suSb4f
 TJyWcQk51HMNa230tT2vEjsA5+FhXL/ZUYGu1ckVfghmoYdmWAEx8fuzec2Lqe/V
 hsn3Csl+JOPmqOEHyRvQ7t0UL4j3918hNP21RWyee6qTYK09YSM=
 =3N6k
 -----END PGP SIGNATURE-----

Merge tag 'nf-24-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Fix esoteric undefined behaviour due to uninitialized stack access
   in ip_vs_protocol_init(), from Jinghao Jia.

2) Fix iptables xt_LED slab-out-of-bounds due to incorrect sanitization
   of the led string identifier, reported by syzbot. Patch from
   Dmitry Antipov.

3) Remove WARN_ON_ONCE reachable from userspace to check for the maximum
   cgroup level, nft_socket cgroup matching is restricted to 255 levels,
   but cgroups allow for INT_MAX levels by default. Reported by syzbot.

4) Fix nft_inner incorrect use of percpu area to store tunnel parser
   context with softirqs, resulting in inconsistent inner header
   offsets that could lead to bogus rule mismatches, reported by syzbot.

5) Grab module reference on ipset core while requesting set type modules,
   otherwise kernel crash is possible by removing ipset core module,
   patch from Phil Sutter.

6) Fix possible double-free in nft_hash garbage collector due to unstable
   walk interator that can provide twice the same element. Use a sequence
   number to skip expired/dead elements that have been already scheduled
   for removal. Based on patch from Laurent Fasnach

netfilter pull request 24-12-05

* tag 'nf-24-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nft_set_hash: skip duplicated elements pending gc run
  netfilter: ipset: Hold module reference while requesting a module
  netfilter: nft_inner: incorrect percpu area handling under softirq
  netfilter: nft_socket: remove WARN_ON_ONCE on maximum cgroup level
  netfilter: x_tables: fix LED ID check in led_tg_check()
  ipvs: fix UB due to uninitialized stack access in ip_vs_protocol_init()
====================

Link: https://patch.msgid.link/20241205002854.162490-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-05 11:49:14 +01:00
Paolo Abeni
0e21a47c83 Merge branch 'vsock-test-fix-wrong-setsockopt-parameters'
Konstantin Shkolnyy says:

====================
vsock/test: fix wrong setsockopt() parameters

Parameters were created using wrong C types, which caused them to be of
wrong size on some architectures, causing problems.

The problem with SO_RCVLOWAT was found on s390 (big endian), while x86-64
didn't show it. After the fix, all tests pass on s390.
Then Stefano Garzarella pointed out that SO_VM_SOCKETS_* calls might have
a similar problem, which turned out to be true, hence, the second patch.

Changes for v8:
- Fix whitespace warnings from "checkpatch.pl --strict"
- Add maintainers to Cc:
Changes for v7:
- Rebase on top of https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git
- Add the "net" tags to the subjects
Changes for v6:
- rework the patch #3 to avoid creating a new file for new functions,
and exclude vsock_perf from calling the new functions.
- add "Reviewed-by:" to the patch #2.
Changes for v5:
- in the patch #2 replace the introduced uint64_t with unsigned long long
to match documentation
- add a patch #3 that verifies every setsockopt() call.
Changes for v4:
- add "Reviewed-by:" to the first patch, and add a second patch fixing
SO_VM_SOCKETS_* calls, which depends on the first one (hence, it's now
a patch series.)
Changes for v3:
- fix the same problem in vsock_perf and update commit message
Changes for v2:
- add "Fixes:" lines to the commit message
====================

Link: https://patch.msgid.link/20241203150656.287028-1-kshk@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-05 11:39:36 +01:00
Konstantin Shkolnyy
86814d8ffd vsock/test: verify socket options after setting them
Replace setsockopt() calls with calls to functions that follow
setsockopt() with getsockopt() and check that the returned value and its
size are the same as have been set. (Except in vsock_perf.)

Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-05 11:39:34 +01:00
Konstantin Shkolnyy
3f36ee29e7 vsock/test: fix parameter types in SO_VM_SOCKETS_* calls
Change parameters of SO_VM_SOCKETS_* to unsigned long long as documented
in the vm_sockets.h, because the corresponding kernel code requires them
to be at least 64-bit, no matter what architecture. Otherwise they are
too small on 32-bit machines.

Fixes: 5c338112e4 ("test/vsock: rework message bounds test")
Fixes: 685a21c314 ("test/vsock: add big message test")
Fixes: 542e893fba ("vsock/test: two tests to check credit update logic")
Fixes: 8abbffd27c ("test/vsock: vsock_perf utility")
Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-05 11:39:33 +01:00
Konstantin Shkolnyy
7ce1c0921a vsock/test: fix failures due to wrong SO_RCVLOWAT parameter
This happens on 64-bit big-endian machines.
SO_RCVLOWAT requires an int parameter. However, instead of int, the test
uses unsigned long in one place and size_t in another. Both are 8 bytes
long on 64-bit machines. The kernel, having received the 8 bytes, doesn't
test for the exact size of the parameter, it only cares that it's >=
sizeof(int), and casts the 4 lower-addressed bytes to an int, which, on
a big-endian machine, contains 0. 0 doesn't trigger an error, SO_RCVLOWAT
returns with success and the socket stays with the default SO_RCVLOWAT = 1,
which results in vsock_test failures, while vsock_perf doesn't even notice
that it's failed to change it.

Fixes: b1346338fb ("vsock_test: POLLIN + SO_RCVLOWAT test")
Fixes: 542e893fba ("vsock/test: two tests to check credit update logic")
Fixes: 8abbffd27c ("test/vsock: vsock_perf utility")
Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-05 11:39:33 +01:00
Yafang shao
d9381508ea audit: workaround a GCC bug triggered by task comm changes
A build failure has been reported with the following details:

   In file included from include/linux/string.h:390,
                    from include/linux/bitmap.h:13,
                    from include/linux/cpumask.h:12,
                    from include/linux/smp.h:13,
                    from include/linux/lockdep.h:14,
                    from include/linux/spinlock.h:63,
                    from include/linux/wait.h:9,
                    from include/linux/wait_bit.h:8,
                    from include/linux/fs.h:6,
                    from kernel/auditsc.c:37:
   In function 'sized_strscpy',
       inlined from '__audit_ptrace' at kernel/auditsc.c:2732:2:
>> include/linux/fortify-string.h:293:17:
   error: call to '__write_overflow' declared with attribute error:
   detected write beyond size of object (1st parameter)
     293 |                 __write_overflow();
         |                 ^~~~~~~~~~~~~~~~~~
   In function 'sized_strscpy',
       inlined from 'audit_signal_info_syscall' at kernel/auditsc.c:2759:3:
>> include/linux/fortify-string.h:293:17:
   error: call to '__write_overflow' declared with attribute error:
   detected write beyond size of object (1st parameter)
     293 |                 __write_overflow();
         |                 ^~~~~~~~~~~~~~~~~~

The issue appears to be a GCC bug, though the root cause remains
unclear at this time. For now, let's implement a workaround.

A bug report has also been filed with GCC [0].

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912 [0]

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202410171420.1V00ICVG-lkp@intel.com/
Reported-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Closes: https://lore.kernel.org/all/20241128182435.57a1ea6f@gandalf.local.home/
Reported-by: Zhuo, Qiuxu <qiuxu.zhuo@intel.com>
Closes: https://lore.kernel.org/all/CY8PR11MB71348E568DBDA576F17DAFF389362@CY8PR11MB7134.namprd11.prod.outlook.com/
Originally-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/linux-hardening/202410171059.C2C395030@keescook/
Signed-off-by: Yafang shao <laoar.shao@gmail.com>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Yafang Shao <laoar.shao@gmail.com>
[PM: subject tweak, description line wrapping]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-04 22:57:46 -05:00
Jakub Kicinski
4615855ea8 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-12-03 (ice, idpf, ixgbe, ixgbevf, igb)

This series contains updates to ice, idpf, ixgbe, ixgbevf, and igb
drivers.

For ice:
Arkadiusz corrects search for determining whether PHY clock recovery is
supported on the device.

Przemyslaw corrects mask used for PHY timestamps on ETH56G devices.

Wojciech adds missing virtchnl ops which caused NULL pointer
dereference.

Marcin fixes VLAN filter settings for uplink VSI in switchdev mode.

For idpf:
Josh restores setting of completion tag for empty buffers.

For ixgbevf:
Jake removes incorrect initialization/support of IPSEC for mailbox
version 1.5.

For ixgbe:
Jake rewords and downgrades misleading message when negotiation
of VF mailbox version is not supported.

Tore Amundsen corrects value for BASE-BX10 capability.

For igb:
Yuan Can adds proper teardown on failed pci_register_driver() call.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  igb: Fix potential invalid memory access in igb_init_module()
  ixgbe: Correct BASE-BX10 compliance code
  ixgbe: downgrade logging of unsupported VF API version to debug
  ixgbevf: stop attempting IPSEC offload on Mailbox API 1.5
  idpf: set completion tag for "empty" bufs associated with a packet
  ice: Fix VLAN pruning in switchdev mode
  ice: Fix NULL pointer dereference in switchdev
  ice: fix PHY timestamp extraction for ETH56G
  ice: fix PHY Clock Recovery availability check
====================

Link: https://patch.msgid.link/20241203215521.1646668-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:46:49 -08:00
Jakub Kicinski
18317290b2 Merge branch 'mlx5-misc-fixes-2024-12-03'
Tariq Toukan says:

====================
mlx5 misc fixes 2024-12-03

This patchset provides misc bug fixes from the team to the mlx5 core and
Eth drivers.
====================

Link: https://patch.msgid.link/20241203204920.232744-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:43:49 -08:00
Jianbo Liu
5085f861b4 net/mlx5e: Remove workaround to avoid syndrome for internal port
Previously a workaround was added to avoid syndrome 0xcdb051. It is
triggered when offload a rule with tunnel encapsulation, and
forwarding to another table, but not matching on the internal port in
firmware steering mode. The original workaround skips internal tunnel
port logic, which is not correct as not all cases are considered. As
an example, if vlan is configured on the uplink port, traffic can't
pass because vlan header is not added with this workaround. Besides,
there is no such issue for software steering. So, this patch removes
that, and returns error directly if trying to offload such rule for
firmware steering.

Fixes: 06b4eac9c4 ("net/mlx5e: Don't offload internal port if filter device is out device")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Tested-by: Frode Nordahl <frode.nordahl@canonical.com>
Reviewed-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241203204920.232744-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:43:46 -08:00
Tariq Toukan
31f114c3d1 net/mlx5e: SD, Use correct mdev to build channel param
In a multi-PF netdev, each traffic channel creates its own resources
against a specific PF.
In the cited commit, where this support was added, the channel_param
logic was mistakenly kept unchanged, so it always used the primary PF
which is found at priv->mdev.
In this patch we fix this by moving the logic to be per-channel, and
passing the correct mdev instance.

This bug happened to be usually harmless, as the resulting cparam
structures would be the same for all channels, due to identical FW logic
and decisions.
However, in some use cases, like fwreset, this gets broken.

This could lead to different symptoms. Example:
Error cqe on cqn 0x428, ci 0x0, qn 0x10a9, opcode 0xe, syndrome 0x4,
vendor syndrome 0x32

Fixes: e4f9686bde ("net/mlx5e: Let channels be SD-aware")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20241203204920.232744-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:43:46 -08:00
Patrisious Haddad
d04c81a3e3 net/mlx5: E-Switch, Fix switching to switchdev mode in MPV
Fix the mentioned commit change for MPV mode, since in MPV mode the IB
device is shared between different core devices, so under this change
when moving both devices simultaneously to switchdev mode the IB device
removal and re-addition can race with itself causing unexpected behavior.

In such case do rescan_drivers() only once in order to add the ethernet
representor auxiliary device, and skip adding and removing IB devices.

Fixes: ab85ebf437 ("net/mlx5: E-switch, refactor eswitch mode change")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241203204920.232744-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:43:46 -08:00
Patrisious Haddad
5f9b2bf019 net/mlx5: E-Switch, Fix switching to switchdev mode with IB device disabled
In case that IB device is already disabled when moving to switchdev mode,
which can happen when working with LAG, need to do rescan_drivers()
before leaving in order to add ethernet representor auxiliary device.

Fixes: ab85ebf437 ("net/mlx5: E-switch, refactor eswitch mode change")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241203204920.232744-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:43:46 -08:00
Cosmin Ratiu
10e0f0c018 net/mlx5: HWS: Properly set bwc queue locks lock classes
The mentioned "Fixes" patch forgot to do that.

Fixes: 9addffa343 ("net/mlx5: HWS, use lock classes for bwc locks")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241203204920.232744-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:43:46 -08:00
Cosmin Ratiu
530b69a269 net/mlx5: HWS: Fix memory leak in mlx5hws_definer_calc_layout
It allocates a match template, which creates a compressed definer fc
struct, but that is not deallocated.

This commit fixes that.

Fixes: 74a778b4a6 ("net/mlx5: HWS, added definers handling")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241203204920.232744-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:43:45 -08:00
Jakub Kicinski
5f4d0356f0 Merge branch 'bnxt_en-support-header-page-pool-in-queue-api'
David Wei says:

====================
bnxt_en: support header page pool in queue API

Commit 7ed816be35 ("eth: bnxt: use page pool for head frags") added a
separate page pool for header frags. Now, frags are allocated from this
header page pool e.g. rxr->tpa_info.data.

The queue API did not properly handle rxr->tpa_info and so using the
queue API to i.e. reset any queues will result in pages being returned
to the incorrect page pool, causing inflight != 0 warnings.

Fix this bug by properly allocating/freeing tpa_info and copying/freeing
head_pool in the queue API implementation.

The 1st patch is a prep patch that refactors helpers out to be used by
the implementation patch later.

The 2nd patch is a drive-by refactor. Happy to take it out and re-send
to net-next if there are any objections.

The 3rd patch is the implementation patch that will properly alloc/free
rxr->tpa_info.
====================

Link: https://patch.msgid.link/20241204041022.56512-1-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:23:37 -08:00
David Wei
bd649c5cc9 bnxt_en: handle tpa_info in queue API implementation
Commit 7ed816be35 ("eth: bnxt: use page pool for head frags") added a
page pool for header frags, which may be distinct from the existing pool
for the aggregation ring. Prior to this change, frags used in the TPA
ring rx_tpa were allocated from system memory e.g. napi_alloc_frag()
meaning their lifetimes were not associated with a page pool. They can
be returned at any time and so the queue API did not alloc or free
rx_tpa.

But now frags come from a separate head_pool which may be different to
page_pool. Without allocating and freeing rx_tpa, frags allocated from
the old head_pool may be returned to a different new head_pool which
causes a mismatch between the pp hold/release count.

Fix this problem by properly freeing and allocating rx_tpa in the queue
API implementation.

Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241204041022.56512-4-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:23:35 -08:00
David Wei
bf1782d70d bnxt_en: refactor bnxt_alloc_rx_rings() to call bnxt_alloc_rx_agg_bmap()
Refactor bnxt_alloc_rx_rings() to call bnxt_alloc_rx_agg_bmap() for
allocating rx_agg_bmap.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241204041022.56512-3-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:23:35 -08:00
David Wei
5883a3e0ba bnxt_en: refactor tpa_info alloc/free into helpers
Refactor bnxt_rx_ring_info->tpa_info operations into helpers that work
on a single tpa_info in prep for queue API using them.

There are 2 pairs of operations:

* bnxt_alloc_one_tpa_info()
* bnxt_free_one_tpa_info()

These alloc/free the tpa_info array itself.

* bnxt_alloc_one_tpa_info_data()
* bnxt_free_one_tpa_info_data()

These alloc/free the frags stored in tpa_info array.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241204041022.56512-2-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:23:35 -08:00
Eric Dumazet
8588c99c7d geneve: do not assume mac header is set in geneve_xmit_skb()
We should not assume mac header is set in output path.

Use skb_eth_hdr() instead of eth_hdr() to fix the issue.

sysbot reported the following :

 WARNING: CPU: 0 PID: 11635 at include/linux/skbuff.h:3052 skb_mac_header include/linux/skbuff.h:3052 [inline]
 WARNING: CPU: 0 PID: 11635 at include/linux/skbuff.h:3052 eth_hdr include/linux/if_ether.h:24 [inline]
 WARNING: CPU: 0 PID: 11635 at include/linux/skbuff.h:3052 geneve_xmit_skb drivers/net/geneve.c:898 [inline]
 WARNING: CPU: 0 PID: 11635 at include/linux/skbuff.h:3052 geneve_xmit+0x4c38/0x5730 drivers/net/geneve.c:1039
Modules linked in:
CPU: 0 UID: 0 PID: 11635 Comm: syz.4.1423 Not tainted 6.12.0-syzkaller-10296-gaaf20f870da0 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
 RIP: 0010:skb_mac_header include/linux/skbuff.h:3052 [inline]
 RIP: 0010:eth_hdr include/linux/if_ether.h:24 [inline]
 RIP: 0010:geneve_xmit_skb drivers/net/geneve.c:898 [inline]
 RIP: 0010:geneve_xmit+0x4c38/0x5730 drivers/net/geneve.c:1039
Code: 21 c6 02 e9 35 d4 ff ff e8 a5 48 4c fb 90 0f 0b 90 e9 fd f5 ff ff e8 97 48 4c fb 90 0f 0b 90 e9 d8 f5 ff ff e8 89 48 4c fb 90 <0f> 0b 90 e9 41 e4 ff ff e8 7b 48 4c fb 90 0f 0b 90 e9 cd e7 ff ff
RSP: 0018:ffffc90003b2f870 EFLAGS: 00010283
RAX: 000000000000037a RBX: 000000000000ffff RCX: ffffc9000dc3d000
RDX: 0000000000080000 RSI: ffffffff86428417 RDI: 0000000000000003
RBP: ffffc90003b2f9f0 R08: 0000000000000003 R09: 000000000000ffff
R10: 000000000000ffff R11: 0000000000000002 R12: ffff88806603c000
R13: 0000000000000000 R14: ffff8880685b2780 R15: 0000000000000e23
FS:  00007fdc2deed6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b30a1dff8 CR3: 0000000056b8c000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
  __netdev_start_xmit include/linux/netdevice.h:5002 [inline]
  netdev_start_xmit include/linux/netdevice.h:5011 [inline]
  __dev_direct_xmit+0x58a/0x720 net/core/dev.c:4490
  dev_direct_xmit include/linux/netdevice.h:3181 [inline]
  packet_xmit+0x1e4/0x360 net/packet/af_packet.c:285
  packet_snd net/packet/af_packet.c:3146 [inline]
  packet_sendmsg+0x2700/0x5660 net/packet/af_packet.c:3178
  sock_sendmsg_nosec net/socket.c:711 [inline]
  __sock_sendmsg net/socket.c:726 [inline]
  __sys_sendto+0x488/0x4f0 net/socket.c:2197
  __do_sys_sendto net/socket.c:2204 [inline]
  __se_sys_sendto net/socket.c:2200 [inline]
  __x64_sys_sendto+0xe0/0x1c0 net/socket.c:2200
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Fixes: a025fb5f49 ("geneve: Allow configuration of DF behaviour")
Reported-by: syzbot+3ec5271486d7cb2d242a@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/674f4b72.050a0220.17bd51.004a.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Link: https://patch.msgid.link/20241203182122.2725517-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:21:47 -08:00
Ido Schimmel
217bbf156f mlxsw: spectrum_acl_flex_keys: Use correct key block on Spectrum-4
The driver is currently using an ACL key block that is not supported by
Spectrum-4. This works because the driver is only using a single field
from this key block which is located in the same offset in the
equivalent Spectrum-4 key block.

The issue was discovered when the firmware started rejecting the use of
the unsupported key block. The change has been reverted to avoid
breaking users that only update their firmware.

Nonetheless, fix the issue by using the correct key block.

Fixes: 07ff135958 ("mlxsw: Introduce flex key elements for Spectrum-4")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/35e72c97bdd3bc414fb8e4d747e5fb5d26c29658.1733237440.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:18:35 -08:00
Kory Maincent
910c4788d6 ethtool: Fix wrong mod state in case of verbose and no_mask bitset
A bitset without mask in a _SET request means we want exactly the bits in
the bitset to be set. This works correctly for compact format but when
verbose format is parsed, ethnl_update_bitset32_verbose() only sets the
bits present in the request bitset but does not clear the rest. The commit
6699170376 ("ethtool: fix application of verbose no_mask bitset") fixes
this issue by clearing the whole target bitmap before we start iterating.
The solution proposed brought an issue with the behavior of the mod
variable. As the bitset is always cleared the old value will always
differ to the new value.

Fix it by adding a new function to compare bitmaps and a temporary variable
which save the state of the old bitmap.

Fixes: 6699170376 ("ethtool: fix application of verbose no_mask bitset")
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20241202153358.1142095-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 18:54:43 -08:00
Paolo Abeni
50b9420444 ipmr: tune the ipmr_can_free_table() checks.
Eric reported a syzkaller-triggered splat caused by recent ipmr changes:

WARNING: CPU: 2 PID: 6041 at net/ipv6/ip6mr.c:419
ip6mr_free_table+0xbd/0x120 net/ipv6/ip6mr.c:419
Modules linked in:
CPU: 2 UID: 0 PID: 6041 Comm: syz-executor183 Not tainted
6.12.0-syzkaller-10681-g65ae975e97d5 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:ip6mr_free_table+0xbd/0x120 net/ipv6/ip6mr.c:419
Code: 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c
02 00 75 58 49 83 bc 24 c0 0e 00 00 00 74 09 e8 44 ef a9 f7 90 <0f> 0b
90 e8 3b ef a9 f7 48 8d 7b 38 e8 12 a3 96 f7 48 89 df be 0f
RSP: 0018:ffffc90004267bd8 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff88803c710000 RCX: ffffffff89e4d844
RDX: ffff88803c52c880 RSI: ffffffff89e4d87c RDI: ffff88803c578ec0
RBP: 0000000000000001 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff88803c578000
R13: ffff88803c710000 R14: ffff88803c710008 R15: dead000000000100
FS: 00007f7a855ee6c0(0000) GS:ffff88806a800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7a85689938 CR3: 000000003c492000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
ip6mr_rules_exit+0x176/0x2d0 net/ipv6/ip6mr.c:283
ip6mr_net_exit_batch+0x53/0xa0 net/ipv6/ip6mr.c:1388
ops_exit_list+0x128/0x180 net/core/net_namespace.c:177
setup_net+0x4fe/0x860 net/core/net_namespace.c:394
copy_net_ns+0x2b4/0x6b0 net/core/net_namespace.c:500
create_new_namespaces+0x3ea/0xad0 kernel/nsproxy.c:110
unshare_nsproxy_namespaces+0xc0/0x1f0 kernel/nsproxy.c:228
ksys_unshare+0x45d/0xa40 kernel/fork.c:3334
__do_sys_unshare kernel/fork.c:3405 [inline]
__se_sys_unshare kernel/fork.c:3403 [inline]
__x64_sys_unshare+0x31/0x40 kernel/fork.c:3403
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f7a856332d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f7a855ee238 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
RAX: ffffffffffffffda RBX: 00007f7a856bd308 RCX: 00007f7a856332d9
RDX: 00007f7a8560f8c6 RSI: 0000000000000000 RDI: 0000000062040200
RBP: 00007f7a856bd300 R08: 00007fff932160a7 R09: 00007f7a855ee6c0
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f7a856bd30c
R13: 0000000000000000 R14: 00007fff93215fc0 R15: 00007fff932160a8
</TASK>

The root cause is a network namespace creation failing after successful
initialization of the ipmr subsystem. Such a case is not currently
matched by the ipmr_can_free_table() helper.

New namespaces are zeroed on allocation and inserted into net ns list
only after successful creation; when deleting an ipmr table, the list
next pointer can be NULL only on netns initialization failure.

Update the ipmr_can_free_table() checks leveraging such condition.

Reported-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+6e8cb445d4b43d006e0c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=6e8cb445d4b43d006e0c
Fixes: 11b6e701bc ("ipmr: add debug check for mr table cleanup")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/8bde975e21bbca9d9c27e36209b2dd4f1d7a3f00.1733212078.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 18:49:16 -08:00
Norbert Szetei
06a025448b ksmbd: align aux_payload_buf to avoid OOB reads in cryptographic operations
The aux_payload_buf allocation in SMB2 read is performed without ensuring
alignment, which could result in out-of-bounds (OOB) reads during
cryptographic operations such as crypto_xor or ghash. This patch aligns
the allocation of aux_payload_buf to prevent these issues.
(Note that to add this patch to stable would require modifications due
to recent patch "ksmbd: use __GFP_RETRY_MAYFAIL")

Signed-off-by: Norbert Szetei <norbert@doyensec.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-04 19:45:28 -06:00
Ralph Boehme
8cb0bc5436 fs/smb/client: cifs_prime_dcache() for SMB3 POSIX reparse points
Spares an extra revalidation request

Cc: stable@vger.kernel.org
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Ralph Boehme <slow@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-04 17:39:51 -06:00
Ralph Boehme
6a832bc8bb fs/smb/client: Implement new SMB3 POSIX type
Fixes special files against current Samba.

On the Samba server:

insgesamt 20
131958 brw-r--r--  1 root  root  0, 0 15. Nov 12:04 blockdev
131965 crw-r--r--  1 root  root  1, 1 15. Nov 12:04 chardev
131966 prw-r--r--  1 samba samba    0 15. Nov 12:05 fifo
131953 -rw-rwxrw-+ 2 samba samba    4 18. Nov 11:37 file
131953 -rw-rwxrw-+ 2 samba samba    4 18. Nov 11:37 hardlink
131957 lrwxrwxrwx  1 samba samba    4 15. Nov 12:03 symlink -> file
131954 -rwxrwxr-x+ 1 samba samba    0 18. Nov 15:28 symlinkoversmb

Before:

ls: cannot access '/mnt/smb3unix/posix/blockdev': No data available
ls: cannot access '/mnt/smb3unix/posix/chardev': No data available
ls: cannot access '/mnt/smb3unix/posix/symlinkoversmb': No data available
ls: cannot access '/mnt/smb3unix/posix/fifo': No data available
ls: cannot access '/mnt/smb3unix/posix/symlink': No data available
total 16
     ? -????????? ? ?    ?     ?            ? blockdev
     ? -????????? ? ?    ?     ?            ? chardev
     ? -????????? ? ?    ?     ?            ? fifo
131953 -rw-rwxrw- 2 root samba 4 Nov 18 11:37 file
131953 -rw-rwxrw- 2 root samba 4 Nov 18 11:37 hardlink
     ? -????????? ? ?    ?     ?            ? symlink
     ? -????????? ? ?    ?     ?            ? symlinkoversmb

After:

insgesamt 21
131958 brw-r--r-- 1 root root  0, 0 15. Nov 12:04 blockdev
131965 crw-r--r-- 1 root root  1, 1 15. Nov 12:04 chardev
131966 prw-r--r-- 1 root samba    0 15. Nov 12:05 fifo
131953 -rw-rwxrw- 2 root samba    4 18. Nov 11:37 file
131953 -rw-rwxrw- 2 root samba    4 18. Nov 11:37 hardlink
131957 lrwxrwxrwx 1 root samba    4 15. Nov 12:03 symlink -> file
131954 lrwxrwxr-x 1 root samba   23 18. Nov 15:28 symlinkoversmb -> mnt/smb3unix/posix/file

Cc: stable@vger.kernel.org
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Ralph Boehme <slow@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-04 17:39:48 -06:00
Pablo Neira Ayuso
7ffc748115 netfilter: nft_set_hash: skip duplicated elements pending gc run
rhashtable does not provide stable walk, duplicated elements are
possible in case of resizing. I considered that checking for errors when
calling rhashtable_walk_next() was sufficient to detect the resizing.
However, rhashtable_walk_next() returns -EAGAIN only at the end of the
iteration, which is too late, because a gc work containing duplicated
elements could have been already scheduled for removal to the worker.

Add a u32 gc worker sequence number per set, bump it on every workqueue
run. Annotate gc worker sequence number on the expired element. Use it
to skip those already seen in this gc workqueue run.

Note that this new field is never reset in case gc transaction fails, so
next gc worker run on the expired element overrides it. Wraparound of gc
worker sequence number should not be an issue with stale gc worker
sequence number in the element, that would just postpone the element
removal in one gc run.

Note that it is not possible to use flags to annotate that element is
pending gc run to detect duplicates, given that gc transaction can be
invalidated in case of update from the control plane, therefore, not
allowing to clear such flag.

On x86_64, pahole reports no changes in the size of nft_rhash_elem.

Fixes: f6c383b8c3 ("netfilter: nf_tables: adapt set backend to use GC transaction API")
Reported-by: Laurent Fasnacht <laurent.fasnacht@proton.ch>
Tested-by: Laurent Fasnacht <laurent.fasnacht@proton.ch>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-12-04 21:37:41 +01:00
Len Brown
c9a4b55431 x86/cpu: Add Lunar Lake to list of CPUs with a broken MONITOR implementation
Under some conditions, MONITOR wakeups on Lunar Lake processors
can be lost, resulting in significant user-visible delays.

Add Lunar Lake to X86_BUG_MONITOR so that wake_up_idle_cpu()
always sends an IPI, avoiding this potential delay.

Reported originally here:

	https://bugzilla.kernel.org/show_bug.cgi?id=219364

[ dhansen: tweak subject ]

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/a4aa8842a3c3bfdb7fe9807710eef159cbf0e705.1731463305.git.len.brown%40intel.com
2024-12-04 12:30:14 -08:00