Add a new ethtool -W dump flag (2) to include driver coredump segments.
This patch adds the host backing store context memory pages used by the
chip and FW to store various states to the coredump. The pages for
each context memory type is dumped into a separate coredump segment.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Selvin Thyparampil Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Shruti Parab <shruti.parab@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241115151438.550106-11-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pass the component ID and segment ID to this function to create
the coredump segment header. This will be needed in the next
patches to create more segments for the coredump.
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Signed-off-by: Shruti Parab <shruti.parab@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241115151438.550106-10-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Host context memory is used by the newer chips to store context
information for various L2 and RoCE states and FW logs. This
information will be useful for debugging. This patch adds the
functions to copy all pages of a context memory type to a contiguous
buffer. The next patches will include the context memory dump
during ethtool -w coredump.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Co-developed-by: Shruti Parab <shruti.parab@broadcom.com>
Signed-off-by: Shruti Parab <shruti.parab@broadcom.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241115151438.550106-9-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If FW supports appending new FW logs to an offset in the context
memory after FW reset, then do not free this type of context memory
during reset. The driver will provide the initial offset to the FW
when configuring this type of context memory. This way, we don't lose
the older FW logs after reset.
Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241115151438.550106-8-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The FW trace memory pages will be added to the ethtool -w coredump
in later patches. In addition to the raw data, the driver has to
add a header to provide the head and tail information on each FW
trace log segment when creating the coredump. The FW sends an async
message to the driver after DMAing a chunk of logs to the context
memory to indicate the last offset containing the tail of the logs.
The driver needs to keep track of that.
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Signed-off-by: Shruti Parab <shruti.parab@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241115151438.550106-7-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Allocate the new FW trace log backing store context memory types
if they are supported by the FW. FW debug logs are DMA'ed to the host
backing store memory when the on-chip buffers are full. If host
memory cannot be allocated for these memory types, the driver
will not abort.
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Signed-off-by: Shruti Parab <shruti.parab@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241115151438.550106-6-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If 'force' is false, it will keep the memory pages and all data
structures for the context memory type if the memory is valid.
This patch always passes true for the 'force' parameter so there is
no change in behavior. Later patches will adjust the 'force' parameter
for the FW log context memory types so that the logs will not be reset
after FW reset.
Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241115151438.550106-5-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a new function bnxt_free_one_ctx_mem() to free one context
memory type. bnxt_free_ctx_mem() now calls the new function in
the loop to free each context memory type. There is no change in
behavior. Later patches will further make use of the new function.
Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241115151438.550106-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a new bit to struct bnxt_ctx_mem_type to indicate that host
memory has been successfully allocated for this context memory type.
In the next patches, we'll be adding some additional context memory
types for FW debugging/logging. If memory cannot be allocated for
any of these new types, we will not abort and the cleared mem_valid
bit will indicate to skip configuring the memory type.
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Signed-off-by: Shruti Parab <shruti.parab@broadcom.com>
Signed-of-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241115151438.550106-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The major change is the new firmware command to flush the FW debug
logs to the host backing store context memory buffers.
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241115151438.550106-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add global clock controllers for QCS8300, and IPQ5424.
Add camera, display and video clock controllers for SA8775P.
Add global, display, gpu, tcsr, and rpmh clock controllers for SAR2130P.
Add global, camera, display, gpu, and video clock controllers for
SM8475.
Support for IPQ9574 is added to the Alpha PLL clock driver, and the
checks for already configured PLL at boot are cleaned up.
QCS404 GPLL3 initial rate is corrected.
A new ops for shared rcg2 floor_ops is introduced, for dealing with
shared SDCC clocks.
-----BEGIN PGP SIGNATURE-----
iQJJBAABCAAzFiEEBd4DzF816k8JZtUlCx85Pw2ZrcUFAmc4ykgVHGFuZGVyc3Nv
bkBrZXJuZWwub3JnAAoJEAsfOT8Nma3Fu08QAM7Hgh75SqfeVVFZYz7parQmG29t
xQtEtbNOVvcRjxiZK94/QnZwcEyCi9OJbikV7o7Fo+GBYI09dSCnoZ9FDyeJnXGg
6beYrvna3wIENYbrKEpJW4tBBWC6WI5Rxc6GU6SHQIx1kAXKxzAlTyRnCM/UlBBD
0qg7Pm+SYif+gSoNhNN1Sx4PJGGffNzZnFX1Ft13AN+t3scIZKPV7xxWFE/qUoI4
SmixfvDdURPPsiG7P6MS9rDg81wnwgqB/iwYFtytCVBkLc6tYyWCKmRtpv4iXAUc
U8YO3UWXPyvpgFlGEF16wZ4/WA2dtgfrunk/v0yyxmky5e5grBRJrqe7SZ4sDUUe
a9cSTnlou3t0aK1LS0e7xW2HOMUxwd4SqlnijDFBPxSZZ4gK5Oq1Sx025gu7kIyR
GX9bqULYGlvJgHjtpNjXX0IhVhx9sH4NWLJqr27wYwEGGbnx1JkoTEWZaRbpYB2d
hhVQ4uO4ZRpTEf3p0+fN8poE8nH1sHmhi829ic3wGyFitIYp94KDNTxQ11lkcUxM
BxXhoNTdh9E0cuWDn0Ittdlfvp7QZlRhxaUL0i0ocrSSkQjKIK2/KTnm+1sBJiJs
7DAImseFUUNwLcV5RlVoAFT0nOUB2W+lZSIVRziDI/5FLbCvxjFYtUhHusmfpDnk
nlE/Yh43Dw7ldnSc
=ULNj
-----END PGP SIGNATURE-----
Merge tag 'qcom-clk-for-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into clk-qcom
Pull Qualcomm clk driver updates from Bjorn Andersson:
- Global clock controllers for Qualcomm QCS8300 and IPQ5424 SoCs
- Camera, display and video clock controllers for Qualcomm SA8775P SoCs
- Global, display, GPU, TCSR, and RPMh clock controllers for Qualcomm SAR2130P
- Global, camera, display, GPU, and video clock controllers for
Qualcomm SM8475 SoCs
- Support for Qualcomm IPQ9574 in the Alpha PLL clock driver
- Cleanup checks for already configured PLLs at boot in the Qualcomm
Alpha PLL driver
- Fix the initial rate for Qualcomm QCS404 GPLL3
- Add shared rcg2 floor clk_ops for shared SDCC clks on Qualcomm SoCs
* tag 'qcom-clk-for-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (43 commits)
clk: qcom: remove unused data from gcc-ipq5424.c
clk: qcom: Add support for Global Clock Controller on QCS8300
dt-bindings: clock: qcom: Add GCC clocks for QCS8300
clk: qcom: add Global Clock controller (GCC) driver for IPQ5424 SoC
clk: qcom: clk-alpha-pll: Add NSS HUAYRA ALPHA PLL support for ipq9574
dt-bindings: clock: Add Qualcomm IPQ5424 GCC binding
clk: qcom: add SAR2130P GPU Clock Controller support
clk: qcom: dispcc-sm8550: enable support for SAR2130P
clk: qcom: tcsrcc-sm8550: add SAR2130P support
clk: qcom: add support for GCC on SAR2130P
clk: qcom: rpmh: add support for SAR2130P
clk: qcom: rcg2: add clk_rcg2_shared_floor_ops
dt-bindings: clk: qcom,sm8450-gpucc: add SAR2130P compatibles
dt-bindings: clock: qcom,sm8550-dispcc: Add SAR2130P compatible
dt-bindings: clock: qcom,sm8550-tcsr: Add SAR2130P compatible
dt-bindings: clock: qcom: document SAR2130P Global Clock Controller
dt-bindings: clock: qcom,rpmhcc: Add SAR2130P compatible
clk: qcom: Make GCC_6125 depend on QCOM_GDSC
dt-bindings: clock: qcom: gcc-ipq9574: remove q6 bring up clock macros
dt-bindings: clock: qcom: gcc-ipq5332: remove q6 bring up clock macros
...
Jiayuan Chen says:
====================
bpf: fix recursive lock and add test
1. fix recursive lock when ebpf prog return SK_PASS.
2. add selftest to reproduce recursive lock.
Note that the test code can reproduce the 'dead-lock' and if just
the selftest merged without first patch, the test case will
definitely fail, because the issue of deadlock is inevitable.
v1: https://lore.kernel.org/55fc6114-7e64-4b65-86d2-92cfd1e9e92f@linux.dev/
====================
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20241118030910.36230-1-mrpre@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a new tests in sockmap_basic.c to test SK_PASS for sockmap
Signed-off-by: Jiayuan Chen <mrpre@163.com>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20241118030910.36230-3-mrpre@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When the stream_verdict program returns SK_PASS, it places the received skb
into its own receive queue, but a recursive lock eventually occurs, leading
to an operating system deadlock. This issue has been present since v6.9.
'''
sk_psock_strp_data_ready
write_lock_bh(&sk->sk_callback_lock)
strp_data_ready
strp_read_sock
read_sock -> tcp_read_sock
strp_recv
cb.rcv_msg -> sk_psock_strp_read
# now stream_verdict return SK_PASS without peer sock assign
__SK_PASS = sk_psock_map_verd(SK_PASS, NULL)
sk_psock_verdict_apply
sk_psock_skb_ingress_self
sk_psock_skb_ingress_enqueue
sk_psock_data_ready
read_lock_bh(&sk->sk_callback_lock) <= dead lock
'''
This topic has been discussed before, but it has not been fixed.
Previous discussion:
https://lore.kernel.org/all/6684a5864ec86_403d20898@john.notmuch
Fixes: 6648e613226e ("bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue")
Reported-by: Vincent Whitchurch <vincent.whitchurch@datadoghq.com>
Signed-off-by: Jiayuan Chen <mrpre@163.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20241118030910.36230-2-mrpre@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jason A. Donenfeld says:
====================
wireguard updates and fixes for 6.13
This tiny series (+3/-2) fixes one bug and has three small improvements.
1) Fix running the netns.sh test suite on systems that haven't yet
inserted the nf_conntrack module.
2) Remove a stray useless function call in a selftest.
3) There's no need to zero out the netdev private data in recent
kernels.
4) Set the TSO max size to be GSO_MAX_SIZE, so that we aggregate larger
packets. Daniel reports seeing a 15% improvement in a simple load and
suggested the speedups would be even better in more complex loads.
====================
Link: https://patch.msgid.link/20241117212030.629159-1-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Advertise GSO_MAX_SIZE as TSO max size in order support BIG TCP for wireguard.
This helps to improve wireguard performance a bit when enabled as it allows
wireguard to aggregate larger skbs in wg_packet_consume_data_done() via
napi_gro_receive(), but also allows the stack to build larger skbs on xmit
where the driver then segments them before encryption inside wg_xmit().
We've seen a 15% improvement in TCP stream performance.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://patch.msgid.link/20241117212030.629159-5-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Some distros may not load nf_conntrack by default, which will cause
subsequent nf_conntrack sets to fail. Load this module if it is not
already loaded.
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
[ Jason: add [[ -e ... ]] check so this works in the qemu harness. ]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://patch.msgid.link/20241117212030.629159-4-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit fixes a useless call issue detected by Coverity (CID
1508092). The call to horrible_allowedips_lookup_v4 is unnecessary as
its return value is never checked.
Signed-off-by: Dheeraj Reddy Jonnalagadda <dheeraj.linuxdev@gmail.com>
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://patch.msgid.link/20241117212030.629159-3-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The memory for netdev_priv is allocated using kvzalloc in
alloc_netdev_mqs before rtnl_link_ops->setup is called so there is no
need to zero it again in wg_setup.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://patch.msgid.link/20241117212030.629159-2-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Enhance the vpa_pmu driver with a feature to observe context switch
latency event for both per-task (tid) and per-pid (pid) option.
Couple of new helper functions are added to hide the abstraction of
reading the context switch latency counter from kvm_vcpu_arch struct
and these helper functions are defined in the "kvm/book3s_hv.c".
"PERF_ATTACH_TASK" flag is used to decide whether to read the counter
values from lppaca or kvm_vcpu_arch struct.
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Co-developed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241118114114.208964-4-kjain@linux.ibm.com
Commit e1f288d2f9c69 ("KVM: PPC: Book3S HV nestedv2: Add support
for reading VPA counters for pseries guests") introduced support for new
Virtual Process Area(VPA) based software counters. These counters are
useful when observing context switch latency of L1 <-> L2. It also
added access to counters in lppaca, which is good enough to understand
latency details per-cpu level. But to extend and aggregate
per-process level(qemu) or per-pid/tid level(vcpu), these
counters also needs to be added as part of kvm_vcpu_arch struct.
Additional code added to update these new kvm_vcpu_arch variables
in do_trace_nested_cs_time function.
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Co-developed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241118114114.208964-3-kjain@linux.ibm.com
Details are added for the vpa_pmu event and format
attributes in the ABI documentation.
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Co-developed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241118114114.208964-2-kjain@linux.ibm.com
To support performance measurement for KVM on PowerVM(KoP)
feature, PowerVM hypervisor has added couple of new software
counters in Virtual Process Area(VPA) of the partition.
Commit e1f288d2f9c69 ("KVM: PPC: Book3S HV nestedv2: Add
support for reading VPA counters for pseries guests")
have updated the paca fields with corresponding changes.
Proposed perf interface is to expose these new software
counters for monitoring of context switch latencies and
runtime aggregate. Perf interface driver is called
"vpa_pmu" and it has dependency on KVM and perf, hence
added new config called "VPA_PMU" which depends on
"CONFIG_KVM_BOOK3S_64_HV" and "CONFIG_HV_PERF_CTRS".
Since, kvm and kvm_host are currently compiled as built-in
modules, this perf interface takes the same path and
registered as a module.
vpa_pmu perf interface needs access to some of the kvm
functions and structures like kvmhv_get_l2_counters_status(),
hence kvm_book3s_64.h and kvm_ppc.h are included.
Below are the events added to monitor KoP:
vpa_pmu/l1_to_l2_lat/
vpa_pmu/l2_to_l1_lat/
vpa_pmu/l2_runtime_agg/
and vpa_pmu driver supports only per-cpu monitoring with this patch.
Example usage:
[command]# perf stat -e vpa_pmu/l1_to_l2_lat/ -a -I 1000
1.001017682 727,200 vpa_pmu/l1_to_l2_lat/
2.003540491 1,118,824 vpa_pmu/l1_to_l2_lat/
3.005699458 1,919,726 vpa_pmu/l1_to_l2_lat/
4.007827011 2,364,630 vpa_pmu/l1_to_l2_lat/
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Co-developed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241118114114.208964-1-kjain@linux.ibm.com
Breno Leitao says:
====================
netpoll: Use RCU primitives for npinfo pointer access
The net_device->npinfo pointer is marked with __rcu, indicating it requires
proper RCU access primitives:
struct net_device {
...
struct netpoll_info __rcu *npinfo;
...
};
Direct access to this pointer can lead to issues such as:
- Compiler incorrectly caching/reusing stale pointer values
- Missing memory ordering guarantees
- Non-atomic pointer loads
Replace direct NULL checks of npinfo with rcu_access_pointer(),
which provides the necessary memory ordering guarantees without the
overhead of a full RCU dereference, since we only need to verify
if the pointer is NULL.
In both cases, the RCU read lock is not held when the function is being
called. I checked that by using lockdep_assert_in_rcu_read_lock(), and
seeing the warning on both cases.
====================
Link: https://patch.msgid.link/20241118-netpoll_rcu-v1-0-a1888dcb4a02@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The ndev->npinfo pointer in netpoll_poll_lock() is RCU-protected but is
being accessed directly for a NULL check. While no RCU read lock is held
in this context, we should still use proper RCU primitives for
consistency and correctness.
Replace the direct NULL check with rcu_access_pointer(), which is the
appropriate primitive when only checking for NULL without dereferencing
the pointer. This function provides the necessary ordering guarantees
without requiring RCU read-side protection.
Fixes: bea3348eef27 ("[NET]: Make NAPI polling independent of struct net_device objects.")
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Link: https://patch.msgid.link/20241118-netpoll_rcu-v1-2-a1888dcb4a02@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The ndev->npinfo pointer in __netpoll_setup() is RCU-protected but is being
accessed directly for a NULL check. While no RCU read lock is held in this
context, we should still use proper RCU primitives for consistency and
correctness.
Replace the direct NULL check with rcu_access_pointer(), which is the
appropriate primitive when only checking for NULL without dereferencing
the pointer. This function provides the necessary ordering guarantees
without requiring RCU read-side protection.
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20241118-netpoll_rcu-v1-1-a1888dcb4a02@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
fun_create_queue was added in 2022 by
commit e1ffcc66818f ("net/fungible: Add service module for Fungible
drivers")
but hasn't been used.
Remove it.
Also remove the static helper functions it was the only user of.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kees Cook says:
====================
UAPI: ethtool: Avoid flex-array in struct ethtool_link_settings
This reverts the tagged struct group in struct ethtool_link_settings and
instead just removes the flexible array member from Linux's view as it
is entirely unused.
====================
Link: https://patch.msgid.link/20241115204115.work.686-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
struct ethtool_link_settings tends to be used as a header for other
structures that have trailing bytes[1], but has a trailing flexible array
itself. Using this overlapped with other structures leads to ambiguous
object sizing in the compiler, so we want to avoid such situations (which
have caused real bugs in the past). Detecting this can be done with
-Wflex-array-member-not-at-end, which will need to be enabled globally.
Using a tagged struct_group() to create a new ethtool_link_settings_hdr
structure isn't possible as it seems we cannot use the tagged variant of
struct_group() due to syntax issues from C++'s perspective (even within
"extern C")[2]. Instead, we can just leave the offending member defined
in UAPI and remove it from the kernel's view of the structure, as Linux
doesn't actually use this member at all. There is also no change in
size since it was already a flexible array that didn't contribute to
size returned by any use of sizeof().
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/lkml/20241109100213.262a2fa0@kernel.org/ [2]
Link: https://lore.kernel.org/lkml/0bc2809fe2a6c11dd4c8a9a10d9bd65cccdb559b.1730238285.git.gustavoars@kernel.org/ [1]
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20241115204308.3821419-3-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This reverts commit 43d3487035e9a86fad952de4240a518614240d43. We cannot
use tagged struct groups in UAPI because C++ will throw syntax errors
even under "extern C".
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20241115204308.3821419-2-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This reverts commit 3bd9b9abdf1563a22041b7255baea6d449902f1a. We cannot
use the new tagged struct group because it throws C++ errors even under
"extern C".
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20241115204308.3821419-1-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
bpf_offload caught a spurious warning in TC recently, but the error
message did not provide enough information to know what the problem
is:
FAIL: Found 'netdevsim' in command output, leaky extack?
Add the extack to the output:
FAIL: Unexpected command output, leaky extack? ('netdevsim', 'Warning: Filter with specified priority/protocol not found.')
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
CAN networking and drivers are maintained by Marc, Oliver and Vincent.
Marc sends us already pull requests with reviewed and validated code.
Exclude the CAN patch postings from the netdev@ mailing list to lower
the patch volume there.
Link: https://lore.kernel.org/20241113193709.395c18b0@kernel.org
Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://patch.msgid.link/20241115195609.981049-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commits for the SMC protocol usually get carried through the netdev
mailing list. Some portions use InfiniBand verbs that are discussed on
the RDMA mailing list. So run patches by that list too to increase the
likelihood that all interested parties can see them.
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthieu Baerts says:
====================
mptcp: pm: lockless list traversal and cleanup
Here are two patches improving the MPTCP in-kernel path-manager.
- Patch 1: the get and dump endpoints operations are iterating over the
endpoints list in a lockless way.
- Patch 2: reduce the code duplication to lookup an endpoint.
====================
Link: https://patch.msgid.link/20241115-net-next-mptcp-pm-lockless-dump-v1-0-f4a1bcb4ca2c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The helper __lookup_addr() can be used in mptcp_pm_nl_get_local_id()
and mptcp_pm_nl_is_backup() to simplify the code, and avoid code
duplication.
Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241115-net-next-mptcp-pm-lockless-dump-v1-2-f4a1bcb4ca2c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To return an endpoint to the userspace via Netlink, and to dump all of
them, the endpoint list was iterated while holding the pernet->lock, but
only to read the content of the list.
In these cases, the spin locks can be replaced by RCU read ones, and use
the _rcu variants to iterate over the entries list in a lockless way.
Note that the __lookup_addr_by_id() helper has been modified to use the
_rcu variants of list_for_each_entry(), but with an extra conditions, so
it can be called either while the RCU read lock is held, or when the
associated pernet->lock is held.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241115-net-next-mptcp-pm-lockless-dump-v1-1-f4a1bcb4ca2c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The driver’s compatibility with devices is confirmed earlier in
platform_match(). Since reaching probe means the device is valid,
the extra check can be removed to simplify the code.
Signed-off-by: Vitalii Mordan <mordan@ispras.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For 1000BASE-X or SGMII interface mode, the PCS also need to be selected.
Only return null pointer when there is a copper NIC with external PHY.
Fixes: 02b2a6f91b90 ("net: txgbe: support copper NIC with external PHY")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Link: https://patch.msgid.link/20241115073508.1130046-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since the GPIO interrupt controller is always not working properly, we need
to constantly add workaround to cope with hardware deficiencies. So just
remove GPIO interrupt controller, and let the SFP driver poll the GPIO
status.
Fixes: b4a2496c17ed ("net: txgbe: fix GPIO interrupt blocking")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Link: https://patch.msgid.link/20241115071527.1129458-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski says:
====================
eth: fbnic: cleanup and add a few stats
Cleanup trival problems with fbnic and add the PCIe and RPC (Rx parser)
stats.
All stats are read under rtnl_lock for now, so the code is pretty
trivial. We'll need to add more locking when we start gathering
drops used by .ndo_get_stats64.
====================
Link: https://patch.msgid.link/20241115015344.757567-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Report Rx parser statistics via ethtool -S.
The parser stats are 32b, so we need to add refresh to the service
task to make sure we don't miss overflows.
Signed-off-by: Sanman Pradhan <sanman.p211993@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241115015344.757567-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add PCIe hardware statistics support to the fbnic driver. These stats
provide insight into PCIe transaction performance and error conditions.
Which includes, read/write and completion TLP counts and DWORD counts and
debug counters for tag, completion credit and NP credit exhaustion
The stats are exposed via debugfs and can be used to monitor PCIe
performance and debug PCIe issues.
Signed-off-by: Sanman Pradhan <sanman.p211993@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241115015344.757567-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add the usual debugfs structure:
fbnic/
$pci-id/
device-fileA
device-fileB
This patch only adds the directories, subsequent changes
will add files.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20241115015344.757567-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
While adding the SPDX headers I noticed we're also missing
a header guard.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20241115015344.757567-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo noticed that we are missing SPDX headers, add them.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20241115015344.757567-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We use pcim_enable_device(), there is no need to call pci_disable_device().
Fixes: 546dd90be979 ("eth: fbnic: Add scaffolding for Meta's NIC driver")
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241115014809.754860-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The sanity checks are going to get silently cast to unsigned
and always pass. Cast the sizeof to signed size.
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241115003248.733862-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit 51183d233b5a ("net/neighbor: Update neigh_dump_info for strict
data checking") added strict checking. The err variable is not cleared,
so if we find no table to dump we will return the validation error even
if user did not want strict checking.
I think the only way to hit this is to send an buggy request, and ask
for a table which doesn't exist, so there's no point treating this
as a real fix. I only noticed it because a syzbot repro depended on it
to trigger another bug.
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241115003221.733593-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>