Core
----
- Increase size limits for to-be-sent skb frag allocations. This
allows tun, tap devices and packet sockets to better cope with large
writes operations.
- Store netdevs in an xarray, to simplify iterating over netdevs.
- Refactor nexthop selection for multipath routes.
- Improve sched class lifetime handling.
- Add backup nexthop ID support for bridge.
- Implement drop reasons support in openvswitch.
- Several data races annotations and fixes.
- Constify the sk parameter of routing functions.
- Prepend kernel version to netconsole message.
Protocols
---------
- Implement support for TCP probing the peer being under memory
pressure.
- Remove hard coded limitation on IPv6 specific info placement
inside the socket struct.
- Get rid of sysctl_tcp_adv_win_scale and use an auto-estimated
per socket scaling factor.
- Scaling-up the IPv6 expired route GC via a separated list of
expiring routes.
- In-kernel support for the TLS alert protocol.
- Better support for UDP reuseport with connected sockets.
- Add NEXT-C-SID support for SRv6 End.X behavior, reducing the SR
header size.
- Get rid of additional ancillary per MPTCP connection struct socket.
- Implement support for BPF-based MPTCP packet schedulers.
- Format MPTCP subtests selftests results in TAP.
- Several new SMC 2.1 features including unique experimental options,
max connections per lgr negotiation, max links per lgr negotiation.
BPF
---
- Multi-buffer support in AF_XDP.
- Add multi uprobe BPF links for attaching multiple uprobes
and usdt probes, which is significantly faster and saves extra fds.
- Implement an fd-based tc BPF attach API (TCX) and BPF link support on
top of it.
- Add SO_REUSEPORT support for TC bpf_sk_assign.
- Support new instructions from cpu v4 to simplify the generated code and
feature completeness, for x86, arm64, riscv64.
- Support defragmenting IPv(4|6) packets in BPF.
- Teach verifier actual bounds of bpf_get_smp_processor_id()
and fix perf+libbpf issue related to custom section handling.
- Introduce bpf map element count and enable it for all program types.
- Add a BPF hook in sys_socket() to change the protocol ID
from IPPROTO_TCP to IPPROTO_MPTCP to cover migration for legacy.
- Introduce bpf_me_mcache_free_rcu() and fix OOM under stress.
- Add uprobe support for the bpf_get_func_ip helper.
- Check skb ownership against full socket.
- Support for up to 12 arguments in BPF trampoline.
- Extend link_info for kprobe_multi and perf_event links.
Netfilter
---------
- Speed-up process exit by aborting ruleset validation if a
fatal signal is pending.
- Allow NLA_POLICY_MASK to be used with BE16/BE32 types.
Driver API
----------
- Page pool optimizations, to improve data locality and cache usage.
- Introduce ndo_hwtstamp_get() and ndo_hwtstamp_set() to avoid the need
for raw ioctl() handling in drivers.
- Simplify genetlink dump operations (doit/dumpit) providing them
the common information already populated in struct genl_info.
- Extend and use the yaml devlink specs to [re]generate the split ops.
- Introduce devlink selective dumps, to allow SF filtering SF based on
handle and other attributes.
- Add yaml netlink spec for netlink-raw families, allow route, link and
address related queries via the ynl tool.
- Remove phylink legacy mode support.
- Support offload LED blinking to phy.
- Add devlink port function attributes for IPsec.
New hardware / drivers
----------------------
- Ethernet:
- Broadcom ASP 2.0 (72165) ethernet controller
- MediaTek MT7988 SoC
- Texas Instruments AM654 SoC
- Texas Instruments IEP driver
- Atheros qca8081 phy
- Marvell 88Q2110 phy
- NXP TJA1120 phy
- WiFi:
- MediaTek mt7981 support
- Can:
- Kvaser SmartFusion2 PCI Express devices
- Allwinner T113 controllers
- Texas Instruments tcan4552/4553 chips
- Bluetooth:
- Intel Gale Peak
- Qualcomm WCN3988 and WCN7850
- NXP AW693 and IW624
- Mediatek MT2925
Drivers
-------
- Ethernet NICs:
- nVidia/Mellanox:
- mlx5:
- support UDP encapsulation in packet offload mode
- IPsec packet offload support in eswitch mode
- improve aRFS observability by adding new set of counters
- extends MACsec offload support to cover RoCE traffic
- dynamic completion EQs
- mlx4:
- convert to use auxiliary bus instead of custom interface logic
- Intel
- ice:
- implement switchdev bridge offload, even for LAG interfaces
- implement SRIOV support for LAG interfaces
- igc:
- add support for multiple in-flight TX timestamps
- Broadcom:
- bnxt:
- use the unified RX page pool buffers for XDP and non-XDP
- use the NAPI skb allocation cache
- OcteonTX2:
- support Round Robin scheduling HTB offload
- TC flower offload support for SPI field
- Freescale:
- add XDP_TX feature support
- AMD:
- ionic: add support for PCI FLR event
- sfc:
- basic conntrack offload
- introduce eth, ipv4 and ipv6 pedit offloads
- ST Microelectronics:
- stmmac: maximze PTP timestamping resolution
- Virtual NICs:
- Microsoft vNIC:
- batch ringing RX queue doorbell on receiving packets
- add page pool for RX buffers
- Virtio vNIC:
- add per queue interrupt coalescing support
- Google vNIC:
- add queue-page-list mode support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add port range matching tc-flower offload
- permit enslavement to netdevices with uppers
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- convert to phylink_pcs
- Renesas:
- r8A779fx: add speed change support
- rzn1: enables vlan support
- Ethernet PHYs:
- convert mv88e6xxx to phylink_pcs
- WiFi:
- Qualcomm Wi-Fi 7 (ath12k):
- extremely High Throughput (EHT) PHY support
- RealTek (rtl8xxxu):
- enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU),
RTL8192EU and RTL8723BU
- RealTek (rtw89):
- Introduce Time Averaged SAR (TAS) support
- Connector:
- support for event filtering
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmTt1ZoSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkgFUP/REFaYWdWUvAzmWeezyx9dqgZMfSOjWq
9QvySiA94OAOcjIYkb7wfzQ5BBAZqaBQ/f8XqWwS1EDDDEBs8sP1cxmABKwW7Hsr
qFRu2sOqLzKBk223d0jIgEocfQaFpGbF71gXoTlDivBjBi5UxWm9bF0XnbYWcKgO
/QEvzNosi9uNdi85Fzmv62J6YzAdidEpwGsM7X2CfejwNRmStxAEg/NwvRR0Hyiq
OJCo97omEgTRaUle8nc64PDx33u4h5kQ1BkaeHEv0rbE3hftFC2YPKn/InmqSFGz
6ew2xnrGPR37LCuAiCcIIv6yR7K0eu0iYJ7jXwZxBDqxGavEPuwWGBoCP6qFiitH
ZLWhIrAUrdmSbySkTOCONhJ475qFAuQoYHYpZnX/bJZUHlSsb/9lwDJYJQGpVfd1
/daqJVSb7lhaifmNO1iNd/ibCIXq9zapwtkRwA897M8GkZBTsnVvazFld1Em+Se3
Bx6DSDUVBqVQ9fpZG2IAGD6odDwOzC1lF2IoceFvK9Ff6oE0psI+A0qNLMkHxZbW
Qlo7LsNe53hpoCC+yHTfXX7e/X8eNt0EnCGOQJDusZ0Nr3K7H4LKFA0i8UBUK05n
4lKnnaSQW7GQgdofLWt103OMDR9GoDxpFsm7b1X9+AEk6Fz6tq50wWYeMZETUKYP
DCW8VGFOZjZM
=9CsR
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core:
- Increase size limits for to-be-sent skb frag allocations. This
allows tun, tap devices and packet sockets to better cope with
large writes operations
- Store netdevs in an xarray, to simplify iterating over netdevs
- Refactor nexthop selection for multipath routes
- Improve sched class lifetime handling
- Add backup nexthop ID support for bridge
- Implement drop reasons support in openvswitch
- Several data races annotations and fixes
- Constify the sk parameter of routing functions
- Prepend kernel version to netconsole message
Protocols:
- Implement support for TCP probing the peer being under memory
pressure
- Remove hard coded limitation on IPv6 specific info placement inside
the socket struct
- Get rid of sysctl_tcp_adv_win_scale and use an auto-estimated per
socket scaling factor
- Scaling-up the IPv6 expired route GC via a separated list of
expiring routes
- In-kernel support for the TLS alert protocol
- Better support for UDP reuseport with connected sockets
- Add NEXT-C-SID support for SRv6 End.X behavior, reducing the SR
header size
- Get rid of additional ancillary per MPTCP connection struct socket
- Implement support for BPF-based MPTCP packet schedulers
- Format MPTCP subtests selftests results in TAP
- Several new SMC 2.1 features including unique experimental options,
max connections per lgr negotiation, max links per lgr negotiation
BPF:
- Multi-buffer support in AF_XDP
- Add multi uprobe BPF links for attaching multiple uprobes and usdt
probes, which is significantly faster and saves extra fds
- Implement an fd-based tc BPF attach API (TCX) and BPF link support
on top of it
- Add SO_REUSEPORT support for TC bpf_sk_assign
- Support new instructions from cpu v4 to simplify the generated code
and feature completeness, for x86, arm64, riscv64
- Support defragmenting IPv(4|6) packets in BPF
- Teach verifier actual bounds of bpf_get_smp_processor_id() and fix
perf+libbpf issue related to custom section handling
- Introduce bpf map element count and enable it for all program types
- Add a BPF hook in sys_socket() to change the protocol ID from
IPPROTO_TCP to IPPROTO_MPTCP to cover migration for legacy
- Introduce bpf_me_mcache_free_rcu() and fix OOM under stress
- Add uprobe support for the bpf_get_func_ip helper
- Check skb ownership against full socket
- Support for up to 12 arguments in BPF trampoline
- Extend link_info for kprobe_multi and perf_event links
Netfilter:
- Speed-up process exit by aborting ruleset validation if a fatal
signal is pending
- Allow NLA_POLICY_MASK to be used with BE16/BE32 types
Driver API:
- Page pool optimizations, to improve data locality and cache usage
- Introduce ndo_hwtstamp_get() and ndo_hwtstamp_set() to avoid the
need for raw ioctl() handling in drivers
- Simplify genetlink dump operations (doit/dumpit) providing them the
common information already populated in struct genl_info
- Extend and use the yaml devlink specs to [re]generate the split ops
- Introduce devlink selective dumps, to allow SF filtering SF based
on handle and other attributes
- Add yaml netlink spec for netlink-raw families, allow route, link
and address related queries via the ynl tool
- Remove phylink legacy mode support
- Support offload LED blinking to phy
- Add devlink port function attributes for IPsec
New hardware / drivers:
- Ethernet:
- Broadcom ASP 2.0 (72165) ethernet controller
- MediaTek MT7988 SoC
- Texas Instruments AM654 SoC
- Texas Instruments IEP driver
- Atheros qca8081 phy
- Marvell 88Q2110 phy
- NXP TJA1120 phy
- WiFi:
- MediaTek mt7981 support
- Can:
- Kvaser SmartFusion2 PCI Express devices
- Allwinner T113 controllers
- Texas Instruments tcan4552/4553 chips
- Bluetooth:
- Intel Gale Peak
- Qualcomm WCN3988 and WCN7850
- NXP AW693 and IW624
- Mediatek MT2925
Drivers:
- Ethernet NICs:
- nVidia/Mellanox:
- mlx5:
- support UDP encapsulation in packet offload mode
- IPsec packet offload support in eswitch mode
- improve aRFS observability by adding new set of counters
- extends MACsec offload support to cover RoCE traffic
- dynamic completion EQs
- mlx4:
- convert to use auxiliary bus instead of custom interface
logic
- Intel
- ice:
- implement switchdev bridge offload, even for LAG
interfaces
- implement SRIOV support for LAG interfaces
- igc:
- add support for multiple in-flight TX timestamps
- Broadcom:
- bnxt:
- use the unified RX page pool buffers for XDP and non-XDP
- use the NAPI skb allocation cache
- OcteonTX2:
- support Round Robin scheduling HTB offload
- TC flower offload support for SPI field
- Freescale:
- add XDP_TX feature support
- AMD:
- ionic: add support for PCI FLR event
- sfc:
- basic conntrack offload
- introduce eth, ipv4 and ipv6 pedit offloads
- ST Microelectronics:
- stmmac: maximze PTP timestamping resolution
- Virtual NICs:
- Microsoft vNIC:
- batch ringing RX queue doorbell on receiving packets
- add page pool for RX buffers
- Virtio vNIC:
- add per queue interrupt coalescing support
- Google vNIC:
- add queue-page-list mode support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add port range matching tc-flower offload
- permit enslavement to netdevices with uppers
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- convert to phylink_pcs
- Renesas:
- r8A779fx: add speed change support
- rzn1: enables vlan support
- Ethernet PHYs:
- convert mv88e6xxx to phylink_pcs
- WiFi:
- Qualcomm Wi-Fi 7 (ath12k):
- extremely High Throughput (EHT) PHY support
- RealTek (rtl8xxxu):
- enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU),
RTL8192EU and RTL8723BU
- RealTek (rtw89):
- Introduce Time Averaged SAR (TAS) support
- Connector:
- support for event filtering"
* tag 'net-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1806 commits)
net: ethernet: mtk_wed: minor change in wed_{tx,rx}info_show
net: ethernet: mtk_wed: add some more info in wed_txinfo_show handler
net: stmmac: clarify difference between "interface" and "phy_interface"
r8152: add vendor/device ID pair for D-Link DUB-E250
devlink: move devlink_notify_register/unregister() to dev.c
devlink: move small_ops definition into netlink.c
devlink: move tracepoint definitions into core.c
devlink: push linecard related code into separate file
devlink: push rate related code into separate file
devlink: push trap related code into separate file
devlink: use tracepoint_enabled() helper
devlink: push region related code into separate file
devlink: push param related code into separate file
devlink: push resource related code into separate file
devlink: push dpipe related code into separate file
devlink: move and rename devlink_dpipe_send_and_alloc_skb() helper
devlink: push shared buffer related code into separate file
devlink: push port related code into separate file
devlink: push object register/unregister notifications into separate helpers
inet: fix IP_TRANSPARENT error handling
...
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXTKAAKCRCRxhvAZXjc
oifJAQCzi/p+AdQu8LA/0XvR7fTwaq64ZDCibU4BISuLGT2kEgEAuGbuoFZa0rs2
XYD/s4+gi64p9Z01MmXm2XO1pu3GPg0=
=eJz5
-----END PGP SIGNATURE-----
Merge tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs timestamp updates from Christian Brauner:
"This adds VFS support for multi-grain timestamps and converts tmpfs,
xfs, ext4, and btrfs to use them. This carries acks from all relevant
filesystems.
The VFS always uses coarse-grained timestamps when updating the ctime
and mtime after a change. This has the benefit of allowing filesystems
to optimize away a lot of metadata updates, down to around 1 per
jiffy, even when a file is under heavy writes.
Unfortunately, this has always been an issue when we're exporting via
NFSv3, which relies on timestamps to validate caches. A lot of changes
can happen in a jiffy, so timestamps aren't sufficient to help the
client decide to invalidate the cache.
Even with NFSv4, a lot of exported filesystems don't properly support
a change attribute and are subject to the same problems with timestamp
granularity. Other applications have similar issues with timestamps
(e.g., backup applications).
If we were to always use fine-grained timestamps, that would improve
the situation, but that becomes rather expensive, as the underlying
filesystem would have to log a lot more metadata updates.
This introduces fine-grained timestamps that are used when they are
actively queried.
This uses the 31st bit of the ctime tv_nsec field to indicate that
something has queried the inode for the mtime or ctime. When this flag
is set, on the next mtime or ctime update, the kernel will fetch a
fine-grained timestamp instead of the usual coarse-grained one.
As POSIX generally mandates that when the mtime changes, the ctime
must also change the kernel always stores normalized ctime values, so
only the first 30 bits of the tv_nsec field are ever used.
Filesytems can opt into this behavior by setting the FS_MGTIME flag in
the fstype. Filesystems that don't set this flag will continue to use
coarse-grained timestamps.
Various preparatory changes, fixes and cleanups are included:
- Fixup all relevant places where POSIX requires updating ctime
together with mtime. This is a wide-range of places and all
maintainers provided necessary Acks.
- Add new accessors for inode->i_ctime directly and change all
callers to rely on them. Plain accesses to inode->i_ctime are now
gone and it is accordingly rename to inode->__i_ctime and commented
as requiring accessors.
- Extend generic_fillattr() to pass in a request mask mirroring in a
sense the statx() uapi. This allows callers to pass in a request
mask to only get a subset of attributes filled in.
- Rework timestamp updates so it's possible to drop the @now
parameter the update_time() inode operation and associated helpers.
- Add inode_update_timestamps() and convert all filesystems to it
removing a bunch of open-coding"
* tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (107 commits)
btrfs: convert to multigrain timestamps
ext4: switch to multigrain timestamps
xfs: switch to multigrain timestamps
tmpfs: add support for multigrain timestamps
fs: add infrastructure for multigrain timestamps
fs: drop the timespec64 argument from update_time
xfs: have xfs_vn_update_time gets its own timestamp
fat: make fat_update_time get its own timestamp
fat: remove i_version handling from fat_update_time
ubifs: have ubifs_update_time use inode_update_timestamps
btrfs: have it use inode_update_timestamps
fs: drop the timespec64 arg from generic_update_time
fs: pass the request_mask to generic_fillattr
fs: remove silly warning from current_time
gfs2: fix timestamp handling on quota inodes
fs: rename i_ctime field to __i_ctime
selinux: convert to ctime accessor functions
security: convert to ctime accessor functions
apparmor: convert to ctime accessor functions
sunrpc: convert to ctime accessor functions
...
At last, move the last bits out of leftover.c,
the devlink_notify_register/unregister() functions to dev.c
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230828061657.300667-16-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In preparation for the trap code move, use tracepoint_enabled() helper
instead of trace_devlink_trap_report_enabled() which would not be
defined in that scope.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230828061657.300667-10-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since both dpipe and resource code is using this helper, in preparation
for code split to separate files, move
devlink_dpipe_send_and_alloc_skb() helper into netlink.c. Rename it on
the way.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230828061657.300667-5-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In preparations of leftover.c split to individual files, avoid need to
have object structures exposed in devl_internal.h and allow to have them
maintained in object files.
The register/unregister notifications need to know the structures
to iterate lists. To avoid the need, introduce per-object
register/unregister notification helpers and use them.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230828061657.300667-2-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
My recent patch forgot to change error handling for IP_TRANSPARENT
socket option.
WARNING: bad unlock balance detected!
6.5.0-rc7-syzkaller-01717-g59da9885767a #0 Not tainted
-------------------------------------
syz-executor151/5028 is trying to release lock (sk_lock-AF_INET) at:
[<ffffffff88213983>] sockopt_release_sock+0x53/0x70 net/core/sock.c:1073
but there are no more locks to release!
other info that might help us debug this:
1 lock held by syz-executor151/5028:
stack backtrace:
CPU: 0 PID: 5028 Comm: syz-executor151 Not tainted 6.5.0-rc7-syzkaller-01717-g59da9885767a #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
__lock_release kernel/locking/lockdep.c:5438 [inline]
lock_release+0x4b5/0x680 kernel/locking/lockdep.c:5781
sock_release_ownership include/net/sock.h:1824 [inline]
release_sock+0x175/0x1b0 net/core/sock.c:3527
sockopt_release_sock+0x53/0x70 net/core/sock.c:1073
do_ip_setsockopt+0x12c1/0x3640 net/ipv4/ip_sockglue.c:1364
ip_setsockopt+0x59/0xe0 net/ipv4/ip_sockglue.c:1419
raw_setsockopt+0x218/0x290 net/ipv4/raw.c:833
__sys_setsockopt+0x2cd/0x5b0 net/socket.c:2305
__do_sys_setsockopt net/socket.c:2316 [inline]
__se_sys_setsockopt net/socket.c:2313 [inline]
Fixes: 4bd0623f04ee ("inet: move inet->transparent to inet->inet_flags")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
While looking at TC_ACT_* handling, the TC_ACT_CONSUMED is only handled in
sch_handle_ingress but not sch_handle_egress. This was added via cd11b164073b
("net/tc: introduce TC_ACT_REINSERT.") and e5cf1baf92cb ("act_mirred: use
TC_ACT_REINSERT when possible") and later got renamed into TC_ACT_CONSUMED
via 720f22fed81b ("net: sched: refactor reinsert action").
The initial work was targeted for ovs back then and only needed on ingress,
and the mirred action module also restricts it to only that. However, given
it's an API contract it would still make sense to make this consistent to
sch_handle_ingress and handle it on egress side in the same way, that is,
setting return code to "success" and returning NULL back to the caller as
otherwise an action module sitting on egress returning TC_ACT_CONSUMED could
lead to an UAF when untreated.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a memory leak for the tc egress path with TC_ACT_{STOLEN,QUEUED,TRAP}:
[...]
unreferenced object 0xffff88818bcb4f00 (size 232):
comm "softirq", pid 0, jiffies 4299085078 (age 134.028s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 80 70 61 81 88 ff ff 00 41 31 14 81 88 ff ff ..pa.....A1.....
backtrace:
[<ffffffff9991b938>] kmem_cache_alloc_node+0x268/0x400
[<ffffffff9b3d9231>] __alloc_skb+0x211/0x2c0
[<ffffffff9b3f0c7e>] alloc_skb_with_frags+0xbe/0x6b0
[<ffffffff9b3bf9a9>] sock_alloc_send_pskb+0x6a9/0x870
[<ffffffff9b6b3f00>] __ip_append_data+0x14d0/0x3bf0
[<ffffffff9b6ba24e>] ip_append_data+0xee/0x190
[<ffffffff9b7e1496>] icmp_push_reply+0xa6/0x470
[<ffffffff9b7e4030>] icmp_reply+0x900/0xa00
[<ffffffff9b7e42e3>] icmp_echo.part.0+0x1a3/0x230
[<ffffffff9b7e444d>] icmp_echo+0xcd/0x190
[<ffffffff9b7e9566>] icmp_rcv+0x806/0xe10
[<ffffffff9b699bd1>] ip_protocol_deliver_rcu+0x351/0x3d0
[<ffffffff9b699f14>] ip_local_deliver_finish+0x2b4/0x450
[<ffffffff9b69a234>] ip_local_deliver+0x174/0x1f0
[<ffffffff9b69a4b2>] ip_sublist_rcv_finish+0x1f2/0x420
[<ffffffff9b69ab56>] ip_sublist_rcv+0x466/0x920
[...]
I was able to reproduce this via:
ip link add dev dummy0 type dummy
ip link set dev dummy0 up
tc qdisc add dev eth0 clsact
tc filter add dev eth0 egress protocol ip prio 1 u32 match ip protocol 1 0xff action mirred egress redirect dev dummy0
ping 1.1.1.1
<stolen>
After the fix, there are no kmemleak reports with the reproducer. This is
in line with what is also done on the ingress side, and from debugging the
skb_unref(skb) on dummy xmit and sch_handle_egress() side, it is visible
that these are two different skbs with both skb_unref(skb) as true. The two
seen skbs are due to mirred doing a skb_clone() internally as use_reinsert
is false in tcf_mirred_act() for egress. This was initially reported by Gal.
Fixes: e420bed02507 ("bpf: Add fd-based tcx multi-prog infra with link support")
Reported-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/bdfc2640-8f65-5b56-4472-db8e2b161aab@nvidia.com
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There was a previous attempt to fix an out-of-bounds access in the DCCP
error handlers, but that fix assumed that the error handlers only want
to access the first 8 bytes of the DCCP header. Actually, they also look
at the DCCP sequence number, which is stored beyond 8 bytes, so an
explicit pskb_may_pull() is required.
Fixes: 6706a97fec96 ("dccp: fix out of bound access in dccp_v4_err()")
Fixes: 1aa9d1a0e7ee ("ipv6: dccp: fix out of bound access in dccp_v6_err()")
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tls_cipher_desc also contains the algorithm name needed by
crypto_alloc_aead, use it.
Finally, use get_cipher_desc to check if the cipher_type coming from
userspace is valid, and remove the cipher_type switch.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/53d021d80138aa125a9cef4468aa5ce531975a7b.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We can get rid of some local variables, but we have to keep nonce_size
because tls1.3 uses nonce_size = 0 for all ciphers.
We can also drop the runtime sanity checks on iv/rec_seq/tag size,
since we have compile time checks on those values.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/deed9c4430a62c31751a72b8c03ad66ffe710717.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Every cipher uses the same code to update its crypto_info struct based
on the values contained in the cctx, with only the struct type and
size/offset changing. We can get those from tls_cipher_desc, and use
a single pair of memcpy and final copy_to_user.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/c21a904b91e972bdbbf9d1c6d2731ccfa1eedf72.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
tls_sw_fallback_init already gets the key and tag size from
tls_cipher_desc. We can now also check that the cipher type is valid,
and stop hard-coding the algorithm name passed to crypto_alloc_aead.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/c8c94b8fcafbfb558e09589c1f1ad48dbdf92f76.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
tls_set_device_offload is already getting iv and rec_seq sizes from
tls_cipher_desc. We can now also check if the cipher_type coming from
userspace is valid and can be offloaded.
We can also remove the runtime check on rec_seq, since we validate it
at compile time.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/8ab71b8eca856c7aaf981a45fe91ac649eb0e2e9.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- add nonce, usually equal to iv_size but not for chacha
- add offsets into the crypto_info for each field
- add algorithm name
- add offloadable flag
Also add helpers to access each field of a crypto_info struct
described by a tls_cipher_desc.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/39d5f476d63c171097764e8d38f6f158b7c109ae.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
tls_cipher_size_desc indexes ciphers by their type, but we're not
using indices 0..50 of the array. Each struct tls_cipher_size_desc is
20B, so that's a lot of unused memory. We can reindex the array
starting at the lowest used cipher_type.
Introduce the get_cipher_size_desc helper to find the right item and
avoid out-of-bounds accesses, and make tls_cipher_size_desc's size
explicit so that gcc reminds us to update TLS_CIPHER_MIN/MAX when we
add a new cipher.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/5e054e370e240247a5d37881a1cd93a67c15f4ca.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Expose port function commands to enable / disable IPsec packet offloads,
this is used to control the port IPsec capabilities.
When IPsec packet is disabled for a function of the port (default),
function cannot offload IPsec packet operations (encapsulation and XFRM
policy offload). When enabled, IPsec packet operations can be offloaded
by the function of the port, which includes crypto operation
(Encrypt/Decrypt), IPsec encapsulation and XFRM state and policy
offload.
Example of a PCI VF port which supports IPsec packet offloads:
$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_packet disable
$ devlink port function set pci/0000:06:00.0/1 ipsec_packet enable
$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_packet enable
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-3-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Expose port function commands to enable / disable IPsec crypto offloads,
this is used to control the port IPsec capabilities.
When IPsec crypto is disabled for a function of the port (default),
function cannot offload any IPsec crypto operations (Encrypt/Decrypt and
XFRM state offloading). When enabled, IPsec crypto operations can be
offloaded by the function of the port.
Example of a PCI VF port which supports IPsec crypto offloads:
$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto disable
$ devlink port function set pci/0000:06:00.0/1 ipsec_crypto enable
$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto enable
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-2-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
HFSC assumes that inner classes have an fsc curve, but it is currently
possible for classes without an fsc curve to become parents. This leads
to bugs including a use-after-free.
Don't allow non-root classes without HFSC_FSC to become parents.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Budimir Markovic <markovicbudimir@gmail.com>
Signed-off-by: Budimir Markovic <markovicbudimir@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20230824084905.422-1-markovicbudimir@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZOjkTAAKCRDbK58LschI
gx32AP9gaaHFBtOYBfoenKTJfMgv1WhtQHIBas+WN9ItmBx9MAEA4gm/VyQ6oD7O
EBjJKJQ2CZ/QKw7cNacXw+l5jF7/+Q0=
=8P7g
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-08-25
We've added 87 non-merge commits during the last 8 day(s) which contain
a total of 104 files changed, 3719 insertions(+), 4212 deletions(-).
The main changes are:
1) Add multi uprobe BPF links for attaching multiple uprobes
and usdt probes, which is significantly faster and saves extra fds,
from Jiri Olsa.
2) Add support BPF cpu v4 instructions for arm64 JIT compiler,
from Xu Kuohai.
3) Add support BPF cpu v4 instructions for riscv64 JIT compiler,
from Pu Lehui.
4) Fix LWT BPF xmit hooks wrt their return values where propagating
the result from skb_do_redirect() would trigger a use-after-free,
from Yan Zhai.
5) Fix a BPF verifier issue related to bpf_kptr_xchg() with local kptr
where the map's value kptr type and locally allocated obj type
mismatch, from Yonghong Song.
6) Fix BPF verifier's check_func_arg_reg_off() function wrt graph
root/node which bypassed reg->off == 0 enforcement,
from Kumar Kartikeya Dwivedi.
7) Lift BPF verifier restriction in networking BPF programs to treat
comparison of packet pointers not as a pointer leak,
from Yafang Shao.
8) Remove unmaintained XDP BPF samples as they are maintained
in xdp-tools repository out of tree, from Toke Høiland-Jørgensen.
9) Batch of fixes for the tracing programs from BPF samples in order
to make them more libbpf-aware, from Daniel T. Lee.
10) Fix a libbpf signedness determination bug in the CO-RE relocation
handling logic, from Andrii Nakryiko.
11) Extend libbpf to support CO-RE kfunc relocations. Also follow-up
fixes for bpf_refcount shared ownership implementation,
both from Dave Marchevsky.
12) Add a new bpf_object__unpin() API function to libbpf,
from Daniel Xu.
13) Fix a memory leak in libbpf to also free btf_vmlinux
when the bpf_object gets closed, from Hao Luo.
14) Small error output improvements to test_bpf module, from Helge Deller.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (87 commits)
selftests/bpf: Add tests for rbtree API interaction in sleepable progs
bpf: Allow bpf_spin_{lock,unlock} in sleepable progs
bpf: Consider non-owning refs to refcounted nodes RCU protected
bpf: Reenable bpf_refcount_acquire
bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes
bpf: Consider non-owning refs trusted
bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire
selftests/bpf: Enable cpu v4 tests for RV64
riscv, bpf: Support unconditional bswap insn
riscv, bpf: Support signed div/mod insns
riscv, bpf: Support 32-bit offset jmp insn
riscv, bpf: Support sign-extension mov insns
riscv, bpf: Support sign-extension load insns
riscv, bpf: Fix missing exception handling and redundant zext for LDX_B/H/W
samples/bpf: Add note to README about the XDP utilities moved to xdp-tools
samples/bpf: Cleanup .gitignore
samples/bpf: Remove the xdp_sample_pkts utility
samples/bpf: Remove the xdp1 and xdp2 utilities
samples/bpf: Remove the xdp_rxq_info utility
samples/bpf: Remove the xdp_redirect* utilities
...
====================
Link: https://lore.kernel.org/r/20230825194319.12727-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The second pull request for v6.6, this time with both stack and driver
changes. Unusually we have only one major new feature but lots of
small cleanup all over, I guess this is due to people have been on
vacation the last month.
Major changes:
rtw89
* Introduce Time Averaged SAR (TAS) support
-----BEGIN PGP SIGNATURE-----
iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmToqosRHGt2YWxvQGtl
cm5lbC5vcmcACgkQbhckVSbrbZv9XQf9HDq9smbuWLvwzNjbbS31hHFLmnfhN8Zp
+Zzn47gpMCle9ahGLQyw8lcfNPWCMyqOu4sGQ6hyyuH+YXoxZryuq9QDwWo9L/b1
5Cpm4IaBYBMm0ZoOkWw2lQSzGyNrXgvCEKRVC+pYQMvr5V2aEWxT/kT4guiou9D5
OXPRFN2iqZP0Q3TKcfKWRnWn3S0Ok3kZCFuXcWkL0sgwjqP/wbAPO1XNI1IImKNM
xUd0zT4vK/layYq7i20y8blglI5kcp/aKCFEwYpQC2WPeZ3Wtl1G9PQ8eze5Gc2Q
NTw3xfr6tENIcAmYoLdBdKbUq6e6pwLwXlojlZ2beR6s7LHM30AinQ==
=2Hja
-----END PGP SIGNATURE-----
Merge tag 'wireless-next-2023-08-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.6
The second pull request for v6.6, this time with both stack and driver
changes. Unusually we have only one major new feature but lots of
small cleanup all over, I guess this is due to people have been on
vacation the last month.
Major changes:
rtw89
- Introduce Time Averaged SAR (TAS) support
* tag 'wireless-next-2023-08-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (114 commits)
wifi: rtlwifi: rtl8723: Remove unused function rtl8723_cmd_send_packet()
wifi: rtw88: usb: kill and free rx urbs on probe failure
wifi: rtw89: Fix clang -Wimplicit-fallthrough in rtw89_query_sar()
wifi: rtw89: phy: modify register setting of ENV_MNTR, PHYSTS and DIG
wifi: rtw89: phy: add phy_gen_def::cr_base to support WiFi 7 chips
wifi: rtw89: mac: define register address of rx_filter to generalize code
wifi: rtw89: mac: define internal memory address for WiFi 7 chip
wifi: rtw89: mac: generalize code to indirectly access WiFi internal memory
wifi: rtw89: mac: add mac_gen_def::band1_offset to map MAC band1 register address
wifi: wlcore: sdio: Use module_sdio_driver macro to simplify the code
wifi: rtw89: initialize multi-channel handling
wifi: rtw89: provide functions to configure NoA for beacon update
wifi: rtw89: call rtw89_chan_get() by vif chanctx if aware of vif
wifi: rtw89: sar: let caller decide the center frequency to query
wifi: rtw89: refine rtw89_correct_cck_chan() by rtw89_hw_to_nl80211_band()
wifi: rtw89: add function prototype for coex request duration
Fix nomenclature for USB and PCI wireless devices
wifi: ath: Use is_multicast_ether_addr() to check multicast Ether address
wifi: ath12k: Remove unused declarations
wifi: ath12k: add check max message length while scanning with extraie
...
====================
Link: https://lore.kernel.org/r/20230825132230.A0833C433C8@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In the case of a Periodic Synchronized Receiver,
the PA report received from a Broadcaster contains the BASE,
which has information about codec and other parameters of a BIG.
This isnformation is stored and the application can retrieve it
using getsockopt(BT_ISO_BASE).
Signed-off-by: Claudia Draghicescu <claudia.rosu@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Not calling hci_(dis)connect_cfm before deleting conn referred to by a
socket generally results to use-after-free.
When cleaning up SCO connections when the parent ACL is deleted too
early, use hci_conn_failed to do the connection cleanup properly.
We also need to clean up ISO connections in a similar situation when
connecting has started but LE Create CIS is not yet sent, so do it too
here.
Fixes: ca1fd42e7dbf ("Bluetooth: Fix potential double free caused by hci_conn_unlink")
Reported-by: syzbot+cf54c1da6574b6c1b049@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-bluetooth/00000000000013b93805fbbadc50@google.com/
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
There a few instances still using HCI_MAX_AD_LENGTH instead of using
max_adv_len which takes care of detecting what is the actual maximum
length depending on if the controller supports EA or not.
Fixes: 112b5090c219 ("Bluetooth: MGMT: Fix always using HCI_MAX_AD_LENGTH")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
This commit implements defer setup support for the Broadcast Sink
scenario: By setting defer setup on a broadcast socket before calling
listen, the user is able to trigger the PA sync and BIG sync procedures
separately.
This is useful if the user first wants to synchronize to the periodic
advertising transmitted by a Broadcast Source, and trigger the BIG sync
procedure later on.
If defer setup is set, once a PA sync established event arrives, a new
hcon is created and notified to the ISO layer. A child socket associated
with the PA sync connection will be added to the accept queue of the
listening socket.
Once the accept call returns the fd for the PA sync child socket, the
user should call read on that fd. This will trigger the BIG create sync
procedure, and the PA sync socket will become a listening socket itself.
When the BIG sync established event is notified to the ISO layer, the
bis connections will be added to the accept queue of the PA sync parent.
The user should call accept on the PA sync socket to get the final bis
connections.
Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
This fixes sending BT_HCI_CMD_LE_CREATE_CONN_CANCEL when
hci_le_create_conn_sync has not been called because HCI_CONN_SCANNING
has been clear too early before its cmd_sync callback has been run.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Use-after-free can occur in hci_disconnect_all_sync if a connection is
deleted by concurrent processing of a controller event.
To prevent this the code now tries to iterate over the list backwards
to ensure the links are cleanup before its parents, also it no longer
relies on a cursor, instead it always uses the last element since
hci_abort_conn_sync is guaranteed to call hci_conn_del.
UAF crash log:
==================================================================
BUG: KASAN: slab-use-after-free in hci_set_powered_sync
(net/bluetooth/hci_sync.c:5424) [bluetooth]
Read of size 8 at addr ffff888009d9c000 by task kworker/u9:0/124
CPU: 0 PID: 124 Comm: kworker/u9:0 Tainted: G W
6.5.0-rc1+ #10
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
1.16.2-1.fc38 04/01/2014
Workqueue: hci0 hci_cmd_sync_work [bluetooth]
Call Trace:
<TASK>
dump_stack_lvl+0x5b/0x90
print_report+0xcf/0x670
? __virt_addr_valid+0xdd/0x160
? hci_set_powered_sync+0x2c9/0x4a0 [bluetooth]
kasan_report+0xa6/0xe0
? hci_set_powered_sync+0x2c9/0x4a0 [bluetooth]
? __pfx_set_powered_sync+0x10/0x10 [bluetooth]
hci_set_powered_sync+0x2c9/0x4a0 [bluetooth]
? __pfx_hci_set_powered_sync+0x10/0x10 [bluetooth]
? __pfx_lock_release+0x10/0x10
? __pfx_set_powered_sync+0x10/0x10 [bluetooth]
hci_cmd_sync_work+0x137/0x220 [bluetooth]
process_one_work+0x526/0x9d0
? __pfx_process_one_work+0x10/0x10
? __pfx_do_raw_spin_lock+0x10/0x10
? mark_held_locks+0x1a/0x90
worker_thread+0x92/0x630
? __pfx_worker_thread+0x10/0x10
kthread+0x196/0x1e0
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
Allocated by task 1782:
kasan_save_stack+0x33/0x60
kasan_set_track+0x25/0x30
__kasan_kmalloc+0x8f/0xa0
hci_conn_add+0xa5/0xa80 [bluetooth]
hci_bind_cis+0x881/0x9b0 [bluetooth]
iso_connect_cis+0x121/0x520 [bluetooth]
iso_sock_connect+0x3f6/0x790 [bluetooth]
__sys_connect+0x109/0x130
__x64_sys_connect+0x40/0x50
do_syscall_64+0x60/0x90
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Freed by task 695:
kasan_save_stack+0x33/0x60
kasan_set_track+0x25/0x30
kasan_save_free_info+0x2b/0x50
__kasan_slab_free+0x10a/0x180
__kmem_cache_free+0x14d/0x2e0
device_release+0x5d/0xf0
kobject_put+0xdf/0x270
hci_disconn_complete_evt+0x274/0x3a0 [bluetooth]
hci_event_packet+0x579/0x7e0 [bluetooth]
hci_rx_work+0x287/0xaa0 [bluetooth]
process_one_work+0x526/0x9d0
worker_thread+0x92/0x630
kthread+0x196/0x1e0
ret_from_fork+0x2c/0x50
==================================================================
Fixes: 182ee45da083 ("Bluetooth: hci_sync: Rework hci_suspend_notifier")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Remove the necessity to modify skb_ext_total_length() when new extension
types are added.
Also reduces the line count a bit.
With optimizations enabled the function is folded down to the same
constant value as before during compilation.
This has been validated on x86 with GCC 6.5.0 and 13.2.1.
Also a similar construct has been validated on godbolt.org with GCC 5.1.
In any case the compiler has to be able to evaluate the construct at
compile-time for the BUILD_BUG_ON() in skb_extensions_init().
Even if not evaluated at compile-time this function would only ever
be executed once at run-time, so the overhead would be very minuscule.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230823-skb_ext-simplify-v2-1-66e26cd66860@weissschuh.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>