Each region has an independently configurable number of maximum
snapshots. This information is not reported to userspace, making it not
very discoverable. Fix this by adding a new
DEVLINK_ATTR_REGION_MAX_SNAPSHOST attribute which is used to report this
maximum.
Ex:
$devlink region
pci/0000:af:00.0/nvm-flash: size 10485760 snapshot [] max 1
pci/0000:af:00.0/device-caps: size 4096 snapshot [] max 10
pci/0000:af:00.1/nvm-flash: size 10485760 snapshot [] max 1
pci/0000:af:00.1/device-caps: size 4096 snapshot [] max 10
This information enables users to understand why a new region command
may fail due to having too many existing snapshots.
Reported-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Describe common flows and refcounting behaviour.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/mptcp/protocol.c
977d293e23 ("mptcp: ensure tx skbs always have the MPTCP ext")
efe686ffce ("mptcp: ensure tx skbs always have the MPTCP ext")
same patch merged in both trees, keep net-next.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This reverts the following patches :
- commit 2e05fcae83 ("tcp: fix compile error if !CONFIG_SYSCTL")
- commit 4f661542a4 ("tcp: fix zerocopy and notsent_lowat issues")
- commit 472c2e07ee ("tcp: add one skb cache for tx")
- commit 8b27dae5a2 ("tcp: add one skb cache for rx")
Having a cache of one skb (in each direction) per TCP socket is fragile,
since it can cause a significant increase of memory needs,
and not good enough for high speed flows anyway where more than one skb
is needed.
We want instead to add a generic infrastructure, with more flexible
per-cpu caches, for alien NUMA nodes.
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Documents devlink params, fw update & cd collection commands
and its usage.
Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The file sja1105.txt was converted to nxp,sja1105.yaml.
Signed-off-by: Alejandro Concepcion-Rodriguez <asconcepcion@acoro.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
wireless and can.
Current release - regressions:
- qrtr: revert check in qrtr_endpoint_post(), fixes audio and wifi
- ip_gre: validate csum_start only on pull
- bnxt_en: fix 64-bit doorbell operation on 32-bit kernels
- ionic: fix double use of queue-lock, fix a sleeping in atomic
- can: c_can: fix null-ptr-deref on ioctl()
- cs89x0: disable compile testing on powerpc
Current release - new code bugs:
- bridge: mcast: fix vlan port router deadlock, consistently disable BH
Previous releases - regressions:
- dsa: tag_rtl4_a: fix egress tags, only port 0 was working
- mptcp: fix possible divide by zero
- netfilter: nft_ct: protect nft_ct_pcpu_template_refcnt with mutex
- netfilter: socket: icmp6: fix use-after-scope
- stmmac: fix MAC not working when system resume back with WoL active
Previous releases - always broken:
- ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL
address
- seg6: set fc_nlinfo in nh_create_ipv4, nh_create_ipv6
- mptcp: only send extra TCP acks in eligible socket states
- dsa: lantiq_gswip: fix maximum frame length
- stmmac: fix overall budget calculation for rxtx_napi
- bnxt_en: fix firmware version reporting via devlink
- renesas: sh_eth: add missing barrier to fix freeing wrong tx descriptor
Stragglers:
- netfilter: conntrack: switch to siphash
- netfilter: refuse insertion if chain has grown too large
- ncsi: add get MAC address command to get Intel i210 MAC address
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmE3uicACgkQMUZtbf5S
IrtJVA//XdE8qAmw1JukjyYC87JH2ale20eoZ6ERn7/09e4tdv3M6dOTI4YfrM6+
CMNP5MP2qit3IzY+lN0+yt9AAFH7k85z3MA8zLxsXN4z63OJcZvFv/G/OWy4Wp/0
vOo/DH+rF3LR+fZZvjJI+8Xi9/orsRpD12cwGmjGRxybh+XcnHKI/GvK2RgE6oBR
015RfBbbQBpzFQvESLnSwDzabN1XFEL1x/bz7N8ek3okfO/tab+f3E1tb6eYtTy+
jyDyOWpayd4xDttKNMUuxwS1q+/oAWOAq8PzkaF/ZG2sBH1Z4yZN9ZtsLNZmPG8N
5L1FEem/Nmgr54T9v/FhfiryhhGGysVfVgtQcCBkKRmVn1Kk2L6dFvtuanPtFFd3
llbi5PvCDJy3rbMmxKmyoM3T4jpMwWxQRZKsosw+k/WQfb8/SUOjgpY713V1Wx/P
S+2uadU4l9Ql9sF6X0IqZABnnt+j/BuDo6C6vVq7vyj0iQ9hEX9YxC0ybrAHOYpH
suHWKndodRfTxxVOg8xRNYwXyRLNbm1AP6LMDNKBlFUjwNSZ362qFX7W7DuXoRup
Rrnb8V1QFvM+pyFb2a0qNtBS68IXbjCdVQX5e8a5ELaAUnDPefNrfPN+/rrTLEtV
LnusmBF+02llVSYdr88t1e+LmzqS/aqXFy2ry4y6owjq20ld2O0=
=Zvuz
-----END PGP SIGNATURE-----
Merge tag 'net-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes and stragglers from Jakub Kicinski:
"Networking stragglers and fixes, including changes from netfilter,
wireless and can.
Current release - regressions:
- qrtr: revert check in qrtr_endpoint_post(), fixes audio and wifi
- ip_gre: validate csum_start only on pull
- bnxt_en: fix 64-bit doorbell operation on 32-bit kernels
- ionic: fix double use of queue-lock, fix a sleeping in atomic
- can: c_can: fix null-ptr-deref on ioctl()
- cs89x0: disable compile testing on powerpc
Current release - new code bugs:
- bridge: mcast: fix vlan port router deadlock, consistently disable
BH
Previous releases - regressions:
- dsa: tag_rtl4_a: fix egress tags, only port 0 was working
- mptcp: fix possible divide by zero
- netfilter: nft_ct: protect nft_ct_pcpu_template_refcnt with mutex
- netfilter: socket: icmp6: fix use-after-scope
- stmmac: fix MAC not working when system resume back with WoL active
Previous releases - always broken:
- ip/ip6_gre: use the same logic as SIT interfaces when computing
v6LL address
- seg6: set fc_nlinfo in nh_create_ipv4, nh_create_ipv6
- mptcp: only send extra TCP acks in eligible socket states
- dsa: lantiq_gswip: fix maximum frame length
- stmmac: fix overall budget calculation for rxtx_napi
- bnxt_en: fix firmware version reporting via devlink
- renesas: sh_eth: add missing barrier to fix freeing wrong tx
descriptor
Stragglers:
- netfilter: conntrack: switch to siphash
- netfilter: refuse insertion if chain has grown too large
- ncsi: add get MAC address command to get Intel i210 MAC address"
* tag 'net-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits)
ieee802154: Remove redundant initialization of variable ret
net: stmmac: fix MAC not working when system resume back with WoL active
net: phylink: add suspend/resume support
net: renesas: sh_eth: Fix freeing wrong tx descriptor
bonding: 3ad: pass parameter bond_params by reference
cxgb3: fix oops on module removal
can: c_can: fix null-ptr-deref on ioctl()
can: rcar_canfd: add __maybe_unused annotation to silence warning
net: wwan: iosm: Unify IO accessors used in the driver
net: wwan: iosm: Replace io.*64_lo_hi() with regular accessors
net: qcom/emac: Replace strlcpy with strscpy
ip6_gre: Revert "ip6_gre: add validation for csum_start"
net: hns3: make hclgevf_cmd_caps_bit_map0 and hclge_cmd_caps_bit_map0 static
selftests/bpf: Test XDP bonding nest and unwind
bonding: Fix negative jump label count on nested bonding
MAINTAINERS: add VM SOCKETS (AF_VSOCK) entry
stmmac: dwmac-loongson:Fix missing return value
iwlwifi: fix printk format warnings in uefi.c
net: create netdev->dev_addr assignment helpers
bnxt_en: Fix possible unintended driver initiated error recovery
...
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
1) Protect nft_ct template with global mutex, from Pavel Skripkin.
2) Two recent commits switched inet rt and nexthop exception hashes
from jhash to siphash. If those two spots are problematic then
conntrack is affected as well, so switch voer to siphash too.
While at it, add a hard upper limit on chain lengths and reject
insertion if this is hit. Patches from Florian Westphal.
3) Fix use-after-scope in nf_socket_ipv6 reported by KASAN,
from Benjamin Hesmans.
* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
netfilter: socket: icmp6: fix use-after-scope
netfilter: refuse insertion if chain has grown too large
netfilter: conntrack: switch to siphash
netfilter: conntrack: sanitize table size default settings
netfilter: nft_ct: protect nft_ct_pcpu_template_refcnt with mutex
====================
Link: https://lore.kernel.org/r/20210903163020.13741-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- A reworking of PDF generation to yield better results for documents
using CJK fonts in particular.
- A new set of translations into traditional Chinese, a dialect for which
I am assured there is a community of interested readers.
- A lot more regular Chinese translation work as well.
...plus the usual assortment of updates, fixes, typo tweaks, etc.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmEugrgACgkQF0NaE2wM
fliWWQf/RXf34QkMIe+r77WlTRKc+/6R/cO9VlYPtM9vqreKHZZvGgM1t76aOusb
M5QHwQGoZDzaE1wrv0PPm00HtB0Tw7GfZRUbZ4D+niJD1+gcbDTkTR6NdjOvWWUR
zHX2Sx8KJiNrFDtLtRtlUexM8GD124KZ0A8GF6Hpu3WR3HTFDInTdiylUOmj/4eO
3zUGgrJnUVzkqHLGZzV/kmE4kEHGpxyps2JwGq2iF7362t8R6xH3mEdKKKc1pUpx
lGSxfHs+OPWRsNxVJsdYh8kneIpML8OK6lKda1pzwNj8QhIMz/6tZoutKziHsalI
HkbC3exh+SHak2U6Had303vqkIM7cg==
=2QUy
-----END PGP SIGNATURE-----
Merge tag 'docs-5.15' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"Yet another set of documentation changes:
- A reworking of PDF generation to yield better results for documents
using CJK fonts in particular.
- A new set of translations into traditional Chinese, a dialect for
which I am assured there is a community of interested readers.
- A lot more regular Chinese translation work as well.
... plus the usual assortment of updates, fixes, typo tweaks, etc"
* tag 'docs-5.15' of git://git.lwn.net/linux: (55 commits)
docs: sphinx-requirements: Move sphinx_rtd_theme to top
docs: pdfdocs: Enable language-specific font choice of zh_TW translations
docs: pdfdocs: Teach xeCJK about character classes of quotation marks
docs: pdfdocs: Permit AutoFakeSlant for CJK fonts
docs: pdfdocs: One-half spacing for CJK translations
docs: pdfdocs: Add conf.py local to translations for ascii-art alignment
docs: pdfdocs: Preserve inter-phrase space in Korean translations
docs: pdfdocs: Choose Serif font as CJK mainfont if possible
docs: pdfdocs: Add CJK-language-specific font settings
docs: pdfdocs: Refactor config for CJK document
scripts/kernel-doc: Override -Werror from KCFLAGS with KDOC_WERROR
docs/zh_CN: Add zh_CN/accounting/psi.rst
doc: align Italian translation
Documentation/features/vm: riscv supports THP now
docs/zh_CN: add infiniband user_verbs translation
docs/zh_CN: add infiniband user_mad translation
docs/zh_CN: add infiniband tag_matching translation
docs/zh_CN: add infiniband sysfs translation
docs/zh_CN: add infiniband opa_vnic translation
docs/zh_CN: add infiniband ipoib translation
...
Daniel Borkmann says:
====================
bpf-next 2021-08-31
We've added 116 non-merge commits during the last 17 day(s) which contain
a total of 126 files changed, 6813 insertions(+), 4027 deletions(-).
The main changes are:
1) Add opaque bpf_cookie to perf link which the program can read out again,
to be used in libbpf-based USDT library, from Andrii Nakryiko.
2) Add bpf_task_pt_regs() helper to access userspace pt_regs, from Daniel Xu.
3) Add support for UNIX stream type sockets for BPF sockmap, from Jiang Wang.
4) Allow BPF TCP congestion control progs to call bpf_setsockopt() e.g. to switch
to another congestion control algorithm during init, from Martin KaFai Lau.
5) Extend BPF iterator support for UNIX domain sockets, from Kuniyuki Iwashima.
6) Allow bpf_{set,get}sockopt() calls from setsockopt progs, from Prankur Gupta.
7) Add bpf_get_netns_cookie() helper for BPF_PROG_TYPE_{SOCK_OPS,CGROUP_SOCKOPT}
progs, from Xu Liu and Stanislav Fomichev.
8) Support for __weak typed ksyms in libbpf, from Hao Luo.
9) Shrink struct cgroup_bpf by 504 bytes through refactoring, from Dave Marchevsky.
10) Fix a smatch complaint in verifier's narrow load handling, from Andrey Ignatov.
11) Fix BPF interpreter's tail call count limit, from Daniel Borkmann.
12) Big batch of improvements to BPF selftests, from Magnus Karlsson, Li Zhijian,
Yucong Sun, Yonghong Song, Ilya Leoshkevich, Jussi Maki, Ilya Leoshkevich, others.
13) Another big batch to revamp XDP samples in order to give them consistent look
and feel, from Kumar Kartikeya Dwivedi.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (116 commits)
MAINTAINERS: Remove self from powerpc BPF JIT
selftests/bpf: Fix potential unreleased lock
samples: bpf: Fix uninitialized variable in xdp_redirect_cpu
selftests/bpf: Reduce more flakyness in sockmap_listen
bpf: Fix bpf-next builds without CONFIG_BPF_EVENTS
bpf: selftests: Add dctcp fallback test
bpf: selftests: Add connect_to_fd_opts to network_helpers
bpf: selftests: Add sk_state to bpf_tcp_helpers.h
bpf: tcp: Allow bpf-tcp-cc to call bpf_(get|set)sockopt
selftests: xsk: Preface options with opt
selftests: xsk: Make enums lower case
selftests: xsk: Generate packets from specification
selftests: xsk: Generate packet directly in umem
selftests: xsk: Simplify cleanup of ifobjects
selftests: xsk: Decrease sending speed
selftests: xsk: Validate tx stats on tx thread
selftests: xsk: Simplify packet validation in xsk tests
selftests: xsk: Rename worker_* functions that are not thread entry points
selftests: xsk: Disassociate umem size with packets sent
selftests: xsk: Remove end-of-test packet
...
====================
Link: https://lore.kernel.org/r/20210830225618.11634-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Clean up and consolidate ct ecache infrastructure by merging ct and
expect notifiers, from Florian Westphal.
2) Missing counters and timestamp in nfnetlink_queue and _log conntrack
information.
3) Missing error check for xt_register_template() in iptables mangle,
as a incremental fix for the previous pull request, also from
Florian Westphal.
4) Add netfilter hooks for the SRv6 lightweigh tunnel driver, from
Ryoga Sato. The hooks are enabled via nf_hooks_lwtunnel sysctl
to make sure existing netfilter rulesets do not break. There is
a static key to disable the hooks by default.
The pktgen_bench_xmit_mode_netif_receive.sh shows no noticeable
impact in the seg6_input path for non-netfilter users: similar
numbers with and without this patch.
This is a sample of the perf report output:
11.67% kpktgend_0 [ipv6] [k] ipv6_get_saddr_eval
7.89% kpktgend_0 [ipv6] [k] __ipv6_addr_label
7.52% kpktgend_0 [ipv6] [k] __ipv6_dev_get_saddr
6.63% kpktgend_0 [kernel.vmlinux] [k] asm_exc_nmi
4.74% kpktgend_0 [ipv6] [k] fib6_node_lookup_1
3.48% kpktgend_0 [kernel.vmlinux] [k] pskb_expand_head
3.33% kpktgend_0 [ipv6] [k] ip6_rcv_core.isra.29
3.33% kpktgend_0 [ipv6] [k] seg6_do_srh_encap
2.53% kpktgend_0 [ipv6] [k] ipv6_dev_get_saddr
2.45% kpktgend_0 [ipv6] [k] fib6_table_lookup
2.24% kpktgend_0 [kernel.vmlinux] [k] ___cache_free
2.16% kpktgend_0 [ipv6] [k] ip6_pol_route
2.11% kpktgend_0 [kernel.vmlinux] [k] __ipv6_addr_type
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
conntrack has two distinct table size settings:
nf_conntrack_max and nf_conntrack_buckets.
The former limits how many conntrack objects are allowed to exist
in each namespace.
The second sets the size of the hashtable.
As all entries are inserted twice (once for original direction, once for
reply), there should be at least twice as many buckets in the table than
the maximum number of conntrack objects that can exist at the same time.
Change the default multiplier to 1 and increase the chosen bucket sizes.
This results in the same nf_conntrack_max settings as before but reduces
the average bucket list length.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch introduces netfilter hooks for solving the problem that
conntrack couldn't record both inner flows and outer flows.
This patch also introduces a new sysctl toggle for enabling lightweight
tunnel netfilter hooks.
Signed-off-by: Ryoga Saito <contact@proelbtn.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Currently, the pktgen.rst documentation doesn't cover the latest pktgen
sample usage options such as count and IPv6, and so on. Also, this
documentation includes the old sample scripts which are no longer use
because it was removed by the commit a4b6ade835 ("samples/pktgen :
remove remaining old pktgen sample scripts")
Thus, this commit documents pktgen sample usage using the latest options
and removes old sample scripts, and fixes a minor typo.
Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, there are many drivers who support CQE mode configuration,
some configure it as a fixed when initialized, some provide an
interface to change it by ethtool private flags. In order to make it
more generic, add two new 'ETHTOOL_A_COALESCE_USE_CQE_TX' and
'ETHTOOL_A_COALESCE_USE_CQE_RX' coalesce attributes, then these
parameters can be accessed by ethtool netlink coalesce uAPI.
Also add an new structure kernel_ethtool_coalesce, then the
new parameter can be added into this struct.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As suggested by David, document a somewhat unexpected behavior that results
from net.ipv4.tcp_l3mdev_accept=1. This behavior was encountered while
debugging FRR, a VRF-aware application, on a system which used
net.ipv4.tcp_l3mdev_accept=1 and where TCP connections for BGP with MD5
keys were failing to establish.
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two new methods have been introduced, add some verbiage about what they do.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function has disappeared in commit edac6f6332 ("Revert "net: dsa:
Allow drivers to filter packets they can decode source port from"").
Also, since commit 4e50025129 ("net: dsa: generalize overhead for
taggers that use both headers and trailers"), the next paragraph is no
longer true (it is still discouraged to do that, but it is now
supported, so no point in mentioning it). Delete.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the paragraphs that talk about the various modes of traffic
support, bridging with foreign interfaces, etc etc. There is nothing
that the user needs to know now, it should all work out of the box as
expected.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sja1105 driver has removed its devlink params, so there is nothing
to see here.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This series introduces the support for two new mlx5 features:
1) Sample offload for tunneled traffic
2) devlink rate objects support
1) From Chris Mi: Sample offload for tunneled traffic
=====================================================
Background and solution
-----------------------
Currently the sample offload actions send the encapsulated packet
to software. This series de-capsulates the packet before performing
the sampling and set the tunnel properties on the skb metadata
fields to make the behavior consistent with OVS sFlow.
If de-capsulating first, we can't use the same match like before in
default table. So instantiate a post action instance to continue
processing the action list. If HW can preserve reg_c, also use the
post action instance.
Post action infrastructure
--------------------------
Some tc actions are modeled in hardware using multiple tables
causing a tc action list split. For example, CT action is modeled
by jumping to a ct table which is controlled by nf flow table.
sFlow jumps in hardware to a sample table, which continues to a
"default table" where it should continue processing the action list.
Multi table actions are modeled in hardware using a unique fte_id.
The fte_id is set before jumping to a table. Split actions continue
to a post-action table where the matched fte_id value continues the
execution the tc action list.
This series also introduces post action infrastructure. Both ct and
sample use it.
Sample for tunnel in TC SW
--------------------------
tc filter add dev vxlan1 protocol ip parent ffff: prio 3 \
flower src_mac 24:25:d0:e1:00:00 dst_mac 02:25:d0:13:01:02 \
enc_src_ip 192.168.1.14 enc_dst_ip 192.168.1.13 \
enc_dst_port 4789 enc_key_id 4 \
action sample rate 1 group 6 \
action tunnel_key unset \
action mirred egress redirect dev enp4s0f0_1
MLX5 sample HW offload
----------------------
For the following typical flow table:
+-------------------------------+
+ original flow table +
+-------------------------------+
+ original match +
+-------------------------------+
+ sample action + other actions +
+-------------------------------+
We translate the tc filter with sample action to the following HW model:
+---------------------+
+ original flow table +
+---------------------+
+ original match +
+---------------------+
| set fte_id (if reg_c preserve cap)
| do decap
v
+------------------------------------------------+
+ Flow Sampler Object +
+------------------------------------------------+
+ sample ratio +
+------------------------------------------------+
+ sample table id | default table id +
+------------------------------------------------+
| |
v v
+-----------------------------+ +-------------------+
+ sample table + + default table +
+-----------------------------+ +-------------------+
+ forward to management vport + |
+-----------------------------+ |
+-------+------+
| |reg_c preserve cap
| |or decap action
v v
+-----------------+ +-------------+
+ per vport table + + post action +
+-----------------+ +-------------+
+ original match +
+-----------------+
+ other actions +
+-----------------+
2) From Dmytro Linkin: devlink rate object support for mlx5_core driver
=======================================================================
HIGH-LEVEL OVERVIEW
Devlink leaf rate objects created per vport (VF/SF, and PF on BlueField)
in switchdev mode on devlink port registration.
Implement devlink ops callbacks to create/destroy rate groups, set TX
rate values of the vport/group, assign vport to the group.
Driver accepts TX rate values as fraction of 1Mbps.
Refactor existing eswitch QoS infrastructure to be accessible by legacy
NDO rate API and new devlink rate API. NDO rate API is not
removed/disabled in switchdev mode to not break existing users. Rate
values configured with NDO rate API are not visible for devlink
infrastructure, therefore APIs should not be used simultaneously.
IMPLEMENTATION DETAILS
Driver provide two level rate hierarchy to manage bandwidth - group
level and vport level. Initially each vport added to internal unlimited
group created by default. Each rate element (vport or group) receive
bandwidth relative to its parent element (for groups the parent is a
physical link itself) in a Round Robin manner, where element get
bandwidth value according to its weight. Example:
Created four rate groups with tx_share limits:
$ devlink port function rate add \
pci/0000:06:00.0/group_1 tx_share 30gbit
$ devlink port function rate add \
pci/0000:06:00.0/group_2 tx_share 20gbit
$ devlink port function rate add \
pci/0000:06:00.0/group_3 tx_share 20gbit
$ devlink port function rate add \
pci/0000:06:00.0/group_4 tx_share 10gbit
Weights created in HW for each group are relative to the bigest tx_share
value, which is 30gbit:
<group_1> 1.0
<group_2> 0.67
<group_3> 0.67
<group_4> 0.33
Assuming link speed is 50 Gbit/sec and each group can sustain such
amount of traffic, maximum bandwidth is 50 / (1.0 + 0.67 + 0.67 + 0.33)
= ~18.75 Gbit/sec. Normilized bandwidth values for groups:
<group_1> 18.75 * 1.0 = 18.75 Gbit/sec
<group_2> 18.75 * 0.67 = 12.5 Gbit/sec
<group_3> 18.75 * 0.67 = 12.5 Gbit/sec
<group_4> 18.75 * 0.33 = 6.25 Gbit/sec
If in example above group_1 doesn't produce any traffic, then maximum
bandwidth becomes 50 / (0.67 + 0.67 + 0.33) = ~30.0 Gbit/sec. Normalized
values:
<group_2> 30.0 * 0.67 = 20.0 Gbit/sec
<group_3> 30.0 * 0.67 = 20.0 Gbit/sec
<group_4> 30.0 * 0.33 = 10.0 Gbit/sec
Same normalization applied to each vport in the group.
Normalized values are internal, therefore driver provides QoS
tracepoints for next events:
* vport rate element creation/deletion:
* vport rate element configuration;
* group rate element creation/deletion;
* group rate element configuration.
PATCHES OVERVIEW
1 - Moving and isolation of eswitch QoS logic in separate file;
2 - Implement devlink leaf rate object support for vports;
3 - Implement rate groups creation/deletion;
4 - Implement TX rate management for the groups;
5 - Implement parent set for vports;
6 - Eswitch QoS tracepoints.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmEfNKEACgkQSD+KveBX
+j4DVgf/ZTX3n7xDVrqNgM3hkUOT7QKVCq5zUlDKw1IizE6I+8xB6LO3KmUPyWIn
+VBXE3c+aqUNZu4XlqdkVn1JskVdfTdAGXIIeBgMQskJ/VFCNqU4E9uieEmpiHnI
DUUkEI6eBURJSu1KPD0xdMqtdGJE+/KjwmfZFnCsa4uxmRuV7B0BdxzGIA6AMFKn
+jNS/PFbQM6bUDqP2UUwd97sThtTzDIVH86gu36yK/mdcwdLreqKeuxoHJWePWHC
qLReBC5OQ9zXD2F1Dv2u3WU7EJT7qyCLNrBUrTwHcR9N0Di+2a6lGvjRL5tjWKKC
KOrNkkviurmPp+VieJU+rHHYQwjoGQ==
=XfOY
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2021-08-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-08-19
This series introduces the support for two new mlx5 features:
1) Sample offload for tunneled traffic
2) devlink rate objects support
1) From Chris Mi: Sample offload for tunneled traffic
=====================================================
Background and solution
-----------------------
Currently the sample offload actions send the encapsulated packet
to software. This series de-capsulates the packet before performing
the sampling and set the tunnel properties on the skb metadata
fields to make the behavior consistent with OVS sFlow.
If de-capsulating first, we can't use the same match like before in
default table. So instantiate a post action instance to continue
processing the action list. If HW can preserve reg_c, also use the
post action instance.
Post action infrastructure
--------------------------
Some tc actions are modeled in hardware using multiple tables
causing a tc action list split. For example, CT action is modeled
by jumping to a ct table which is controlled by nf flow table.
sFlow jumps in hardware to a sample table, which continues to a
"default table" where it should continue processing the action list.
Multi table actions are modeled in hardware using a unique fte_id.
The fte_id is set before jumping to a table. Split actions continue
to a post-action table where the matched fte_id value continues the
execution the tc action list.
This series also introduces post action infrastructure. Both ct and
sample use it.
Sample for tunnel in TC SW
--------------------------
tc filter add dev vxlan1 protocol ip parent ffff: prio 3 \
flower src_mac 24:25:d0:e1:00:00 dst_mac 02:25:d0:13:01:02 \
enc_src_ip 192.168.1.14 enc_dst_ip 192.168.1.13 \
enc_dst_port 4789 enc_key_id 4 \
action sample rate 1 group 6 \
action tunnel_key unset \
action mirred egress redirect dev enp4s0f0_1
MLX5 sample HW offload
----------------------
For the following typical flow table:
+-------------------------------+
+ original flow table +
+-------------------------------+
+ original match +
+-------------------------------+
+ sample action + other actions +
+-------------------------------+
We translate the tc filter with sample action to the following HW model:
+---------------------+
+ original flow table +
+---------------------+
+ original match +
+---------------------+
| set fte_id (if reg_c preserve cap)
| do decap
v
+------------------------------------------------+
+ Flow Sampler Object +
+------------------------------------------------+
+ sample ratio +
+------------------------------------------------+
+ sample table id | default table id +
+------------------------------------------------+
| |
v v
+-----------------------------+ +-------------------+
+ sample table + + default table +
+-----------------------------+ +-------------------+
+ forward to management vport + |
+-----------------------------+ |
+-------+------+
| |reg_c preserve cap
| |or decap action
v v
+-----------------+ +-------------+
+ per vport table + + post action +
+-----------------+ +-------------+
+ original match +
+-----------------+
+ other actions +
+-----------------+
2) From Dmytro Linkin: devlink rate object support for mlx5_core driver
=======================================================================
HIGH-LEVEL OVERVIEW
Devlink leaf rate objects created per vport (VF/SF, and PF on BlueField)
in switchdev mode on devlink port registration.
Implement devlink ops callbacks to create/destroy rate groups, set TX
rate values of the vport/group, assign vport to the group.
Driver accepts TX rate values as fraction of 1Mbps.
Refactor existing eswitch QoS infrastructure to be accessible by legacy
NDO rate API and new devlink rate API. NDO rate API is not
removed/disabled in switchdev mode to not break existing users. Rate
values configured with NDO rate API are not visible for devlink
infrastructure, therefore APIs should not be used simultaneously.
IMPLEMENTATION DETAILS
Driver provide two level rate hierarchy to manage bandwidth - group
level and vport level. Initially each vport added to internal unlimited
group created by default. Each rate element (vport or group) receive
bandwidth relative to its parent element (for groups the parent is a
physical link itself) in a Round Robin manner, where element get
bandwidth value according to its weight. Example:
Created four rate groups with tx_share limits:
$ devlink port function rate add \
pci/0000:06:00.0/group_1 tx_share 30gbit
$ devlink port function rate add \
pci/0000:06:00.0/group_2 tx_share 20gbit
$ devlink port function rate add \
pci/0000:06:00.0/group_3 tx_share 20gbit
$ devlink port function rate add \
pci/0000:06:00.0/group_4 tx_share 10gbit
Weights created in HW for each group are relative to the bigest tx_share
value, which is 30gbit:
<group_1> 1.0
<group_2> 0.67
<group_3> 0.67
<group_4> 0.33
Assuming link speed is 50 Gbit/sec and each group can sustain such
amount of traffic, maximum bandwidth is 50 / (1.0 + 0.67 + 0.67 + 0.33)
= ~18.75 Gbit/sec. Normilized bandwidth values for groups:
<group_1> 18.75 * 1.0 = 18.75 Gbit/sec
<group_2> 18.75 * 0.67 = 12.5 Gbit/sec
<group_3> 18.75 * 0.67 = 12.5 Gbit/sec
<group_4> 18.75 * 0.33 = 6.25 Gbit/sec
If in example above group_1 doesn't produce any traffic, then maximum
bandwidth becomes 50 / (0.67 + 0.67 + 0.33) = ~30.0 Gbit/sec. Normalized
values:
<group_2> 30.0 * 0.67 = 20.0 Gbit/sec
<group_3> 30.0 * 0.67 = 20.0 Gbit/sec
<group_4> 30.0 * 0.33 = 10.0 Gbit/sec
Same normalization applied to each vport in the group.
Normalized values are internal, therefore driver provides QoS
tracepoints for next events:
* vport rate element creation/deletion:
* vport rate element configuration;
* group rate element creation/deletion;
* group rate element configuration.
PATCHES OVERVIEW
1 - Moving and isolation of eswitch QoS logic in separate file;
2 - Implement devlink leaf rate object support for vports;
3 - Implement rate groups creation/deletion;
4 - Implement TX rate management for the groups;
5 - Implement parent set for vports;
6 - Eswitch QoS tracepoints.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
- bump version strings, by Simon Wunderlich
- update docs about move IRC channel away from freenode,
by Sven Eckelmann
- Switch to kstrtox.h for kstrtou64, by Sven Eckelmann
- Update NULL checks, by Sven Eckelmann (2 patches)
- remove remaining skb-copy calls for broadcast packets,
by Linus Lüssing
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAmEeeK8WHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeofhWD/0YwNndM/FFo/NHcO3GFDZx9eLM
dFuO7zdMilzgg462q7+mgi0jXA2Kp50Y+JcCqS2XVRIsMgKTVABflgmSlUIOdDoC
A3KKRVgQ1HNPD4WREaEV2CLvBdhR9wEI0jRHvZou7n/VWrfJcUHgdl9aDA2/ptlP
NcuYCKC99HCQmvaBt4GZgOunYDeplmo2qLip2gpwJWf9/vkL7HiBe3HtQSh1HI2y
EIn4SExZOFcxMmKeJMsYl35OZh9oFv7nTnpZBGyKjA+HS0pu03aaPNRGMjW/pdhF
f7V61aDJBU0xU6PjWvUegY4VMInrjW8F10EEJck461J/B9PXjUHUaH8BXXuGBkRM
0kU0Cv21a3Ovz23lgnXSnXu/xjqq5/zZHjnGvyPAMMppAI5f73q/0THtv9iOu+Cz
Qf/tYl0BIRir20ZWtddQ9x2W3+cBYPOYrf/tnmWqFhPddenn+xitwTysVA6fOykQ
pVksQ5UVpDZasZI9Al+R2M0CBttn7tS/iu95PV9CMST8aRgUuU90yd2Ocg3rRDNQ
iEor0AozmO879W460BFQcTILw+D7OdlErUV8H8VW4507imZ7JXGPwZTFxhjM2Xhx
wUXo/o2sxt/ITSdtZAeQj8zOXQMtOi3KlXtTl8ZzyRT//YLWah0j4oBf0a8K62/y
i1Pd5MgXDQAm8fHkBg==
=sHz+
-----END PGP SIGNATURE-----
Merge tag 'batadv-next-pullrequest-20210819' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- update docs about move IRC channel away from freenode,
by Sven Eckelmann
- Switch to kstrtox.h for kstrtou64, by Sven Eckelmann
- Update NULL checks, by Sven Eckelmann (2 patches)
- remove remaining skb-copy calls for broadcast packets,
by Linus Lüssing
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Turn BPF_PROG_RUN into a proper always inlined function. No functional and
performance changes are intended, but it makes it much easier to understand
what's going on with how BPF programs are actually get executed. It's more
obvious what types and callbacks are expected. Also extra () around input
parameters can be dropped, as well as `__` variable prefixes intended to avoid
naming collisions, which makes the code simpler to read and write.
This refactoring also highlighted one extra issue. BPF_PROG_RUN is both
a macro and an enum value (BPF_PROG_RUN == BPF_PROG_TEST_RUN). Turning
BPF_PROG_RUN into a function causes naming conflict compilation error. So
rename BPF_PROG_RUN into lower-case bpf_prog_run(), similar to
bpf_prog_run_xdp(), bpf_prog_run_pin_on_cpu(), etc. All existing callers of
BPF_PROG_RUN, the macro, are switched to bpf_prog_run() explicitly.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210815070609.987780-2-andrii@kernel.org
Add documentation for two bad signal integrity substates:
ETHTOOL_LINK_EXT_SUBSTATE_BSI_SERDES_REFERENCE_CLOCK_LOST
ETHTOOL_LINK_EXT_SUBSTATE_BSI_SERDES_ALOS.
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The msk can use backup subflows to transmit in-sequence data
only if there are no other active subflow. On active backup
scenario, the MPTCP connection can do forward progress only
due to MPTCP retransmissions - rtx can pick backup subflows.
This patch introduces a new flag flow MPTCP subflows: if the
underlying TCP connection made no progresses for long time,
and there are other less problematic subflows available, the
given subflow become stale.
Stale subflows are not considered active: if all non backup
subflows become stale, the MPTCP scheduler can pick backup
subflows for plain transmissions.
Stale subflows can return in active state, as soon as any reply
from the peer is observed.
Active backup scenarios can now leverage the available b/w
with no restrinction.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/207
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new device generic parameter to enable/disable creation of
VDPA net auxiliary device and associated device functionality
in the devlink instance.
User who prefers to disable such functionality can disable it using below
example.
$ devlink dev param set pci/0000:06:00.0 \
name enable_vnet value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0
At this point devlink instance do not create auxiliary device for the
VDPA net functionality.
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new device generic parameter to enable/disable creation of
RDMA auxiliary device and associated device functionality
in the devlink instance.
User who prefers to disable such functionality can disable it using below
example.
$ devlink dev param set pci/0000:06:00.0 \
name enable_rdma value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0
At this point devlink instance do not create auxiliary device for the
RDMA functionality.
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new device generic parameter to enable/disable creation of
Ethernet auxiliary device and associated device functionality
in the devlink instance.
User who prefers to disable such functionality can disable it using below
example.
$ devlink dev param set pci/0000:06:00.0 \
name enable_eth value false cmode driverinit
$ devlink dev reload pci/0000:06:00.0
At this point devlink instance do not create auxiliary device for the
Ethernet functionality.
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to recent developments around the Freenode.org IRC network, the
opinions about the usage of this service shifted dramatically. The majority
of the still active users of the #batman channel prefers a move to the
hackint.org network.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Restrict range element expansion in ipset to avoid soft lockup,
from Jozsef Kadlecsik.
2) Memleak in error path for nf_conntrack_bridge for IPv4 packets,
from Yajun Deng.
3) Simplify conntrack garbage collection strategy to avoid frequent
wake-ups, from Florian Westphal.
4) Fix NFNLA_HOOK_FUNCTION_NAME string, do not include module name.
5) Missing chain family netlink attribute in chain description
in nfnetlink_hook.
6) Incorrect sequence number on nfnetlink_hook dumps.
7) Use netlink request family in reply message for consistency.
8) Remove offload_pickup sysctl, use conntrack for established state
instead, from Florian Westphal.
9) Translate NFPROTO_INET/ingress to NFPROTO_NETDEV/ingress, since
NFPROTO_INET is not exposed through nfnetlink_hook.
* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
netfilter: nfnetlink_hook: translate inet ingress to netdev
netfilter: conntrack: remove offload_pickup sysctl again
netfilter: nfnetlink_hook: Use same family as request message
netfilter: nfnetlink_hook: use the sequence number of the request message
netfilter: nfnetlink_hook: missing chain family
netfilter: nfnetlink_hook: strip off module name from hookfn
netfilter: conntrack: collect all entries in one cycle
netfilter: nf_conntrack_bridge: Fix memory leak when error
netfilter: ipset: Limit the maximal range of consecutive elements to add/delete
====================
Link: https://lore.kernel.org/r/20210806151149.6356-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
These two sysctls were added because the hardcoded defaults (2 minutes,
tcp, 30 seconds, udp) turned out to be too low for some setups.
They appeared in 5.14-rc1 so it should be fine to remove it again.
Marcelo convinced me that there should be no difference between a flow
that was offloaded vs. a flow that was not wrt. timeout handling.
Thus the default is changed to those for TCP established and UDP stream,
5 days and 120 seconds, respectively.
Marcelo also suggested to account for the timeout value used for the
offloading, this avoids increase beyond the value in the conntrack-sysctl
and will also instantly expire the conntrack entry with altered sysctls.
Example:
nf_conntrack_udp_timeout_stream=60
nf_flowtable_udp_timeout=60
This will remove offloaded udp flows after one minute, rather than two.
An earlier version of this patch also cleared the ASSURED bit to
allow nf_conntrack to evict the entry via early_drop (i.e., table full).
However, it looks like we can safely assume that connection timed out
via HW is still in established state, so this isn't needed.
Quoting Oz:
[..] the hardware sends all packets with a set FIN flags to sw.
[..] Connections that are aged in hardware are expected to be in the
established state.
In case it turns out that back-to-sw-path transition can occur for
'dodgy' connections too (e.g., one side disappeared while software-path
would have been in RETRANS timeout), we can adjust this later.
Cc: Oz Shlomo <ozsh@nvidia.com>
Cc: Paul Blakey <paulb@nvidia.com>
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Build failure in drivers/net/wwan/mhi_wwan_mbim.c:
add missing parameter (0, assuming we don't want buffer pre-alloc).
Conflict in drivers/net/dsa/sja1105/sja1105_main.c between:
589918df93 ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too")
0fac6aa098 ("net: dsa: sja1105: delete the best_effort_vlan_filtering mode")
Follow the instructions from the commit message of the former commit
- removed the if conditions. When looking at commit 589918df93 ("net:
dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too")
note that the mask_iotag fields get removed by the following patch.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There are aspects of netdevsim which are commonly
misunderstood and pointed out in review. Cong
suggest we document them.
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add an option lacp_active, which is similar with team's runner.active.
This option specifies whether to send LACPDU frames periodically. If set
on, the LACPDU frames are sent along with the configured lacp_rate
setting. If set off, the LACPDU frames acts as "speak when spoken to".
Note, the LACPDU state frames still will be sent when init or unbind port.
v2: remove module parameter
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IF_OPER_TESTING is in fact used today.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrii Nakryiko says:
====================
bpf-next 2021-07-30
We've added 64 non-merge commits during the last 15 day(s) which contain
a total of 83 files changed, 5027 insertions(+), 1808 deletions(-).
The main changes are:
1) BTF-guided binary data dumping libbpf API, from Alan.
2) Internal factoring out of libbpf CO-RE relocation logic, from Alexei.
3) Ambient BPF run context and cgroup storage cleanup, from Andrii.
4) Few small API additions for libbpf 1.0 effort, from Evgeniy and Hengqi.
5) bpf_program__attach_kprobe_opts() fixes in libbpf, from Jiri.
6) bpf_{get,set}sockopt() support in BPF iterators, from Martin.
7) BPF map pinning improvements in libbpf, from Martynas.
8) Improved module BTF support in libbpf and bpftool, from Quentin.
9) Bpftool cleanups and documentation improvements, from Quentin.
10) Libbpf improvements for supporting CO-RE on old kernels, from Shuyi.
11) Increased maximum cgroup storage size, from Stanislav.
12) Small fixes and improvements to BPF tests and samples, from various folks.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (64 commits)
tools: bpftool: Complete metrics list in "bpftool prog profile" doc
tools: bpftool: Document and add bash completion for -L, -B options
selftests/bpf: Update bpftool's consistency script for checking options
tools: bpftool: Update and synchronise option list in doc and help msg
tools: bpftool: Complete and synchronise attach or map types
selftests/bpf: Check consistency between bpftool source, doc, completion
tools: bpftool: Slightly ease bash completion updates
unix_bpf: Fix a potential deadlock in unix_dgram_bpf_recvmsg()
libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf
tools: bpftool: Support dumping split BTF by id
libbpf: Add split BTF support for btf__load_from_kernel_by_id()
tools: Replace btf__get_from_id() with btf__load_from_kernel_by_id()
tools: Free BTF objects at various locations
libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id()
libbpf: Rename btf__load() as btf__load_into_kernel()
libbpf: Return non-null error on failures in libbpf_find_prog_btf_id()
bpf: Emit better log message if bpf_iter ctx arg btf_id == 0
tools/resolve_btfids: Emit warnings and patch zero id for missing symbols
bpf: Increase supported cgroup storage value size
libbpf: Fix race when pinning maps in parallel
...
====================
Link: https://lore.kernel.org/r/20210730225606.1897330-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Document the mirroring capabilities of the dpaa2-switch driver,
any restrictions that are imposed and some example commands.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change adds a brief document about the sockets API provided for
sending and receiving MCTP messages from userspace.
This is roughly based on the OpenBMC design document, at:
https://github.com/openbmc/docs/blob/master/designs/mctp/mctp-kernel.md
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Append ioam6-sysctl to toctree in order to get rid of building warnings.
Signed-off-by: Hu Haowen <src.res@email.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
All other user triggered operations are gone from ndo_ioctl, so move
the SIOCBOND family into a custom operation as well.
The .ndo_ioctl() helper is no longer called by the dev_ioctl.c code now,
but there are still a few definitions in obsolete wireless drivers as well
as the appletalk and ieee802154 layers to call SIOCSIFADDR/SIOCGIFADDR
helpers from inside the kernel.
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to further reduce the scope of ndo_do_ioctl(), move
out the SIOCWANDEV handling into a new network device operation
function.
Adjust the prototype to only pass the if_settings sub-structure
in place of the ifreq, and remove the redundant 'cmd' argument
in the process.
Cc: Krzysztof Halasa <khc@pm.waw.pl>
Cc: "Jan \"Yenya\" Kasprzak" <kas@fi.muni.cz>
Cc: Kevin Curtis <kevin.curtis@farsite.co.uk>
Cc: Zhao Qiang <qiang.zhao@nxp.com>
Cc: Martin Schiller <ms@dev.tdt.de>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: linux-x25@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most users of ndo_do_ioctl are ethernet drivers that implement
the MII commands SIOCGMIIPHY/SIOCGMIIREG/SIOCSMIIREG, or hardware
timestamping with SIOCSHWTSTAMP/SIOCGHWTSTAMP.
Separate these from the few drivers that use ndo_do_ioctl to
implement SIOCBOND, SIOCBR and SIOCWANDEV commands.
This is a purely cosmetic change intended to help readers find
their way through the implementation.
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Vladimir Oltean <olteanv@gmail.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SIOCDEVPRIVATE ioctl commands are mainly used in really old
drivers, and they have a number of problems:
- They hide behind the normal .ndo_do_ioctl function that
is also used for other things in modern drivers, so it's
hard to spot a driver that actually uses one of these
- Since drivers use a number different calling conventions,
it is impossible to support compat mode for them in
a generic way.
- With all drivers using the same 16 commands codes, there
is no way to introspect the data being passed through
things like strace.
Add a new net_device_ops callback pointer, to address the
first two of these. Separating them from .ndo_do_ioctl
makes it easy to grep for drivers with a .ndo_siocdevprivate
callback, and the unwieldy name hopefully makes it easier
to spot in code review.
By passing the ifreq structure and the ifr_data pointer
separately, it is no longer necessary to overload these,
and the driver can use either one for a given command.
Cc: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a documentation entry for the DPAA2 switch listing its
requirements, features and some examples to go along them.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a file to document devlink support for hns3 driver, now support devlink
info and devlink reload.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the DPAA2 DPIO driver chapter title by adding the necessary
overline. Without this, the index page of the DPAA2 documentation
doesn't display properly.
Fixes: d8e516bac7 ("soc: fsl: dpio: Convert DPIO documentation to .rst")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20210722100356.635078-5-ciorneiioana@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Multiple complaints have been raised from the TFO users on the internet
stating that the TFO blackhole logic is too aggressive and gets falsely
triggered too often.
(e.g. https://blog.apnic.net/2021/07/05/tcp-fast-open-not-so-fast/)
Considering that most middleboxes no longer drop TFO packets, we decide
to disable the blackhole logic by setting
/proc/sys/net/ipv4/tcp_fastopen_blackhole_timeout_set to 0 by default.
Fixes: cf1ef3f071 ("net/tcp_fastopen: Disable active side TFO in certain scenarios")
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add documentation for new IOAM sysctls:
- ioam6_id and ioam6_id_wide: two per-namespace sysctls
- ioam6_enabled, ioam6_id and ioam6_id_wide: three per-interface sysctls
Example of IOAM configuration based on the following simple topology:
_____ _____ _____
| | eth0 eth0 | | eth1 eth0 | |
| A |.----------.| B |.----------.| C |
|_____| |_____| |_____|
1) Node and interface IDs can be configured for IOAM:
# IOAM ID of A = 1, IOAM ID of A.eth0 = 11
(A) sysctl -w net.ipv6.ioam6_id=1
(A) sysctl -w net.ipv6.conf.eth0.ioam6_id=11
# IOAM ID of B = 2, IOAM ID of B.eth0 = 21, IOAM ID of B.eth1 = 22
(B) sysctl -w net.ipv6.ioam6_id=2
(B) sysctl -w net.ipv6.conf.eth0.ioam6_id=21
(B) sysctl -w net.ipv6.conf.eth1.ioam6_id=22
# IOAM ID of C = 3, IOAM ID of C.eth0 = 31
(C) sysctl -w net.ipv6.ioam6_id=3
(C) sysctl -w net.ipv6.conf.eth0.ioam6_id=31
Note that "_wide" IDs equivalents can be configured the same way.
2) Each node can be configured to form an IOAM domain. For instance,
we allow IOAM from A to C only (not the reverse path), i.e. enable
IOAM on ingress for B.eth0 and C.eth0:
(B) sysctl -w net.ipv6.conf.eth0.ioam6_enabled=1
(C) sysctl -w net.ipv6.conf.eth0.ioam6_enabled=1
3) An IOAM domain (e.g. ID=123) is defined and made known to each node:
(A) ip ioam namespace add 123
(B) ip ioam namespace add 123
(C) ip ioam namespace add 123
4) Finally, an IOAM Pre-allocated Trace can be inserted in traffic sent
by A when C (e.g. db02::2) is the destination:
(A) ip -6 route add db02::2/128 encap ioam6 trace type 0x800000 ns 123
size 12 dev eth0
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new heading for extensions to make it more readable. Also, add one
more example of filtering interface index for better understanding.
Signed-off-by: Roy, UjjaL <royujjal@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/CAADnVQJ=DoRDcVkaXmY3EmNdLoO7gq1mkJOn5G=00wKH8qUtZQ@mail.gmail.com
Andrii Nakryiko says:
====================
pull-request: bpf 2021-07-15
The following pull-request contains BPF updates for your *net* tree.
We've added 9 non-merge commits during the last 5 day(s) which contain
a total of 9 files changed, 37 insertions(+), 15 deletions(-).
The main changes are:
1) Fix NULL pointer dereference in BPF_TEST_RUN for BPF_XDP_DEVMAP and
BPF_XDP_CPUMAP programs, from Xuan Zhuo.
2) Fix use-after-free of net_device in XDP bpf_link, from Xuan Zhuo.
3) Follow-up fix to subprog poke descriptor use-after-free problem, from
Daniel Borkmann and John Fastabend.
4) Fix out-of-range array access in s390 BPF JIT backend, from Colin Ian King.
5) Fix memory leak in BPF sockmap, from John Fastabend.
6) Fix for sockmap to prevent proc stats reporting bug, from John Fastabend
and Jakub Sitnicki.
7) Fix NULL pointer dereference in bpftool, from Tobias Klauser.
8) AF_XDP documentation fixes, from Baruch Siach.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Current release - regressions:
- sock: fix parameter order in sock_setsockopt()
Current release - new code bugs:
- netfilter: nft_last:
- fix incorrect arithmetic when restoring last used
- honor NFTA_LAST_SET on restoration
Previous releases - regressions:
- udp: properly flush normal packet at GRO time
- sfc: ensure correct number of XDP queues; don't allow enabling the
feature if there isn't sufficient resources to Tx from any CPU
- dsa: sja1105: fix address learning getting disabled on the CPU port
- mptcp: addresses a rmem accounting issue that could keep packets
in subflow receive buffers longer than necessary, delaying
MPTCP-level ACKs
- ip_tunnel: fix mtu calculation for ETHER tunnel devices
- do not reuse skbs allocated from skbuff_fclone_cache in the napi
skb cache, we'd try to return them to the wrong slab cache
- tcp: consistently disable header prediction for mptcp
Previous releases - always broken:
- bpf: fix subprog poke descriptor tracking use-after-free
- ipv6:
- allocate enough headroom in ip6_finish_output2() in case
iptables TEE is used
- tcp: drop silly ICMPv6 packet too big messages to avoid
expensive and pointless lookups (which may serve as a DDOS
vector)
- make sure fwmark is copied in SYNACK packets
- fix 'disable_policy' for forwarded packets (align with IPv4)
- netfilter: conntrack: do not renew entry stuck in tcp SYN_SENT state
- netfilter: conntrack: do not mark RST in the reply direction coming
after SYN packet for an out-of-sync entry
- mptcp: cleanly handle error conditions with MP_JOIN and syncookies
- mptcp: fix double free when rejecting a join due to port mismatch
- validate lwtstate->data before returning from skb_tunnel_info()
- tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy path
- mt76: mt7921: continue to probe driver when fw already downloaded
- bonding: fix multiple issues with offloading IPsec to (thru?) bond
- stmmac: ptp: fix issues around Qbv support and setting time back
- bcmgenet: always clear wake-up based on energy detection
Misc:
- sctp: move 198 addresses from unusable to private scope
- ptp: support virtual clocks and timestamping
- openvswitch: optimize operation for key comparison
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmDu3mMACgkQMUZtbf5S
Irsjxg//UwcPJMYFmXV+fGkEsWYe1Kf29FcUDEeANFtbltfAcIfZ0GoTbSDRnrVb
HcYAKcm4XRx5bWWdQrQsQq/yiLbnS/rSLc7VRB+uRHWRKl3eYcaUB2rnCXsxrjGw
wQJgOmztDCJS4BIky24iQpF/8lg7p/Gj2Ih532gh93XiYo612FrEJKkYb2/OQfYX
GkbnZ0kL2Y1SV+bhy6aT5azvhHKM4/3eA4fHeJ2p8e2gOZ5ni0vpX0xEzdzKOCd0
vwR/Wu3h/+2QuFYVcSsVguuM++JXACG8MAS/Tof78dtNM4a3kQxzqeh5Bv6IkfTu
rokENLq4pjNRy+nBAOeQZj8Jd0K0kkf/PN9WMdGQtplMoFhjjV25R6PeRrV9wwPo
peozIz2MuQo7Kfof1D+44h2foyLfdC28/Z0CvRbDpr5EHOfYynvBbrnhzIGdQp6V
xgftKTOdgz2Djgg8HiblZund1FA44OYerddVAASrIsnSFnIz1VLVQIsfV+GLBwwc
FawrIZ6WfIjzRSrDGOvDsbAQI47T/1jbaPJeK6XgjWkQmjEd6UtRWRZLYCxemQEw
4HP3sWC96BOehuD8ylipVE1oFqrxCiOB/fZxezXqjo8dSX3NLdak4cCHTHoW5SuZ
eEAxQRaBliKd+P7hoy9cZ57CAu3zUa8kijfM5QRlCAHF+zSxaPs=
=QFnb
-----END PGP SIGNATURE-----
Merge tag 'net-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski.
"Including fixes from bpf and netfilter.
Current release - regressions:
- sock: fix parameter order in sock_setsockopt()
Current release - new code bugs:
- netfilter: nft_last:
- fix incorrect arithmetic when restoring last used
- honor NFTA_LAST_SET on restoration
Previous releases - regressions:
- udp: properly flush normal packet at GRO time
- sfc: ensure correct number of XDP queues; don't allow enabling the
feature if there isn't sufficient resources to Tx from any CPU
- dsa: sja1105: fix address learning getting disabled on the CPU port
- mptcp: addresses a rmem accounting issue that could keep packets in
subflow receive buffers longer than necessary, delaying MPTCP-level
ACKs
- ip_tunnel: fix mtu calculation for ETHER tunnel devices
- do not reuse skbs allocated from skbuff_fclone_cache in the napi
skb cache, we'd try to return them to the wrong slab cache
- tcp: consistently disable header prediction for mptcp
Previous releases - always broken:
- bpf: fix subprog poke descriptor tracking use-after-free
- ipv6:
- allocate enough headroom in ip6_finish_output2() in case
iptables TEE is used
- tcp: drop silly ICMPv6 packet too big messages to avoid
expensive and pointless lookups (which may serve as a DDOS
vector)
- make sure fwmark is copied in SYNACK packets
- fix 'disable_policy' for forwarded packets (align with IPv4)
- netfilter: conntrack:
- do not renew entry stuck in tcp SYN_SENT state
- do not mark RST in the reply direction coming after SYN packet
for an out-of-sync entry
- mptcp: cleanly handle error conditions with MP_JOIN and syncookies
- mptcp: fix double free when rejecting a join due to port mismatch
- validate lwtstate->data before returning from skb_tunnel_info()
- tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy path
- mt76: mt7921: continue to probe driver when fw already downloaded
- bonding: fix multiple issues with offloading IPsec to (thru?) bond
- stmmac: ptp: fix issues around Qbv support and setting time back
- bcmgenet: always clear wake-up based on energy detection
Misc:
- sctp: move 198 addresses from unusable to private scope
- ptp: support virtual clocks and timestamping
- openvswitch: optimize operation for key comparison"
* tag 'net-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (158 commits)
net: dsa: properly check for the bridge_leave methods in dsa_switch_bridge_leave()
sfc: add logs explaining XDP_TX/REDIRECT is not available
sfc: ensure correct number of XDP queues
sfc: fix lack of XDP TX queues - error XDP TX failed (-22)
net: fddi: fix UAF in fza_probe
net: dsa: sja1105: fix address learning getting disabled on the CPU port
net: ocelot: fix switchdev objects synced for wrong netdev with LAG offload
net: Use nlmsg_unicast() instead of netlink_unicast()
octeontx2-pf: Fix uninitialized boolean variable pps
ipv6: allocate enough headroom in ip6_finish_output2()
net: hdlc: rename 'mod_init' & 'mod_exit' functions to be module-specific
net: bridge: multicast: fix MRD advertisement router port marking race
net: bridge: multicast: fix PIM hello router port marking race
net: phy: marvell10g: fix differentiation of 88X3310 from 88X3340
dsa: fix for_each_child.cocci warnings
virtio_net: check virtqueue_add_sgs() return value
mptcp: properly account bulk freed memory
selftests: mptcp: fix case multiple subflows limited by server
mptcp: avoid processing packet if a subflow reset
mptcp: fix syncookie process if mptcp can not_accept new subflow
...
This patch adds a new sysctl tcp_ignore_invalid_rst to disable marking
out of segments RSTs as INVALID.
Signed-off-by: Ali Abdallah <aabdallah@suse.de>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Here is the big set of tty and serial driver patches for 5.14-rc1.
A bit more than normal, but nothing major, lots of cleanups. Highlights
are:
- lots of tty api cleanups and mxser driver cleanups from Jiri
- build warning fixes
- various serial driver updates
- coding style cleanups
- various tty driver minor fixes and updates
- removal of broken and disable r3964 line discipline (finally!)
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYOM4qQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylKvQCfbh+OmTkDlDlDhSWlxuV05M1XTXoAoLUcLZru
s5JCnwSZztQQLMDHj7Pd
=Zupm
-----END PGP SIGNATURE-----
Merge tag 'tty-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial updates from Greg KH:
"Here is the big set of tty and serial driver patches for 5.14-rc1.
A bit more than normal, but nothing major, lots of cleanups.
Highlights are:
- lots of tty api cleanups and mxser driver cleanups from Jiri
- build warning fixes
- various serial driver updates
- coding style cleanups
- various tty driver minor fixes and updates
- removal of broken and disable r3964 line discipline (finally!)
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (227 commits)
serial: mvebu-uart: remove unused member nb from struct mvebu_uart
arm64: dts: marvell: armada-37xx: Fix reg for standard variant of UART
dt-bindings: mvebu-uart: fix documentation
serial: mvebu-uart: correctly calculate minimal possible baudrate
serial: mvebu-uart: do not allow changing baudrate when uartclk is not available
serial: mvebu-uart: fix calculation of clock divisor
tty: make linux/tty_flip.h self-contained
serial: Prefer unsigned int to bare use of unsigned
serial: 8250: 8250_omap: Fix possible interrupt storm on K3 SoCs
serial: qcom_geni_serial: use DT aliases according to DT bindings
Revert "tty: serial: Add UART driver for Cortina-Access platform"
tty: serial: Add UART driver for Cortina-Access platform
MAINTAINERS: add me back as mxser maintainer
mxser: Documentation, fix typos
mxser: Documentation, make the docs up-to-date
mxser: Documentation, remove traces of callout device
mxser: introduce mxser_16550A_or_MUST helper
mxser: rename flags to old_speed in mxser_set_serial_info
mxser: use port variable in mxser_set_serial_info
mxser: access info->MCR under info->slock
...
kernel-doc for TIPC is too simple, we need to add more information for it.
This patch is to extend the abstract, and add the Features and Links items.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add an interface for getting PHC (PTP Hardware Clock)
virtual clocks, which are based on PHC physical clock
providing hardware timestamp to network packets.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Core:
- BPF:
- add syscall program type and libbpf support for generating
instructions and bindings for in-kernel BPF loaders (BPF loaders
for BPF), this is a stepping stone for signed BPF programs
- infrastructure to migrate TCP child sockets from one listener
to another in the same reuseport group/map to improve flexibility
of service hand-off/restart
- add broadcast support to XDP redirect
- allow bypass of the lockless qdisc to improving performance
(for pktgen: +23% with one thread, +44% with 2 threads)
- add a simpler version of "DO_ONCE()" which does not require
jump labels, intended for slow-path usage
- virtio/vsock: introduce SOCK_SEQPACKET support
- add getsocketopt to retrieve netns cookie
- ip: treat lowest address of a IPv4 subnet as ordinary unicast address
allowing reclaiming of precious IPv4 addresses
- ipv6: use prandom_u32() for ID generation
- ip: add support for more flexible field selection for hashing
across multi-path routes (w/ offload to mlxsw)
- icmp: add support for extended RFC 8335 PROBE (ping)
- seg6: add support for SRv6 End.DT46 behavior
- mptcp:
- DSS checksum support (RFC 8684) to detect middlebox meddling
- support Connection-time 'C' flag
- time stamping support
- sctp: packetization Layer Path MTU Discovery (RFC 8899)
- xfrm: speed up state addition with seq set
- WiFi:
- hidden AP discovery on 6 GHz and other HE 6 GHz improvements
- aggregation handling improvements for some drivers
- minstrel improvements for no-ack frames
- deferred rate control for TXQs to improve reaction times
- switch from round robin to virtual time-based airtime scheduler
- add trace points:
- tcp checksum errors
- openvswitch - action execution, upcalls
- socket errors via sk_error_report
Device APIs:
- devlink: add rate API for hierarchical control of max egress rate
of virtual devices (VFs, SFs etc.)
- don't require RCU read lock to be held around BPF hooks
in NAPI context
- page_pool: generic buffer recycling
New hardware/drivers:
- mobile:
- iosm: PCIe Driver for Intel M.2 Modem
- support for Qualcomm MSM8998 (ipa)
- WiFi: Qualcomm QCN9074 and WCN6855 PCI devices
- sparx5: Microchip SparX-5 family of Enterprise Ethernet switches
- Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)
- NXP SJA1110 Automotive Ethernet 10-port switch
- Qualcomm QCA8327 switch support (qca8k)
- Mikrotik 10/25G NIC (atl1c)
Driver changes:
- ACPI support for some MDIO, MAC and PHY devices from Marvell and NXP
(our first foray into MAC/PHY description via ACPI)
- HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx
- Mellanox/Nvidia NIC (mlx5)
- NIC VF offload of L2 bridging
- support IRQ distribution to Sub-functions
- Marvell (prestera):
- add flower and match all
- devlink trap
- link aggregation
- Netronome (nfp): connection tracking offload
- Intel 1GE (igc): add AF_XDP support
- Marvell DPU (octeontx2): ingress ratelimit offload
- Google vNIC (gve): new ring/descriptor format support
- Qualcomm mobile (rmnet & ipa): inline checksum offload support
- MediaTek WiFi (mt76)
- mt7915 MSI support
- mt7915 Tx status reporting
- mt7915 thermal sensors support
- mt7921 decapsulation offload
- mt7921 enable runtime pm and deep sleep
- Realtek WiFi (rtw88)
- beacon filter support
- Tx antenna path diversity support
- firmware crash information via devcoredump
- Qualcomm 60GHz WiFi (wcn36xx)
- Wake-on-WLAN support with magic packets and GTK rekeying
- Micrel PHY (ksz886x/ksz8081): add cable test support
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmDb+fUACgkQMUZtbf5S
Irs2Jg//aqN0Q8CgIvYCVhPxQw1tY7pTAbgyqgBZ01vwjyvtIOgJiWzSfFEU84mX
M8fcpFX5eTKrOyJ9S6UFfQ/JG114n3hjAxFFT4Hxk2gC1Tg0vHuFQTDHcUl28bUE
mTm61e1YpdorILnv2k5JVQ/wu0vs5QKDrjcYcrcPnh+j93wvnPOgAfDBV95nZzjS
OTt4q2fR8GzLcSYWWsclMbDNkzyTG50RW/0Yd6aGjr5QGvXfrMeXfUJNz533PMf/
w5lNyjRKv+x9mdTZJzU0+msNUrZgUdRz7W8Ey8lD3hJZRE+D6/uU7FtsE8Mi3+uc
HWxeZUyzA3YF1MfVl/eesbxyPT7S/OkLzk4O5B35FbqP0YltaP+bOjq1/nM3ce1/
io9Dx9pIl/2JANUgRCAtLi8Z2dkvRoqTaBxZ/nPudCCljFwDwl6joTMJ7Ow22i5Y
5aIkcXFmZq4LbJDiHvbTlqT7yiuaEvu2UK/23bSIg/K3nF4eAmkY9Y1EgiMf60OF
78Ttw0wk2tUegwaS5MZnCniKBKDyl9gM2F6rbZ/IxQRR2LTXFc1B6gC+ynUxgXfh
Ub8O++6qGYGYZ0XvQH4pzco79p3qQWBTK5beIp2eu6BOAjBVIXq4AibUfoQLACsu
hX7jMPYd0kc3WFgUnKgQP8EnjFSwbf4XiaE7fIXvWBY8hzCw2h4=
=LvtX
-----END PGP SIGNATURE-----
Merge tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- BPF:
- add syscall program type and libbpf support for generating
instructions and bindings for in-kernel BPF loaders (BPF loaders
for BPF), this is a stepping stone for signed BPF programs
- infrastructure to migrate TCP child sockets from one listener to
another in the same reuseport group/map to improve flexibility
of service hand-off/restart
- add broadcast support to XDP redirect
- allow bypass of the lockless qdisc to improving performance (for
pktgen: +23% with one thread, +44% with 2 threads)
- add a simpler version of "DO_ONCE()" which does not require jump
labels, intended for slow-path usage
- virtio/vsock: introduce SOCK_SEQPACKET support
- add getsocketopt to retrieve netns cookie
- ip: treat lowest address of a IPv4 subnet as ordinary unicast
address allowing reclaiming of precious IPv4 addresses
- ipv6: use prandom_u32() for ID generation
- ip: add support for more flexible field selection for hashing
across multi-path routes (w/ offload to mlxsw)
- icmp: add support for extended RFC 8335 PROBE (ping)
- seg6: add support for SRv6 End.DT46 behavior
- mptcp:
- DSS checksum support (RFC 8684) to detect middlebox meddling
- support Connection-time 'C' flag
- time stamping support
- sctp: packetization Layer Path MTU Discovery (RFC 8899)
- xfrm: speed up state addition with seq set
- WiFi:
- hidden AP discovery on 6 GHz and other HE 6 GHz improvements
- aggregation handling improvements for some drivers
- minstrel improvements for no-ack frames
- deferred rate control for TXQs to improve reaction times
- switch from round robin to virtual time-based airtime scheduler
- add trace points:
- tcp checksum errors
- openvswitch - action execution, upcalls
- socket errors via sk_error_report
Device APIs:
- devlink: add rate API for hierarchical control of max egress rate
of virtual devices (VFs, SFs etc.)
- don't require RCU read lock to be held around BPF hooks in NAPI
context
- page_pool: generic buffer recycling
New hardware/drivers:
- mobile:
- iosm: PCIe Driver for Intel M.2 Modem
- support for Qualcomm MSM8998 (ipa)
- WiFi: Qualcomm QCN9074 and WCN6855 PCI devices
- sparx5: Microchip SparX-5 family of Enterprise Ethernet switches
- Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)
- NXP SJA1110 Automotive Ethernet 10-port switch
- Qualcomm QCA8327 switch support (qca8k)
- Mikrotik 10/25G NIC (atl1c)
Driver changes:
- ACPI support for some MDIO, MAC and PHY devices from Marvell and
NXP (our first foray into MAC/PHY description via ACPI)
- HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx
- Mellanox/Nvidia NIC (mlx5)
- NIC VF offload of L2 bridging
- support IRQ distribution to Sub-functions
- Marvell (prestera):
- add flower and match all
- devlink trap
- link aggregation
- Netronome (nfp): connection tracking offload
- Intel 1GE (igc): add AF_XDP support
- Marvell DPU (octeontx2): ingress ratelimit offload
- Google vNIC (gve): new ring/descriptor format support
- Qualcomm mobile (rmnet & ipa): inline checksum offload support
- MediaTek WiFi (mt76)
- mt7915 MSI support
- mt7915 Tx status reporting
- mt7915 thermal sensors support
- mt7921 decapsulation offload
- mt7921 enable runtime pm and deep sleep
- Realtek WiFi (rtw88)
- beacon filter support
- Tx antenna path diversity support
- firmware crash information via devcoredump
- Qualcomm WiFi (wcn36xx)
- Wake-on-WLAN support with magic packets and GTK rekeying
- Micrel PHY (ksz886x/ksz8081): add cable test support"
* tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits)
tcp: change ICSK_CA_PRIV_SIZE definition
tcp_yeah: check struct yeah size at compile time
gve: DQO: Fix off by one in gve_rx_dqo()
stmmac: intel: set PCI_D3hot in suspend
stmmac: intel: Enable PHY WOL option in EHL
net: stmmac: option to enable PHY WOL with PMT enabled
net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del}
net: use netdev_info in ndo_dflt_fdb_{add,del}
ptp: Set lookup cookie when creating a PTP PPS source.
net: sock: add trace for socket errors
net: sock: introduce sk_error_report
net: dsa: replay the local bridge FDB entries pointing to the bridge dev too
net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev
net: dsa: include fdb entries pointing to bridge in the host fdb list
net: dsa: include bridge addresses which are local in the host fdb list
net: dsa: sync static FDB entries on foreign interfaces to hardware
net: dsa: install the host MDB and FDB entries in the master's RX filter
net: dsa: reference count the FDB addresses at the cross-chip notifier level
net: dsa: introduce a separate cross-chip notifier type for host FDBs
net: dsa: reference count the MDB entries at the cross-chip notifier level
...
We want to add reference counting for FDB entries in cross-chip
topologies, and in order for that to have any chance of working and not
be unbalanced (leading to entries which are never deleted), we need to
ensure that higher layers are sane, because if they aren't, it's garbage
in, garbage out.
For example, if we add a bridge FDB entry twice, the bridge properly
errors out:
$ bridge fdb add dev swp0 00:01:02:03:04:07 master static
$ bridge fdb add dev swp0 00:01:02:03:04:07 master static
RTNETLINK answers: File exists
However, the same thing cannot be said about the bridge bypass
operations:
$ bridge fdb add dev swp0 00:01:02:03:04:07
$ bridge fdb add dev swp0 00:01:02:03:04:07
$ bridge fdb add dev swp0 00:01:02:03:04:07
$ bridge fdb add dev swp0 00:01:02:03:04:07
$ echo $?
0
But one 'bridge fdb del' is enough to remove the entry, no matter how
many times it was added.
The bridge bypass operations are impossible to maintain in these
circumstances and lack of support for reference counting the cross-chip
notifiers is holding us back from making further progress, so just drop
support for them. The only way left for users to install static bridge
FDB entries is the proper one, using the "master static" flags.
With this change, rtnl_fdb_add() falls back to calling
ndo_dflt_fdb_add() which uses the duplicate-exclusive variant of
dev_uc_add(): dev_uc_add_excl(). Because DSA does not (yet) declare
IFF_UNICAST_FLT, this results in us going to promiscuous mode:
$ bridge fdb add dev swp0 00:01:02:03:04:05
[ 28.206743] device swp0 entered promiscuous mode
$ bridge fdb add dev swp0 00:01:02:03:04:05
RTNETLINK answers: File exists
So even if it does not completely fail, there is at least some indication
that it is behaving differently from before, and closer to user space
expectations, I would argue (the lack of a "local|static" specifier
defaults to "local", or "host-only", so dev_uc_add() is a reasonable
default implementation). If the generic implementation of .ndo_fdb_add
provided by Vlad Yasevich is a proof of anything, it only proves that
the implementation provided by DSA was always wrong, by not looking at
"ndm->ndm_state & NUD_NOARP" (the "static" flag which means that the FDB
entry points outwards) and "ndm->ndm_state & NUD_PERMANENT" (the "local"
flag which means that the FDB entry points towards the host). It all
used to mean the same thing to DSA.
Update the documentation so that the users are not confused about what's
going on.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf-next 2021-06-28
The following pull-request contains BPF updates for your *net-next* tree.
We've added 37 non-merge commits during the last 12 day(s) which contain
a total of 56 files changed, 394 insertions(+), 380 deletions(-).
The main changes are:
1) XDP driver RCU cleanups, from Toke Høiland-Jørgensen and Paul E. McKenney.
2) Fix bpf_skb_change_proto() IPv4/v6 GSO handling, from Maciej Żenczykowski.
3) Fix false positive kmemleak report for BPF ringbuf alloc, from Rustam Kovhaev.
4) Fix x86 JIT's extable offset calculation for PROBE_LDX NULL, from Ravi Bangoria.
5) Enable libbpf fallback probing with tracing under RHEL7, from Jonathan Edwards.
6) Clean up x86 JIT to remove unused cnt tracking from EMIT macro, from Jiri Olsa.
7) Netlink cleanups for libbpf to please Coverity, from Kumar Kartikeya Dwivedi.
8) Allow to retrieve ancestor cgroup id in tracing programs, from Namhyung Kim.
9) Fix lirc BPF program query to use user-provided prog_cnt, from Sean Young.
10) Add initial libbpf doc including generated kdoc for its API, from Grant Seltzer.
11) Make xdp_rxq_info_unreg_mem_model() more robust, from Jakub Kicinski.
12) Fix up bpfilter startup log-level to info level, from Gary Lin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
These is no need to wait for 'interval' period for the next probe
if the last probe is already acked in search state. The 'interval'
period waiting should be only for probe failure timeout and the
current pmtu check when it's in search complete state.
This change will shorten the probe time a lot in search state, and
also fix the document accordingly.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denote that the new switch generation is supported, detail its pin
strapping options (with differences compared to SJA1105) and explain how
MDIO access to the internal 100base-T1 and 100base-TX PHYs is performed.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DQO is a new descriptor format for our next generation virtual NIC.
Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Skip non-SCTP packets in the new SCTP chunk support for nft_exthdr,
from Phil Sutter.
2) Simplify TCP option sanity check for TCP packets, also from Phil.
3) Add a new expression to store when the rule has been used last time.
4) Pass the hook state object to log function, from Florian Westphal.
5) Document the new sysctl knobs to tune the flowtable timeouts,
from Oz Shlomo.
6) Fix snprintf error check in the new nfnetlink_hook infrastructure,
from Dan Carpenter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Examples in this document use all kinds of indentation from 3 to 5
spaces and even mixed with tabs. Making them all even and equal to
4 spaces.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20210622185647.3705104-1-i.maximets@ovn.org
This patch added a new sysctl, named allow_join_initial_addr_port, to
control whether allow peers to send join requests to the IP address and
port number used by the initial subflow.
Suggested-by: Florian Westphal <fw@strlen.de>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PLPMTUD can be enabled by doing 'sysctl -w net.sctp.probe_interval=n'.
'n' is the interval for PLPMTUD probe timer in milliseconds, and it
can't be less than 5000 if it's not 0.
All asoc/transport's PLPMTUD in a new socket will be enabled by default.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel assumes bank 0 when 'ETHTOOL_MSG_MODULE_EEPROM_GET' is sent
without 'ETHTOOL_A_MODULE_EEPROM_BANK'.
Document it as part of the interface documentation.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
'ETHTOOL_A_MODULE_EEPROM_DATA' is a binary attribute, not a nested one.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The command is called 'ETHTOOL_MSG_MODULE_EEPROM_GET', not
'ETHTOOL_MSG_MODULE_EEPROM'.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch added a new sysctl, named checksum_enabled, to control
whether DSS checksum can be enabled.
Acked-by: Paolo Abeni <pabeni@redhat.com>
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The conversion tools used during DocBook/LaTeX/html/Markdown->ReST
conversion and some cut-and-pasted text contain some characters that
aren't easily reachable on standard keyboards and/or could cause
troubles when parsed by the documentation build system.
Replace the occurences of the following characters:
- U+00a0 (' '): NO-BREAK SPACE
as it can cause lines being truncated on PDF output
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/9bd9f5c067c4b068a974730a14fe8d68e1be0c9a.1623826294.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Daniel Borkmann says:
====================
pull-request: bpf-next 2021-06-17
The following pull-request contains BPF updates for your *net-next* tree.
We've added 50 non-merge commits during the last 25 day(s) which contain
a total of 148 files changed, 4779 insertions(+), 1248 deletions(-).
The main changes are:
1) BPF infrastructure to migrate TCP child sockets from a listener to another
in the same reuseport group/map, from Kuniyuki Iwashima.
2) Add a provably sound, faster and more precise algorithm for tnum_mul() as
noted in https://arxiv.org/abs/2105.05398, from Harishankar Vishwanathan.
3) Streamline error reporting changes in libbpf as planned out in the
'libbpf: the road to v1.0' effort, from Andrii Nakryiko.
4) Add broadcast support to xdp_redirect_map(), from Hangbin Liu.
5) Extends bpf_map_lookup_and_delete_elem() functionality to 4 more map
types, that is, {LRU_,PERCPU_,LRU_PERCPU_,}HASH, from Denis Salopek.
6) Support new LLVM relocations in libbpf to make them more linker friendly,
also add a doc to describe the BPF backend relocations, from Yonghong Song.
7) Silence long standing KUBSAN complaints on register-based shifts in
interpreter, from Daniel Borkmann and Eric Biggers.
8) Add dummy PT_REGS macros in libbpf to fail BPF program compilation when
target arch cannot be determined, from Lorenz Bauer.
9) Extend AF_XDP to support large umems with 1M+ pages, from Magnus Karlsson.
10) Fix two minor libbpf tc BPF API issues, from Kumar Kartikeya Dwivedi.
11) Move libbpf BPF_SEQ_PRINTF/BPF_SNPRINTF macros that can be used by BPF
programs to bpf_helpers.h header, from Florent Revest.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes .rst file warnings seen on linux-next build.
Fixes: f7af616c63 ("net: iosm: infrastructure")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: M Chetan Kumar <m.chetan.kumar@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adds a new sysctl option: net.ipv4.tcp_migrate_req. If this
option is enabled or eBPF program is attached, we will be able to migrate
child sockets from a listener to another in the same reuseport group after
close() or shutdown() syscalls.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210612123224.12525-2-kuniyu@amazon.co.jp
Add documentation for the devlink feature prestera switchdev driver supports:
add description for the support of the driver-specific devlink traps
(include both traps with action TRAP and action DROP);
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add 25gbase-r phy interface mode
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some really really weird switches just couldn't decide whether to use a
normal or a tail tagger, so they just did both.
This creates problems for DSA, because we only have the concept of an
'overhead' which can be applied to the headroom or to the tailroom of
the skb (like for example during the central TX reallocation procedure),
depending on the value of bool tail_tag, but not to both.
We need to generalize DSA to cater for these odd switches by
transforming the 'overhead / tail_tag' pair into 'needed_headroom /
needed_tailroom'.
The DSA master's MTU is increased to account for both.
The flow dissector code is modified such that it only calls the DSA
adjustment callback if the tagger has a non-zero header length.
Taggers are trivially modified to declare either needed_headroom or
needed_tailroom, based on the tail_tag value that they currently
declare.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement support for pushing vlan header into untagged packet on ingress
of port that has pvid configured and support for popping vlan on egress of
port that has the matching vlan configured as untagged. To support such
configurations packet reformat contexts of {INSERT|REMOVE}_HEADER types are
created per such vlan and saved to struct mlx5_esw_bridge_vlan which allows
all FDB entries on particular vlan to share single packet reformat
instance. When initializing FDB entries with pvid or untagged vlan type set
its mlx5_flow_act->pkt_reformat action accordingly.
Flush all flows when removing vlan from port. This is necessary because
even though software bridge removes all FDB entries before removing their
vlan, mlx5 bridge implementation deletes their corresponding flow entries
from hardware in asynchronous workqueue task, which will cause firmware
error if vlan packet reformat context is deleted before all flows that
point to it.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Add support for FDB vlan-tagged entries. Extend ingress and egress flow
tables with flow groups to match packet vlan tag. Modify the flow creation
code to include vlan tag, if vlan is configured on port and vlan
configuration is supported for offload.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Hardware supported by mlx5 driver doesn't provide learning and requires the
driver to emulate all switch-like behavior in software. As such, all
packets by default go through miss path, appear on representor and get to
software bridge, if it is the upper device of the representor. This causes
bridge to process packet in software, learn the MAC address to FDB and send
SWITCHDEV_FDB_ADD_TO_DEVICE event to all subscribers.
In order to offload FDB entries in mlx5, register switchdev notifier
callback and implement support for both 'added_by_user' and dynamic FDB
entry SWITCHDEV_FDB_ADD_TO_DEVICE events asynchronously using new
mlx5_esw_bridge_offloads->wq ordered workqueue. In workqueue callback
offload the ingress rule (matching FDB entry MAC as packet source MAC) and
egress table rule (matching FDB entry MAC as destination MAC). For ingress
table rule also match source vport to ensure that only traffic coming from
expected bridge port is matched by offloaded rule. Save all the relevant
FDB entry data in struct mlx5_esw_bridge_fdb_entry instance and insert the
instance in new mlx5_esw_bridge->fdb_list list (for traversing all entries
by software ageing implementation in following patch) and in new
mlx5_esw_bridge->fdb_ht hash table for fast retrieval. Notify the bridge
that FDB entry has been offloaded by sending SWITCHDEV_FDB_OFFLOADED
notification.
Delete FDB entry on reception of SWITCHDEV_FDB_DEL_TO_DEVICE event.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
The documentation file used to be written in markdown format but was
converted to reStructuredText (rst).
The converted file doesn't keep up with rst format requirements which
results in hard-to-read text.
This patch fixes the formatting of the file. The patch also
* Highlights and emphasizes some lines to improve readability
* Rephrases some hard-to-understand text
* Updates outdated function descriptions.
* Removes TSO description which falsely claims the driver supports it
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add devlink rate objects section at devlink port documentation.
Add devlink rate support info at netdevsim devlink documentation.
Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding documentation explaining the new MAPv4/v5 packet formats
and the corresponding checksum offload headers.
Signed-off-by: Sharath Chandra Vurukala <sharathv@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
To avoid confusions, it seems better to parse this sysctl parameter as a
boolean. We use it as a boolean, no need to parse an integer and bring
confusions if we see a value different from 0 and 1, especially with
this parameter name: enabled.
It seems fine to do this modification because the default value is 1
(enabled). Then the only other interesting value to set is 0 (disabled).
All other values would not have changed the default behaviour.
Suggested-by: Florian Westphal <fw@strlen.de>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a new multipath hash policy where the packet fields used for hash
calculation are determined by user space via the
fib_multipath_hash_fields sysctl that was introduced in the previous
patch.
The current set of available packet fields includes both outer and inner
fields, which requires two invocations of the flow dissector. Avoid
unnecessary dissection of the outer or inner flows by skipping
dissection if none of the outer or inner fields are required.
In accordance with the existing policies, when an skb is not available,
packet fields are extracted from the provided flow key. In which case,
only outer fields are considered.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
A subsequent patch will add a new multipath hash policy where the packet
fields used for multipath hash calculation are determined by user space.
This patch adds a sysctl that allows user space to set these fields.
The packet fields are represented using a bitmask and are common between
IPv4 and IPv6 to allow user space to use the same numbering across both
protocols. For example, to hash based on standard 5-tuple:
# sysctl -w net.ipv6.fib_multipath_hash_fields=0x0037
net.ipv6.fib_multipath_hash_fields = 0x0037
To avoid introducing holes in 'struct netns_sysctl_ipv6', move the
'bindv6only' field after the multipath hash fields.
The kernel rejects unknown fields, for example:
# sysctl -w net.ipv6.fib_multipath_hash_fields=0x1000
sysctl: setting key "net.ipv6.fib_multipath_hash_fields": Invalid argument
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new multipath hash policy where the packet fields used for hash
calculation are determined by user space via the
fib_multipath_hash_fields sysctl that was introduced in the previous
patch.
The current set of available packet fields includes both outer and inner
fields, which requires two invocations of the flow dissector. Avoid
unnecessary dissection of the outer or inner flows by skipping
dissection if none of the outer or inner fields are required.
In accordance with the existing policies, when an skb is not available,
packet fields are extracted from the provided flow key. In which case,
only outer fields are considered.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
A subsequent patch will add a new multipath hash policy where the packet
fields used for multipath hash calculation are determined by user space.
This patch adds a sysctl that allows user space to set these fields.
The packet fields are represented using a bitmask and are common between
IPv4 and IPv6 to allow user space to use the same numbering across both
protocols. For example, to hash based on standard 5-tuple:
# sysctl -w net.ipv4.fib_multipath_hash_fields=0x0037
net.ipv4.fib_multipath_hash_fields = 0x0037
The kernel rejects unknown fields, for example:
# sysctl -w net.ipv4.fib_multipath_hash_fields=0x1000
sysctl: setting key "net.ipv4.fib_multipath_hash_fields": Invalid argument
More fields can be added in the future, if needed.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Group the flow flags under a single struct called flow. The new struct
contains 'stopped' and 'tco_stopped' bools which used to be bits in a
bitfield. The struct also contains the lock protecting them to
potentially share the same cache line.
Note that commit c545b66c69 (tty: Serialize tcflow() with other tty
flow control changes) added a padding to the original bitfield. It was
for the bitfield to occupy a whole 64b word to avoid interferring stores
on Alpha (cannot we evaporate this arch with weird implications to C
code yet?). But it doesn't work as expected as the padding
(tty_struct::unused) is aligned to a 8B boundary too and occupies some
bytes from the next word.
So make it reliable by:
1) setting __aligned of the struct -- that aligns the start, and
2) making 'unsigned long unused[0]' as the last member of the struct --
pads the end.
This is also the perfect time to start the documentation of tty_struct
where all this lives. So we start by documenting what these bools
actually serve for. And why we do all the alignment dances. Only the few
up-to-date information from the Theodore's comment made it into this new
Kerneldoc comment.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: "Maciej W. Rozycki" <macro@orcam.me.uk>
Link: https://lore.kernel.org/r/20210505091928.22010-13-jslaby@suse.cz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Probably because the original file was pre-processed by some
tool, both i40e.rst and iavf.rst files are using this character:
- U+2013 ('–'): EN DASH
meaning an hyphen when calling a command line application, which
is obviously wrong. So, replace them by an hyphen, ensuring
that it will be properly displayed as literals when building
the documentation.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/95eb2a48d0ca3528780ce0dfce64359977fa8cb3.1620744606.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Daniel Borkmann says:
====================
pull-request: bpf-next 2021-04-28
The main changes are:
1) Add link detach and following re-attach for trampolines, from Jiri Olsa.
2) Use kernel's "binary printf" lib for formatted output BPF helpers (which
avoids the needs for variadic argument handling), from Florent Revest.
3) Fix verifier 64 to 32 bit min/max bound propagation, from Daniel Borkmann.
4) Convert cpumap to use netif_receive_skb_list(), from Lorenzo Bianconi.
5) Add generic batched-ops support to percpu array map, from Pedro Tammela.
6) Various CO-RE relocation BPF selftests fixes, from Andrii Nakryiko.
7) Misc doc rst fixes, from Hengqi Chen.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
bpf, selftests: Update array map tests for per-cpu batched ops
bpf: Add batched ops support for percpu array
bpf: Implement formatted output helpers with bstr_printf
seq_file: Add a seq_bprintf function
bpf, docs: Fix literal block for example code
bpf, cpumap: Bulk skb using netif_receive_skb_list
bpf: Fix propagation of 32 bit unsigned bounds from 64 bit bounds
bpf: Lock bpf_trace_printk's tmp buf before it is written to
selftests/bpf: Fix core_reloc test runner
selftests/bpf: Fix field existence CO-RE reloc tests
selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macro
libbpf: Support BTF_KIND_FLOAT during type compatibility checks in CO-RE
selftests/bpf: Add remaining ASSERT_xxx() variants
selftests/bpf: Use ASSERT macros in lsm test
selftests/bpf: Test that module can't be unloaded with attached trampoline
selftests/bpf: Add re-attach test to lsm test
selftests/bpf: Add re-attach test to fexit_test
selftests/bpf: Add re-attach test to fentry_test
bpf: Allow trampoline re-attach for tracing and lsm programs
====================
Link: https://lore.kernel.org/r/20210427233740.22238-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Update timestamping doc for DSA switches to describe current
implementation accurately. On TX, the skb cloning is no longer
in DSA generic code.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some parts of the documentation may lead the reader to think that the
socket's own frames are always received when CAN_RAW_RECV_OWN_MSGS is
enabled, but all frames are subject to filtering.
As explained by Marc Kleine-Budde:
On TX complete of a CAN frame it's pushed into the RX path of the
networking stack, along with the information of the originating socket.
Then the CAN frame is delivered into AF_CAN, where it is passed on to
all registered receivers depending on filters. One receiver is the
sending socket in CAN_RAW. Then in CAN_RAW the it is checked if the
sending socket has RECV_OWN_MSGS enabled.
Link: https://lore.kernel.org/r/20210420191212.42753-1-erik@flodin.me
Signed-off-by: Erik Flodin <erik@flodin.me>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
ETHTOOL_MSG_MODULE_EEPROM_GET is missing from the list of messages.
ETHTOOL_MSG_MODULE_EEPROM_GET_REPLY is sadly a rather long name
so we need to adjust column length.
v2: use spaces (Andrew)
Fixes: c781ff12a2 ("ethtool: Allow network drivers to dump arbitrary EEPROM data")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
- keep the ZC code, drop the code related to reinit
net/bridge/netfilter/ebtables.c
- fix build after move to net_generic
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Make the lack of expectations for switching NICs explicit,
describe the new stats.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similarly to pause statistics add stats for FEC.
The IEEE standard mandates two sets of counters:
- 30.5.1.1.17 aFECCorrectedBlocks
- 30.5.1.1.18 aFECUncorrectableBlocks
where block is a block of bits FEC operates on.
Each of these counters is defined per lane (PCS instance).
Multiple vendors provide number of corrected _bits_ rather
than/as well as blocks.
This set adds the 2 standard-based block counters and a extra
one for corrected bits.
Counters are exposed to user space via netlink in new attributes.
Each attribute carries an array of u64s, first element is
the total count, and the following ones are a per-lane break down.
Much like with pause stats the operation will not fail when driver
does not implement the get_fec_stats callback (nor can the driver
fail the operation by returning an error). If stats can't be
reported the relevant attributes will be empty.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Let's have all seg6 sysctl at the same place.
Fixes: a6dc6670cd ("ipv6: sr: Add documentation for seg_flowlabel sysctl")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently each packet inserted in eswitch is tagged with a internal
metadata to indicate source vport. Metadata tagging is not always
needed. Metadata insertion is needed for multi-port RoCE, failover
between representors and stacked devices. In many other cases,
metadata enablement is not needed.
Metadata insertion slows down the packet processing rate of the E-switch
when it is in switchdev mode.
Below table show performance gain with metadata disabled for VXLAN
offload rules in both SMFS and DMFS steering mode on ConnectX-5 device.
----------------------------------------------
| steering | metadata | pkt size | rx pps |
| mode | | | (million) |
----------------------------------------------
| smfs | disabled | 128Bytes | 42 |
----------------------------------------------
| smfs | enabled | 128Bytes | 36 |
----------------------------------------------
| dmfs | disabled | 128Bytes | 42 |
----------------------------------------------
| dmfs | enabled | 128Bytes | 36 |
----------------------------------------------
Hence, allow user to disable metadata using driver specific devlink
parameter. Metadata setting of the eswitch is applicable only for the
switchdev mode.
Example to show and disable metadata before changing eswitch mode:
$ devlink dev param show pci/0000:06:00.0 name esw_port_metadata
pci/0000:06:00.0:
name esw_port_metadata type driver-specific
values:
cmode runtime value true
$ devlink dev param set pci/0000:06:00.0 \
name esw_port_metadata value false cmode runtime
$ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
changelog:
v1->v2:
- added performance numbers in commit log
- updated commit log and documentation for switchdev mode
- added explicit note on when user can disable metadata in
documentation
Define get_module_eeprom_by_page() ethtool callback and implement
netlink infrastructure.
get_module_eeprom_by_page() allows network drivers to dump a part of
module's EEPROM specified by page and bank numbers along with offset and
length. It is effectively a netlink replacement for get_module_info()
and get_module_eeprom() pair, which is needed due to emergence of
complex non-linear EEPROM layouts.
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
MAINTAINERS
- keep Chandrasekar
drivers/net/ethernet/mellanox/mlx5/core/en_main.c
- simple fix + trust the code re-added to param.c in -next is fine
include/linux/bpf.h
- trivial
include/linux/ethtool.h
- trivial, fix kdoc while at it
include/linux/skmsg.h
- move to relevant place in tcp.c, comment re-wrapped
net/core/skmsg.c
- add the sk = sk // sk = NULL around calls
net/tipc/crypto.c
- trivial
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Quotes to backticks. All commands use backticks since the names
are constants.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix incorrect documentation. Mostly referring to other objects,
likely because the text was copied and not adjusted.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
X.25 Layer 3 (the Packet Layer) expects layer 2 to provide a reliable
datalink service such that no packets are reordered or dropped. And
X.25 Layer 2 (the LAPB layer) is indeed designed to provide such service.
However, this reliability is not preserved when a driver calls "netif_rx"
to deliver the received packets to layer 3, because "netif_rx" will put
the packets into per-CPU queues before they are delivered to layer 3.
If there are multiple CPUs, the order of the packets may not be preserved.
The per-CPU queues may also drop packets if there are too many.
Therefore, we should not call "netif_rx" to let it queue the packets.
Instead, we should use our own queue that won't reorder or drop packets.
This patch changes all X.25 drivers to use their own queues instead of
calling "netif_rx". The patch also documents this requirement in the
"x25-iface" documentation.
Cc: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If there is overlapp between ip_local_port_range and ip_local_reserved_ports with a huge reserved block, it will affect probability of selecting ephemeral ports, see file net/ipv4/inet_hashtables.c:723
int __inet_hash_connect(
...
for (i = 0; i < remaining; i += 2, port += 2) {
if (unlikely(port >= high))
port -= remaining;
if (inet_is_local_reserved_port(net, port))
continue;
E.g. if there is reserved block of 10000 ports, two ports right after this block will be 5000 more likely selected than others.
If this was intended, we can/should add note into documentation as proposed in this commit, otherwise we should think about different solution. One option could be mapping table of continuous port ranges. Second option could be letting user to modify step (port+=2) in above loop, e.g. using new sysctl parameter.
Signed-off-by: Otto Hollmann <otto.hollmann@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add FEC API to netlink.
This is not a 1-to-1 conversion.
FEC settings already depend on link modes to tell user which
modes are supported. Take this further an use link modes for
manual configuration. Old struct ethtool_fecparam is still
used to talk to the drivers, so we need to translate back
and forth. We can revisit the internal API if number of FEC
encodings starts to grow.
Enforce only one active FEC bit (by using a bit position
rather than another mask).
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Section 8 of RFC 8335 specifies potential security concerns of
responding to PROBE requests, and states that nodes that support PROBE
functionality MUST be able to enable/disable responses and that
responses MUST be disabled by default
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a document describing the principles behind resilient next-hop groups,
and some notes about how to configure and offload them.
Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
... cannot be used in block quote, it breaks compilation, remove it.
Fix warnings due to missing blank line such as:
net-next/Documentation/networking/nf_flowtable.rst:142: WARNING: Block quote ends without a blank line; unexpected unindent.
Fixes: 143490cde5 ("docs: nf_flowtable: update documentation with enhancements")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates the flowtable documentation to describe recent
enhancements:
- Offload action is available after the first packets go through the
classic forwarding path.
- IPv4 and IPv6 are supported. Only TCP and UDP layer 4 are supported at
this stage.
- Tuple has been augmented to track VLAN id and PPPoE session id.
- Bridge and IP forwarding integration, including bridge VLAN filtering
support.
- Hardware offload support.
- Describe the [OFFLOAD] and [HW_OFFLOAD] tags in the conntrack table
listing.
- Replace 'flow offload' by 'flow add' in example rulesets (preferred
syntax).
- Describe existing cache limitations.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
s/subsytem/subsystem/
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 9d5ef190e5 ("net: dsa: automatically bring up DSA master
when opening user port"), DSA manages the administrative status of the
host port automatically. Update the configuration steps to reflect this.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"make htmldocs" complains:
configuration.rst:165: WARNING: duplicate label networking/dsa/configuration:single port, other instance in (...)
configuration.rst:212: WARNING: duplicate label networking/dsa/configuration:bridge, other instance in (...)
configuration.rst:252: WARNING: duplicate label networking/dsa/configuration:gateway, other instance in (...)
And for good reason, because the "single port", "bridge" and "gateway"
use cases are replicated twice, once for normal taggers and twice for
DSA_TAG_PROTO_NONE. So when trying to reference these sections via a
hyperlink such as:
https://www.kernel.org/doc/html/latest/networking/dsa/configuration.html#single-port
it will always reference the first occurrence, and never the second one.
This change makes the "single port", "bridge" and "gateway"
configuration examples consistent with the formatting used in the
"Configuration showcases" subsection.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"make htmldocs" produces these warnings:
Documentation/networking/dsa/dsa.rst:468: WARNING: Unexpected indentation.
Documentation/networking/dsa/dsa.rst:477: WARNING: Block quote ends without a blank line; unexpected unindent.
Fixes: 8411abbcad ("Documentation: networking: dsa: mention integration with devlink")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even though this is clear from the context, it is nice to actually be
grammatically correct.
Fixes: 0f22ad45f4 ("Documentation: networking: switchdev: clarify device driver behavior")
Reported-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It looks like "make htmldocs" produces this warning:
Documentation/networking/switchdev.rst:482: WARNING: Unexpected indentation.
Fixes: 0f22ad45f4 ("Documentation: networking: switchdev: clarify device driver behavior")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "bridge fdb add" command provided in the switchdev documentation is
junk now, not only because it is syntactically incorrect and rejected by
the iproute2 bridge program, but also because it was not updated in
light of Arkadi Sharshevsky's radical switchdev refactoring in commit
29ab586c3d ("net: switchdev: Remove bridge bypass support from
switchdev"). Try to explain what the intended usage pattern is with the
new kernel implementation.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch provides details on the expected behavior of switchdev
enabled network devices when operating in a "stand alone" mode, as well
as when being bridge members. This clarifies a number of things that
recently came up during a bug fixing session on the b53 DSA switch
driver.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a short summary of the methods that a driver writer must implement
for offloading a HSR/PRP network interface.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: George McCollister <george.mccollister@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a short summary of the methods that a driver writer must implement
for getting an MRP instance to work on top of a DSA switch.
Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a short summary of the methods that a driver writer must implement
for offloading a link aggregation group, and what is still missing.
Cc: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a short summary of the devlink features supported by the DSA core.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The documentation was already lagging behind by not mentioning the old
version of port_bridge_flags (port_set_egress_floods). So now we are
skipping one step and just explaining how a DSA driver should configure
address learning and flooding settings.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
On one hand, the link is dead and therefore useless.
On the other hand, there are always more drivers to port, but at this
stage, DSA does not need to affirm itself as the driver model to use for
Ethernet-connected switches (since we already have 15 tagging protocols
supported and probably more switch families from various vendors), so
there is nothing actionable to do.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the recent series containing commit bae33f2b5a ("net: switchdev:
remove the transaction structure from port attributes"), there aren't
prepare/commit transactional phases anymore in most of the switchdev
objects/attributes, and as a result, there aren't any in the DSA driver
API either. So remove this piece.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
After Vivien's series from 2019 containing commits 27d4d19d7c ("net:
dsa: remove limitation of switch index value") and ab8ccae122 ("net:
dsa: add ports list in the switch fabric"), this is basically no longer
true.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The chapter about tagging protocols is out of date because it doesn't
mention all taggers that have been added since last documentation
update. But judging based on that, it will always tend to lag behind,
and there's no good reason why we would enumerate the supported
hardware. Instead we could do something more useful and explain what
there is to know about tagging protocols instead.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While preparing some slides for a customer presentation, I found the
existing high-level view to be a bit confusing, so I modified it a
little bit.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ena.rst documentation referred to end_start_xmit() when it should refer
to ena_start_xmit(). Fix the typo.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The batching logic in netvsc_send is non-trivial, due to
a combination of the Linux API and the underlying hypervisor
interface. Add a comment explaining why the code is written this
way.
Signed-off-by: Shachar Raindel <shacharr@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Documentation is missing and it's not very clear what
this callback is for - presumably testing the recovery?
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor tweaks and improvement of wording about the diagnose callback.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit fixes three spelling typos in devlink-dpipe.rst and
devlink-port.rst.
Signed-off-by: Eva Dengler <eva.dengler@fau.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
"either" is outside the parentheses, so the matching "or" should be too.
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Following the recent update to MAINTAINERS update my e-mail address.
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a spelling typo in bonding.rst.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here is the "big" set of staging and IIO driver patches for 5.12-rc1.
Nothing really huge in here, the number of staging tree patches has gone
down for a bit, maybe there's only so much churn to happen in here at
the moment.
The IIO changes are:
- new drivers
- new DT bindings
- new iio driver features
with full details in the shortlog.
The staging driver patches are just a lot of tiny coding style cleanups,
along with some semi-larger hikey driver cleanups as those are _almost_
good enough to get out of the staging tree, but will probably have to
wait until 5.13 to have happen.
All of these have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYCqelQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymIDQCguWtPGy6U1sgaL3GAK/ROt2aet3wAn3TP1WgB
GeKAKKPshu3cskYQzlou
=UPZR
-----END PGP SIGNATURE-----
Merge tag 'staging-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging and IIO driver updates from Greg KH:
"Here is the "big" set of staging and IIO driver patches for 5.12-rc1.
Nothing really huge in here, the number of staging tree patches has
gone down for a bit, maybe there's only so much churn to happen in
here at the moment.
The IIO changes are:
- new drivers
- new DT bindings
- new iio driver features
with full details in the shortlog.
The staging driver patches are just a lot of tiny coding style
cleanups, along with some semi-larger hikey driver cleanups as those
are _almost_ good enough to get out of the staging tree, but will
probably have to wait until 5.13 to have happen.
All of these have been in linux-next with no reported issues"
* tag 'staging-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (189 commits)
staging: hikey9xx: Fix alignment of function parameters
staging: greybus: Fixed a misspelling in hid.c
staging: wimax/i2400m: fix some byte order issues found by sparse
staging: wimax: i2400m: fix some incorrect type warnings
staging: greybus: minor code style fix
staging:wlan-ng: use memdup_user instead of kmalloc/copy_from_user
staging:r8188eu: use IEEE80211_FCTL_* kernel definitions
staging: rtl8192e: remove multiple blank lines
staging: greybus: Fixed alignment issue in hid.c
staging: wfx: remove unused included header files
staging: nvec: minor coding style fix
staging: wimax: Fix some coding style problem
staging: fbtft: add tearing signal detect
staging: vt6656: Fixed issue with alignment in rf.c
staging: qlge: Remove duplicate word in comment
staging: rtl8723bs: remove obsolete commented out code
staging: rtl8723bs: fix function comments to follow kernel-doc
staging: wfx: avoid defining array of flexible struct
staging: rtl8723bs: Replace one-element array with flexible-array member in struct ndis_80211_var_ie
staging: Replace lkml.org links with lore
...
Here is the big set of tty/serial driver changes for 5.12-rc1.
Nothing huge, just lots of good cleanups and additions:
- Your n_tty line discipline cleanups
- vt core cleanups and reworks to make the code more "modern"
- stm32 driver additions
- tty led support added to the tty core and led layer
- minor serial driver fixups and additions
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYCqgqw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymJYQCgnxHmkhzJ2VarTDR3cWm1gu0NU7AAoNe5wWUh
4TQbhB9LSNo78HnIVze0
=Chcg
-----END PGP SIGNATURE-----
Merge tag 'tty-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver updates from Greg KH:
"Here is the big set of tty/serial driver changes for 5.12-rc1.
Nothing huge, just lots of good cleanups and additions:
- n_tty line discipline cleanups
- vt core cleanups and reworks to make the code more "modern"
- stm32 driver additions
- tty led support added to the tty core and led layer
- minor serial driver fixups and additions
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (54 commits)
serial: core: Remove BUG_ON(in_interrupt()) check
vt_ioctl: Remove in_interrupt() check
dt-bindings: serial: imx: Switch to my personal address
vt: keyboard, use new API for keyboard_tasklet
serial: stm32: improve platform_get_irq condition handling in init_port
serial: ifx6x60: Remove driver for deprecated platform
tty: fix up iterate_tty_read() EOVERFLOW handling
tty: fix up hung_up_tty_read() conversion
tty: fix up hung_up_tty_write() conversion
tty: teach the n_tty ICANON case about the new "cookie continuations" too
tty: teach n_tty line discipline about the new "cookie continuations"
tty: clean up legacy leftovers from n_tty line discipline
tty: implement read_iter
tty: convert tty_ldisc_ops 'read()' function to take a kernel pointer
serial: remove sirf prima/atlas driver
serial: mxs-auart: Remove <asm/cacheflush.h>
serial: mxs-auart: Remove serial_mxs_probe_dt()
serial: fsl_lpuart: Use of_device_get_match_data()
dt-bindings: serial: renesas,hscif: Add r8a779a0 support
tty: serial: Drop unused efm32 serial driver
...
We have no in-tree users, also update the sfp-phylink.rst documentation
to indicate that phy_attach_direct() is used instead of of_phy_attach().
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf-next 2021-02-16
The following pull-request contains BPF updates for your *net-next* tree.
There's a small merge conflict between 7eeba1706e ("tcp: Add receive timestamp
support for receive zerocopy.") from net-next tree and 9cacf81f81 ("bpf: Remove
extra lock_sock for TCP_ZEROCOPY_RECEIVE") from bpf-next tree. Resolve as follows:
[...]
lock_sock(sk);
err = tcp_zerocopy_receive(sk, &zc, &tss);
err = BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sk, level, optname,
&zc, &len, err);
release_sock(sk);
[...]
We've added 116 non-merge commits during the last 27 day(s) which contain
a total of 156 files changed, 5662 insertions(+), 1489 deletions(-).
The main changes are:
1) Adds support of pointers to types with known size among global function
args to overcome the limit on max # of allowed args, from Dmitrii Banshchikov.
2) Add bpf_iter for task_vma which can be used to generate information similar
to /proc/pid/maps, from Song Liu.
3) Enable bpf_{g,s}etsockopt() from all sock_addr related program hooks. Allow
rewriting bind user ports from BPF side below the ip_unprivileged_port_start
range, both from Stanislav Fomichev.
4) Prevent recursion on fentry/fexit & sleepable programs and allow map-in-map
as well as per-cpu maps for the latter, from Alexei Starovoitov.
5) Add selftest script to run BPF CI locally. Also enable BPF ringbuffer
for sleepable programs, both from KP Singh.
6) Extend verifier to enable variable offset read/write access to the BPF
program stack, from Andrei Matei.
7) Improve tc & XDP MTU handling and add a new bpf_check_mtu() helper to
query device MTU from programs, from Jesper Dangaard Brouer.
8) Allow bpf_get_socket_cookie() helper also be called from [sleepable] BPF
tracing programs, from Florent Revest.
9) Extend x86 JIT to pad JMPs with NOPs for helping image to converge when
otherwise too many passes are required, from Gary Lin.
10) Verifier fixes on atomics with BPF_FETCH as well as function-by-function
verification both related to zero-extension handling, from Ilya Leoshkevich.
11) Better kernel build integration of resolve_btfids tool, from Jiri Olsa.
12) Batch of AF_XDP selftest cleanups and small performance improvement
for libbpf's xsk map redirect for newer kernels, from Björn Töpel.
13) Follow-up BPF doc and verifier improvements around atomics with
BPF_FETCH, from Brendan Jackman.
14) Permit zero-sized data sections e.g. if ELF .rodata section contains
read-only data from local variables, from Yonghong Song.
15) veth driver skb bulk-allocation for ndo_xdp_xmit, from Lorenzo Bianconi.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Some internal PHY's have their events like link change reported by the
MAC interrupt. We have PHY_IGNORE_INTERRUPT to deal with this scenario.
I'm not too happy with this name. We don't ignore interrupts, typically
there is no interrupt exposed at a PHY level. So let's rename it to
PHY_MAC_INTERRUPT. This is in line with phy_mac_interrupt(), which is
called from the MAC interrupt handler to handle PHY events.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
J721e, J7200 and AM64 have multi port switches which can work in multi
mac mode and in switch mode. Add documentation explaining how to use
different modes.
Borrowed from:
Documentation/networking/device_drivers/ethernet/ti/cpsw_switchdev.rst
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
AM65 NUSS ethernet switch on K3 devices can be configured to work either
in independent mac mode where each port acts as independent network
interface (multi mac) or switch mode.
Add devlink hooks to provide a way to switch b/w these modes.
Rationale to use devlink instead of defaulting to bridge mode is that
SoC use cases require to support multiple independent MAC ports with no
switching so that users can use software bridges with multi-mac
configuration (e.g: to support LAG, HSR/PRP, etc). Also, switching
between multi mac and switch mode requires significant Port and ALE
reconfiguration, therefore is easier to be made as part of mode change
devlink hooks. It also allows to keep user interface similar to what
was implemented for the previous generation of TI CPSW IP
(on AM33/AM43/AM57 SoCs).
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_rmem[1] has been changed to 131072, we should update the documentation
to reflect this.
Fixes: a337531b94 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Zhibin Liu <zhibinliu@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for offloading of HSR/PRP (IEC 62439-3) tag insertion
tag removal, duplicate generation and forwarding.
For HSR, insertion involves the switch adding a 6 byte HSR header after
the 14 byte Ethernet header. For PRP it adds a 6 byte trailer.
Tag removal involves automatically stripping the HSR/PRP header/trailer
in the switch. This is possible when the switch also performs auto
deduplication using the HSR/PRP header/trailer (making it no longer
required).
Forwarding involves automatically forwarding between redundant ports in
an HSR. This is crucial because delay is accumulated as a frame passes
through each node in the ring.
Duplication involves the switch automatically sending a single frame
from the CPU port to both redundant ports. This is required because the
inserted HSR/PRP header/trailer must contain the same sequence number
on the frames sent out both redundant ports.
Export is_hsr_master so DSA can tell them apart from other devices in
dsa_slave_changeupper.
Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Point out where patchwork bot's code lives, and that we don't want
people posting stuff that doesn't build.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 142d93d12d ("net/mlx5: Add devlink subfunction port
documentation") refers to a section 'mlx5 port function' in the table of
contents, but includes a section 'mlx5 function attributes' instead.
Hence, make htmldocs warns:
mlx5.rst:16: WARNING: Unknown target name: "mlx5 port function".
Correct the section reference in table of contents to the actual name of
section in the documentation.
Also, tune another section underline while visiting this document.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2021-02-08
This series contains updates to the ice driver and documentation.
Brett adds a log message when a trusted VF goes in and out of promiscuous
for consistency with i40e driver.
Dave implements a new LLDP command that allows adding VSI destinations to
existing filters and adds support for netdev bonding events, current
support is software based.
Michal refactors code to move from VSI stored xsk_buff_pools to
netdev-provided ones.
Kiran implements the creation scheduler aggregator nodes and distributing
VSIs within the nodes.
Ben modifies rate limit calculations to use clock frequency from the
hardware instead of using a hardcoded one.
Jesse adds support for user to control writeback frequency.
Chinh refactors DCB variables out of the ice_port_info struct.
Bruce removes some unnecessary casting.
Mitch fixes an error message that was reported as if_up instead of if_down.
Tony adjusts fallback allocation for MSI-X to use all given vectors instead
of using only the minimum configuration and updates documentation for
the ice driver.
====================
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide documentation for src_valid_mark sysctl, which was added
in commit 28f6aeea3f ("net: restore ip source validation").
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the value '2' to 'fib_notify_on_flag_change' to allow sending
notifications only for failed route installation.
Separate value is added for such notifications because there are less of
them, so they do not impact performance and some users will find them more
important.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the value '2' to 'fib_notify_on_flag_change' to allow sending
notifications only for failed route installation.
Separate value is added for such notifications because there are less of
them, so they do not impact performance and some users will find them more
important.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ice documentation has not been updated since the initial commits of the
driver. Update the documentation with features and information that are now
available.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
DSA wants the master interface to be open before the user port is due to
historical reasons. The promiscuity of interfaces that are down used to
have issues, as referenced Lennert Buytenhek in commit df02c6ff2e
("dsa: fix master interface allmulti/promisc handling").
The bugfix mentioned there, commit b6c40d68ff ("net: only invoke
dev->change_rx_flags when device is UP"), was basically a "don't do
that" approach to working around the promiscuity while down issue.
Further work done by Vlad Yasevich in commit d2615bf450 ("net: core:
Always propagate flag changes to interfaces") has resolved the
underlying issue, and it is strictly up to the DSA and 8021q drivers
now, it is no longer mandated by the networking core that the master
interface must be up when changing its promiscuity.
From DSA's point of view, deciding to error out in dsa_slave_open
because the master isn't up is
(a) a bad user experience and
(b) knocking at an open door.
Even if there still was an issue with promiscuity while down, DSA could
still just open the master and avoid it.
Doing it this way has the additional benefit that user space can now
remove DSA-specific workarounds, like systemd-networkd with BindCarrier:
https://github.com/systemd/systemd/issues/7478
And we can finally remove one of the 2 bullets in the "Common pitfalls
using DSA setups" chapter.
Tested with two cascaded DSA switches:
$ ip link set sw0p2 up
fsl_enetc 0000:00:00.2 eno2: configuring for fixed/internal link mode
fsl_enetc 0000:00:00.2 eno2: Link is Up - 1Gbps/Full - flow control rx/tx
mscc_felix 0000:00:00.5 swp0: configuring for fixed/sgmii link mode
mscc_felix 0000:00:00.5 swp0: Link is Up - 1Gbps/Full - flow control off
8021q: adding VLAN 0 to HW filter on device swp0
sja1105 spi2.0 sw0p2: configuring for phy/rgmii-id link mode
IPv6: ADDRCONF(NETDEV_CHANGE): eno2: link becomes ready
IPv6: ADDRCONF(NETDEV_CHANGE): swp0: link becomes ready
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, when auto negotiation is on, the user can advertise all the
linkmodes which correspond to a specific speed, but does not have a
similar selector for the number of lanes. This is significant when a
specific speed can be achieved using different number of lanes. For
example, 2x50 or 4x25.
Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct
ethtool_link_settings' with lanes field in order to implement a new
lanes-selector that will enable the user to advertise a specific number
of lanes as well.
When auto negotiation is off, lanes parameter can be forced only if the
driver supports it. Add a capability bit in 'struct ethtool_ops' that
allows ethtool know if the driver can handle the lanes parameter when
auto negotiation is off, so if it does not, an error message will be
returned when trying to set lanes.
Example:
$ ethtool -s swp1 lanes 4
$ ethtool swp1
Settings for swp1:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseKX/Full
10000baseKR/Full
40000baseCR4/Full
40000baseSR4/Full
40000baseLR4/Full
25000baseCR/Full
25000baseSR/Full
50000baseCR2/Full
100000baseSR4/Full
100000baseCR4/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 40000baseCR4/Full
40000baseSR4/Full
40000baseLR4/Full
100000baseSR4/Full
100000baseCR4/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Speed: Unknown!
Duplex: Unknown! (255)
Auto-negotiation: on
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Link detected: no
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After installing a route to the kernel, user space receives an
acknowledgment, which means the route was installed in the kernel,
but not necessarily in hardware.
The asynchronous nature of route installation in hardware can lead
to a routing daemon advertising a route before it was actually installed in
hardware. This can result in packet loss or mis-routed packets until the
route is installed in hardware.
It is also possible for a route already installed in hardware to change
its action and therefore its flags. For example, a host route that is
trapping packets can be "promoted" to perform decapsulation following
the installation of an IPinIP/VXLAN tunnel.
Emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags
are changed. The aim is to provide an indication to user-space
(e.g., routing daemons) about the state of the route in hardware.
Introduce a sysctl that controls this behavior.
Keep the default value at 0 (i.e., do not emit notifications) for several
reasons:
- Multiple RTM_NEWROUTE notification per-route might confuse existing
routing daemons.
- Convergence reasons in routing daemons.
- The extra notifications will negatively impact the insertion rate.
- Not all users are interested in these notifications.
Move fib6_info_hw_flags_set() to C file because it is no longer a short
function.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After installing a route to the kernel, user space receives an
acknowledgment, which means the route was installed in the kernel,
but not necessarily in hardware.
The asynchronous nature of route installation in hardware can lead to a
routing daemon advertising a route before it was actually installed in
hardware. This can result in packet loss or mis-routed packets until the
route is installed in hardware.
It is also possible for a route already installed in hardware to change
its action and therefore its flags. For example, a host route that is
trapping packets can be "promoted" to perform decapsulation following
the installation of an IPinIP/VXLAN tunnel.
Emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags
are changed. The aim is to provide an indication to user-space
(e.g., routing daemons) about the state of the route in hardware.
Introduce a sysctl that controls this behavior.
Keep the default value at 0 (i.e., do not emit notifications) for several
reasons:
- Multiple RTM_NEWROUTE notification per-route might confuse existing
routing daemons.
- Convergence reasons in routing daemons.
- The extra notifications will negatively impact the insertion rate.
- Not all users are interested in these notifications.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Roopa Prabhu <roopa@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This section was missed during the conversion to ReST, so convert it in the
same style as the surrounding section titles.
Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
Link: https://lore.kernel.org/r/20210128111930.29473-1-jlu@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/can/dev.c
b552766c87 ("can: dev: prevent potential information leak in can_fill_info()")
3e77f70e73 ("can: dev: move driver related infrastructure into separate subdir")
0a042c6ec9 ("can: dev: move netlink related code into seperate file")
Code move.
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
57ac4a31c4 ("net/mlx5e: Correctly handle changing the number of queues when the interface is down")
214baf2287 ("net/mlx5e: Support HTB offload")
Adjacent code changes
net/switchdev/switchdev.c
20776b465c ("net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP")
ffb68fc58e ("net: switchdev: remove the transaction structure from port object notifiers")
bae33f2b5a ("net: switchdev: remove the transaction structure from port attributes")
Transaction parameter gets dropped otherwise keep the fix.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Parav Pandit Says:
=================
This patchset introduces support for mlx5 subfunction (SF).
A subfunction is a lightweight function that has a parent PCI function on
which it is deployed. mlx5 subfunction has its own function capabilities
and its own resources. This means a subfunction has its own dedicated
queues(txq, rxq, cq, eq). These queues are neither shared nor stolen from
the parent PCI function.
When subfunction is RDMA capable, it has its own QP1, GID table and rdma
resources neither shared nor stolen from the parent PCI function.
A subfunction has dedicated window in PCI BAR space that is not shared
with the other subfunctions or parent PCI function. This ensures that all
class devices of the subfunction accesses only assigned PCI BAR space.
A Subfunction supports eswitch representation through which it supports tc
offloads. User must configure eswitch to send/receive packets from/to
subfunction port.
Subfunctions share PCI level resources such as PCI MSI-X IRQs with
their other subfunctions and/or with its parent PCI function.
Patch summary:
--------------
Patch 1 to 4 prepares devlink
patch 5 to 7 mlx5 adds SF device support
Patch 8 to 11 mlx5 adds SF devlink port support
Patch 12 and 14 adds documentation
Patch-1 prepares code to handle multiple port function attributes
Patch-2 introduces devlink pcisf port flavour similar to pcipf and pcivf
Patch-3 adds port add and delete driver callbacks
Patch-4 adds port function state get and set callbacks
Patch-5 mlx5 vhca event notifier support to distribute subfunction
state change notification
Patch-6 adds SF auxiliary device
Patch-7 adds SF auxiliary driver
Patch-8 prepares eswitch to handler SF vport
Patch-9 adds eswitch helpers to add/remove SF vport
Patch-10 implements devlink port add/del callbacks
Patch-11 implements devlink port function get/set callbacks
Patch-12 to 14 adds documentation
Patch-12 added mlx5 port function documentation
Patch-13 adds subfunction documentation
Patch-14 adds mlx5 subfunction documentation
Subfunction support is discussed in detail in RFC [1] and [2].
RFC [1] and extension [2] describes requirements, design and proposed
plumbing using devlink, auxiliary bus and sysfs for systemd/udev
support. Functionality of this patchset is best explained using real
examples further below.
overview:
--------
A subfunction can be created and deleted by a user using devlink port
add/delete interface.
A subfunction can be configured using devlink port function attribute
before its activated.
When a subfunction is activated, it results in an auxiliary device on
the host PCI device where it is deployed. A driver binds to the
auxiliary device that further creates supported class devices.
example subfunction usage sequence:
-----------------------------------
Change device to switchdev mode:
$ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
Add a devlink port of subfunction flavour:
$ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88
Configure mac address of the port function:
$ devlink port function set ens2f0npf0sf88 hw_addr 00:00:00:00:88:88
Now activate the function:
$ devlink port function set ens2f0npf0sf88 state active
Now use the auxiliary device and class devices:
$ devlink dev show
pci/0000:06:00.0
auxiliary/mlx5_core.sf.4
$ ip link show
127: ens2f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 24:8a:07:b3:d1:12 brd ff:ff:ff:ff:ff:ff
altname enp6s0f0np0
129: p0sf88: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:88:88 brd ff:ff:ff:ff:ff:ff
$ rdma dev show
43: rdmap6s0f0: node_type ca fw 16.29.0550 node_guid 248a:0703:00b3:d112 sys_image_guid 248a:0703:00b3:d112
44: mlx5_0: node_type ca fw 16.29.0550 node_guid 0000:00ff:fe00:8888 sys_image_guid 248a:0703:00b3:d112
After use inactivate the function:
$ devlink port function set ens2f0npf0sf88 state inactive
Now delete the subfunction port:
$ devlink port del ens2f0npf0sf88
[1] https://lore.kernel.org/netdev/20200519092258.GF4655@nanopsycho/
[2] https://marc.info/?l=linux-netdev&m=158555928517777&w=2
=================
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmALKDwACgkQSD+KveBX
+j7qjQf6A1moPhhIlXROCzaJUjlAj2U291LWBveU+I6na6fjYjAAWHYwfv0YKQpo
Qb0NRt+9abgEpGidc4hOwIJKhK+vlWrQuehRt83aAfAwaN3OEeGuNllniWo821Hj
sNiJfSC/DslOlQSxKLsAs3Fduy/sV3GN9Zv7hEwOFgEr5QvB2c6H1XiypVP2Ecsd
ZXC3SuEWxIoRtfXEkTkJne9LNoiDChlvT1FR/z75h8HUBdAOjzBTQzBbM+8M4Msw
8aKUPya3FMRAPWsOgPhkpU0xTtH2Mi7MC9TlwiWmrK4Q3uvesIav8pVf7r3GNAZA
sipIZ4gP0M5SiCaZa8rIBpTXBHxmvg==
=jEG4
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2021-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 subfunction support
Parav Pandit says:
This patchset introduces support for mlx5 subfunction (SF).
A subfunction is a lightweight function that has a parent PCI function on
which it is deployed. mlx5 subfunction has its own function capabilities
and its own resources. This means a subfunction has its own dedicated
queues(txq, rxq, cq, eq). These queues are neither shared nor stolen from
the parent PCI function.
When subfunction is RDMA capable, it has its own QP1, GID table and rdma
resources neither shared nor stolen from the parent PCI function.
A subfunction has dedicated window in PCI BAR space that is not shared
with the other subfunctions or parent PCI function. This ensures that all
class devices of the subfunction accesses only assigned PCI BAR space.
A Subfunction supports eswitch representation through which it supports tc
offloads. User must configure eswitch to send/receive packets from/to
subfunction port.
Subfunctions share PCI level resources such as PCI MSI-X IRQs with
their other subfunctions and/or with its parent PCI function.
Subfunction support is discussed in detail in RFC [1] and [2].
RFC [1] and extension [2] describes requirements, design and proposed
plumbing using devlink, auxiliary bus and sysfs for systemd/udev
support. Functionality of this patchset is best explained using real
examples further below.
overview:
--------
A subfunction can be created and deleted by a user using devlink port
add/delete interface.
A subfunction can be configured using devlink port function attribute
before its activated.
When a subfunction is activated, it results in an auxiliary device on
the host PCI device where it is deployed. A driver binds to the
auxiliary device that further creates supported class devices.
example subfunction usage sequence:
-----------------------------------
Change device to switchdev mode:
$ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
Add a devlink port of subfunction flavour:
$ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88
Configure mac address of the port function:
$ devlink port function set ens2f0npf0sf88 hw_addr 00:00:00:00:88:88
Now activate the function:
$ devlink port function set ens2f0npf0sf88 state active
Now use the auxiliary device and class devices:
$ devlink dev show
pci/0000:06:00.0
auxiliary/mlx5_core.sf.4
$ ip link show
127: ens2f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 24:8a:07:b3:d1:12 brd ff:ff:ff:ff:ff:ff
altname enp6s0f0np0
129: p0sf88: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:88:88 brd ff:ff:ff:ff:ff:ff
$ rdma dev show
43: rdmap6s0f0: node_type ca fw 16.29.0550 node_guid 248a:0703:00b3:d112 sys_image_guid 248a:0703:00b3:d112
44: mlx5_0: node_type ca fw 16.29.0550 node_guid 0000:00ff:fe00:8888 sys_image_guid 248a:0703:00b3:d112
After use inactivate the function:
$ devlink port function set ens2f0npf0sf88 state inactive
Now delete the subfunction port:
$ devlink port del ens2f0npf0sf88
[1] https://lore.kernel.org/netdev/20200519092258.GF4655@nanopsycho/
[2] https://marc.info/?l=linux-netdev&m=158555928517777&w=2
=================
* tag 'mlx5-updates-2021-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: Add devlink subfunction port documentation
devlink: Extend devlink port documentation for subfunctions
devlink: Add devlink port documentation
net/mlx5: SF, Port function state change support
net/mlx5: SF, Add port add delete functionality
net/mlx5: E-switch, Add eswitch helpers for SF vport
net/mlx5: E-switch, Prepare eswitch to handle SF vport
net/mlx5: SF, Add auxiliary device driver
net/mlx5: SF, Add auxiliary device support
net/mlx5: Introduce vhca state event notifier
devlink: Support get and set state of port function
devlink: Support add and delete devlink port
devlink: Introduce PCI SF port flavour and port attribute
devlink: Prepare code to fill multiple port function attributes
====================
Link: https://lore.kernel.org/r/20210122193658.282884-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add packet trap that can report packets that were dropped due to
destination MAC filtering.
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For IPv4, default route is learned via DHCPv4 and user is allowed to change
metric using config etc/network/interfaces. But for IPv6, default route can
be learned via RA, for which, currently a fixed metric value 1024 is used.
Ideally, user should be able to configure metric on default route for IPv6
similar to IPv4. This patch adds sysctl for the same.
Logs:
For IPv4:
Config in etc/network/interfaces:
auto eth0
iface eth0 inet dhcp
metric 4261413864
IPv4 Kernel Route Table:
$ ip route list
default via 172.21.47.1 dev eth0 metric 4261413864
FRR Table, if a static route is configured:
[In real scenario, it is useful to prefer BGP learned default route over DHCPv4 default route.]
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
> - selected route, * - FIB route
S>* 0.0.0.0/0 [20/0] is directly connected, eth0, 00:00:03
K 0.0.0.0/0 [254/1000] via 172.21.47.1, eth0, 6d08h51m
i.e. User can prefer Default Router learned via Routing Protocol in IPv4.
Similar behavior is not possible for IPv6, without this fix.
After fix [for IPv6]:
sudo sysctl -w net.ipv6.conf.eth0.net.ipv6.conf.eth0.ra_defrtr_metric=1996489705
IP monitor: [When IPv6 RA is received]
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 pref high
Kernel IPv6 routing table
$ ip -6 route list
default via fe80::be16:65ff:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 21sec hoplimit 64 pref high
FRR Table, if a static route is configured:
[In real scenario, it is useful to prefer BGP learned default route over IPv6 RA default route.]
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
> - selected route, * - FIB route
S>* ::/0 [20/0] is directly connected, eth0, 00:00:06
K ::/0 [119/1001] via fe80::xx16:xxxx:feb3:ce8e, eth0, 6d07h43m
If the metric is changed later, the effect will be seen only when next IPv6
RA is received, because the default route must be fully controlled by RA msg.
Below metric is changed from 1996489705 to 1996489704.
$ sudo sysctl -w net.ipv6.conf.eth0.ra_defrtr_metric=1996489704
net.ipv6.conf.eth0.ra_defrtr_metric = 1996489704
IP monitor:
[On next IPv6 RA msg, Kernel deletes prev route and installs new route with updated metric]
Deleted default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 3sec hoplimit 64 pref high
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489704 pref high
Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210125214430.24079-1-pchaudhary@linkedin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch adds documentation for sysctl conf/all/disable_ipv6 and
conf/default/disable_ipv6 settings which is currently missing.
Signed-off-by: Pali Rohár <pali@kernel.org>
Link: https://lore.kernel.org/r/20210121150244.20483-1-pali@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The switch ASIC has a limited capacity of physical ('flavour physical'
in devlink terminology) ports that it can support. While each system is
brought up with a different number of ports, this number can be
increased via splitting up to the ASIC's limit.
Expose physical ports as a devlink resource so that user space will have
visibility to the maximum number of ports that can be supported and the
current occupancy.
In addition, add a "Generic Resources" section in devlink-resource
documentation so the different drivers will be aligned by the same resource
name when exposing to user space.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Added documentation for devlink port and port function related commands.
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Introduce API to add and delete an auxiliary device for an SF.
Each SF has its own dedicated window in the PCI BAR 2.
SF device is similar to PCI PF and VF that supports multiple class of
devices such as net, rdma and vdpa.
SF device will be added or removed in subsequent patch during SF
devlink port function state change command.
A subfunction device exposes user supplied subfunction number which will
be further used by systemd/udev to have deterministic name for its
netdevice and rdma device.
An mlx5 subfunction auxiliary device example:
$ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
$ devlink port show
pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false
$ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88
pci/0000:08:00.0/32768: type eth netdev eth6 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false
function:
hw_addr 00:00:00:00:00:00 state inactive opstate detached
$ devlink port show ens2f0npf0sf88
pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false
function:
hw_addr 00:00:00:00:88:88 state inactive opstate detached
$ devlink port function set ens2f0npf0sf88 hw_addr 00:00:00:00:88:88 state active
On activation,
$ ls -l /sys/bus/auxiliary/devices/
mlx5_core.sf.4 -> ../../../devices/pci0000:00/0000:00:03.0/0000:06:00.0/mlx5_core.sf.4
$ cat /sys/bus/auxiliary/devices/mlx5_core.sf.4/sfnum
88
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Add devlink health reporter documentation for NIX block.
Signed-off-by: George Cherian <george.cherian@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This fixes up the markup to fix a warning, be more consistent with
use of monospace, and use the correct .rst syntax for <em> (* instead
of _).
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/bpf/20210120133946.2107897-2-jackmanb@google.com
Commit 91c960b005 ("bpf: Rename BPF_XADD and prepare to encode other
atomics in .imm") modified the BPF documentation, but missed some ReST
markup.
Hence, make htmldocs warns on Documentation/networking/filter.rst:1053:
WARNING: Inline emphasis start-string without end-string.
Add some minimal markup to address this warning.
Fixes: 91c960b005 ("bpf: Rename BPF_XADD and prepare to encode other atomics in .imm")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Brendan Jackman <jackmanb@google.com>
Link: https://lore.kernel.org/bpf/20210118080004.6367-1-lukas.bulwahn@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Conflicts:
drivers/net/can/dev.c
commit 03f16c5075 ("can: dev: can_restart: fix use after free bug")
commit 3e77f70e73 ("can: dev: move driver related infrastructure into separate subdir")
Code move.
drivers/net/dsa/b53/b53_common.c
commit 8e4052c32d ("net: dsa: b53: fix an off by one in checking "vlan->vid"")
commit b7a9e0da2d ("net: switchdev: remove vid_begin -> vid_end range from VLAN objects")
Field rename.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This comes from an end-user request, where they're running multiple VMs on
hosts with bonded interfaces connected to some interest switch topologies,
where 802.3ad isn't an option. They're currently running a proprietary
solution that effectively achieves load-balancing of VMs and bandwidth
utilization improvements with a similar form of transmission algorithm.
Basically, each VM has it's own vlan, so it always sends its traffic out
the same interface, unless that interface fails. Traffic gets split
between the interfaces, maintaining a consistent path, with failover still
available if an interface goes down.
Unlike bond_eth_hash(), this hash function is using the full source MAC
address instead of just the last byte, as there are so few components to
the hash, and in the no-vlan case, we would be returning just the last
byte of the source MAC as the hash value. It's entirely possible to have
two NICs in a bond with the same last byte of their MAC, but not the same
MAC, so this adjustment should guarantee distinct hashes in all cases.
This has been rudimetarily tested to provide similar results to the
proprietary solution it is aiming to replace. A patch for iproute2 is also
posted, to properly support the new mode there as well.
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Thomas Davis <tadavis@lbl.gov>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Link: https://lore.kernel.org/r/20210119010927.1191922-1-jarod@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With NETIF_F_HW_TLS_RX packets are decrypted in HW. This cannot be
logically done when RXCSUM offload is off.
Fixes: 14136564c8 ("net: Add TLS RX offload feature")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Boris Pismenny <borisp@nvidia.com>
Link: https://lore.kernel.org/r/20210117151538.9411-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Borkmann says:
====================
pull-request: bpf-next 2021-01-16
1) Extend atomic operations to the BPF instruction set along with x86-64 JIT support,
that is, atomic{,64}_{xchg,cmpxchg,fetch_{add,and,or,xor}}, from Brendan Jackman.
2) Add support for using kernel module global variables (__ksym externs in BPF
programs) retrieved via module's BTF, from Andrii Nakryiko.
3) Generalize BPF stackmap's buildid retrieval and add support to have buildid
stored in mmap2 event for perf, from Jiri Olsa.
4) Various fixes for cross-building BPF sefltests out-of-tree which then will
unblock wider automated testing on ARM hardware, from Jean-Philippe Brucker.
5) Allow to retrieve SOL_SOCKET opts from sock_addr progs, from Daniel Borkmann.
6) Clean up driver's XDP buffer init and split into two helpers to init per-
descriptor and non-changing fields during processing, from Lorenzo Bianconi.
7) Minor misc improvements to libbpf & bpftool, from Ian Rogers.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (41 commits)
perf: Add build id data in mmap2 event
bpf: Add size arg to build_id_parse function
bpf: Move stack_map_get_build_id into lib
bpf: Document new atomic instructions
bpf: Add tests for new BPF atomic operations
bpf: Add bitwise atomic instructions
bpf: Pull out a macro for interpreting atomic ALU operations
bpf: Add instructions for atomic_[cmp]xchg
bpf: Add BPF_FETCH field / create atomic_fetch_add instruction
bpf: Move BPF_STX reserved field check into BPF_STX verifier code
bpf: Rename BPF_XADD and prepare to encode other atomics in .imm
bpf: x86: Factor out a lookup table for some ALU opcodes
bpf: x86: Factor out emission of REX byte
bpf: x86: Factor out emission of ModR/M for *(reg + off)
tools/bpftool: Add -Wall when building BPF programs
bpf, libbpf: Avoid unused function warning on bpf_tail_call_static
selftests/bpf: Install btf_dump test cases
selftests/bpf: Fix installation of urandom_read
selftests/bpf: Move generated test files to $(TEST_GEN_FILES)
selftests/bpf: Fix out-of-tree build
...
====================
Link: https://lore.kernel.org/r/20210116012922.17823-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A subsequent patch will add additional atomic operations. These new
operations will use the same opcode field as the existing XADD, with
the immediate discriminating different operations.
In preparation, rename the instruction mode BPF_ATOMIC and start
calling the zero immediate BPF_ADD.
This is possible (doesn't break existing valid BPF progs) because the
immediate field is currently reserved MBZ and BPF_ADD is zero.
All uses are removed from the tree but the BPF_XADD definition is
kept around to avoid breaking builds for people including kernel
headers.
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Björn Töpel <bjorn.topel@gmail.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-5-jackmanb@google.com
Sparx-5 supports this mode and it is missing in the PHY core.
Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cited patch below blocked the TLS TX device offload unless HW_CSUM
is set. This broke devices that use IP_CSUM && IP6_CSUM.
Here we fix it.
Note that the single HW_TLS_TX feature flag indicates support for
both IPv4/6, hence it should still be disabled in case only one of
(IP_CSUM | IPV6_CSUM) is set.
Fixes: ae0b04b238 ("net: Disable NETIF_F_HW_TLS_TX when HW_CSUM is disabled")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reported-by: Rohit Maheshwari <rohitm@chelsio.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Link: https://lore.kernel.org/r/20210114151215.7061-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit 80b9414832 ("docs: octeontx2: Add Documentation for NPA health
reporters") added new documentation with improper formatting for rst, and
caused a few new warnings for make htmldocs in octeontx2.rst:169--202.
Tune markup and formatting for better presentation in the HTML view.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: George Cherian <george.cherian@marvell.com>
Link: https://lore.kernel.org/r/20210106161735.21751-1-lukas.bulwahn@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The main purpose of tty_port::low_latency was removed in commit
a9c3f68f3c (tty: Fix low_latency BUG) back in 2014. It was left in
place for drivers as an optional tune knob. But only one driver has been
using it until the previous commit. So remove this misconcept
completely, given there are no users.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Link: https://lore.kernel.org/r/20210105120239.28031-11-jslaby@suse.cz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes some spelling typos in snmp_counter.rst
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix calling context.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before commit 889b8f964f ("packet: Kill CONFIG_PACKET_MMAP.") there
used to be a CONFIG_PACKET_MMAP config symbol that depended on
CONFIG_PACKET. The text still implies that PACKET_MMAP can be disabled.
Remove that from the text, as well as reference to old kernel versions.
Also, drop reference to broken link to information for pre 2.6.5
kernels.
Make a slight working improvement (s/In/On/) while at it.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Link: https://lore.kernel.org/r/80089f3783372c8fd7833f28ce774a171b2ef252.1609232919.git.baruch@tkos.co.il
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Join adjacent questions to a single question line. This fixes the
formatting of questions that were not part of the heading.
Also, drop Q: and A: prefixes. We don't need them now that questions and
answers are visually separated.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Link: https://lore.kernel.org/r/f76078ba5547744f2ec178984c32fbc7dcd29a2b.1608454187.git.baruch@tkos.co.il
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There are a couple of subsystems maintained by other people that
merge their drivers through the SoC tree, those changes include:
- The SCMI firmware framework gains support for sensor notifications
and for controlling voltage domains.
- A large update for the Tegra memory controller driver, integrating
it better with the interconnect framework
- The memory controller subsystem gains support for Mediatek MT8192
- The reset controller framework gains support for sharing pulsed
resets
For Soc specific drivers in drivers/soc, the main changes are
- The Allwinner/sunxi MBUS gets a rework for the way it handles
dma_map_ops and offsets between physical and dma address spaces.
- An errata fix plus some cleanups for Freescale Layerscape SoCs
- A cleanup for renesas drivers regarding MMIO accesses.
- New SoC specific drivers for Mediatek MT8192 and MT8183 power domains
- New SoC specific drivers for Aspeed AST2600 LPC bus control
and SoC identification.
- Core Power Domain support for Qualcomm MSM8916, MSM8939, SDM660
and SDX55.
- A rework of the TI AM33xx 'genpd' power domain support to use
information from DT instead of platform data
- Support for TI AM64x SoCs
- Allow building some Amlogic drivers as modules instead of built-in
Finally, there are numerous cleanups and smaller bug fixes for
Mediatek, Tegra, Samsung, Qualcomm, TI OMAP, Amlogic, Rockchips,
Renesas, and Xilinx SoCs.
There is a trivial conflict in the cedrus driver, with two branches
adding the same CEDRUS_CAPABILITY_H265_DEC flag, and another trivial
remove/remove conflict in linux/dma-mapping.h.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAl/alSUACgkQmmx57+YA
GNm7GRAAlNMVi7F0f4Ixf1bEh+J2QUonYIpZfrdxOLFwISGQ+nstGrFW2He/OeQv
KAi027tZLl6Sdzjy809cLDPA4Z2IKwjVWhEbBHybvy1+irPYjnixtLd0x3YvPhjH
iadlcjQ3uaGue8PvubK6CVnBEy82A+Pp29n9i4A4wX/8w+BVIhVsxwQWUBF8pFXE
3La2UZYZMVMvVZMrpTOqwCgdmLDCk+RLMVZ1IiRqBEBq5/DVq03uIXgjGEOrq8tl
PXC89w7K510Is891mbBdBThQf+pZkU1vwORuknDcEJKWs9ngbEha7ebVgp32kbFl
pi8DEK205d106WQgfn0Zxkpbsp8XD058wDILwkhBcteXlBaUEL6btGVLDTUCJZuv
/pkH8tL4lNGpThQFbCEXC8oHZBp2xk55P+SW9RRZOoA5tAp+sz7hlf3y3YKdCSxv
4xybeeVOAgjl01WtbEC7CuIkqcKVSQ7njhLhC8r5ASteNywDThqxLT6nd0VegcQc
YH3Eu9QRXpvFwQ35zMkTMWa27bMG5d60fp90bWT0R5amXZpxJJot87w8trFCxv74
mE5KvCbefCRNsTt8GOBA/WR7hVaG369g07qOvs7g4LjJEM3Nl2h/A4/zVFef9O0t
yq3Nm4YCGfDSAQXzGR2SJ3nxiqbDknzJTAtZPf4BmbaMuPOIJ5k=
=BjJf
-----END PGP SIGNATURE-----
Merge tag 'arm-soc-drivers-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC driver updates from Arnd Bergmann:
"There are a couple of subsystems maintained by other people that merge
their drivers through the SoC tree, those changes include:
- The SCMI firmware framework gains support for sensor notifications
and for controlling voltage domains.
- A large update for the Tegra memory controller driver, integrating
it better with the interconnect framework
- The memory controller subsystem gains support for Mediatek MT8192
- The reset controller framework gains support for sharing pulsed
resets
For Soc specific drivers in drivers/soc, the main changes are
- The Allwinner/sunxi MBUS gets a rework for the way it handles
dma_map_ops and offsets between physical and dma address spaces.
- An errata fix plus some cleanups for Freescale Layerscape SoCs
- A cleanup for renesas drivers regarding MMIO accesses.
- New SoC specific drivers for Mediatek MT8192 and MT8183 power
domains
- New SoC specific drivers for Aspeed AST2600 LPC bus control and SoC
identification.
- Core Power Domain support for Qualcomm MSM8916, MSM8939, SDM660 and
SDX55.
- A rework of the TI AM33xx 'genpd' power domain support to use
information from DT instead of platform data
- Support for TI AM64x SoCs
- Allow building some Amlogic drivers as modules instead of built-in
Finally, there are numerous cleanups and smaller bug fixes for
Mediatek, Tegra, Samsung, Qualcomm, TI OMAP, Amlogic, Rockchips,
Renesas, and Xilinx SoCs"
* tag 'arm-soc-drivers-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (222 commits)
soc: mediatek: mmsys: Specify HAS_IOMEM dependency for MTK_MMSYS
firmware: xilinx: Properly align function parameter
firmware: xilinx: Add a blank line after function declaration
firmware: xilinx: Remove additional newline
firmware: xilinx: Fix kernel-doc warnings
firmware: xlnx-zynqmp: fix compilation warning
soc: xilinx: vcu: add missing register NUM_CORE
soc: xilinx: vcu: use vcu-settings syscon registers
dt-bindings: soc: xlnx: extract xlnx, vcu-settings to separate binding
soc: xilinx: vcu: drop useless success message
clk: samsung: mark PM functions as __maybe_unused
soc: samsung: exynos-chipid: initialize later - with arch_initcall
soc: samsung: exynos-chipid: order list of SoCs by name
memory: jz4780_nemc: Fix potential NULL dereference in jz4780_nemc_probe()
memory: ti-emif-sram: only build for ARMv7
memory: tegra30: Support interconnect framework
memory: tegra20: Support hardware versioning and clean up OPP table initialization
dt-bindings: memory: tegra20-emc: Document opp-supported-hw property
soc: rockchip: io-domain: Fix error return code in rockchip_iodomain_probe()
reset-controller: ti: force the write operation when assert or deassert
...
Core:
- support "prefer busy polling" NAPI operation mode, where we defer softirq
for some time expecting applications to periodically busy poll
- AF_XDP: improve efficiency by more batching and hindering
the adjacency cache prefetcher
- af_packet: make packet_fanout.arr size configurable up to 64K
- tcp: optimize TCP zero copy receive in presence of partial or unaligned
reads making zero copy a performance win for much smaller messages
- XDP: add bulk APIs for returning / freeing frames
- sched: support fragmenting IP packets as they come out of conntrack
- net: allow virtual netdevs to forward UDP L4 and fraglist GSO skbs
BPF:
- BPF switch from crude rlimit-based to memcg-based memory accounting
- BPF type format information for kernel modules and related tracing
enhancements
- BPF implement task local storage for BPF LSM
- allow the FENTRY/FEXIT/RAW_TP tracing programs to use bpf_sk_storage
Protocols:
- mptcp: improve multiple xmit streams support, memory accounting and
many smaller improvements
- TLS: support CHACHA20-POLY1305 cipher
- seg6: add support for SRv6 End.DT4/DT6 behavior
- sctp: Implement RFC 6951: UDP Encapsulation of SCTP
- ppp_generic: add ability to bridge channels directly
- bridge: Connectivity Fault Management (CFM) support as is defined in
IEEE 802.1Q section 12.14.
Drivers:
- mlx5: make use of the new auxiliary bus to organize the driver internals
- mlx5: more accurate port TX timestamping support
- mlxsw:
- improve the efficiency of offloaded next hop updates by using
the new nexthop object API
- support blackhole nexthops
- support IEEE 802.1ad (Q-in-Q) bridging
- rtw88: major bluetooth co-existance improvements
- iwlwifi: support new 6 GHz frequency band
- ath11k: Fast Initial Link Setup (FILS)
- mt7915: dual band concurrent (DBDC) support
- net: ipa: add basic support for IPA v4.5
Refactor:
- a few pieces of in_interrupt() cleanup work from Sebastian Andrzej Siewior
- phy: add support for shared interrupts; get rid of multiple driver
APIs and have the drivers write a full IRQ handler, slight growth
of driver code should be compensated by the simpler API which
also allows shared IRQs
- add common code for handling netdev per-cpu counters
- move TX packet re-allocation from Ethernet switch tag drivers to
a central place
- improve efficiency and rename nla_strlcpy
- number of W=1 warning cleanups as we now catch those in a patchwork
build bot
Old code removal:
- wan: delete the DLCI / SDLA drivers
- wimax: move to staging
- wifi: remove old WDS wifi bridging support
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl/YXmUACgkQMUZtbf5S
IrvSQBAAgOrt4EFopEvVqlTHZbqI45IEqgtXS+YWmlgnjZCgshyMj8q1yK1zzane
qYxr/NNJ9kV3FdtaynmmHPgEEEfR5kJ/D3B2BsxYDkaDDrD0vbNsBGw+L+/Gbhxl
N/5l/9FjLyLY1D+EErknuwR5XGuQ6BSDVaKQMhYOiK2hgdnAAI4hszo8Chf6wdD0
XDBslQ7vpD/05r+eMj0IkS5dSAoGOIFXUxhJ5dqrDbRHiKsIyWqA3PLbYemfAhxI
s2XckjfmSgGE3FKL8PSFu+EcfHbJQQjLcULJUnqgVcdwEEtRuE9ggEi52nZRXMWM
4e8sQJAR9Fx7pZy0G1xfS149j6iPU5LjRlU9TNSpVABz14Vvvo3gEL6gyIdsz+xh
hMN7UBdp0FEaP028CXoIYpaBesvQqj0BSndmee8qsYAtN6j+QKcM2AOSr7JN1uMH
C/86EDoGAATiEQIVWJvnX5MPmlAoblyLA+RuVhmxkIBx2InGXkFmWqRkXT5l4jtk
LVl8/TArR4alSQqLXictXCjYlCm9j5N4zFFtEVasSYi7/ZoPfgRNWT+lJ2R8Y+Zv
+htzGaFuyj6RJTVeFQMrkl3whAtBamo2a0kwg45NnxmmXcspN6kJX1WOIy82+MhD
Yht7uplSs7MGKA78q/CDU0XBeGjpABUvmplUQBIfrR/jKLW2730=
=GXs1
-----END PGP SIGNATURE-----
Merge tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- support "prefer busy polling" NAPI operation mode, where we defer
softirq for some time expecting applications to periodically busy
poll
- AF_XDP: improve efficiency by more batching and hindering the
adjacency cache prefetcher
- af_packet: make packet_fanout.arr size configurable up to 64K
- tcp: optimize TCP zero copy receive in presence of partial or
unaligned reads making zero copy a performance win for much smaller
messages
- XDP: add bulk APIs for returning / freeing frames
- sched: support fragmenting IP packets as they come out of conntrack
- net: allow virtual netdevs to forward UDP L4 and fraglist GSO skbs
BPF:
- BPF switch from crude rlimit-based to memcg-based memory accounting
- BPF type format information for kernel modules and related tracing
enhancements
- BPF implement task local storage for BPF LSM
- allow the FENTRY/FEXIT/RAW_TP tracing programs to use
bpf_sk_storage
Protocols:
- mptcp: improve multiple xmit streams support, memory accounting and
many smaller improvements
- TLS: support CHACHA20-POLY1305 cipher
- seg6: add support for SRv6 End.DT4/DT6 behavior
- sctp: Implement RFC 6951: UDP Encapsulation of SCTP
- ppp_generic: add ability to bridge channels directly
- bridge: Connectivity Fault Management (CFM) support as is defined
in IEEE 802.1Q section 12.14.
Drivers:
- mlx5: make use of the new auxiliary bus to organize the driver
internals
- mlx5: more accurate port TX timestamping support
- mlxsw:
- improve the efficiency of offloaded next hop updates by using
the new nexthop object API
- support blackhole nexthops
- support IEEE 802.1ad (Q-in-Q) bridging
- rtw88: major bluetooth co-existance improvements
- iwlwifi: support new 6 GHz frequency band
- ath11k: Fast Initial Link Setup (FILS)
- mt7915: dual band concurrent (DBDC) support
- net: ipa: add basic support for IPA v4.5
Refactor:
- a few pieces of in_interrupt() cleanup work from Sebastian Andrzej
Siewior
- phy: add support for shared interrupts; get rid of multiple driver
APIs and have the drivers write a full IRQ handler, slight growth
of driver code should be compensated by the simpler API which also
allows shared IRQs
- add common code for handling netdev per-cpu counters
- move TX packet re-allocation from Ethernet switch tag drivers to a
central place
- improve efficiency and rename nla_strlcpy
- number of W=1 warning cleanups as we now catch those in a patchwork
build bot
Old code removal:
- wan: delete the DLCI / SDLA drivers
- wimax: move to staging
- wifi: remove old WDS wifi bridging support"
* tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1922 commits)
net: hns3: fix expression that is currently always true
net: fix proc_fs init handling in af_packet and tls
nfc: pn533: convert comma to semicolon
af_vsock: Assign the vsock transport considering the vsock address flags
af_vsock: Set VMADDR_FLAG_TO_HOST flag on the receive path
vsock_addr: Check for supported flag values
vm_sockets: Add VMADDR_FLAG_TO_HOST vsock flag
vm_sockets: Add flags field in the vsock address data structure
net: Disable NETIF_F_HW_TLS_TX when HW_CSUM is disabled
tcp: Add logic to check for SYN w/ data in tcp_simple_retransmit
net: mscc: ocelot: install MAC addresses in .ndo_set_rx_mode from process context
nfc: s3fwrn5: Release the nfc firmware
net: vxget: clean up sparse warnings
mlxsw: spectrum_router: Use eXtended mezzanine to offload IPv4 router
mlxsw: spectrum: Set KVH XLT cache mode for Spectrum2/3
mlxsw: spectrum_router_xm: Introduce basic XM cache flushing
mlxsw: reg: Add Router LPM Cache Enable Register
mlxsw: reg: Add Router LPM Cache ML Delete Register
mlxsw: spectrum_router_xm: Implement L-value tracking for M-index
mlxsw: reg: Add XM Router M Table Register
...
With NETIF_F_HW_TLS_TX packets are encrypted in HW. This cannot be
logically done when HW_CSUM offload is off.
Fixes: 2342a8512a ("net: Add TLS TX offload features")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Boris Pismenny <borisp@nvidia.com>
Link: https://lore.kernel.org/r/20201213143929.26253-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add Documentation for devlink health reporters for NPA block.
Signed-off-by: George Cherian <george.cherian@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
of the churn behind us. Significant stuff in this pull includes:
- A set of new Chinese translations
- Italian translation updates
- A mechanism from Mauro to automatically format Documentation/features
for the built docs
- Automatic cross references without explicit :ref: markup
- A new reset-controller document
- An extensive new document on reporting problems from Thorsten
That last patch also adds the CC-BY-4.0 license to LICENSES/dual; there was
some discussion on this, but we seem to have consensus and an ack from Greg
for that addition.
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl/XyewPHGNvcmJldEBs
d24ubmV0AAoJEBdDWhNsDH5YUeYH/AiNNlVIF/80T45TAkm+1kpy2Fb+d/5wbtGK
PB7OTXPyDmmqwZNldlF9IsRhp5W+wYC3PNlulYMG44hT7/Jqf2CMFw8SOZqGLmBV
LhWwoS+TAWLB19IOOMrVXbhAlNsX01NwBDY/dwONjW1Jcu+tuAsBR47T9lKjw4kJ
qGFGMQTvZG9Ig1x7E6X38mAd7W3SD1viNuUePS2YcoB15GAocWfVVHvu1r+RHUTS
27ET8tWzMMuiaCAD6toVY9L4T7iCI7YSPXQm8BOkf/f4LXDnpo8Fo11LE5ozTAh3
+avnNt8vnrRXc06MnzwsvNHm2TqN97B4shkeDiPAV3ySXI8Zu/w=
=HScX
-----END PGP SIGNATURE-----
Merge tag 'docs-5.11' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"A much quieter cycle for documentation (happily), with, one hopes, the
bulk of the churn behind us. Significant stuff in this pull includes:
- A set of new Chinese translations
- Italian translation updates
- A mechanism from Mauro to automatically format
Documentation/features for the built docs
- Automatic cross references without explicit :ref: markup
- A new reset-controller document
- An extensive new document on reporting problems from Thorsten
That last patch also adds the CC-BY-4.0 license to LICENSES/dual;
there was some discussion on this, but we seem to have consensus and
an ack from Greg for that addition"
* tag 'docs-5.11' of git://git.lwn.net/linux: (50 commits)
docs: fix broken cross reference in translations/zh_CN
docs: Note that sphinx 1.7 will be required soon
docs: update requirements to install six module
docs: reporting-issues: move 'outdated, need help' note to proper place
docs: Update documentation to reflect what TAINT_CPU_OUT_OF_SPEC means
docs: add a reset controller chapter to the driver API docs
docs: make reporting-bugs.rst obsolete
docs: Add a new text describing how to report bugs
LICENSES: Add the CC-BY-4.0 license
Documentation: fix multiple typos found in the admin-guide subdirectory
Documentation: fix typos found in admin-guide subdirectory
kernel-doc: Fix example in Nested structs/unions
docs: clean up sysctl/kernel: titles, version
docs: trace: fix event state structure name
docs: nios2: add missing ReST file
scripts: get_feat.pl: reduce table width for all features output
scripts: get_feat.pl: change the group by order
scripts: get_feat.pl: make complete table more coincise
scripts: kernel-doc: fix parsing function-like typedefs
Documentation: fix typos found in process, dev-tools, and doc-guide subdirectories
...
According to the X.25 documentation, there was a plan to implement
X.25-over-802.2-LLC. It never finished but left various code stubs in the
X.25 code. At this time it is unlikely that it would ever finish so it
may be better to remove those code stubs.
Also change the documentation to make it clear that this is not a ongoing
plan anymore. Change words like "will" to "could", "would", etc.
Cc: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Link: https://lore.kernel.org/r/20201209033346.83742-1-xie.he.0141@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add documentation of the newly-added PPPIOCBRIDGECHAN and
PPPIOCUNBRIDGECHAN ioctls.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make various places which point to
Documentation/admin-guide/reporting-bugs.rst point to
Documentation/admin-guide/reporting-issues.rst instead. That document is
brand new and as of now is not completely finished. But even at this
stage it's a lot more helpful and accurate than reporting-bugs.rst.
Hence also add a note to reporting-bugs.rst, telling people they're
better off reading reporting-issues.rst instead.
reporting-bugs.rst is scheduled for removal once reporting-issues.rst
is considered ready.
Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/r/3df7c2d16de112b47bb6e6158138608e78562bf5.1607063223.git.linux@leemhuis.info
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Trivial conflict in CAN, keep the net-next + the byteswap wrapper.
Conflicts:
drivers/net/can/usb/gs_usb.c
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Make an explicit suggestion how to post user space side of kernel
patches to avoid reposts when patchwork groups the wrong patches.
v2: mention the cases unlike iproute2 explicitly
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a packet trap to report packets that were dropped due to a
blackhole nexthop.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The extension of struct can_frame with the len8_dlc element and the
can_dlc naming issue required an update of the documentation.
Additionally introduce the term 'Classical CAN' which has been established
by CAN in Automation to separate the original CAN2.0 A/B from CAN FD.
Updated some data structures and flags.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://lore.kernel.org/r/20201110101852.1973-7-socketcan@hartkopp.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The helper functions can_len2dlc and can_dlc2len are only relevant for
CAN FD data length code (DLC) conversion.
To fit the introduced can_cc_dlc2len for Classical CAN we rename:
can_dlc2len -> can_fd_dlc2len to get the payload length from the DLC
can_len2dlc -> can_fd_len2dlc to get the DLC from the payload length
Suggested-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://lore.kernel.org/r/20201110101852.1973-6-socketcan@hartkopp.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Use table markup to show the structure of the CAN identifier, PGN, PDU1, and
PDU2 formats. Also add introductory sentence.
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
Link: https://lore.kernel.org/r/20201104155730.25196-1-yegorslists@googlemail.com
[mkl: removed trailing whitespace]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
commit f73659192b ("net: wan: Delete the DLCI / SDLA drivers")
deleted "Documentation/networking/framerelay.rst". However, it is still
referenced in "Documentation/networking/index.rst". We need to remove the
reference, too.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Link: https://lore.kernel.org/r/20201118124226.15588-1-xie.he.0141@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The DLCI driver (dlci.c) implements the Frame Relay protocol. However,
we already have another newer and better implementation of Frame Relay
provided by the HDLC_FR driver (hdlc_fr.c).
The DLCI driver's implementation of Frame Relay is used by only one
hardware driver in the kernel - the SDLA driver (sdla.c).
The SDLA driver provides Frame Relay support for the Sangoma S50x devices.
However, the vendor provides their own driver (along with their own
multi-WAN-protocol implementations including Frame Relay), called WANPIPE.
I believe most users of the hardware would use the vendor-provided WANPIPE
driver instead.
(The WANPIPE driver was even once in the kernel, but was deleted in
commit 8db60bcf30 ("[WAN]: Remove broken and unmaintained Sangoma
drivers.") because the vendor no longer updated the in-kernel WANPIPE
driver.)
Cc: Mike McLagan <mike.mclagan@linux.org>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Link: https://lore.kernel.org/r/20201114150921.685594-1-xie.he.0141@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move to the kernel.org patchwork instance, it has significantly
lower latency for accessing from Europe and the US. Other quirks
include the reply bot.
Link: https://lore.kernel.org/r/20201110035120.642746-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Introduced in 0eeb075fad, the "ignore_routes_with_linkdown" sysctl
ignores a route whose interface is down. It is provided as a
per-interface sysctl. However, while a "all" variant is exposed, it
was a noop since it was never evaluated. We use the usual "or" logic
for this kind of sysctls.
Tested with:
ip link add type veth # veth0 + veth1
ip link add type veth # veth1 + veth2
ip link set up dev veth0
ip link set up dev veth1 # link-status paired with veth0
ip link set up dev veth2
ip link set up dev veth3 # link-status paired with veth2
# First available path
ip -4 addr add 203.0.113.${uts#H}/24 dev veth0
ip -6 addr add 2001:db8:1::${uts#H}/64 dev veth0
# Second available path
ip -4 addr add 192.0.2.${uts#H}/24 dev veth2
ip -6 addr add 2001:db8:2::${uts#H}/64 dev veth2
# More specific route through first path
ip -4 route add 198.51.100.0/25 via 203.0.113.254 # via veth0
ip -6 route add 2001:db8:3::/56 via 2001:db8:1::ff # via veth0
# Less specific route through second path
ip -4 route add 198.51.100.0/24 via 192.0.2.254 # via veth2
ip -6 route add 2001:db8:3::/48 via 2001:db8:2::ff # via veth2
# H1: enable on "all"
# H2: enable on "veth0"
for v in ipv4 ipv6; do
case $uts in
H1)
sysctl -qw net.${v}.conf.all.ignore_routes_with_linkdown=1
;;
H2)
sysctl -qw net.${v}.conf.veth0.ignore_routes_with_linkdown=1
;;
esac
done
set -xe
# When veth0 is up, best route is through veth0
ip -o route get 198.51.100.1 | grep -Fw veth0
ip -o route get 2001:db8:3::1 | grep -Fw veth0
# When veth0 is down, best route should be through veth2 on H1/H2,
# but on veth0 on H2
ip link set down dev veth1 # down veth0
ip route show
[ $uts != H3 ] || ip -o route get 198.51.100.1 | grep -Fw veth0
[ $uts != H3 ] || ip -o route get 2001:db8:3::1 | grep -Fw veth0
[ $uts = H3 ] || ip -o route get 198.51.100.1 | grep -Fw veth2
[ $uts = H3 ] || ip -o route get 2001:db8:3::1 | grep -Fw veth2
Without this patch, the two last lines would fail on H1 (the one using
the "all" sysctl). With the patch, everything succeeds as expected.
Also document the sysctl in `ip-sysctl.rst`.
Fixes: 0eeb075fad ("net: ipv4 sysctl option to ignore routes when nexthop link is down")
Signed-off-by: Vincent Bernat <vincent@bernat.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
and netfilter subtrees.
Current release - bugs in new features:
- can: isotp: isotp_rcv_cf(): enable RX timeout handling in
listen-only mode
Previous release - regressions:
- mac80211:
- don't require VHT elements for HE on 2.4 GHz
- fix regression where EAPOL frames were sent in plaintext
- netfilter:
- ipset: Update byte and packet counters regardless of whether
they match
- ip_tunnel: fix over-mtu packet send by allowing fragmenting even
if inner packet has IP_DF (don't fragment) set in its header
(when TUNNEL_DONT_FRAGMENT flag is not set on the tunnel dev)
- net: fec: fix MDIO probing for some FEC hardware blocks
- ip6_tunnel: set inner ipproto before ip6_tnl_encap to un-break
gso support
- sctp: Fix COMM_LOST/CANT_STR_ASSOC err reporting on big-endian
platforms, sparse-related fix used the wrong integer size
Previous release - always broken:
- netfilter: use actual socket sk rather than skb sk when routing
harder
- r8169: work around short packet hw bug on RTL8125 by padding frames
- net: ethernet: ti: cpsw: disable PTPv1 hw timestamping
advertisement, the hardware does not support it
- chelsio/chtls: fix always leaking ctrl_skb and another leak caused
by a race condition
- fix drivers incorrectly writing into skbs on TX:
- cadence: force nonlinear buffers to be cloned
- gianfar: Account for Tx PTP timestamp in the skb headroom
- gianfar: Replace skb_realloc_headroom with skb_cow_head for PTP
- can: flexcan:
- remove FLEXCAN_QUIRK_DISABLE_MECR quirk for LS1021A
- add ECC initialization for VF610 and LX2160A
- flexcan_remove(): disable wakeup completely
- can: fix packet echo functionality:
- peak_canfd: fix echo management when loopback is on
- make sure skbs are not freed in IRQ context in case they need
to be dropped
- always clone the skbs to make sure they have a reference on
the socket, and prevent it from disappearing
- fix real payload length return value for RTR frames
- can: j1939: return failure on bind if netdev is down, rather than
waiting indefinitely
Misc:
- IPv6: reply ICMP error if the first fragment don't include all
headers to improve compliance with RFC 8200
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl+kTDcACgkQMUZtbf5S
IrtC9A//f9rwNFI7sRaz9FYi6ljtWY7paPxdOxy3pWRoNzbfffjTGSPheNvy1pQb
IPaLsNwRrckQNSEPTbQqlUYcjzk1W74ffvq0sQOan4kNKxjX3uf78E6RuWARJsRC
dLqfcJctO6bFi6sEMwIFZ2tLOO5lUIA+Pd0GbjhSdObWzl3uqJ26v7wC6vVk29vS
116Mmhe8/TDVtCOzwlZnBPHqBJkTAirB+MAEX4Sp6FB9YirlcNZbWyHX5L6ejGqC
WQVjU2tPBBugeo0j72tc+y0mD3iK0aLcPL+dk0EQQYHRDMVTebl+gxNPUXCo9Out
HGe5z4e4qrR4Rx1W6MQ3pKwTYuCdwKjMRGd72JAi428/l4NN3y9W/HkI2Zuppd2l
7ifURkNQllYjGCSoHBviJbajyFBeA1nkFJgMSJiRs4T167K3zTbsyjNnfa4LnsvS
B3SrYMGqIH+oR20R9EoV8prVX+Alj1hh/jX02J8zsCcHmBqF2yZi17NarVAWoarm
v/AAqehlP+D1vjAmbCG9DeborrjaNi+v6zFTKK6ZadvLXRJX/N+wEPIpG4KjiK8W
DWKIVlee0R+kgCXE1n9AuZaZLWb7VwrAjkG1Pmfi3vkZhWeAhOW4X98ehhi/hVR/
Gq+e48ZECW5yuOA1q4hbsCYkGr2qAn/LPbsXxhEmW8qwkJHZYkI=
=5R2w
-----END PGP SIGNATURE-----
Merge tag 'net-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.10-rc3, including fixes from wireless, can, and
netfilter subtrees.
Current merge window - bugs in new features:
- can: isotp: isotp_rcv_cf(): enable RX timeout handling in
listen-only mode
Previous releases - regressions:
- mac80211:
- don't require VHT elements for HE on 2.4 GHz
- fix regression where EAPOL frames were sent in plaintext
- netfilter:
- ipset: Update byte and packet counters regardless of whether
they match
- ip_tunnel: fix over-mtu packet send by allowing fragmenting even if
inner packet has IP_DF (don't fragment) set in its header (when
TUNNEL_DONT_FRAGMENT flag is not set on the tunnel dev)
- net: fec: fix MDIO probing for some FEC hardware blocks
- ip6_tunnel: set inner ipproto before ip6_tnl_encap to un-break gso
support
- sctp: Fix COMM_LOST/CANT_STR_ASSOC err reporting on big-endian
platforms, sparse-related fix used the wrong integer size
Previous releases - always broken:
- netfilter: use actual socket sk rather than skb sk when routing
harder
- r8169: work around short packet hw bug on RTL8125 by padding frames
- net: ethernet: ti: cpsw: disable PTPv1 hw timestamping
advertisement, the hardware does not support it
- chelsio/chtls: fix always leaking ctrl_skb and another leak caused
by a race condition
- fix drivers incorrectly writing into skbs on TX:
- cadence: force nonlinear buffers to be cloned
- gianfar: Account for Tx PTP timestamp in the skb headroom
- gianfar: Replace skb_realloc_headroom with skb_cow_head for PTP
- can: flexcan:
- remove FLEXCAN_QUIRK_DISABLE_MECR quirk for LS1021A
- add ECC initialization for VF610 and LX2160A
- flexcan_remove(): disable wakeup completely
- can: fix packet echo functionality:
- peak_canfd: fix echo management when loopback is on
- make sure skbs are not freed in IRQ context in case they need to
be dropped
- always clone the skbs to make sure they have a reference on the
socket, and prevent it from disappearing
- fix real payload length return value for RTR frames
- can: j1939: return failure on bind if netdev is down, rather than
waiting indefinitely
Misc:
- IPv6: reply ICMP error if the first fragment don't include all
headers to improve compliance with RFC 8200"
* tag 'net-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
ionic: check port ptr before use
r8169: work around short packet hw bug on RTL8125
net: openvswitch: silence suspicious RCU usage warning
chelsio/chtls: fix always leaking ctrl_skb
chelsio/chtls: fix memory leaks caused by a race
can: flexcan: flexcan_remove(): disable wakeup completely
can: flexcan: add ECC initialization for VF610
can: flexcan: add ECC initialization for LX2160A
can: flexcan: remove FLEXCAN_QUIRK_DISABLE_MECR quirk for LS1021A
can: mcp251xfd: remove unneeded break
can: mcp251xfd: mcp251xfd_regmap_nocrc_read(): fix semicolon.cocci warnings
can: mcp251xfd: mcp251xfd_regmap_crc_read(): increase severity of CRC read error messages
can: peak_canfd: pucan_handle_can_rx(): fix echo management when loopback is on
can: peak_usb: peak_usb_get_ts_time(): fix timestamp wrapping
can: peak_usb: add range checking in decode operations
can: xilinx_can: handle failure cases of pm_runtime_get_sync
can: ti_hecc: ti_hecc_probe(): add missed clk_disable_unprepare() in error path
can: isotp: padlen(): make const array static, makes object smaller
can: isotp: isotp_rcv_cf(): enable RX timeout handling in listen-only mode
can: isotp: Explain PDU in CAN_ISOTP help text
...
The Spectrum ASIC has a dedicated table where nexthops (i.e., adjacency
entries) are populated. The size of this table can be controlled via
devlink-resource.
Add such a resource to netdevsim so that its occupancy will reflect the
number of nexthop objects currently programmed to the device.
By limiting the size of the resource, error paths could be exercised and
tested.
Example output:
# devlink resource show netdevsim/netdevsim10
netdevsim/netdevsim10:
name IPv4 size unlimited unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
resources:
name fib size unlimited occ 4 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
name fib-rules size unlimited occ 3 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
name IPv6 size unlimited unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
resources:
name fib size unlimited occ 1 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
name fib-rules size unlimited occ 2 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
name nexthops size unlimited occ 0 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables none
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Describe the two MPTCP sysctls, what the values mean, and the default
settings.
Acked-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Documentation references Samsung S3C24xx and S3C64xx machine files in
multiple places but the files were traveling around the kernel multiple
times.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20200911143343.498-1-krzk@kernel.org
This patch is to enable udp tunneling socks by calling
sctp_udp_sock_start() in sctp_ctrlsock_init(), and
sctp_udp_sock_stop() in sctp_ctrlsock_exit().
Also add sysctl udp_port to allow changing the listening
sock's port by users.
Wit this patch, the whole sctp over udp feature can be
enabled and used.
v1->v2:
- Also update ctl_sock udp_port in proc_sctp_do_udp_port()
where netns udp_port gets changed.
v2->v3:
- Call htons() when setting sk udp_port from netns udp_port.
v3->v4:
- Not call sctp_udp_sock_start() when new_value is 0.
- Add udp_port entry in ip-sysctl.rst.
v4->v5:
- Not call sctp_udp_sock_start/stop() in sctp_ctrlsock_init/exit().
- Improve the description of udp_port in ip-sysctl.rst.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
encap_port is added as per netns/sock/assoc/transport, and the
latter one's encap_port inherits the former one's by default.
The transport's encap_port value would mostly decide if one
packet should go out with udp encapsulated or not.
This patch also allows users to set netns' encap_port by sysctl.
v1->v2:
- Change to define encap_port as __be16 for sctp_sock, asoc and
transport.
v2->v3:
- No change.
v3->v4:
- Add 'encap_port' entry in ip-sysctl.rst.
v4->v5:
- Improve the description of encap_port in ip-sysctl.rst.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There are no known users of this driver as of October 2020, and it will
be removed unless someone turns out to still need it in future releases.
According to https://en.wikipedia.org/wiki/List_of_WiMAX_networks, there
have been many public wimax networks, but it appears that many of these
have migrated to LTE or discontinued their service altogether.
As most PCs and phones lack WiMAX hardware support, the remaining
networks tend to use standalone routers. These almost certainly
run Linux, but not a modern kernel or the mainline wimax driver stack.
NetworkManager appears to have dropped userspace support in 2015
https://bugzilla.gnome.org/show_bug.cgi?id=747846, the
www.linuxwimax.org
site had already shut down earlier.
WiMax is apparently still being deployed on airport campus networks
("AeroMACS"), but in a frequency band that was not supported by the old
Intel 2400m (used in Sandy Bridge laptops and earlier), which is the
only driver using the kernel's wimax stack.
Move all files into drivers/staging/wimax, including the uapi header
files and documentation, to make it easier to remove it when it gets
to that. Only minimal changes are made to the source files, in order
to make it possible to port patches across the move.
Also remove the MAINTAINERS entry that refers to a broken mailing
list and website.
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-By: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Suggested-by: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Changeset 410d06879c ("ice: add the DDP Track ID to devlink info")
added description for a new devlink field, but forgot to add
one of its columns, causing it to break:
.../Documentation/networking/devlink/ice.rst:15: WARNING: Error parsing content block for the "list-table" directive: uniform two-level bullet list expected, but row 11 does not contain the same number of items as row 1 (3 vs 4).
.. list-table:: devlink info versions implemented
:widths: 5 5 5 90
...
* - ``fw.app.bundle_id``
- 0xc0000001
- Unique identifier for the DDP package loaded in the device. Also
referred to as the DDP Track ID. Can be used to uniquely identify
the specific DDP package.
Add the type field to the ``fw.app.bundle_id`` row.
Fixes: 410d06879c ("ice: add the DDP Track ID to devlink info")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/84ae28bda1987284033966b7b56a4b27ae40713b.1603791716.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
include/linux/ethtool.h is included twice with kernel-doc,
both to document ethtool_pause_stats(). The first one is
at statistics.rst, and the second one at ethtool-netlink.rst.
Replace one of the references to use the name of the
function. The automarkup.py extension should create the
cross-references.
Solves this warning:
../Documentation/networking/ethtool-netlink.rst: WARNING: Duplicate C declaration, also defined in 'networking/statistics'.
Declaration is 'ethtool_pause_stats'.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Link: https://lore.kernel.org/r/fdbf853bbdaf3bc1d38f32744b739d175c5c31f5.1603791716.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>