The EEE support has not been enabled on ENETC, but it may connect
to a PHY which supports EEE and advertises EEE by default, while
its link partner also advertises EEE. If this happens, the PHY enters
low power mode when the traffic rate is low and causes packet loss.
This patch disables EEE advertisement by default for any PHY that
ENETC connects to, to prevent the above unwanted outcome.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf 2019-12-05
The following pull-request contains BPF updates for your *net* tree.
We've added 6 non-merge commits during the last 1 day(s) which contain
a total of 14 files changed, 116 insertions(+), 37 deletions(-).
The main changes are:
1) three selftests fixes, from Stanislav.
2) one samples fix, from Jesper.
3) one verifier fix, from Yonghong.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
sock_fprog_kern::len is in units of struct sock_filter, not bytes.
Fixes: 3e859adf3643 ("compat_ioctl: unify copy-in of ppp filters")
Reported-by: syzbot+eb853b51b10f1befa0b7@syzkaller.appspotmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan says:
====================
net: hns3: fixes for -net
This patchset includes misc fixes for the HNS3 ethernet driver.
[patch 1/3] fixes a TX queue not restarted problem.
[patch 2/3] fixes a use-after-free issue.
[patch 3/3] fixes a VF ID issue for setting VF VLAN.
change log:
V1->V2: keeps 'ring' as parameter in hns3_nic_maybe_stop_tx()
in [patch 1/3], suggestted by David.
rewrites [patch 2/3]'s commit log to make it be easier
to understand, suggestted by David.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, when set VF VLAN with command "ip link set <pf name>
vf <vf id> vlan <vlan id>", the VF ID 0 is handled as PF incorrectly,
which should be the first VF. This patch fixes it.
Fixes: 21e043cd8124 ("net: hns3: fix set port based VLAN for PF")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, hns3_nic_maybe_stop_tx() uses skb_copy() to linearize a
SKB if the BD num required by the SKB does not meet the hardware
limitation, and it linearizes the SKB by allocating a new linearized SKB
and freeing the old SKB, if hns3_nic_maybe_stop_tx() returns -EBUSY
because there are no enough space in the ring to send the linearized
skb to hardware, the sch_direct_xmit() still hold reference to old SKB
and try to retransmit the old SKB when dev_hard_start_xmit() return
TX_BUSY, which may cause use after freed problem.
This patch fixes it by using __skb_linearize() to linearize the
SKB in hns3_nic_maybe_stop_tx().
Fixes: 51e8439f3496 ("net: hns3: add 8 BD limit for tx flow")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is timing window between ring_space checking and
netif_stop_subqueue when transmiting a SKB, and the TX BD
cleaning may be executed during the time window, which may
caused TX queue not restarted problem.
This patch fixes it by rechecking the ring_space after
netif_stop_subqueue to make sure TX queue is restarted.
Also, the ring->next_to_clean is updated even when pkts is
zero, because all the TX BD cleaned may be non-SKB, so it
needs to check if TX queue need to be restarted.
Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace "select NET_SWITCHDEV" vs "depends on NET_SWITCHDEV" to fix Kconfig
warning with CONFIG_COMPILE_TEST=y
WARNING: unmet direct dependencies detected for NET_SWITCHDEV
Depends on [n]: NET [=y] && INET [=n]
Selected by [y]:
- TI_CPSW_SWITCHDEV [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_TI [=y] && (ARCH_DAVINCI || ARCH_OMAP2PLUS || COMPILE_TEST [=y])
because TI_CPSW_SWITCHDEV blindly selects NET_SWITCHDEV even though
INET is not set/enabled, while NET_SWITCHDEV depends on INET.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: ed3525eda4c4 ("net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann says:
====================
s390/qeth: fixes 2019-12-05
please apply the following fixes to your net tree.
The first two patches target the RX data path, the third fixes a memory
leak when shutting down a qeth device.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The cio layer's intparm logic does not align itself well with how qeth
manages cmd IOs. When an active IO gets terminated via halt/clear, the
corresponding IRQ's intparm does not reflect the cmd buffer but rather
the intparm that was passed to ccw_device_halt() / ccw_device_clear().
This behaviour was recently clarified in
commit b91d9e67e50b ("s390/cio: fix intparm documentation").
As a result, qeth_irq() currently doesn't cancel a cmd that was
terminated via halt/clear. This primarily causes us to leak
card->read_cmd after the qeth device is removed, since our IO path still
holds a refcount for this cmd.
For qeth this means that we need to keep track of which IO is pending on
a device ('active_cmd'), and use this as the intparm when calling
halt/clear. Otherwise qeth_irq() can't match the subsequent IRQ to its
cmd buffer.
Since we now keep track of the _expected_ intparm, we can also detect
any mismatch; this would constitute a bug somewhere in the lower layers.
In this case cancel the active cmd - we effectively "lost" the IRQ and
should not expect any further notification for this IO.
Fixes: 405548959cc7 ("s390/qeth: add support for dynamically allocated cmds")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the RX path builds non-linear skbs, the packet headers can
currently spill over into page fragments. Depending on the packet type
and what fields we need to access in the headers, this could cause us
to go past the end of skb->data.
So for non-linear packets, copy precisely the length of the necessary
headers ('linear_len') into skb->data.
And don't copy more, upper-level protocols will peel whatever additional
packet headers they need.
Fixes: 4a71df50047f ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Depending on a packet's type, the RX path needs to access fields in the
packet headers and thus requires a minimum packet length.
Enforce this length when building the skb.
On the other hand a single runt packet is no reason to drop the whole
RX buffer. So just skip it, and continue processing on the next packet.
Fixes: 4a71df50047f ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 2b3e88ea6528 ("net: phy: improve phy state checking")
phy_start_aneg() expects phy state to be >= PHY_UP. Call phy_start()
before calling phy_start_aneg() during probe so that autonegotiation
is initiated.
As phy_start() takes care of calling phy_start_aneg(), drop the explicit
call to phy_start_aneg().
Network fails without this patch on Octeon TX.
Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking")
Signed-off-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing fexit_bpf2bpf test covers the target progrm with callees.
This patch added a test for the target program without callees.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191205010607.177904-1-yhs@fb.com
For jited bpf program, if the subprogram count is 1, i.e.,
there is no callees in the program, prog->aux->func will be NULL
and prog->bpf_func points to image address of the program.
If there is more than one subprogram, prog->aux->func is populated,
and subprogram 0 can be accessed through either prog->bpf_func or
prog->aux->func[0]. Other subprograms should be accessed through
prog->aux->func[subprog_id].
This patch fixed a bug in check_attach_btf_id(), where
prog->aux->func[subprog_id] is used to access any subprogram which
caused a segfault like below:
[79162.619208] BUG: kernel NULL pointer dereference, address:
0000000000000000
......
[79162.634255] Call Trace:
[79162.634974] ? _cond_resched+0x15/0x30
[79162.635686] ? kmem_cache_alloc_trace+0x162/0x220
[79162.636398] ? selinux_bpf_prog_alloc+0x1f/0x60
[79162.637111] bpf_prog_load+0x3de/0x690
[79162.637809] __do_sys_bpf+0x105/0x1740
[79162.638488] do_syscall_64+0x5b/0x180
[79162.639147] entry_SYSCALL_64_after_hwframe+0x44/0xa9
......
Fixes: 5b92a28aae4d ("bpf: Support attaching tracing BPF program to other BPF programs")
Reported-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191205010606.177774-1-yhs@fb.com
It looks like BPF program that handles BPF_SOCK_OPS_STATE_CB state
can race with the bpf_map_lookup_elem("global_map"); I sometimes
see the failures in this test and re-running helps.
Since we know that we expect the callback to be called 3 times (one
time for listener socket, two times for both ends of the connection),
let's export this number and add simple retry logic around that.
Also, let's make EXPECT_EQ() not return on failure, but continue
evaluating all conditions; that should make potential debugging
easier.
With this fix in place I don't observe the flakiness anymore.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Lawrence Brakmo <brakmo@fb.com>
Link: https://lore.kernel.org/bpf/20191204190955.170934-1-sdf@google.com
Commit 5c26f9a78358 ("libbpf: Don't use cxx to test_libpf target")
converted existing c++ test to c. We still want to include and
link against libbpf from c++ code, so reinstate this test back,
this time in a form of a selftest with a clear comment about
its purpose.
v2:
* -lelf -> $(LDLIBS) (Andrii Nakryiko)
Fixes: 5c26f9a78358 ("libbpf: Don't use cxx to test_libpf target")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191202215931.248178-1-sdf@google.com
Commit 40430452fd5d ("kernfs: use 64bit inos if ino_t is 64bit") changed
the way cgroup ids are exposed to the userspace. Instead of assuming
fixed root id, let's query it.
Fixes: 40430452fd5d ("kernfs: use 64bit inos if ino_t is 64bit")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191202200143.250793-1-sdf@google.com
In the days of using bpf_load.c the order in which the 'maps' sections
were defines in BPF side (*_kern.c) file, were used by userspace side
to identify the map via using the map order as an index. In effect the
order-index is created based on the order the maps sections are stored
in the ELF-object file, by the LLVM compiler.
This have also carried over in libbpf via API bpf_map__next(NULL, obj)
to extract maps in the order libbpf parsed the ELF-object file.
When BTF based maps were introduced a new section type ".maps" were
created. I found that the LLVM compiler doesn't create the ".maps"
sections in the order they are defined in the C-file. The order in the
ELF file is based on the order the map pointer is referenced in the code.
This combination of changes lead to xdp_rxq_info mixing up the map
file-descriptors in userspace, resulting in very broken behaviour, but
without warning the user.
This patch fix issue by instead using bpf_object__find_map_by_name()
to find maps via their names. (Note, this is the ELF name, which can
be longer than the name the kernel retains).
Fixes: be5bca44aa6b ("samples: bpf: convert some XDP samples from bpf_load to libbpf")
Fixes: 451d1dc886b5 ("samples: bpf: update map definition to new syntax BTF-defined map")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157529025128.29832.5953245340679936909.stgit@firesoul
The skb_mpls_push was not updating ethertype of an ethernet packet if
the packet was originally received from a non ARPHRD_ETHER device.
In the below OVS data path flow, since the device corresponding to
port 7 is an l3 device (ARPHRD_NONE) the skb_mpls_push function does
not update the ethertype of the packet even though the previous
push_eth action had added an ethernet header to the packet.
recirc_id(0),in_port(7),eth_type(0x0800),ipv4(tos=0/0xfc,ttl=64,frag=no),
actions:push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),
push_mpls(label=13,tc=0,ttl=64,bos=1,eth_type=0x8847),4
Fixes: 8822e270d697 ("net: core: move push MPLS functionality from OvS to core helper")
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a recent change to the SPI subsystem [1], a new `delay` struct was added
to replace the `delay_usecs`. This change replaces the current `delay_secs`
with `delay` for this driver.
The `spi_transfer_delay_exec()` function [in the SPI framework] makes sure
that both `delay_usecs` & `delay` are used (in this order to preserve
backwards compatibility).
[1] commit bebcfd272df6485 ("spi: introduce `delay` field for
`spi_transfer` + spi_transfer_delay_exec()")
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The referenced commit below allowed more than one hwmon device to be
created per SFP, which is definitely not what we want. Avoid this by
only creating the hwmon device just as we transition to WAITDEV state.
Fixes: 139d3a212a1f ("net: sfp: allow modules with slow diagnostics to probe")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
When unbinding, we don't correctly tear down the module state, leaving
(for example) the hwmon registration behind. Ensure everything is
properly removed by sending a remove event at unbind.
Fixes: 6b0da5c9c1a3 ("net: sfp: track upstream's attachment state in state machine")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the user has specified their own RSS hash key, don't
lose it across queue resets such as DOWN/UP, MTU change,
and number of channels change. This is fixed by moving
the key initialization to a little earlier in the lif
creation.
Also, let's clean up the RSS config a little better on
the way down by setting it all to 0.
Fixes: aa3198819bea ("ionic: Add RSS support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
A lockdep splat was observed when trying to remove an xdp memory
model from the table since the mutex was obtained when trying to
remove the entry, but not before the table walk started:
Fix the splat by obtaining the lock before starting the table walk.
Fixes: c3f812cea0d7 ("page_pool: do not release pool until inflight == 0.")
Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The act_ct TC module shares a common conntrack and NAT infrastructure
exposed via netfilter. It's possible that a packet needs both SNAT and
DNAT manipulation, due to e.g. tuple collision. Netfilter can support
this because it runs through the NAT table twice - once on ingress and
again after egress. The act_ct action doesn't have such capability.
Like netfilter hook infrastructure, we should run through NAT twice to
keep the symmetry.
Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The openvswitch module shares a common conntrack and NAT infrastructure
exposed via netfilter. It's possible that a packet needs both SNAT and
DNAT manipulation, due to e.g. tuple collision. Netfilter can support
this because it runs through the NAT table twice - once on ingress and
again after egress. The openvswitch module doesn't have such capability.
Like netfilter hook infrastructure, we should run through NAT twice to
keep the symmetry.
Fixes: 05752523e565 ("openvswitch: Interface with NAT.")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca says:
====================
net: convert ipv6_stub to ip6_dst_lookup_flow
Xiumei Mu reported a bug in a VXLAN over IPsec setup:
IPv6 | ESP | VXLAN
Using this setup, packets go out unencrypted, because VXLAN over IPv6
gets its route from ipv6_stub->ipv6_dst_lookup (in vxlan6_get_route),
which doesn't perform an XFRM lookup.
This patchset first makes ip6_dst_lookup_flow suitable for some
existing users of ipv6_stub->ipv6_dst_lookup by adding a 'net'
argument, then converts all those users.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
ipv6_stub uses the ip6_dst_lookup function to allow other modules to
perform IPv6 lookups. However, this function skips the XFRM layer
entirely.
All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the
ip_route_output_key and ip_route_output helpers) for their IPv4 lookups,
which calls xfrm_lookup_route(). This patch fixes this inconsistent
behavior by switching the stub to ip6_dst_lookup_flow, which also calls
xfrm_lookup_route().
This requires some changes in all the callers, as these two functions
take different arguments and have different return types.
Fixes: 5f81bd2e5d80 ("ipv6: export a stub for IPv6 symbols used by vxlan")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will be used in the conversion of ipv6_stub to ip6_dst_lookup_flow,
as some modules currently pass a net argument without a socket to
ip6_dst_lookup. This is equivalent to commit 343d60aada5a ("ipv6: change
ipv6_stub_impl.ipv6_dst_lookup to take net argument").
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent commit 5c72299fba9d ("net: sched: cls_flower: Classify
packets using port ranges") had added filtering based on port ranges
to tc flower. However the commit missed necessary changes in hw-offload
code, so the feature gave rise to generating incorrect offloaded flow
keys in NIC.
One more detailed example is below:
$ tc qdisc add dev eth0 ingress
$ tc filter add dev eth0 ingress protocol ip flower ip_proto tcp \
dst_port 100-200 action drop
With the setup above, an exact match filter with dst_port == 0 will be
installed in NIC by hw-offload. IOW, the NIC will have a rule which is
equivalent to the following one.
$ tc qdisc add dev eth0 ingress
$ tc filter add dev eth0 ingress protocol ip flower ip_proto tcp \
dst_port 0 action drop
The behavior was caused by the flow dissector which extracts packet
data into the flow key in the tc flower. More specifically, regardless
of exact match or specified port ranges, fl_init_dissector() set the
FLOW_DISSECTOR_KEY_PORTS flag in struct flow_dissector to extract port
numbers from skb in skb_flow_dissect() called by fl_classify(). Note
that device drivers received the same struct flow_dissector object as
used in skb_flow_dissect(). Thus, offloaded drivers could not identify
which of these is used because the FLOW_DISSECTOR_KEY_PORTS flag was
set to struct flow_dissector in either case.
This patch adds the new FLOW_DISSECTOR_KEY_PORTS_RANGE flag and the new
tp_range field in struct fl_flow_key to recognize which filters are applied
to offloaded drivers. At this point, when filters based on port ranges
passed to drivers, drivers return the EOPNOTSUPP error because they do
not support the feature (the newly created FLOW_DISSECTOR_KEY_PORTS_RANGE
flag).
Fixes: 5c72299fba9d ("net: sched: cls_flower: Classify packets using port ranges")
Signed-off-by: Yoshiki Komachi <komachi.yoshiki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sch->q.len hasn't been set if the subqueue is a NOLOCK qdisc
in mq_dump() and mqprio_dump().
Fixes: ce679e8df7ed ("net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mqprio")
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It appears linux-4.14 stable needs a backport of commit
88f8598d0a30 ("tcp: exit if nothing to retransmit on RTO timeout")
Since tcp_rtx_queue_empty() is not in pre 4.15 kernels,
let's refactor tcp_retransmit_timer() to only use tcp_rtx_queue_head()
I will provide to stable teams the squashed patches.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In addition to filling the node_guid and port_guid attributes,
there is a need to populate VF index too, otherwise users of netlink
interface will see same VF index for all VFs.
Fixes: 30aad41721e0 ("net/core: Add support for getting VF GUIDs")
Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have an interesting memory leak in the bridge when it is being
unregistered and is a slave to a master device which would change the
mac of its slaves on unregister (e.g. bond, team). This is a very
unusual setup but we do end up leaking 1 fdb entry because
dev_set_mac_address() would cause the bridge to insert the new mac address
into its table after all fdbs are flushed, i.e. after dellink() on the
bridge has finished and we call NETDEV_UNREGISTER the bond/team would
release it and will call dev_set_mac_address() to restore its original
address and that in turn will add an fdb in the bridge.
One fix is to check for the bridge dev's reg_state in its
ndo_set_mac_address callback and return an error if the bridge is not in
NETREG_REGISTERED.
Easy steps to reproduce:
1. add bond in mode != A/B
2. add any slave to the bond
3. add bridge dev as a slave to the bond
4. destroy the bridge device
Trace:
unreferenced object 0xffff888035c4d080 (size 128):
comm "ip", pid 4068, jiffies 4296209429 (age 1413.753s)
hex dump (first 32 bytes):
41 1d c9 36 80 88 ff ff 00 00 00 00 00 00 00 00 A..6............
d2 19 c9 5e 3f d7 00 00 00 00 00 00 00 00 00 00 ...^?...........
backtrace:
[<00000000ddb525dc>] kmem_cache_alloc+0x155/0x26f
[<00000000633ff1e0>] fdb_create+0x21/0x486 [bridge]
[<0000000092b17e9c>] fdb_insert+0x91/0xdc [bridge]
[<00000000f2a0f0ff>] br_fdb_change_mac_address+0xb3/0x175 [bridge]
[<000000001de02dbd>] br_stp_change_bridge_id+0xf/0xff [bridge]
[<00000000ac0e32b1>] br_set_mac_address+0x76/0x99 [bridge]
[<000000006846a77f>] dev_set_mac_address+0x63/0x9b
[<00000000d30738fc>] __bond_release_one+0x3f6/0x455 [bonding]
[<00000000fc7ec01d>] bond_netdev_event+0x2f2/0x400 [bonding]
[<00000000305d7795>] notifier_call_chain+0x38/0x56
[<0000000028885d4a>] call_netdevice_notifiers+0x1e/0x23
[<000000008279477b>] rollback_registered_many+0x353/0x6a4
[<0000000018ef753a>] unregister_netdevice_many+0x17/0x6f
[<00000000ba854b7a>] rtnl_delete_link+0x3c/0x43
[<00000000adf8618d>] rtnl_dellink+0x1dc/0x20a
[<000000009b6395fd>] rtnetlink_rcv_msg+0x23d/0x268
Fixes: 43598813386f ("bridge: add local MAC address to forwarding table (v2)")
Reported-by: syzbot+2add91c08eb181fea1bf@syzkaller.appspotmail.com
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have to free "dev->name_node" on this error path.
Fixes: ff92741270bf ("net: introduce name_node struct to be used in hashlist")
Reported-by: syzbot+6e13e65ffbaa33757bcb@syzkaller.appspotmail.com
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEmvEkXzgOfc881GuFWsYho5HknSAFAl3mOAwTHG1rbEBwZW5n
dXRyb25peC5kZQAKCRBaxiGjkeSdIPq9B/951kXPnxKh/U7Uto4CdWKOAnI7juIA
29mM21yFMME4FCsrvowNfQfQ1o4b4mQ39lPJW9jvwqVPyxsp14O+gyeAYPlRaiDI
AngYoUER4lHOAu948GlRqLCxXD68zOSptMO3sJwBKp3hGocDctja+2t4UMCSHnGs
+s4S/U73gsdSW16DUYomhzl0dCPonUeXh/DGh13pImc7zKz8M1Pu8Ukb/4wGO9lg
HDqT3l7cXtrrbuwkENLTXLvEg7+kS31fZCF1j2jWTGfjwA4g5zA1LETZtz2aou8o
OPs1XrQ5KntBKsyljA3pL914DuJpJSnfuxpN60Fxpb3+x8284BTnL11I
=yhsG
-----END PGP SIGNATURE-----
Merge tag 'linux-can-fixes-for-5.5-20191203' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2019-12-03
this is a pull request of 6 patches for net/master.
The first two patches are against the MAINTAINERS file and adds Appana
Durga Kedareswara rao as maintainer for the xilinx-can driver and Sriram
Dash for the m_can (mmio) driver.
The next patch is by Jouni Hogander and fixes a use-after-free in the
slcan driver.
Johan Hovold's patch for the ucan driver fixes the non-atomic allocation
in the completion handler.
The last two patches target the xilinx-can driver. The first one is by
Venkatesh Yadav Abbarapu and skips the error message on deferred probe,
the second one is by Srinivas Neeli and fixes the usage of the skb after
can_put_echo_skb().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
As per linux can framework, driver not allowed to touch the skb memory
after can_put_echo_skb() call.
This patch fixes the same.
https://www.spinics.net/lists/linux-can/msg02199.html
Signed-off-by: Srinivas Neeli <srinivas.neeli@xilinx.com>
Reviewed-by: Appana Durga Kedareswara Rao <appana.durga.rao@xilinx.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
When the CAN bus clock is provided from the clock wizard, clock wizard
driver may not be available when can driver probes resulting to the
error message "bus clock not found error".
As this error message is not very useful to the end user, skip printing
in the case of deferred probe.
Signed-off-by: Venkatesh Yadav Abbarapu <venkatesh.abbarapu@xilinx.com>
Signed-off-by: Srinivas Neeli <srinivas.neeli@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Appana Durga Kedareswara Rao <appana.durga.rao@xilinx.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
USB completion handlers are called in atomic context and must
specifically not allocate memory using GFP_KERNEL.
Fixes: 9f2d3eae88d2 ("can: ucan: add driver for Theobroma Systems UCAN devices")
Cc: stable <stable@vger.kernel.org> # 4.19
Cc: Jakob Unterwurzacher <jakob.unterwurzacher@theobroma-systems.com>
Cc: Martin Elshuber <martin.elshuber@theobroma-systems.com>
Cc: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Slcan_open doesn't clean-up device which registration failed from the
slcan_devs device list. On next open this list is iterated and freed
device is accessed. Fix this by calling slc_free_netdev in error path.
Driver/net/can/slcan.c is derived from slip.c. Use-after-free error was
identified in slip_open by syzboz. Same bug is in slcan.c. Here is the
trace from the Syzbot slip report:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x197/0x210 lib/dump_stack.c:118
print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
__kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506
kasan_report+0x12/0x20 mm/kasan/common.c:634
__asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
sl_sync drivers/net/slip/slip.c:725 [inline]
slip_open+0xecd/0x11b7 drivers/net/slip/slip.c:801
tty_ldisc_open.isra.0+0xa3/0x110 drivers/tty/tty_ldisc.c:469
tty_set_ldisc+0x30e/0x6b0 drivers/tty/tty_ldisc.c:596
tiocsetd drivers/tty/tty_io.c:2334 [inline]
tty_ioctl+0xe8d/0x14f0 drivers/tty/tty_io.c:2594
vfs_ioctl fs/ioctl.c:46 [inline]
file_ioctl fs/ioctl.c:509 [inline]
do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:696
ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
__do_sys_ioctl fs/ioctl.c:720 [inline]
__se_sys_ioctl fs/ioctl.c:718 [inline]
__x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: ed50e1600b44 ("slcan: Fix memory leak in error path")
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: David Miller <davem@davemloft.net>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Jouni Hogander <jouni.hogander@unikie.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v5.4
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Since we are actively working on MMIO MCAN device driver,
as discussed with Marc, I am adding myself as a maintainer.
Signed-off-by: Sriram Dash <sriram.dash@samsung.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The skb_mpls_pop was not updating ethertype of an ethernet packet if the
packet was originally received from a non ARPHRD_ETHER device.
In the below OVS data path flow, since the device corresponding to port 7
is an l3 device (ARPHRD_NONE) the skb_mpls_pop function does not update
the ethertype of the packet even though the previous push_eth action had
added an ethernet header to the packet.
recirc_id(0),in_port(7),eth_type(0x8847),
mpls(label=12/0xfffff,tc=0/0,ttl=0/0x0,bos=1/1),
actions:push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),
pop_mpls(eth_type=0x800),4
Fixes: ed246cee09b9 ("net: core: move pop MPLS functionality from OvS to core helper")
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This field has never been checked since introduction in mainline kernel
Signed-off-by: Victorien Molle <victorien.molle@wifirst.fr>
Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr>
Fixes: 2db6dc2662ba "sch_cake: Make gso-splitting configurable from userspace"
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2019-12-02
The following pull-request contains BPF updates for your *net* tree.
We've added 10 non-merge commits during the last 6 day(s) which contain
a total of 10 files changed, 60 insertions(+), 51 deletions(-).
The main changes are:
1) Fix vmlinux BTF generation for binutils pre v2.25, from Stanislav Fomichev.
2) Fix libbpf global variable relocation to take symbol's st_value offset
into account, from Andrii Nakryiko.
3) Fix libbpf build on powerpc where check_abi target fails due to different
readelf output format, from Aurelien Jarno.
4) Don't set BPF insns RO for the case when they are JITed in order to avoid
fragmenting the direct map, from Daniel Borkmann.
5) Fix static checker warning in btf_distill_func_proto() as well as a build
error due to empty enum when BPF is compiled out, from Alexei Starovoitov.
6) Fix up generation of bpf_helper_defs.h for perf, from Arnaldo Carvalho de Melo.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
On powerpc with recent versions of binutils, readelf outputs an extra
field when dumping the symbols of an object file. For example:
35: 0000000000000838 96 FUNC LOCAL DEFAULT [<localentry>: 8] 1 btf_is_struct
The extra "[<localentry>: 8]" prevents the GLOBAL_SYM_COUNT variable to
be computed correctly and causes the check_abi target to fail.
Fix that by looking for the symbol name in the last field instead of the
8th one. This way it should also cope with future extra fields.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/bpf/20191201195728.4161537-1-aurelien@aurel32.net
Merge updates from Andrew Morton:
"Incoming:
- a small number of updates to scripts/, ocfs2 and fs/buffer.c
- most of MM
I still have quite a lot of material (mostly not MM) staged after
linux-next due to -next dependencies. I'll send those across next week
as the preprequisites get merged up"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (135 commits)
mm/page_io.c: annotate refault stalls from swap_readpage
mm/Kconfig: fix trivial help text punctuation
mm/Kconfig: fix indentation
mm/memory_hotplug.c: remove __online_page_set_limits()
mm: fix typos in comments when calling __SetPageUptodate()
mm: fix struct member name in function comments
mm/shmem.c: cast the type of unmap_start to u64
mm: shmem: use proper gfp flags for shmem_writepage()
mm/shmem.c: make array 'values' static const, makes object smaller
userfaultfd: require CAP_SYS_PTRACE for UFFD_FEATURE_EVENT_FORK
fs/userfaultfd.c: wp: clear VM_UFFD_MISSING or VM_UFFD_WP during userfaultfd_register()
userfaultfd: wrap the common dst_vma check into an inlined function
userfaultfd: remove unnecessary WARN_ON() in __mcopy_atomic_hugetlb()
userfaultfd: use vma_pagesize for all huge page size calculation
mm/madvise.c: use PAGE_ALIGN[ED] for range checking
mm/madvise.c: replace with page_size() in madvise_inject_error()
mm/mmap.c: make vma_merge() comment more easy to understand
mm/hwpoison-inject: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops
autonuma: reduce cache footprint when scanning page tables
autonuma: fix watermark checking in migrate_balanced_pgdat()
...