14324 Commits

Author SHA1 Message Date
Nicolin Chen
1d4684fbe8 iommufd: Reorder include files
Reorder include files to alphabetic order to simplify maintenance, and
separate local headers and global headers with a blank line.

No functional change intended.

Link: https://patch.msgid.link/r/7524b037cc05afe19db3c18f863253e1d1554fa2.1722644866.git.nicolinc@nvidia.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-26 12:02:03 -03:00
Jens Axboe
7ed9e09e2d io_uring: wire up min batch wake timeout
Expose min_wait_usec in io_uring_getevents_arg, replacing the pad member
that is currently in there. The value is in usecs, which is explained in
the name as well.

Note that if min_wait_usec and a normal timeout is used in conjunction,
the normal timeout is still relative to the base time. For example, if
min_wait_usec is set to 100 and the normal timeout is 1000, the max
total time waited is still 1000. This also means that if the normal
timeout is shorter than min_wait_usec, then only the min_wait_usec will
take effect.

See previous commit for an explanation of how this works.

IORING_FEAT_MIN_TIMEOUT is added as a feature flag for this, as
applications doing submit_and_wait_timeout() style operations will
generally not see the -EINVAL from the wait side as they return the
number of IOs submitted. Only if no IOs are submitted will the -EINVAL
bubble back up to the application.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-08-25 08:27:01 -06:00
Pavel Begunkov
2b8e976b98 io_uring: user registered clockid for wait timeouts
Add a new registration opcode IORING_REGISTER_CLOCK, which allows the
user to select which clock id it wants to use with CQ waiting timeouts.
It only allows a subset of all posix clocks and currently supports
CLOCK_MONOTONIC and CLOCK_BOOTTIME.

Suggested-by: Lewis Baker <lewissbaker@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/98f2bc8a3c36cdf8f0e6a275245e81e903459703.1723039801.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-08-25 08:27:01 -06:00
Pavel Begunkov
d29cb3726f io_uring: add absolute mode wait timeouts
In addition to current relative timeouts for the waiting loop, where the
timespec argument specifies the maximum time it can wait for, add
support for the absolute mode, with the value carrying a CLOCK_MONOTONIC
absolute time until which we should return control back to the user.

Suggested-by: Lewis Baker <lewissbaker@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4d5b74d67ada882590b2e42aa3aa7117bbf6b55f.1723039801.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-08-25 08:27:01 -06:00
Jordan Rome
65ab5ac4df bpf: Add bpf_copy_from_user_str kfunc
This adds a kfunc wrapper around strncpy_from_user,
which can be called from sleepable BPF programs.

This matches the non-sleepable 'bpf_probe_read_user_str'
helper except it includes an additional 'flags'
param, which allows consumers to clear the entire
destination buffer on success or failure.

Signed-off-by: Jordan Rome <linux@jordanrome.com>
Link: https://lore.kernel.org/r/20240823195101.3621028-1-linux@jordanrome.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-08-23 15:40:01 -07:00
Dave Marchevsky
b0966c7245 bpf: Support bpf_kptr_xchg into local kptr
Currently, users can only stash kptr into map values with bpf_kptr_xchg().
This patch further supports stashing kptr into local kptr by adding local
kptr as a valid destination type.

When stashing into local kptr, btf_record in program BTF is used instead
of btf_record in map to search for the btf_field of the local kptr.

The local kptr specific checks in check_reg_type() only apply when the
source argument of bpf_kptr_xchg() is local kptr. Therefore, we make the
scope of the check explicit as the destination now can also be local kptr.

Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Amery Hung <amery.hung@bytedance.com>
Link: https://lore.kernel.org/r/20240813212424.2871455-5-amery.hung@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-08-23 11:39:33 -07:00
Maxime Chevallier
17194be4c8 net: ethtool: Introduce a command to list PHYs on an interface
As we have the ability to track the PHYs connected to a net_device
through the link_topology, we can expose this list to userspace. This
allows userspace to use these identifiers for phy-specific commands and
take the decision of which PHY to target by knowing the link topology.

Add PHY_GET and PHY_DUMP, which can be a filtered DUMP operation to list
devices on only one interface.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-23 13:04:34 +01:00
Maxime Chevallier
c15e065b46 net: ethtool: Allow passing a phy index for some commands
Some netlink commands are target towards ethernet PHYs, to control some
of their features. As there's several such commands, add the ability to
pass a PHY index in the ethnl request, which will populate the generic
ethnl_req_info with the passed phy_index.

Add a helper that netlink command handlers need to use to grab the
targeted PHY from the req_info. This helper needs to hold rtnl_lock()
while interacting with the PHY, as it may be removed at any point.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-23 13:04:34 +01:00
Maxime Chevallier
3849687869 net: phy: Introduce ethernet link topology representation
Link topologies containing multiple network PHYs attached to the same
net_device can be found when using a PHY as a media converter for use
with an SFP connector, on which an SFP transceiver containing a PHY can
be used.

With the current model, the transceiver's PHY can't be used for
operations such as cable testing, timestamping, macsec offload, etc.

The reason being that most of the logic for these configuration, coming
from either ethtool netlink or ioctls tend to use netdev->phydev, which
in multi-phy systems will reference the PHY closest to the MAC.

Introduce a numbering scheme allowing to enumerate PHY devices that
belong to any netdev, which can in turn allow userspace to take more
precise decisions with regard to each PHY's configuration.

The numbering is maintained per-netdev, in a phy_device_list.
The numbering works similarly to a netdevice's ifindex, with
identifiers that are only recycled once INT_MAX has been reached.

This prevents races that could occur between PHY listing and SFP
transceiver removal/insertion.

The identifiers are assigned at phy_attach time, as the numbering
depends on the netdevice the phy is attached to. The PHY index can be
re-used for PHYs that are persistent.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-23 13:04:34 +01:00
Jakub Kicinski
761d527d5d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

drivers/net/ethernet/broadcom/bnxt/bnxt.h
  c948c0973df5 ("bnxt_en: Don't clear ntuple filters and rss contexts during ethtool ops")
  f2878cdeb754 ("bnxt_en: Add support to call FW to update a VNIC")

Link: https://patch.msgid.link/20240822210125.1542769-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22 17:06:18 -07:00
Kees Cook
5ac86f0ed0 virt: vbox: struct vmmdev_hgcm_pagelist: Replace 1-element array with flexible array
Replace the deprecated[1] use of a 1-element array in
struct vmmdev_hgcm_pagelist with a modern flexible array. As this is
UAPI, we cannot trivially change the size of the struct, so use a union
to retain the old first element's size, but switch "pages" to a flexible
array.

No binary differences are present after this conversion.

Link: https://github.com/KSPP/linux/issues/79 [1]
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20240710231555.work.406-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-22 16:56:24 -07:00
Chris Wulff
a139c98f76 USB: gadget: f_hid: Add GET_REPORT via userspace IOCTL
While supporting GET_REPORT is a mandatory request per the HID
specification the current implementation of the GET_REPORT request responds
to the USB Host with an empty reply of the request length. However, some
USB Hosts will request the contents of feature reports via the GET_REPORT
request. In addition, some proprietary HID 'protocols' will expect
different data, for the same report ID, to be to become available in the
feature report by sending a preceding SET_REPORT to the USB Device that
defines what data is to be presented when that feature report is
subsequently retrieved via GET_REPORT (with a very fast < 5ms turn around
between the SET_REPORT and the GET_REPORT).

There are two other patch sets already submitted for adding GET_REPORT
support. The first [1] allows for pre-priming a list of reports via IOCTLs
which then allows the USB Host to perform the request, with no further
userspace interaction possible during the GET_REPORT request. And another
[2] which allows for a single report to be setup by userspace via IOCTL,
which will be fetched and returned by the kernel for subsequent GET_REPORT
requests by the USB Host, also with no further userspace interaction
possible.

This patch, while loosely based on both the patch sets, differs by allowing
the option for userspace to respond to each GET_REPORT request by setting
up a poll to notify userspace that a new GET_REPORT request has arrived. To
support this, two extra IOCTLs are supplied. The first of which is used to
retrieve the report ID of the GET_REPORT request (in the case of having
non-zero report IDs in the HID descriptor). The second IOCTL allows for
storing report responses in a list for responding to requests.

The report responses are stored in a list (it will be either added if it
does not exist or updated if it exists already). A flag (userspace_req) can
be set to whether subsequent requests notify userspace or not.

Basic operation when a GET_REPORT request arrives from USB Host:

- If the report ID exists in the list and it is set for immediate return
  (i.e. userspace_req == false) then response is sent immediately,
userspace is not notified

- The report ID does not exist, or exists but is set to notify userspace
  (i.e. userspace_req == true) then notify userspace via poll:

	- If userspace responds, and either adds or update the response in
	  the list and respond to the host with the contents

	- If userspace does not respond within the fixed timeout (2500ms)
	  but the report has been set prevously, then send 'old' report
	  contents

	- If userspace does not respond within the fixed timeout (2500ms)
	  and the report does not exist in the list then send an empty
	  report

Note that userspace could 'prime' the report list at any other time.

While this patch allows for flexibility in how the system responds to
requests, and therefore the HID 'protocols' that could be supported, a
drawback is the time it takes to service the requests and therefore the
maximum throughput that would be achievable. The USB HID Specification
v1.11 itself states that GET_REPORT is not intended for periodic data
polling, so this limitation is not severe.

Testing on an iMX8M Nano Ultra Lite with a heavy multi-core CPU loading
showed that userspace can typically respond to the GET_REPORT request
within 1200ms - which is well within the 5000ms most operating systems seem
to allow, and within the 2500ms set by this patch.

[1] https://lore.kernel.org/all/20220805070507.123151-2-sunil@amarulasolutions.com/
[2] https://lore.kernel.org/all/20220726005824.2817646-1-vi@endrift.com/

Signed-off-by: David Sands <david.sands@biamp.com>
Signed-off-by: Chris Wulff <chris.wulff@biamp.com>
Link: https://lore.kernel.org/r/20240817142850.1311460-2-crwulff@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-22 17:25:59 +08:00
Justin Iurman
273f8c1420 net: ipv6: ioam6: new feature tunsrc
This patch provides a new feature (i.e., "tunsrc") for the tunnel (i.e.,
"encap") mode of ioam6. Just like seg6 already does, except it is
attached to a route. The "tunsrc" is optional: when not provided (by
default), the automatic resolution is applied. Using "tunsrc" when
possible has a benefit: performance. See the comparison:
 - before (= "encap" mode): https://ibb.co/bNCzvf7
 - after (= "encap" mode with "tunsrc"): https://ibb.co/PT8L6yq

Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22 10:45:12 +02:00
Juha-Pekka Heikkila
5151fa35ae
drm/fourcc: define Intel Xe2 related tile4 ccs modifiers
Add Tile4 type ccs modifiers to indicate presence of compression on Xe2.
Here is defined I915_FORMAT_MOD_4_TILED_LNL_CCS which is meant for
integrated graphics with igpu related limitations
Here is also defined I915_FORMAT_MOD_4_TILED_BMG_CCS which is meant
for discrete graphics with dgpu related limitations

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240816115229.531671-3-juhapekka.heikkila@gmail.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-21 16:29:42 -04:00
Deven Bowers
f44554b506 audit,ipe: add IPE auditing support
Users of IPE require a way to identify when and why an operation fails,
allowing them to both respond to violations of policy and be notified
of potentially malicious actions on their systems with respect to IPE
itself.

This patch introduces 3 new audit events.

AUDIT_IPE_ACCESS(1420) indicates the result of an IPE policy evaluation
of a resource.
AUDIT_IPE_CONFIG_CHANGE(1421) indicates the current active IPE policy
has been changed to another loaded policy.
AUDIT_IPE_POLICY_LOAD(1422) indicates a new IPE policy has been loaded
into the kernel.

This patch also adds support for success auditing, allowing users to
identify why an allow decision was made for a resource. However, it is
recommended to use this option with caution, as it is quite noisy.

Here are some examples of the new audit record types:

AUDIT_IPE_ACCESS(1420):

    audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
      pid=297 comm="sh" path="/root/vol/bin/hello" dev="tmpfs"
      ino=3897 rule="op=EXECUTE boot_verified=TRUE action=ALLOW"

    audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
      pid=299 comm="sh" path="/mnt/ipe/bin/hello" dev="dm-0"
      ino=2 rule="DEFAULT action=DENY"

    audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
     pid=300 path="/tmp/tmpdp2h1lub/deny/bin/hello" dev="tmpfs"
      ino=131 rule="DEFAULT action=DENY"

The above three records were generated when the active IPE policy only
allows binaries from the initramfs to run. The three identical `hello`
binary were placed at different locations, only the first hello from
the rootfs(initramfs) was allowed.

Field ipe_op followed by the IPE operation name associated with the log.

Field ipe_hook followed by the name of the LSM hook that triggered the IPE
event.

Field enforcing followed by the enforcement state of IPE. (it will be
introduced in the next commit)

Field pid followed by the pid of the process that triggered the IPE
event.

Field comm followed by the command line program name of the process that
triggered the IPE event.

Field path followed by the file's path name.

Field dev followed by the device name as found in /dev where the file is
from.
Note that for device mappers it will use the name `dm-X` instead of
the name in /dev/mapper.
For a file in a temp file system, which is not from a device, it will use
`tmpfs` for the field.
The implementation of this part is following another existing use case
LSM_AUDIT_DATA_INODE in security/lsm_audit.c

Field ino followed by the file's inode number.

Field rule followed by the IPE rule made the access decision. The whole
rule must be audited because the decision is based on the combination of
all property conditions in the rule.

Along with the syscall audit event, user can know why a blocked
happened. For example:

    audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
      pid=2138 comm="bash" path="/mnt/ipe/bin/hello" dev="dm-0"
      ino=2 rule="DEFAULT action=DENY"
    audit[1956]: SYSCALL arch=c000003e syscall=59
      success=no exit=-13 a0=556790138df0 a1=556790135390 a2=5567901338b0
      a3=ab2a41a67f4f1f4e items=1 ppid=147 pid=1956 auid=4294967295 uid=0
      gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0
      ses=4294967295 comm="bash" exe="/usr/bin/bash" key=(null)

The above two records showed bash used execve to run "hello" and got
blocked by IPE. Note that the IPE records are always prior to a SYSCALL
record.

AUDIT_IPE_CONFIG_CHANGE(1421):

    audit: AUDIT1421
      old_active_pol_name="Allow_All" old_active_pol_version=0.0.0
      old_policy_digest=sha256:E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649
      new_active_pol_name="boot_verified" new_active_pol_version=0.0.0
      new_policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F
      auid=4294967295 ses=4294967295 lsm=ipe res=1

The above record showed the current IPE active policy switch from
`Allow_All` to `boot_verified` along with the version and the hash
digest of the two policies. Note IPE can only have one policy active
at a time, all access decision evaluation is based on the current active
policy.
The normal procedure to deploy a policy is loading the policy to deploy
into the kernel first, then switch the active policy to it.

AUDIT_IPE_POLICY_LOAD(1422):

    audit: AUDIT1422 policy_name="boot_verified" policy_version=0.0.0
      policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F2676
      auid=4294967295 ses=4294967295 lsm=ipe res=1

The above record showed a new policy has been loaded into the kernel
with the policy name, policy version and policy hash.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
[PM: subject line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-20 14:02:22 -04:00
Ido Schimmel
1fa3314c14 ipv4: Centralize TOS matching
The TOS field in the IPv4 flow information structure ('flowi4_tos') is
matched by the kernel against the TOS selector in IPv4 rules and routes.
The field is initialized differently by different call sites. Some treat
it as DSCP (RFC 2474) and initialize all six DSCP bits, some treat it as
RFC 1349 TOS and initialize it using RT_TOS() and some treat it as RFC
791 TOS and initialize it using IPTOS_RT_MASK.

What is common to all these call sites is that they all initialize the
lower three DSCP bits, which fits the TOS definition in the initial IPv4
specification (RFC 791).

Therefore, the kernel only allows configuring IPv4 FIB rules that match
on the lower three DSCP bits which are always guaranteed to be
initialized by all call sites:

 # ip -4 rule add tos 0x1c table 100
 # ip -4 rule add tos 0x3c table 100
 Error: Invalid tos.

While this works, it is unlikely to be very useful. RFC 791 that
initially defined the TOS and IP precedence fields was updated by RFC
2474 over twenty five years ago where these fields were replaced by a
single six bits DSCP field.

Extending FIB rules to match on DSCP can be done by adding a new DSCP
selector while maintaining the existing semantics of the TOS selector
for applications that rely on that.

A prerequisite for allowing FIB rules to match on DSCP is to adjust all
the call sites to initialize the high order DSCP bits and remove their
masking along the path to the core where the field is matched on.

However, making this change alone will result in a behavior change. For
example, a forwarded IPv4 packet with a DS field of 0xfc will no longer
match a FIB rule that was configured with 'tos 0x1c'.

This behavior change can be avoided by masking the upper three DSCP bits
in 'flowi4_tos' before comparing it against the TOS selectors in FIB
rules and routes.

Implement the above by adding a new function that checks whether a given
DSCP value matches the one specified in the IPv4 flow information
structure and invoke it from the three places that currently match on
'flowi4_tos'.

Use RT_TOS() for the masking of 'flowi4_tos' instead of IPTOS_RT_MASK
since the latter is not uAPI and we should be able to remove it at some
point.

Include <linux/ip.h> in <linux/in_route.h> since the former defines
IPTOS_TOS_MASK which is used in the definition of RT_TOS() in
<linux/in_route.h>.

No regressions in FIB tests:

 # ./fib_tests.sh
 [...]
 Tests passed: 218
 Tests failed:   0

And FIB rule tests:

 # ./fib_rule_tests.sh
 [...]
 Tests passed: 116
 Tests failed:   0

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-20 14:57:08 +02:00
Wen Gu
e0d103542b net/smc: introduce statistics for ringbufs usage of net namespace
The buffer size histograms in smc_stats, namely rx/tx_rmbsize, record
the sizes of ringbufs for all connections that have ever appeared in
the net namespace. They are incremental and we cannot know the actual
ringbufs usage from these. So here introduces statistics for current
ringbufs usage of existing smc connections in the net namespace into
smc_stats, it will be incremented when new connection uses a ringbuf
and decremented when the ringbuf is unused.

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-20 11:38:23 +02:00
Wen Gu
d386d59b7c net/smc: introduce statistics for allocated ringbufs of link group
Currently we have the statistics on sndbuf/RMB sizes of all connections
that have ever been on the link group, namely smc_stats_memsize. However
these statistics are incremental and since the ringbufs of link group
are allowed to be reused, we cannot know the actual allocated buffers
through these. So here introduces the statistic on actual allocated
ringbufs of the link group, it will be incremented when a new ringbuf is
added into buf_list and decremented when it is deleted from buf_list.

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-20 11:38:23 +02:00
Deven Bowers
0311507792 lsm: add IPE lsm
Integrity Policy Enforcement (IPE) is an LSM that provides an
complimentary approach to Mandatory Access Control than existing LSMs
today.

Existing LSMs have centered around the concept of access to a resource
should be controlled by the current user's credentials. IPE's approach,
is that access to a resource should be controlled by the system's trust
of a current resource.

The basis of this approach is defining a global policy to specify which
resource can be trusted.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
[PM: subject line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-19 22:36:26 -04:00
Geert Uytterhoeven
ad614a706b
drm/xe/oa/uapi: Make bit masks unsigned
When building with gcc-5:

    In function ‘decode_oa_format.isra.26’,
	inlined from ‘xe_oa_set_prop_oa_format’ at drivers/gpu/drm/xe/xe_oa.c:1664:6:
    ././include/linux/compiler_types.h:510:38: error: call to ‘__compiletime_assert_1336’ declared with attribute error: FIELD_GET: mask is not constant
    [...]
    ./include/linux/bitfield.h:155:3: note: in expansion of macro ‘__BF_FIELD_CHECK’
       __BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: "); \
       ^
    drivers/gpu/drm/xe/xe_oa.c:1573:18: note: in expansion of macro ‘FIELD_GET’
      u32 bc_report = FIELD_GET(DRM_XE_OA_FORMAT_MASK_BC_REPORT, fmt);
		      ^

Fixes: b6fd51c62119 ("drm/xe/oa/uapi: Define and parse OA stream properties")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240729092634.2227611-1-geert+renesas@glider.be
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit f2881dfdaaa9ec873dbd383ef5512fc31e576cbb)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-19 10:44:18 -04:00
Christian Brauner
820a185896 fcntl: add F_CREATED_QUERY
Systemd has a helper called openat_report_new() that returns whether a
file was created anew or it already existed before for cases where
O_CREAT has to be used without O_EXCL (cf. [1]). That apparently isn't
something that's specific to systemd but it's where I noticed it.

The current logic is that it first attempts to open the file without
O_CREAT | O_EXCL and if it gets ENOENT the helper tries again with both
flags. If that succeeds all is well. If it now reports EEXIST it
retries.

That works fairly well but some corner cases make this more involved. If
this operates on a dangling symlink the first openat() without O_CREAT |
O_EXCL will return ENOENT but the second openat() with O_CREAT | O_EXCL
will fail with EEXIST. The reason is that openat() without O_CREAT |
O_EXCL follows the symlink while O_CREAT | O_EXCL doesn't for security
reasons. So it's not something we can really change unless we add an
explicit opt-in via O_FOLLOW which seems really ugly.

The caller could try and use fanotify() to register to listen for
creation events in the directory before calling openat(). The caller
could then compare the returned tid to its own tid to ensure that even
in threaded environments it actually created the file. That might work
but is a lot of work for something that should be fairly simple and I'm
uncertain about it's reliability.

The caller could use a bpf lsm hook to hook into security_file_open() to
figure out whether they created the file. That also seems a bit wild.

So let's add F_CREATED_QUERY which allows the caller to check whether
they actually did create the file. That has caveats of course but I
don't think they are problematic:

* In multi-threaded environments a thread can only be sure that it did
  create the file if it calls openat() with O_CREAT. In other words,
  it's obviously not enough to just go through it's fdtable and check
  these fds because another thread could've created the file.

* If there's any codepaths where an openat() with O_CREAT would yield
  the same struct file as that of another thread it would obviously
  cause wrong results. I'm not aware of any such codepaths from openat()
  itself. Imho, that would be a bug.

* Related to the previous point, calling the new fcntl() on files created
  and opened via special-purpose system calls or ioctl()s would cause
  wrong results only if the affected subsystem a) raises FMODE_CREATED
  and b) may return the same struct file for two different calls. I'm
  not seeing anything outside of regular VFS code that raises
  FMODE_CREATED.

  There is code for b) in e.g., the drm layer where the same struct file
  is resurfaced but again FMODE_CREATED isn't used and it would be very
  misleading if it did.

Link: 11d5e2b5fb/src/basic/fs-util.c (L1078) [1]
Link: https://lore.kernel.org/r/20240724-work-fcntl-v1-1-e8153a2f1991@kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-19 13:44:34 +02:00
Takashi Iwai
41776e4008 Merge branch 'topic/seq-filter-cleanup' into for-next
Pull ALSA sequencer cleanup.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-08-19 10:48:46 +02:00
Greg Kroah-Hartman
ca7df2c7bb Merge 6.11-rc4 into usb-next
We need the usb / thunderbolt fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19 06:16:49 +02:00
Greg Kroah-Hartman
10c8d1bd78 Merge 6.11-rc4 into char-misc-next
We need the char/misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19 06:02:26 +02:00
Linus Torvalds
e1bc113215 Char/Misc fixes for 6.11-rc4
Here are some small char/misc fixes for 6.11-rc4 to resolve reported
 problems.  Included in here are:
   - fastrpc revert of a change that broke userspace
   - xillybus fixes for reported issues
 
 Half of these have been in linux-next this week with no reported
 problems, I don't know if the last bit of xillybus driver changes made
 it in, but they are "obviously correct" so will be safe :)
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZsIT4A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymYQACfeV6tIMHlHSVS9VSbiXHJAovjvFEAoLQvuvBS
 0dN1Q4iYF+5Sy4JWAZb/
 =ztLK
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char / misc fixes from Greg KH:
 "Here are some small char/misc fixes for 6.11-rc4 to resolve reported
  problems. Included in here are:

   - fastrpc revert of a change that broke userspace

   - xillybus fixes for reported issues

  Half of these have been in linux-next this week with no reported
  problems, I don't know if the last bit of xillybus driver changes made
  it in, but they are 'obviously correct' so will be safe :)"

* tag 'char-misc-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  char: xillybus: Check USB endpoints when probing device
  char: xillybus: Refine workqueue handling
  Revert "misc: fastrpc: Restrict untrusted app to attach to privileged PD"
  char: xillybus: Don't destroy workqueue from work item running on it
2024-08-18 10:16:34 -07:00
Ivan Orlov
37745918e0 ALSA: timer: Introduce virtual userspace-driven timers
Implement two ioctl calls in order to support virtual userspace-driven
ALSA timers.

The first ioctl is SNDRV_TIMER_IOCTL_CREATE, which gets the
snd_timer_uinfo struct as a parameter and puts a file descriptor of a
virtual timer into the `fd` field of the snd_timer_unfo structure. It
also updates the `id` field of the snd_timer_uinfo struct, which
provides a unique identifier for the timer (basically, the subdevice
number which can be used when creating timer instances).

This patch also introduces a tiny id allocator for the userspace-driven
timers, which guarantees that we don't have more than 128 of them in the
system.

Another ioctl is SNDRV_TIMER_IOCTL_TRIGGER, which allows us to trigger
the virtual timer (and calls snd_timer_interrupt for the timer under
the hood), causing all of the timer instances binded to this timer to
execute their callbacks.

The maximum amount of ticks available for the timer is 1 for the sake of
simplicity of the userspace API. 'start', 'stop', 'open' and 'close'
callbacks for the userspace-driven timers are empty since we don't
really do any hardware initialization here.

Suggested-by: Axel Holzinger <aholzinger@gmx.de>
Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20240813120701.171743-4-ivan.orlov0322@gmail.com
2024-08-18 09:55:54 +02:00
Linus Torvalds
c5ac744cdd io_uring-6.11-20240824
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAma/oWYQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgppdiEACNHrpmc95/gLpi9sGz11vDxWm1W/gh5Kig
 kCiRdFlRnIOiKjWJjVv7Le6FqvC1e1KkW3N+1yuzr7nDI6J/yjgSPPSeR10AWYXU
 A3HWWXRnRZ/AhM0abt7XgEcn+TK2liw/wX0S9hjLht5xz087erLAw+2PQrCTmuxV
 Ge1Vt9S2SMmr0PkrM0xQI8QoQMC+wUzWLrtDUd1xUhHWru4Fl6qs6LFckjhixloD
 FMgVQgt/Vd2sf6Uxd+XLy6QhnOZ5vZ1jYtLPB2wVywxewM2FeLhkJRuTViuLMhnP
 dbkmFPS+iQHGJXjCvU1QD8yv4qNjXnBEaEwbbnd9L9KG48FwddCRrFZA/j/LDxQ1
 1VTof0Bd3EGDvu1e/N70uDp2Vqn620SWKmUrWF/eShbMQq5Vjqa6micuxlMmvnxj
 uzcQ65ePYVzro/PlhELAVxeJL6r1LNFnPjmijBlf349Tj58IXrowW35QEnhh8ouX
 6DGQ45pyANN/Uio65XWzMoc97IvRsP72lmO9iIwd5UBRCH7QxcfdueYH+jQEu2JD
 Sir0ChRfT9HulaNV953KocGGBDAhzPk+AGd1Vm8h7eu8RMy0oqvRDsraXB6Ig/Qh
 VFnlKwuObgfmKHRBFvLX53n3zWKH9Ewo+0kw6qNPC+DyHSEL/zDJ4SAyOh+cliNo
 6DrfJq5jgA==
 =m6tl
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-6.11-20240824' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:

 - Fix a comment in the uapi header using the wrong member name (Caleb)

 - Fix KCSAN warning for a debug check in sqpoll (me)

 - Two more NAPI tweaks (Olivier)

* tag 'io_uring-6.11-20240824' of git://git.kernel.dk/linux:
  io_uring: fix user_data field name in comment
  io_uring/sqpoll: annotate debug task == current with data_race()
  io_uring/napi: remove duplicate io_napi_entry timeout assignation
  io_uring/napi: check napi_enabled in io_napi_add() before proceeding
2024-08-16 14:00:05 -07:00
Caleb Sander Mateos
1fc2ac428e io_uring: fix user_data field name in comment
io_uring_cqe's user_data field refers to `sqe->data`, but io_uring_sqe
does not have a data field. Fix the comment to say `sqe->user_data`.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://github.com/axboe/liburing/pull/1206
Link: https://lore.kernel.org/r/20240816181526.3642732-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-08-16 12:31:26 -06:00
Oleksij Rempel
2140e63cd8 ethtool: Add new result codes for TDR diagnostics
Add new result codes to support TDR diagnostics in preparation for
Open Alliance 1000BaseT1 TDR support:

- ETHTOOL_A_CABLE_RESULT_CODE_NOISE: TDR not possible due to high noise
  level.
- ETHTOOL_A_CABLE_RESULT_CODE_RESOLUTION_NOT_POSSIBLE: TDR resolution not
  possible / out of distance.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20240812073046.1728288-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-16 10:16:16 -07:00
Jakub Kicinski
4d3d3559fc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

Documentation/devicetree/bindings/net/fsl,qoriq-mc-dpmac.yaml
  c25504a0ba36 ("dt-bindings: net: fsl,qoriq-mc-dpmac: add missed property phys")
  be034ee6c33d ("dt-bindings: net: fsl,qoriq-mc-dpmac: using unevaluatedProperties")
https://lore.kernel.org/20240815110934.56ae623a@canb.auug.org.au

drivers/net/dsa/vitesse-vsc73xx-core.c
  5b9eebc2c7a5 ("net: dsa: vsc73xx: pass value in phy_write operation")
  fa63c6434b6f ("net: dsa: vsc73xx: check busy flag in MDIO operations")
  2524d6c28bdc ("net: dsa: vsc73xx: use defined values in phy operations")
https://lore.kernel.org/20240813104039.429b9fe6@canb.auug.org.au
Resolve by using FIELD_PREP(), Stephen's resolution is simpler.

Adjacent changes:

net/vmw_vsock/af_vsock.c
  69139d2919dd ("vsock: fix recursive ->recvmsg calls")
  744500d81f81 ("vsock: add support for SIOCOUTQ ioctl")

Link: https://patch.msgid.link/20240815141149.33862-1-pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-15 17:18:52 -07:00
Griffin Kroah-Hartman
9bb5e74b2b Revert "misc: fastrpc: Restrict untrusted app to attach to privileged PD"
This reverts commit bab2f5e8fd5d2f759db26b78d9db57412888f187.

Joel reported that this commit breaks userspace and stops sensors in
SDM845 from working. Also breaks other qcom SoC devices running postmarketOS.

Cc: stable <stable@kernel.org>
Cc: Ekansh Gupta <quic_ekangupt@quicinc.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reported-by: Joel Selvaraj <joelselvaraj.oss@gmail.com>
Link: https://lore.kernel.org/r/9a9f5646-a554-4b65-8122-d212bb665c81@umsystem.edu
Signed-off-by: Griffin Kroah-Hartman <griffin@kroah.com>
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Fixes: bab2f5e8fd5d ("misc: fastrpc: Restrict untrusted app to attach to privileged PD")
Link: https://lore.kernel.org/r/20240815094920.8242-1-griffin@kroah.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-15 16:59:14 +02:00
Gustavo A. R. Silva
216203bdc2 UAPI: net/sched: Use __struct_group() in flex struct tc_u32_sel
Use the `__struct_group()` helper to create a new tagged
`struct tc_u32_sel_hdr`. This structure groups together all the
members of the flexible `struct tc_u32_sel` except the flexible
array. As a result, the array is effectively separated from the
rest of the members without modifying the memory layout of the
flexible structure.

This new tagged struct will be used to fix problematic declarations
of middle-flex-arrays in composite structs[1].

[1] https://git.kernel.org/linus/d88cabfd9abc

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://patch.msgid.link/e59fe833564ddc5b2cc83056a4c504be887d6193.1723586870.git.gustavoars@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-14 20:37:47 -07:00
Linus Torvalds
d07b43284a s390:
* Fix failure to start guests with kvm.use_gisa=0
 
 * Panic if (un)share fails to maintain security.
 
 ARM:
 
 * Use kvfree() for the kvmalloc'd nested MMUs array
 
 * Set of fixes to address warnings in W=1 builds
 
 * Make KVM depend on assembler support for ARMv8.4
 
 * Fix for vgic-debug interface for VMs without LPIs
 
 * Actually check ID_AA64MMFR3_EL1.S1PIE in get-reg-list selftest
 
 * Minor code / comment cleanups for configuring PAuth traps
 
 * Take kvm->arch.config_lock to prevent destruction / initialization
   race for a vCPU's CPUIF which may lead to a UAF
 
 x86:
 
 * Disallow read-only memslots for SEV-ES and SEV-SNP (and TDX)
 
 * Fix smatch issues
 
 * Small cleanups
 
 * Make x2APIC ID 100% readonly
 
 * Fix typo in uapi constant
 
 Generic:
 
 * Use synchronize_srcu_expedited() on irqfd shutdown
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAma85nQUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPq5gf8DVZjs2yNwdPLAUN7AOpElsobFnVN
 etJ1V09XsfSYMIAV6ksvC1VBzHpyOJN2QKuF0nfIiISsV8W9xLcV2rKIodsGBvdV
 K5ODL/yxeYI27t6Uferra1AGlchtn3tlpZzVarZIgRvZa3NMXaQPYJdcQr2Oybou
 7hZsboMTl6jaSl6NELzcBRksfkcOqQLQoUqVBqlkBTM3yyFRmV85BisRkOWCIBfA
 9c+kn7ZWBfOqYaXjiLeCrdCKrUuFLfR3ejJibYFan5MULYHIL95W/WJo9uQJ1QBr
 BNjMfmtVZ2JOWya40uUSKrvxJ0IErAlMNgmnpjeA4cBqYK5GRHUKvO5jNw==
 =A8SH
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "s390:

   - Fix failure to start guests with kvm.use_gisa=0

   - Panic if (un)share fails to maintain security.

  ARM:

   - Use kvfree() for the kvmalloc'd nested MMUs array

   - Set of fixes to address warnings in W=1 builds

   - Make KVM depend on assembler support for ARMv8.4

   - Fix for vgic-debug interface for VMs without LPIs

   - Actually check ID_AA64MMFR3_EL1.S1PIE in get-reg-list selftest

   - Minor code / comment cleanups for configuring PAuth traps

   - Take kvm->arch.config_lock to prevent destruction / initialization
     race for a vCPU's CPUIF which may lead to a UAF

  x86:

   - Disallow read-only memslots for SEV-ES and SEV-SNP (and TDX)

   - Fix smatch issues

   - Small cleanups

   - Make x2APIC ID 100% readonly

   - Fix typo in uapi constant

  Generic:

   - Use synchronize_srcu_expedited() on irqfd shutdown"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
  KVM: SEV: uapi: fix typo in SEV_RET_INVALID_CONFIG
  KVM: x86: Disallow read-only memslots for SEV-ES and SEV-SNP (and TDX)
  KVM: eventfd: Use synchronize_srcu_expedited() on shutdown
  KVM: selftests: Add a testcase to verify x2APIC is fully readonly
  KVM: x86: Make x2APIC ID 100% readonly
  KVM: x86: Use this_cpu_ptr() instead of per_cpu_ptr(smp_processor_id())
  KVM: x86: hyper-v: Remove unused inline function kvm_hv_free_pa_page()
  KVM: SVM: Fix an error code in sev_gmem_post_populate()
  KVM: SVM: Fix uninitialized variable bug
  KVM: arm64: vgic: Hold config_lock while tearing down a CPU interface
  KVM: selftests: arm64: Correct feature test for S1PIE in get-reg-list
  KVM: arm64: Tidying up PAuth code in KVM
  KVM: arm64: vgic-debug: Exit the iterator properly w/o LPI
  KVM: arm64: Enforce dependency on an ARMv8.4-aware toolchain
  s390/uv: Panic for set and remove shared access UVC errors
  KVM: s390: fix validity interception issue when gisa is switched off
  docs: KVM: Fix register ID of SPSR_FIQ
  KVM: arm64: vgic: fix unexpected unlock sparse warnings
  KVM: arm64: fix kdoc warnings in W=1 builds
  KVM: arm64: fix override-init warnings in W=1 builds
  ...
2024-08-14 13:46:24 -07:00
Amit Shah
1c0e588169 KVM: SEV: uapi: fix typo in SEV_RET_INVALID_CONFIG
"INVALID" is misspelt in "SEV_RET_INAVLID_CONFIG". Since this is part of
the UAPI, keep the current definition and add a new one with the fix.

Fix-suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Amit Shah <amit.shah@amd.com>
Message-ID: <20240814083113.21622-1-amit@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-14 13:05:42 -04:00
Linus Torvalds
4ac0f08f44 vfs-6.11-rc4.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZrym4AAKCRCRxhvAZXjc
 oqT3AP9ydoUNavaZcRayH8r3ybvz9+aJGJ6Q7NznFVCk71vn0gD/buLzmq96Muns
 M5DWHbft2AFwK0Rz2nx8j5OXUeHwrQg=
 =HZBL
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.11-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
 "VFS:

   - Fix the name of file lease slab cache. When file leases were split
     out of file locks the name of the file lock slab cache was used for
     the file leases slab cache as well.

   - Fix a type in take_fd() helper.

   - Fix infinite directory iteration for stable offsets in tmpfs.

   - When the icache is pruned all reclaimable inodes are marked with
     I_FREEING and other processes that try to lookup such inodes will
     block.

     But some filesystems like ext4 can trigger lookups in their inode
     evict callback causing deadlocks. Ext4 does such lookups if the
     ea_inode feature is used whereby a separate inode may be used to
     store xattrs.

     Introduce I_LRU_ISOLATING which pins the inode while its pages are
     reclaimed. This avoids inode deletion during inode_lru_isolate()
     avoiding the deadlock and evict is made to wait until
     I_LRU_ISOLATING is done.

  netfs:

   - Fault in smaller chunks for non-large folio mappings for
     filesystems that haven't been converted to large folios yet.

   - Fix the CONFIG_NETFS_DEBUG config option. The config option was
     renamed a short while ago and that introduced two minor issues.
     First, it depended on CONFIG_NETFS whereas it wants to depend on
     CONFIG_NETFS_SUPPORT. The former doesn't exist, while the latter
     does. Second, the documentation for the config option wasn't fixed
     up.

   - Revert the removal of the PG_private_2 writeback flag as ceph is
     using it and fix how that flag is handled in netfs.

   - Fix DIO reads on 9p. A program watching a file on a 9p mount
     wouldn't see any changes in the size of the file being exported by
     the server if the file was changed directly in the source
     filesystem. Fix this by attempting to read the full size specified
     when a DIO read is requested.

   - Fix a NULL pointer dereference bug due to a data race where a
     cachefiles cookies was retired even though it was still in use.
     Check the cookie's n_accesses counter before discarding it.

  nsfs:

   - Fix ioctl declaration for NS_GET_MNTNS_ID from _IO() to _IOR() as
     the kernel is writing to userspace.

  pidfs:

   - Prevent the creation of pidfds for kthreads until we have a
     use-case for it and we know the semantics we want. It also confuses
     userspace why they can get pidfds for kthreads.

  squashfs:

   - Fix an unitialized value bug reported by KMSAN caused by a
     corrupted symbolic link size read from disk. Check that the
     symbolic link size is not larger than expected"

* tag 'vfs-6.11-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  Squashfs: sanity check symbolic link size
  9p: Fix DIO read through netfs
  vfs: Don't evict inode under the inode lru traversing context
  netfs: Fix handling of USE_PGPRIV2 and WRITE_TO_CACHE flags
  netfs, ceph: Revert "netfs: Remove deprecated use of PG_private_2 as a second writeback flag"
  file: fix typo in take_fd() comment
  pidfd: prevent creation of pidfds for kthreads
  netfs: clean up after renaming FSCACHE_DEBUG config
  libfs: fix infinite directory reads for offset dir
  nsfs: fix ioctl declaration
  fs/netfs/fscache_cookie: add missing "n_accesses" check
  filelock: fix name of file_lease slab cache
  netfs: Fault in smaller chunks for non-large folio mappings
2024-08-14 09:06:28 -07:00
Paul Elder
ac79beb913 media: rkisp1: Add support for the companding block
Add support to the rkisp1 driver for the companding block that exists on
the i.MX8MP version of the ISP. This requires usage of the new
extensible parameters format, and showcases how the format allows for
extensions without breaking backward compatibility.

Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Reviewed-by: Paul Elder <paul.elder@ideasonboard.com>
Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
2024-08-14 16:42:58 +03:00
David Sands
c26cee817f usb: gadget: f_fs: add capability for dfu functional descriptor
Add the ability for the USB FunctionFS (FFS) gadget driver to be able
to create Device Firmware Upgrade (DFU) functional descriptors. [1]

This patch allows implementation of DFU in userspace using the
FFS gadget. The DFU protocol uses the control pipe (ep0) for all
messaging so only the addition of the DFU functional descriptor
is needed in the kernel driver.

The DFU functional descriptor is written to the ep0 file along with
any other descriptors during FFS setup. DFU requires an interface
descriptor followed by the DFU functional descriptor.

This patch includes documentation of the added descriptor for DFU
and conversion of some existing documentation to kernel-doc format
so that it can be included in the generated docs.

An implementation of DFU 1.1 that implements just the runtime descriptor
using the FunctionFS gadget (with rebooting into u-boot for DFU mode)
has been tested on an i.MX8 Nano.

An implementation of DFU 1.1 that implements both runtime and DFU mode
using the FunctionFS gadget has been tested on Xilinx Zynq UltraScale+.
Note that for the best performance of firmware update file transfers, the
userspace program should respond as quick as possible to the setup packets.

[1] https://www.usb.org/sites/default/files/DFU_1.1.pdf

Signed-off-by: David Sands <david.sands@biamp.com>
Co-developed-by: Chris Wulff <crwulff@gmail.com>
Signed-off-by: Chris Wulff <crwulff@gmail.com>
Link: https://lore.kernel.org/r/20240811000004.1395888-2-crwulff@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-13 10:38:46 +02:00
Petr Machata
b72a6a7ab9 net: nexthop: Increase weight to u16
In CLOS networks, as link failures occur at various points in the network,
ECMP weights of the involved nodes are adjusted to compensate. With high
fan-out of the involved nodes, and overall high number of nodes,
a (non-)ECMP weight ratio that we would like to configure does not fit into
8 bits. Instead of, say, 255:254, we might like to configure something like
1000:999. For these deployments, the 8-bit weight may not be enough.

To that end, in this patch increase the next hop weight from u8 to u16.

Increasing the width of an integral type can be tricky, because while the
code still compiles, the types may not check out anymore, and numerical
errors come up. To prevent this, the conversion was done in two steps.
First the type was changed from u8 to a single-member structure, which
invalidated all uses of the field. This allowed going through them one by
one and audit for type correctness. Then the structure was replaced with a
vanilla u16 again. This should ensure that no place was missed.

The UAPI for configuring nexthop group members is that an attribute
NHA_GROUP carries an array of struct nexthop_grp entries:

	struct nexthop_grp {
		__u32	id;	  /* nexthop id - must exist */
		__u8	weight;   /* weight of this nexthop */
		__u8	resvd1;
		__u16	resvd2;
	};

The field resvd1 is currently validated and required to be zero. We can
lift this requirement and carry high-order bits of the weight in the
reserved field:

	struct nexthop_grp {
		__u32	id;	  /* nexthop id - must exist */
		__u8	weight;   /* weight of this nexthop */
		__u8	weight_high;
		__u16	resvd2;
	};

Keeping the fields split this way was chosen in case an existing userspace
makes assumptions about the width of the weight field, and to sidestep any
endianness issues.

The weight field is currently encoded as the weight value minus one,
because weight of 0 is invalid. This same trick is impossible for the new
weight_high field, because zero must mean actual zero. With this in place:

- Old userspace is guaranteed to carry weight_high of 0, therefore
  configuring 8-bit weights as appropriate. When dumping nexthops with
  16-bit weight, it would only show the lower 8 bits. But configuring such
  nexthops implies existence of userspace aware of the extension in the
  first place.

- New userspace talking to an old kernel will work as long as it only
  attempts to configure 8-bit weights, where the high-order bits are zero.
  Old kernel will bounce attempts at configuring >8-bit weights.

Renaming reserved fields as they are allocated for some purpose is commonly
done in Linux. Whoever touches a reserved field is doing so at their own
risk. nexthop_grp::resvd1 in particular is currently used by at least
strace, however they carry an own copy of UAPI headers, and the conversion
should be trivial. A helper is provided for decoding the weight out of the
two fields. Forcing a conversion seems preferable to bending backwards and
introducing anonymous unions or whatever.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/483e2fcf4beb0d9135d62e7d27b46fa2685479d4.1723036486.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-12 17:50:34 -07:00
Petr Machata
75bab45e6b net: nexthop: Add flag to assert that NHGRP reserved fields are zero
There are many unpatched kernel versions out there that do not initialize
the reserved fields of struct nexthop_grp. The issue with that is that if
those fields were to be used for some end (i.e. stop being reserved), old
kernels would still keep sending random data through the field, and a new
userspace could not rely on the value.

In this patch, use the existing NHA_OP_FLAGS, which is currently inbound
only, to carry flags back to the userspace. Add a flag to indicate that the
reserved fields in struct nexthop_grp are zeroed before dumping. This is
reliant on the actual fix from commit 6d745cd0e972 ("net: nexthop:
Initialize all fields in dumped nexthops").

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/21037748d4f9d8ff486151f4c09083bcf12d5df8.1723036486.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-12 17:50:34 -07:00
Christian Brauner
42b0f8da3a
nsfs: fix ioctl declaration
The kernel is writing an object of type __u64, so the ioctl has to be
defined to _IOR(NSIO, 0x5, __u64) instead of _IO(NSIO, 0x5).

Reported-by: Dmitry V. Levin <ldv@strace.io>
Link: https://lore.kernel.org/r/20240730164554.GA18486@altlinux.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-12 22:03:26 +02:00
Greg Kroah-Hartman
9ca12e50a4 Merge 6.11-rc3 into char-misc-next
We need the char/misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-12 18:44:54 +02:00
Jakub Kicinski
3d50c66c06 ethtool: rss: support skipping contexts during dump
Applications may want to deal with dynamic RSS contexts only.
So dumping context 0 will be counter-productive for them.
Support starting the dump from a given context ID.

Alternative would be to implement a dump flag to skip just
context 0, not sure which is better...

Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-12 14:16:24 +01:00
Thomas Zimmermann
5c61f59824 Merge drm/drm-next into drm-misc-next
Get drm-misc-next to the state of v6.11-rc2.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-08-12 14:14:18 +02:00
Jacopo Mondi
1fc379f624 media: uapi: videodev2: Add V4L2_META_FMT_RK_ISP1_EXT_PARAMS
The rkisp1 driver stores ISP configuration parameters in the fixed
rkisp1_params_cfg structure. As the members of the structure are part of
the userspace API, the structure layout is immutable and cannot be
extended further. Introducing new parameters or modifying the existing
ones would change the buffer layout and cause breakages in existing
applications.

The allow for future extensions to the ISP parameters, introduce a new
extensible parameters format, with a new format 4CC. Document usage of
the new format in the rkisp1 admin guide.

Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Reviewed-by: Daniel Scally <dan.scally@ideasonboard.com>
Reviewed-by: Paul Elder <paul.elder@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
2024-08-12 13:36:32 +03:00
Jacopo Mondi
e9d05e9d5d media: uapi: rkisp1-config: Add extensible params format
Add to the rkisp1-config.h header data types and documentation of
the extensible parameters format.

Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Paul Elder <paul.elder@ideasonboard.com>
Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
2024-08-12 13:36:29 +03:00
Mohammed Anees
0dc4fb69eb drm: Add missing documentation for struct drm_plane_size_hint
This patch takes care of the following warnings during documentation
compiling:

./include/uapi/drm/drm_mode.h:869: warning: Function parameter or struct member 'width' not described in 'drm_plane_size_hint'
./include/uapi/drm/drm_mode.h:869: warning: Function parameter or struct member 'height' not described in 'drm_plane_size_hint'

Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20240811101653.170223-1-pvmohammedanees2003@gmail.com
2024-08-12 11:15:14 +02:00
Yishai Hadas
ec7ad65309 RDMA/mlx5: Introduce GET_DATA_DIRECT_SYSFS_PATH ioctl
Introduce the 'GET_DATA_DIRECT_SYSFS_PATH' ioctl to return the sysfs
path of the affiliated 'data direct' device for a given device.

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://patch.msgid.link/403745463e0ef52adbef681ff09aa6a29a756352.1722512548.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-11 11:12:50 +03:00
Yishai Hadas
de8f847a51 RDMA/mlx5: Add support for DMABUF MR registrations with Data-direct
Add support for DMABUF MR registrations with Data-direct device.

Upon userspace calling to register a DMABUF MR with the data direct bit
set, the below algorithm will be followed.

1) Obtain a pinned DMABUF umem from the IB core using the user input
parameters (FD, offset, length) and the DMA PF device.  The DMA PF
device is needed to allow the IOMMU to enable the DMA PF to access the
user buffer over PCI.

2) Create a KSM MKEY by setting its entries according to the user buffer
VA to IOVA mapping, with the MKEY being the data direct device-crossed
MKEY. This KSM MKEY is umrable and will be used as part of the MR cache.
The PD for creating it is the internal device 'data direct' kernel one.

3) Create a crossing MKEY that points to the KSM MKEY using the crossing
access mode.

4) Manage the KSM MKEY by adding it to a list of 'data direct' MKEYs
managed on the mlx5_ib device.

5) Return the crossing MKEY to the user, created with its supplied PD.

Upon DMA PF unbind flow, the driver will revoke the KSM entries.
The final deregistration will occur under the hood once the application
deregisters its MKEY.

Notes:
- This version supports only the PINNED UMEM mode, so there is no
  dependency on ODP.
- The IOVA supplied by the application must be system page aligned due to
  HW translations of KSM.
- The crossing MKEY will not be umrable or part of the MR cache, as we
  cannot change its crossed (i.e. KSM) MKEY over UMR.

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://patch.msgid.link/1f99d8020ed540d9702b9e2252a145a439609ba6.1722512548.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-11 11:12:50 +03:00
Takashi Iwai
4004f3029e Merge branch 'topic/control-lookup-rwlock' into for-next
Pull control lookup optimization changes.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-08-09 14:25:24 +02:00
Christian Brauner
49224a345c
Merge patch series "nsfs: iterate through mount namespaces"
Christian Brauner <brauner@kernel.org> says:

Recently, we added the ability to list mounts in other mount namespaces
and the ability to retrieve namespace file descriptors without having to
go through procfs by deriving them from pidfds.

This extends nsfs in two ways:

(1) Add the ability to retrieve information about a mount namespace via
    NS_MNT_GET_INFO. This will return the mount namespace id and the
    number of mounts currently in the mount namespace. The number of
    mounts can be used to size the buffer that needs to be used for
    listmount() and is in general useful without having to actually
    iterate through all the mounts.

    The structure is extensible.

(2) Add the ability to iterate through all mount namespaces over which
    the caller holds privilege returning the file descriptor for the
    next or previous mount namespace.

    To retrieve a mount namespace the caller must be privileged wrt to
    it's owning user namespace. This means that PID 1 on the host can
    list all mounts in all mount namespaces or that a container can list
    all mounts of its nested containers.

    Optionally pass a structure for NS_MNT_GET_INFO with
    NS_MNT_GET_{PREV,NEXT} to retrieve information about the mount
    namespace in one go.

(1) and (2) can be implemented for other namespace types easily.

Together with recent api additions this means one can iterate through
all mounts in all mount namespaces without ever touching procfs. Here's
a sample program list_all_mounts_everywhere.c:

  // SPDX-License-Identifier: GPL-2.0-or-later

  #define _GNU_SOURCE
  #include <asm/unistd.h>
  #include <assert.h>
  #include <errno.h>
  #include <fcntl.h>
  #include <getopt.h>
  #include <linux/stat.h>
  #include <sched.h>
  #include <stddef.h>
  #include <stdint.h>
  #include <stdio.h>
  #include <stdlib.h>
  #include <string.h>
  #include <sys/ioctl.h>
  #include <sys/param.h>
  #include <sys/pidfd.h>
  #include <sys/stat.h>
  #include <sys/statfs.h>

  #define die_errno(format, ...)                                             \
  	do {                                                               \
  		fprintf(stderr, "%m | %s: %d: %s: " format "\n", __FILE__, \
  			__LINE__, __func__, ##__VA_ARGS__);                \
  		exit(EXIT_FAILURE);                                        \
  	} while (0)

  /* Get the id for a mount namespace */
  #define NS_GET_MNTNS_ID		_IO(0xb7, 0x5)
  /* Get next mount namespace. */

  struct mnt_ns_info {
  	__u32 size;
  	__u32 nr_mounts;
  	__u64 mnt_ns_id;
  };

  #define MNT_NS_INFO_SIZE_VER0 16 /* size of first published struct */

  /* Get information about namespace. */
  #define NS_MNT_GET_INFO		_IOR(0xb7, 10, struct mnt_ns_info)
  /* Get next namespace. */
  #define NS_MNT_GET_NEXT		_IOR(0xb7, 11, struct mnt_ns_info)
  /* Get previous namespace. */
  #define NS_MNT_GET_PREV		_IOR(0xb7, 12, struct mnt_ns_info)

  #define PIDFD_GET_MNT_NAMESPACE _IO(0xFF, 3)

  #define STATX_MNT_ID_UNIQUE	0x00004000U	/* Want/got extended stx_mount_id */

  #define __NR_listmount 458
  #define __NR_statmount 457

  /*
   * @mask bits for statmount(2)
   */
  #define STATMOUNT_SB_BASIC		0x00000001U     /* Want/got sb_... */
  #define STATMOUNT_MNT_BASIC		0x00000002U	/* Want/got mnt_... */
  #define STATMOUNT_PROPAGATE_FROM	0x00000004U	/* Want/got propagate_from */
  #define STATMOUNT_MNT_ROOT		0x00000008U	/* Want/got mnt_root  */
  #define STATMOUNT_MNT_POINT		0x00000010U	/* Want/got mnt_point */
  #define STATMOUNT_FS_TYPE		0x00000020U	/* Want/got fs_type */
  #define STATMOUNT_MNT_NS_ID             0x00000040U     /* Want/got mnt_ns_id */
  #define STATMOUNT_MNT_OPTS              0x00000080U     /* Want/got mnt_opts */

  struct statmount {
  	__u32 size;		/* Total size, including strings */
  	__u32 mnt_opts;
  	__u64 mask;		/* What results were written */
  	__u32 sb_dev_major;	/* Device ID */
  	__u32 sb_dev_minor;
  	__u64 sb_magic;		/* ..._SUPER_MAGIC */
  	__u32 sb_flags;		/* SB_{RDONLY,SYNCHRONOUS,DIRSYNC,LAZYTIME} */
  	__u32 fs_type;		/* [str] Filesystem type */
  	__u64 mnt_id;		/* Unique ID of mount */
  	__u64 mnt_parent_id;	/* Unique ID of parent (for root == mnt_id) */
  	__u32 mnt_id_old;	/* Reused IDs used in proc/.../mountinfo */
  	__u32 mnt_parent_id_old;
  	__u64 mnt_attr;		/* MOUNT_ATTR_... */
  	__u64 mnt_propagation;	/* MS_{SHARED,SLAVE,PRIVATE,UNBINDABLE} */
  	__u64 mnt_peer_group;	/* ID of shared peer group */
  	__u64 mnt_master;	/* Mount receives propagation from this ID */
  	__u64 propagate_from;	/* Propagation from in current namespace */
  	__u32 mnt_root;		/* [str] Root of mount relative to root of fs */
  	__u32 mnt_point;	/* [str] Mountpoint relative to current root */
  	__u64 mnt_ns_id;
  	__u64 __spare2[49];
  	char str[];		/* Variable size part containing strings */
  };

  struct mnt_id_req {
  	__u32 size;
  	__u32 spare;
  	__u64 mnt_id;
  	__u64 param;
  	__u64 mnt_ns_id;
  };

  #define MNT_ID_REQ_SIZE_VER1	32 /* sizeof second published struct */

  #define LSMT_ROOT		0xffffffffffffffff	/* root mount */

  static int __statmount(__u64 mnt_id, __u64 mnt_ns_id, __u64 mask,
  		       struct statmount *stmnt, size_t bufsize, unsigned int flags)
  {
  	struct mnt_id_req req = {
  		.size = MNT_ID_REQ_SIZE_VER1,
  		.mnt_id = mnt_id,
  		.param = mask,
  		.mnt_ns_id = mnt_ns_id,
  	};

  	return syscall(__NR_statmount, &req, stmnt, bufsize, flags);
  }

  static struct statmount *sys_statmount(__u64 mnt_id, __u64 mnt_ns_id,
  				       __u64 mask, unsigned int flags)
  {
  	size_t bufsize = 1 << 15;
  	struct statmount *stmnt = NULL, *tmp = NULL;
  	int ret;

  	for (;;) {
  		tmp = realloc(stmnt, bufsize);
  		if (!tmp)
  			goto out;

  		stmnt = tmp;
  		ret = __statmount(mnt_id, mnt_ns_id, mask, stmnt, bufsize, flags);
  		if (!ret)
  			return stmnt;

  		if (errno != EOVERFLOW)
  			goto out;

  		bufsize <<= 1;
  		if (bufsize >= UINT_MAX / 2)
  			goto out;

  	}

  out:
  	free(stmnt);
  	printf("statmount failed");
  	return NULL;
  }

  static ssize_t sys_listmount(__u64 mnt_id, __u64 last_mnt_id, __u64 mnt_ns_id,
  			     __u64 list[], size_t num, unsigned int flags)
  {
  	struct mnt_id_req req = {
  		.size = MNT_ID_REQ_SIZE_VER1,
  		.mnt_id = mnt_id,
  		.param = last_mnt_id,
  		.mnt_ns_id = mnt_ns_id,
  	};

  	return syscall(__NR_listmount, &req, list, num, flags);
  }

  int main(int argc, char *argv[])
  {
  #define LISTMNT_BUFFER 10
  	__u64 list[LISTMNT_BUFFER], last_mnt_id = 0;
  	int ret, pidfd, fd_mntns;
  	struct mnt_ns_info info = {};

  	pidfd = pidfd_open(getpid(), 0);
  	if (pidfd < 0)
  		die_errno("pidfd_open failed");

  	fd_mntns = ioctl(pidfd, PIDFD_GET_MNT_NAMESPACE, 0);
  	if (fd_mntns < 0)
  		die_errno("ioctl(PIDFD_GET_MNT_NAMESPACE) failed");

  	ret = ioctl(fd_mntns, NS_MNT_GET_INFO, &info);
  	if (ret < 0)
  		die_errno("ioctl(NS_GET_MNTNS_ID) failed");

  	printf("Listing %u mounts for mount namespace %d:%llu\n", info.nr_mounts, fd_mntns, info.mnt_ns_id);
  	for (;;) {
  		ssize_t nr_mounts;
  	next:
  		nr_mounts = sys_listmount(LSMT_ROOT, last_mnt_id, info.mnt_ns_id, list, LISTMNT_BUFFER, 0);
  		if (nr_mounts <= 0) {
  			printf("Finished listing mounts for mount namespace %d:%llu\n\n", fd_mntns, info.mnt_ns_id);
  			ret = ioctl(fd_mntns, NS_MNT_GET_NEXT, 0);
  			if (ret < 0)
  				die_errno("ioctl(NS_MNT_GET_NEXT) failed");
  			close(ret);
  			ret = ioctl(fd_mntns, NS_MNT_GET_NEXT, &info);
  			if (ret < 0) {
  				if (errno == ENOENT) {
  					printf("Finished listing all mount namespaces\n");
  					exit(0);
  				}
  				die_errno("ioctl(NS_MNT_GET_NEXT) failed");
  			}
  			close(fd_mntns);
  			fd_mntns = ret;
  			last_mnt_id = 0;
  			printf("Listing %u mounts for mount namespace %d:%llu\n", info.nr_mounts, fd_mntns, info.mnt_ns_id);
  			goto next;
  		}

  		for (size_t cur = 0; cur < nr_mounts; cur++) {
  			struct statmount *stmnt;

  			last_mnt_id = list[cur];

  			stmnt = sys_statmount(last_mnt_id, info.mnt_ns_id,
  					      STATMOUNT_SB_BASIC |
  					      STATMOUNT_MNT_BASIC |
  					      STATMOUNT_MNT_ROOT |
  					      STATMOUNT_MNT_POINT |
  					      STATMOUNT_MNT_NS_ID |
  					      STATMOUNT_MNT_OPTS |
  					      STATMOUNT_FS_TYPE,
  					  0);
  			if (!stmnt) {
  				printf("Failed to statmount(%llu) in mount namespace(%llu)\n", last_mnt_id, info.mnt_ns_id);
  				continue;
  			}

  			printf("mnt_id(%u/%llu) | mnt_parent_id(%u/%llu): %s @ %s ==> %s with options: %s\n",
  			       stmnt->mnt_id_old, stmnt->mnt_id,
  			       stmnt->mnt_parent_id_old, stmnt->mnt_parent_id,
  			       stmnt->str + stmnt->fs_type,
  			       stmnt->str + stmnt->mnt_root,
  			       stmnt->str + stmnt->mnt_point,
  			       stmnt->str + stmnt->mnt_opts);
  			free(stmnt);
  		}
  	}

  	exit(0);
  }

* patches from https://lore.kernel.org/r/20240719-work-mount-namespace-v1-0-834113cab0d2@kernel.org:
  nsfs: iterate through mount namespaces
  file: add fput() cleanup helper
  fs: add put_mnt_ns() cleanup helper
  fs: allow mount namespace fd

Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-09 12:47:05 +02:00