vfs-6.12-rc2.fixes

-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZvqoUgAKCRCRxhvAZXjc
 oveLAQC952BB+giATJvX5lPz400g+MoDVlWiPdqAVbF0PCGRRwEA+/7TedOfOkQx
 Df3iAzDXVbX/WvWYdZ/DyLp/r44WHAw=
 =tGkE
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.12-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
 "afs:

   - Fix setting of the server responding flag

   - Remove unused struct afs_address_list and afs_put_address_list()
     function

   - Fix infinite loop because of unresponsive servers

   - Ensure that afs_retry_request() function is correctly added to the
     afs_req_ops netfs operations table

  netfs:

   - Fix netfs_folio tracepoint handling to handle NULL mappings

   - Add a missing folio_queue API documentation

   - Ensure that netfs_write_folio() correctly advances the iterator via
     iov_iter_advance()

   - Fix a dentry leak during concurrent cull and cookie lookup
     operations in cachefiles

  pidfs:

   - Correctly handle accessing another task's pid namespace"

* tag 'vfs-6.12-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  netfs: Fix the netfs_folio tracepoint to handle NULL mapping
  netfs: Add folio_queue API documentation
  netfs: Advance iterator correctly rather than jumping it
  afs: Fix the setting of the server responding flag
  afs: Remove unused struct and function prototype
  afs: Fix possible infinite loop with unresponsive servers
  pidfs: check for valid pid namespace
  afs: Fix missing wire-up of afs_retry_request()
  cachefiles: fix dentry leak in cachefiles_open_file()
This commit is contained in:
Linus Torvalds 2024-09-30 10:59:44 -07:00
commit a5f24c7955
11 changed files with 410 additions and 24 deletions

View File

@ -0,0 +1,212 @@
.. SPDX-License-Identifier: GPL-2.0+
===========
Folio Queue
===========
:Author: David Howells <dhowells@redhat.com>
.. Contents:
* Overview
* Initialisation
* Adding and removing folios
* Querying information about a folio
* Querying information about a folio_queue
* Folio queue iteration
* Folio marks
* Lockless simultaneous production/consumption issues
Overview
========
The folio_queue struct forms a single segment in a segmented list of folios
that can be used to form an I/O buffer. As such, the list can be iterated over
using the ITER_FOLIOQ iov_iter type.
The publicly accessible members of the structure are::
struct folio_queue {
struct folio_queue *next;
struct folio_queue *prev;
...
};
A pair of pointers are provided, ``next`` and ``prev``, that point to the
segments on either side of the segment being accessed. Whilst this is a
doubly-linked list, it is intentionally not a circular list; the outward
sibling pointers in terminal segments should be NULL.
Each segment in the list also stores:
* an ordered sequence of folio pointers,
* the size of each folio and
* three 1-bit marks per folio,
but hese should not be accessed directly as the underlying data structure may
change, but rather the access functions outlined below should be used.
The facility can be made accessible by::
#include <linux/folio_queue.h>
and to use the iterator::
#include <linux/uio.h>
Initialisation
==============
A segment should be initialised by calling::
void folioq_init(struct folio_queue *folioq);
with a pointer to the segment to be initialised. Note that this will not
necessarily initialise all the folio pointers, so care must be taken to check
the number of folios added.
Adding and removing folios
==========================
Folios can be set in the next unused slot in a segment struct by calling one
of::
unsigned int folioq_append(struct folio_queue *folioq,
struct folio *folio);
unsigned int folioq_append_mark(struct folio_queue *folioq,
struct folio *folio);
Both functions update the stored folio count, store the folio and note its
size. The second function also sets the first mark for the folio added. Both
functions return the number of the slot used. [!] Note that no attempt is made
to check that the capacity wasn't overrun and the list will not be extended
automatically.
A folio can be excised by calling::
void folioq_clear(struct folio_queue *folioq, unsigned int slot);
This clears the slot in the array and also clears all the marks for that folio,
but doesn't change the folio count - so future accesses of that slot must check
if the slot is occupied.
Querying information about a folio
==================================
Information about the folio in a particular slot may be queried by the
following function::
struct folio *folioq_folio(const struct folio_queue *folioq,
unsigned int slot);
If a folio has not yet been set in that slot, this may yield an undefined
pointer. The size of the folio in a slot may be queried with either of::
unsigned int folioq_folio_order(const struct folio_queue *folioq,
unsigned int slot);
size_t folioq_folio_size(const struct folio_queue *folioq,
unsigned int slot);
The first function returns the size as an order and the second as a number of
bytes.
Querying information about a folio_queue
========================================
Information may be retrieved about a particular segment with the following
functions::
unsigned int folioq_nr_slots(const struct folio_queue *folioq);
unsigned int folioq_count(struct folio_queue *folioq);
bool folioq_full(struct folio_queue *folioq);
The first function returns the maximum capacity of a segment. It must not be
assumed that this won't vary between segments. The second returns the number
of folios added to a segments and the third is a shorthand to indicate if the
segment has been filled to capacity.
Not that the count and fullness are not affected by clearing folios from the
segment. These are more about indicating how many slots in the array have been
initialised, and it assumed that slots won't get reused, but rather the segment
will get discarded as the queue is consumed.
Folio marks
===========
Folios within a queue can also have marks assigned to them. These marks can be
used to note information such as if a folio needs folio_put() calling upon it.
There are three marks available to be set for each folio.
The marks can be set by::
void folioq_mark(struct folio_queue *folioq, unsigned int slot);
void folioq_mark2(struct folio_queue *folioq, unsigned int slot);
void folioq_mark3(struct folio_queue *folioq, unsigned int slot);
Cleared by::
void folioq_unmark(struct folio_queue *folioq, unsigned int slot);
void folioq_unmark2(struct folio_queue *folioq, unsigned int slot);
void folioq_unmark3(struct folio_queue *folioq, unsigned int slot);
And the marks can be queried by::
bool folioq_is_marked(const struct folio_queue *folioq, unsigned int slot);
bool folioq_is_marked2(const struct folio_queue *folioq, unsigned int slot);
bool folioq_is_marked3(const struct folio_queue *folioq, unsigned int slot);
The marks can be used for any purpose and are not interpreted by this API.
Folio queue iteration
=====================
A list of segments may be iterated over using the I/O iterator facility using
an ``iov_iter`` iterator of ``ITER_FOLIOQ`` type. The iterator may be
initialised with::
void iov_iter_folio_queue(struct iov_iter *i, unsigned int direction,
const struct folio_queue *folioq,
unsigned int first_slot, unsigned int offset,
size_t count);
This may be told to start at a particular segment, slot and offset within a
queue. The iov iterator functions will follow the next pointers when advancing
and prev pointers when reverting when needed.
Lockless simultaneous production/consumption issues
===================================================
If properly managed, the list can be extended by the producer at the head end
and shortened by the consumer at the tail end simultaneously without the need
to take locks. The ITER_FOLIOQ iterator inserts appropriate barriers to aid
with this.
Care must be taken when simultaneously producing and consuming a list. If the
last segment is reached and the folios it refers to are entirely consumed by
the IOV iterators, an iov_iter struct will be left pointing to the last segment
with a slot number equal to the capacity of that segment. The iterator will
try to continue on from this if there's another segment available when it is
used again, but care must be taken lest the segment got removed and freed by
the consumer before the iterator was advanced.
It is recommended that the queue always contain at least one segment, even if
that segment has never been filled or is entirely spent. This prevents the
head and tail pointers from collapsing.
API Function Reference
======================
.. kernel-doc:: include/linux/folio_queue.h

View File

@ -134,13 +134,4 @@ struct afs_uvldbentry__xdr {
__be32 spares9;
};
struct afs_address_list {
refcount_t usage;
unsigned int version;
unsigned int nr_addrs;
struct sockaddr_rxrpc addrs[];
};
extern void afs_put_address_list(struct afs_address_list *alist);
#endif /* AFS_VL_H */

View File

@ -420,6 +420,7 @@ const struct netfs_request_ops afs_req_ops = {
.begin_writeback = afs_begin_writeback,
.prepare_write = afs_prepare_write,
.issue_write = afs_issue_write,
.retry_request = afs_retry_request,
};
static void afs_add_open_mmap(struct afs_vnode *vnode)

View File

@ -201,7 +201,7 @@ void afs_wait_for_operation(struct afs_operation *op)
}
}
if (op->call_responded)
if (op->call_responded && op->server)
set_bit(AFS_SERVER_FL_RESPONDING, &op->server->flags);
if (!afs_op_error(op)) {

View File

@ -506,10 +506,10 @@ int afs_wait_for_one_fs_probe(struct afs_server *server, struct afs_endpoint_sta
finish_wait(&server->probe_wq, &wait);
dont_wait:
if (estate->responsive_set & ~exclude)
return 1;
if (test_bit(AFS_ESTATE_SUPERSEDED, &estate->flags))
return 0;
if (estate->responsive_set & ~exclude)
return 1;
if (is_intr && signal_pending(current))
return -ERESTARTSYS;
if (timo == 0)

View File

@ -632,8 +632,10 @@ bool afs_select_fileserver(struct afs_operation *op)
wait_for_more_probe_results:
error = afs_wait_for_one_fs_probe(op->server, op->estate, op->addr_tried,
!(op->flags & AFS_OPERATION_UNINTR));
if (!error)
if (error == 1)
goto iterate_address;
if (!error)
goto restart_from_beginning;
/* We've now had a failure to respond on all of a server's addresses -
* immediately probe them again and consider retrying the server.
@ -644,10 +646,13 @@ bool afs_select_fileserver(struct afs_operation *op)
error = afs_wait_for_one_fs_probe(op->server, op->estate, op->addr_tried,
!(op->flags & AFS_OPERATION_UNINTR));
switch (error) {
case 0:
case 1:
op->flags &= ~AFS_OPERATION_RETRY_SERVER;
trace_afs_rotate(op, afs_rotate_trace_retry_server, 0);
trace_afs_rotate(op, afs_rotate_trace_retry_server, 1);
goto retry_server;
case 0:
trace_afs_rotate(op, afs_rotate_trace_retry_server, 0);
goto restart_from_beginning;
case -ERESTARTSYS:
afs_op_set_error(op, error);
goto failed;

View File

@ -595,14 +595,12 @@ static bool cachefiles_open_file(struct cachefiles_object *object,
* write and readdir but not lookup or open).
*/
touch_atime(&file->f_path);
dput(dentry);
return true;
check_failed:
fscache_cookie_lookup_negative(object->cookie);
cachefiles_unmark_inode_in_use(object, file);
fput(file);
dput(dentry);
if (ret == -ESTALE)
return cachefiles_create_file(object);
return false;
@ -611,7 +609,6 @@ static bool cachefiles_open_file(struct cachefiles_object *object,
fput(file);
error:
cachefiles_do_unmark_inode_in_use(object, d_inode(dentry));
dput(dentry);
return false;
}
@ -654,7 +651,9 @@ bool cachefiles_look_up_object(struct cachefiles_object *object)
goto new_file;
}
if (!cachefiles_open_file(object, dentry))
ret = cachefiles_open_file(object, dentry);
dput(dentry);
if (!ret)
return false;
_leave(" = t [%lu]", file_inode(object->file)->i_ino);

View File

@ -317,6 +317,7 @@ static int netfs_write_folio(struct netfs_io_request *wreq,
struct netfs_io_stream *stream;
struct netfs_group *fgroup; /* TODO: Use this with ceph */
struct netfs_folio *finfo;
size_t iter_off = 0;
size_t fsize = folio_size(folio), flen = fsize, foff = 0;
loff_t fpos = folio_pos(folio), i_size;
bool to_eof = false, streamw = false;
@ -472,7 +473,12 @@ static int netfs_write_folio(struct netfs_io_request *wreq,
if (choose_s < 0)
break;
stream = &wreq->io_streams[choose_s];
wreq->io_iter.iov_offset = stream->submit_off;
/* Advance the iterator(s). */
if (stream->submit_off > iter_off) {
iov_iter_advance(&wreq->io_iter, stream->submit_off - iter_off);
iter_off = stream->submit_off;
}
atomic64_set(&wreq->issued_to, fpos + stream->submit_off);
stream->submit_extendable_to = fsize - stream->submit_off;
@ -487,8 +493,8 @@ static int netfs_write_folio(struct netfs_io_request *wreq,
debug = true;
}
wreq->io_iter.iov_offset = 0;
iov_iter_advance(&wreq->io_iter, fsize);
if (fsize > iter_off)
iov_iter_advance(&wreq->io_iter, fsize - iter_off);
atomic64_set(&wreq->issued_to, fpos + fsize);
if (!debug)

View File

@ -120,6 +120,7 @@ static long pidfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
struct nsproxy *nsp __free(put_nsproxy) = NULL;
struct pid *pid = pidfd_pid(file);
struct ns_common *ns_common = NULL;
struct pid_namespace *pid_ns;
if (arg)
return -EINVAL;
@ -202,7 +203,9 @@ static long pidfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
case PIDFD_GET_PID_NAMESPACE:
if (IS_ENABLED(CONFIG_PID_NS)) {
rcu_read_lock();
ns_common = to_ns_common( get_pid_ns(task_active_pid_ns(task)));
pid_ns = task_active_pid_ns(task);
if (pid_ns)
ns_common = to_ns_common(get_pid_ns(pid_ns));
rcu_read_unlock();
}
break;

View File

@ -3,6 +3,12 @@
*
* Copyright (C) 2024 Red Hat, Inc. All Rights Reserved.
* Written by David Howells (dhowells@redhat.com)
*
* See:
*
* Documentation/core-api/folio_queue.rst
*
* for a description of the API.
*/
#ifndef _LINUX_FOLIO_QUEUE_H
@ -33,6 +39,13 @@ struct folio_queue {
#endif
};
/**
* folioq_init - Initialise a folio queue segment
* @folioq: The segment to initialise
*
* Initialise a folio queue segment. Note that the folio pointers are
* left uninitialised.
*/
static inline void folioq_init(struct folio_queue *folioq)
{
folio_batch_init(&folioq->vec);
@ -43,62 +56,155 @@ static inline void folioq_init(struct folio_queue *folioq)
folioq->marks3 = 0;
}
/**
* folioq_nr_slots: Query the capacity of a folio queue segment
* @folioq: The segment to query
*
* Query the number of folios that a particular folio queue segment might hold.
* [!] NOTE: This must not be assumed to be the same for every segment!
*/
static inline unsigned int folioq_nr_slots(const struct folio_queue *folioq)
{
return PAGEVEC_SIZE;
}
/**
* folioq_count: Query the occupancy of a folio queue segment
* @folioq: The segment to query
*
* Query the number of folios that have been added to a folio queue segment.
* Note that this is not decreased as folios are removed from a segment.
*/
static inline unsigned int folioq_count(struct folio_queue *folioq)
{
return folio_batch_count(&folioq->vec);
}
/**
* folioq_count: Query if a folio queue segment is full
* @folioq: The segment to query
*
* Query if a folio queue segment is fully occupied. Note that this does not
* change if folios are removed from a segment.
*/
static inline bool folioq_full(struct folio_queue *folioq)
{
//return !folio_batch_space(&folioq->vec);
return folioq_count(folioq) >= folioq_nr_slots(folioq);
}
/**
* folioq_is_marked: Check first folio mark in a folio queue segment
* @folioq: The segment to query
* @slot: The slot number of the folio to query
*
* Determine if the first mark is set for the folio in the specified slot in a
* folio queue segment.
*/
static inline bool folioq_is_marked(const struct folio_queue *folioq, unsigned int slot)
{
return test_bit(slot, &folioq->marks);
}
/**
* folioq_mark: Set the first mark on a folio in a folio queue segment
* @folioq: The segment to modify
* @slot: The slot number of the folio to modify
*
* Set the first mark for the folio in the specified slot in a folio queue
* segment.
*/
static inline void folioq_mark(struct folio_queue *folioq, unsigned int slot)
{
set_bit(slot, &folioq->marks);
}
/**
* folioq_unmark: Clear the first mark on a folio in a folio queue segment
* @folioq: The segment to modify
* @slot: The slot number of the folio to modify
*
* Clear the first mark for the folio in the specified slot in a folio queue
* segment.
*/
static inline void folioq_unmark(struct folio_queue *folioq, unsigned int slot)
{
clear_bit(slot, &folioq->marks);
}
/**
* folioq_is_marked2: Check second folio mark in a folio queue segment
* @folioq: The segment to query
* @slot: The slot number of the folio to query
*
* Determine if the second mark is set for the folio in the specified slot in a
* folio queue segment.
*/
static inline bool folioq_is_marked2(const struct folio_queue *folioq, unsigned int slot)
{
return test_bit(slot, &folioq->marks2);
}
/**
* folioq_mark2: Set the second mark on a folio in a folio queue segment
* @folioq: The segment to modify
* @slot: The slot number of the folio to modify
*
* Set the second mark for the folio in the specified slot in a folio queue
* segment.
*/
static inline void folioq_mark2(struct folio_queue *folioq, unsigned int slot)
{
set_bit(slot, &folioq->marks2);
}
/**
* folioq_unmark2: Clear the second mark on a folio in a folio queue segment
* @folioq: The segment to modify
* @slot: The slot number of the folio to modify
*
* Clear the second mark for the folio in the specified slot in a folio queue
* segment.
*/
static inline void folioq_unmark2(struct folio_queue *folioq, unsigned int slot)
{
clear_bit(slot, &folioq->marks2);
}
/**
* folioq_is_marked3: Check third folio mark in a folio queue segment
* @folioq: The segment to query
* @slot: The slot number of the folio to query
*
* Determine if the third mark is set for the folio in the specified slot in a
* folio queue segment.
*/
static inline bool folioq_is_marked3(const struct folio_queue *folioq, unsigned int slot)
{
return test_bit(slot, &folioq->marks3);
}
/**
* folioq_mark3: Set the third mark on a folio in a folio queue segment
* @folioq: The segment to modify
* @slot: The slot number of the folio to modify
*
* Set the third mark for the folio in the specified slot in a folio queue
* segment.
*/
static inline void folioq_mark3(struct folio_queue *folioq, unsigned int slot)
{
set_bit(slot, &folioq->marks3);
}
/**
* folioq_unmark3: Clear the third mark on a folio in a folio queue segment
* @folioq: The segment to modify
* @slot: The slot number of the folio to modify
*
* Clear the third mark for the folio in the specified slot in a folio queue
* segment.
*/
static inline void folioq_unmark3(struct folio_queue *folioq, unsigned int slot)
{
clear_bit(slot, &folioq->marks3);
@ -111,6 +217,19 @@ static inline unsigned int __folio_order(struct folio *folio)
return folio->_flags_1 & 0xff;
}
/**
* folioq_append: Add a folio to a folio queue segment
* @folioq: The segment to add to
* @folio: The folio to add
*
* Add a folio to the tail of the sequence in a folio queue segment, increasing
* the occupancy count and returning the slot number for the folio just added.
* The folio size is extracted and stored in the queue and the marks are left
* unmodified.
*
* Note that it's left up to the caller to check that the segment capacity will
* not be exceeded and to extend the queue.
*/
static inline unsigned int folioq_append(struct folio_queue *folioq, struct folio *folio)
{
unsigned int slot = folioq->vec.nr++;
@ -120,6 +239,19 @@ static inline unsigned int folioq_append(struct folio_queue *folioq, struct foli
return slot;
}
/**
* folioq_append_mark: Add a folio to a folio queue segment
* @folioq: The segment to add to
* @folio: The folio to add
*
* Add a folio to the tail of the sequence in a folio queue segment, increasing
* the occupancy count and returning the slot number for the folio just added.
* The folio size is extracted and stored in the queue, the first mark is set
* and and the second and third marks are left unmodified.
*
* Note that it's left up to the caller to check that the segment capacity will
* not be exceeded and to extend the queue.
*/
static inline unsigned int folioq_append_mark(struct folio_queue *folioq, struct folio *folio)
{
unsigned int slot = folioq->vec.nr++;
@ -130,21 +262,57 @@ static inline unsigned int folioq_append_mark(struct folio_queue *folioq, struct
return slot;
}
/**
* folioq_folio: Get a folio from a folio queue segment
* @folioq: The segment to access
* @slot: The folio slot to access
*
* Retrieve the folio in the specified slot from a folio queue segment. Note
* that no bounds check is made and if the slot hasn't been added into yet, the
* pointer will be undefined. If the slot has been cleared, NULL will be
* returned.
*/
static inline struct folio *folioq_folio(const struct folio_queue *folioq, unsigned int slot)
{
return folioq->vec.folios[slot];
}
/**
* folioq_folio_order: Get the order of a folio from a folio queue segment
* @folioq: The segment to access
* @slot: The folio slot to access
*
* Retrieve the order of the folio in the specified slot from a folio queue
* segment. Note that no bounds check is made and if the slot hasn't been
* added into yet, the order returned will be 0.
*/
static inline unsigned int folioq_folio_order(const struct folio_queue *folioq, unsigned int slot)
{
return folioq->orders[slot];
}
/**
* folioq_folio_size: Get the size of a folio from a folio queue segment
* @folioq: The segment to access
* @slot: The folio slot to access
*
* Retrieve the size of the folio in the specified slot from a folio queue
* segment. Note that no bounds check is made and if the slot hasn't been
* added into yet, the size returned will be PAGE_SIZE.
*/
static inline size_t folioq_folio_size(const struct folio_queue *folioq, unsigned int slot)
{
return PAGE_SIZE << folioq_folio_order(folioq, slot);
}
/**
* folioq_clear: Clear a folio from a folio queue segment
* @folioq: The segment to clear
* @slot: The folio slot to clear
*
* Clear a folio from a sequence in a folio queue segment and clear its marks.
* The occupancy count is left unchanged.
*/
static inline void folioq_clear(struct folio_queue *folioq, unsigned int slot)
{
folioq->vec.folios[slot] = NULL;

View File

@ -448,7 +448,8 @@ TRACE_EVENT(netfs_folio,
),
TP_fast_assign(
__entry->ino = folio->mapping->host->i_ino;
struct address_space *__m = READ_ONCE(folio->mapping);
__entry->ino = __m ? __m->host->i_ino : 0;
__entry->why = why;
__entry->index = folio_index(folio);
__entry->nr = folio_nr_pages(folio);